- How I Doubled My Income with Data Science and Machine Learning - Jun 1, 2021.
Many career opportunities exist in the ever-expanding domain of data. Finding your place -- and finding your salary -- is largely up to your dedication, focus, and drive to learn. If you are an aspiring Data Scientist or have already started your professional journey, there are multiple strategies for maximizing your earning potential.
Career Advice, Data Science, Data Science Skills, Machine Learning, Salary
- Supercharge Your Machine Learning Experiments with PyCaret and Gradio - May 31, 2021.
A step-by-step tutorial to develop and interact with machine learning pipelines rapidly.
Deployment, Machine Learning, Pipeline, PyCaret, Python
- Where Did You Apply Analytics, Data Science, Machine Learning in 2020/2021? - May 25, 2021.
Take part in the latest KDnuggets survey, and let us know where you have been applying Analytics, Data Science, Machine Learning in 2020/2021.
Analytics, Data Science, Machine Learning, Poll, Survey
- Write and train your own custom machine learning models using PyCaret - May 25, 2021.
A step-by-step, beginner-friendly tutorial on how to write and train custom machine learning models in PyCaret.
Machine Learning, Modeling, PyCaret, Python, Training
- Data Validation in Machine Learning is Imperative, Not Optional - May 24, 2021.
Before we reach model training in the pipeline, there are various components like data ingestion, data versioning, data validation, and data pre-processing that need to be executed. In this article, we will discuss data validation, why it is important, its challenges, and more.
Data Quality, Machine Learning, Production, Validation
- Easy MLOps with PyCaret + MLflow - May 18, 2021.
A beginner-friendly, step-by-step tutorial on integrating MLOps in your Machine Learning experiments using PyCaret.
Machine Learning, MLflow, MLOps, PyCaret, Python
- Best Python Books for Beginners and Advanced Programmers - May 14, 2021.
Let's take a look at nine of the best Python books for both beginners and advanced programmers, covering topics such as data science, machine learning, deep learning, NLP, and more.
Analytics, Books, Data Science, Deep Learning, Machine Learning, Python
- The Explainable Boosting Machine - May 13, 2021.
As accurate as gradient boosting, as interpretable as linear regression.
Decision Trees, Explainability, Gradient Boosting, Interpretability, Machine Learning
- A Comprehensive Guide to Ensemble Learning – Exactly What You Need to Know - May 6, 2021.
This article covers ensemble learning methods, and exactly what you need to know in order to understand and implement them.
CatBoost, Ensemble Methods, Machine Learning, Python, random forests algorithm, scikit-learn, XGBoost
- Feature stores – how to avoid feeling that every day is Groundhog Day - May 6, 2021.
Feature stores stop the duplication of each task in the ML lifecycle. You can reuse features and pipelines for different models, monitor models consistently, and sidestep data leakage with this MLOps technology that everyone is talking about.
Data Preparation, Feature Store, Machine Learning, MLOps
- What makes a winning entry in a Machine Learning competition? - May 5, 2021.
So you want to show your grit in a Kaggle-style competition? Many, many others have the same idea, including domain experts and non-experts, and academic and corporate teams. What does it take for your bright ideas and skills to come out on top of thousands of competitors?
Challenge, Competition, Kaggle, Machine Learning, PyTorch, TensorFlow
- XGBoost Explained: DIY XGBoost Library in Less Than 200 Lines of Python - May 3, 2021.
Understand how XGBoost work with a simple 200 lines codes that implement gradient boosting for decision trees.
Algorithms, Machine Learning, Python, XGBoost
- Gradient Boosted Decision Trees – A Conceptual Explanation - Apr 30, 2021.
Gradient boosted decision trees involves implementing several models and aggregating their results. These boosted models have become popular thanks to their performance in machine learning competitions on Kaggle. In this article, we’ll see what gradient boosted decision trees are all about.
CatBoost, Decision Trees, Gradient Boosting, Machine Learning, Python, scikit-learn, XGBoost
- FluDemic – using AI and Machine Learning to get ahead of disease - Apr 30, 2021.
We are amidst a healthcare data explosion. AI/ML will be more vital than ever in the prevention and handling of future pandemics. Here, we walk you through the different facets of modeling infectious diseases, focusing on influenza and COVID-19.
AI, COVID-19, Healthcare, Machine Learning
- Feature Engineering of DateTime Variables for Data Science, Machine Learning - Apr 29, 2021.
Learn how to make more meaningful features from DateTime type variables to be used by Machine Learning Models.
Data Science, Feature Engineering, Machine Learning, Python
- Best Podcasts for Machine Learning - Apr 28, 2021.
Podcasts, especially those featuring interviews, are great for learning about the subfields and tools of AI, as well as the rock stars and superheroes of the AI world. Here, we highlight some of the best podcasts today that are perfect for both those learning about machine learning and seasoned practitioners.
AI, Data Science, Machine Learning, Podcast
- Multiple Time Series Forecasting with PyCaret - Apr 27, 2021.
A step-by-step tutorial to forecast multiple time series with PyCaret.
Forecasting, Machine Learning, PyCaret, Python, Time Series
- Improving model performance through human participation - Apr 23, 2021.
Certain industries, such as medicine and finance, are sensitive to false positives. Using human input in the model inference loop can increase the final precision and recall. Here, we describe how to incorporate human feedback at inference time, so that Machines + Humans = Higher Precision & Recall.
Data Science Platform, Humans, Machine Learning, Model Performance, Precision, Recall
- Data Science Books You Should Start Reading in 2021 - Apr 23, 2021.
Check out this curated list of the best data science books for any level.
Books, Data Science, Data Scientist, Deep Learning, Machine Learning
- The Three Edge Case Culprits: Bias, Variance, and Unpredictability - Apr 22, 2021.
Edge cases occur for three basic reasons: Bias – the ML system is too ‘simple’; Variance – the ML system is too ‘inexperienced’; Unpredictability – the ML system operates in an environment full of surprises. How do we recognize these edge cases situations, and what can we do about them?
Bias, iMerit, Machine Learning, Variance
- Top 10 Must-Know Machine Learning Algorithms for Data Scientists – Part 1 - Apr 22, 2021.
New to data science? Interested in the must-know machine learning algorithms in the field? Check out the first part of our list and introductory descriptions of the top 10 algorithms for data scientists to know.
Algorithms, Bagging, Data Science, Data Scientist, Decision Trees, Linear Regression, Machine Learning, SVM, Top 10
- Time Series Forecasting with PyCaret Regression Module - Apr 21, 2021.
PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few lines only. See how to use PyCaret's Regression Module for Time Series Forecasting.
Machine Learning, PyCaret, Python, Regression, Time Series
- Free From Stanford: Machine Learning with Graphs - Apr 19, 2021.
Check out the freely-available Stanford course Machine Learning with Graphs, taught by Jure Leskovec, and see how a world renowned researcher teaches their topic of expertise. Accessible materials include slides, videos, and more.
Courses, Free, Graphs, Jure Leskovec, Machine Learning, Stanford
- 6 Mistakes To Avoid While Training Your Machine Learning Model - Apr 15, 2021.
While training the AI model, multi-stage activities are performed to utilize the training data in the best manner, so that outcomes are satisfying. So, here are the 6 common mistakes you need to understand to make sure your AI model is successful.
Computer Vision, Data Labeling, Machine Learning, Mistakes
- Continuous Training for Machine Learning – a Framework for a Successful Strategy - Apr 14, 2021.
A basic appreciation by anyone who builds machine learning models is that the model is not useful without useful data. This doesn't change after a model is deployed to production. Effectively monitoring and retraining models with updated data is key to maintaining valuable ML solutions, and can be accomplished with effective approaches to production-level continuous training that is guided by the data.
Machine Learning, MLOps, Model Performance, Production, Real-time, Training Data
- 7 Must-Haves in your Data Science CV - Apr 13, 2021.
If you are looking for a new role as a Data Scientist -- either as a first job fresh out of school, a career change, or a shift to another organization -- then check off as many of these critical points as possible to stand out in the crowd and pass the hiring manager's initial CV screen.
Business, Career Advice, Data Scientist, Machine Learning
- How Noisy Labels Impact Machine Learning Models - Apr 6, 2021.
Not all training data labeling errors have the same impact on the performance of the Machine Learning system. The structure of the labeling errors make a difference. Read iMerit’s latest blog to learn how to minimize the impact of labeling errors.
Data Labeling, Data Preparation, iMerit, Machine Learning
- How to Dockerize Any Machine Learning Application - Apr 6, 2021.
How can you -- an awesome Data Scientist -- also be known as an awesome software engineer? Docker. And these 3 simple steps to use it for your solutions over and over again.
Advice, Applications, Containers, Deployment, Docker, Machine Learning
- How to deploy Machine Learning/Deep Learning models to the web - Apr 5, 2021.
The full value of your deep learning models comes from enabling others to use them. Learn how to deploy your model to the web and access it as a REST API, and begin to share the power of your machine learning development with the world.
Deep Learning, Deployment, Machine Learning, RESTful API
- Awesome Tricks And Best Practices From Kaggle - Apr 5, 2021.
Easily learn what is only learned by hours of search and exploration.
Data Science, Kaggle, Machine Learning, Tips
- Shapash: Making Machine Learning Models Understandable - Apr 2, 2021.
Establishing an expectation for trust around AI technologies may soon become one of the most important skills provided by Data Scientists. Significant research investments are underway in this area, and new tools are being developed, such as Shapash, an open-source Python library that helps Data Scientists make machine learning models more transparent and understandable.
Explainability, Machine Learning, Python, SHAP
- Easy AutoML in Python - Apr 1, 2021.
We’re excited to announce that a new open-source project has joined the Alteryx open-source ecosystem. EvalML is a library for automated machine learning (AutoML) and model understanding, written in Python.
Automated Machine Learning, AutoML, Machine Learning, Open Source, Python
- Overview of MLOps - Mar 26, 2021.
Building a machine learning model is great, but to provide real business value, it must be made useful and maintained to remain useful over time. Machine Learning Operations (MLOps), overviewed here, is a rapidly growing space that encompasses everything required to deploy a machine learning model into production, and is a crucial aspect to delivering this sought after value.
Data Science, Deployment, Machine Learning, MLOps, Monitoring
- Data Science Curriculum for Professionals - Mar 25, 2021.
If you are looking to expand or transition your current professional career that is buried in spreadsheet analysis into one powered by data science, then you are in for an exciting but complex journey with much to explore and master. To begin your adventure, following this complete road map to guide you from a gnome in the forest of spreadsheets to an AI wizard known far and wide throughout the kingdom.
Cloud Computing, Data Science Education, Data Visualization, Machine Learning, Python, R, Roadmap, Statistics
- Top YouTube Machine Learning Channels - Mar 23, 2021.
These are the top 15 YouTube channels for machine learning as determined by our stated criteria, along with some additional data on the channels to help you decide if they may have some content useful for you.
Machine Learning, Youtube
- The Best Machine Learning Frameworks & Extensions for Scikit-learn - Mar 22, 2021.
Learn how to use a selection of packages to extend the functionality of Scikit-learn estimators.
Machine Learning, Python, scikit-learn
- Learning from machine learning mistakes - Mar 19, 2021.
Read this article and discover how to find weak spots of a regression model.
Machine Learning, Mistakes, Modeling, Regression
- Data Validation and Data Verification – From Dictionary to Machine Learning - Mar 16, 2021.
In this article, we will understand the difference between data verification and data validation, two terms which are often used interchangeably when we talk about data quality. However, these two terms are distinct.
Data Quality, Machine Learning, Validation
- 10 Amazing Machine Learning Projects of 2020 - Mar 15, 2021.
So much progress in AI and machine learning happened in 2020, especially in the areas of AI-generating creativity and low-to-no-code frameworks. Check out these trending and popular machine learning projects released last year, and let them inspire your work throughout 2021.
Chatbot, Deep Learning, Image Processing, Machine Learning, Project, Trends
- A Beginner’s Guide to the CLIP Model - Mar 11, 2021.
CLIP is a bridge between computer vision and natural language processing. I'm here to break CLIP down for you in an accessible and fun read! In this post, I'll cover what CLIP is, how CLIP works, and why CLIP is cool.
CLIP, Computer Vision, Machine Learning, NLP
- A Machine Learning Model Monitoring Checklist: 7 Things to Track - Mar 11, 2021.
Once you deploy a machine learning model in production, you need to make sure it performs. In the article, we suggest how to monitor your models and open-source tools to use.
Checklist, Data Science, Deployment, Machine Learning, MLOps, Monitoring
- 4 Machine Learning Concepts I Wish I Knew When I Built My First Model - Mar 9, 2021.
Diving into building your first machine learning model will be an adventure -- one in which you will learn many important lessons the hard way. However, by following these four tips, your first and subsequent models will be put on a path toward excellence.
Feature Selection, Gradio, Hyperparameter, Machine Learning, Metrics, Python
- Speeding up Scikit-Learn Model Training - Mar 5, 2021.
If your scikit-learn models are taking a bit of time to train, then there are several techniques you can use to make the processing more efficient. From optimizing your model configuration to leveraging libraries to speed up training through parallelization, you can build the best scikit-learn model possible in the least amount of time.
Distributed Computing, Machine Learning, Optimization, scikit-learn
- Bayesian Hyperparameter Optimization with tune-sklearn in PyCaret - Mar 5, 2021.
PyCaret, a low code Python ML library, offers several ways to tune the hyper-parameters of a created model. In this post, I'd like to show how Ray Tune is integrated with PyCaret, and how easy it is to leverage its algorithms and distributed computing to achieve results superior to default random search method.
Bayesian, Hyperparameter, Machine Learning, Optimization, PyCaret, Python, scikit-learn
- Reducing the High Cost of Training NLP Models With SRU++ - Mar 4, 2021.
The increasing computation time and costs of training natural language models (NLP) highlight the importance of inventing computationally efficient models that retain top modeling power with reduced or accelerated computation. A single experiment training a top-performing language model on the 'Billion Word' benchmark would take 384 GPU days and as much as $36,000 using AWS on-demand instances.
Deep Learning, Machine Learning, Neural Networks, NLP
- Getting Started with Distributed Machine Learning with PyTorch and Ray - Mar 3, 2021.
Ray is a popular framework for distributed Python that can be paired with PyTorch to rapidly scale machine learning applications.
Distributed Systems, Machine Learning, Python, PyTorch
- Machine Learning Systems Design: A Free Stanford Course - Feb 26, 2021.
This freely-available course from Stanford should give you a toolkit for designing machine learning systems.
Courses, Deployment, Design, Machine Learning, Maintenance, Stanford
- Feature Store as a Foundation for Machine Learning - Feb 19, 2021.
With so many organizations now taking the leap into building production-level machine learning models, many lessons learned are coming to light about the supporting infrastructure. For a variety of important types of use cases, maintaining a centralized feature store is essential for higher ROI and faster delivery to market. In this review, the current feature store landscape is described, and you can learn how to architect one into your MLOps pipeline.
Data Engineering, Data Infrastructure, Data Lake, Feature Engineering, Feature Store, Machine Learning, Metadata, MLOps, Pipeline
- Approaching (Almost) Any Machine Learning Problem - Feb 18, 2021.
This freely-available book is a fantastic walkthrough of practical approaches to machine learning problems.
Deep Learning, Free ebook, Machine Learning, Python
- Distributed and Scalable Machine Learning [Webinar] - Feb 17, 2021.
Mike McCarty and Gil Forsyth work at the Capital One Center for Machine Learning, where they are building internal PyData libraries that scale with Dask and RAPIDS. For this webinar, Feb 23 @ 2 pm PST, 5pm EST, they’ll join Hugo Bowne-Anderson and Matthew Rocklin to discuss their journey to scale data science and machine learning in Python.
Capital One, Dask, Distributed, Machine Learning, Python, scikit-learn, XGBoost
- Easy, Open-Source AutoML in Python with EvalML - Feb 16, 2021.
We’re excited to announce that a new open-source project has joined the Alteryx open-source ecosystem. EvalML is a library for automated machine learning (AutoML) and model understanding, written in Python.
Automated Machine Learning, AutoML, Machine Learning, Open Source, Python
- How to Speed up Scikit-Learn Model Training - Feb 11, 2021.
Scikit-Learn is an easy to use a Python library for machine learning. However, sometimes scikit-learn models can take a long time to train. The question becomes, how do you create the best scikit-learn model in the least amount of time?
Distributed Systems, Hyperparameter, Machine Learning, Optimization, Parallelism, Python, scikit-learn, Training
- Machine Learning – it’s all about assumptions - Feb 11, 2021.
Just as with most things in life, assumptions can directly lead to success or failure. Similarly in machine learning, appreciating the assumed logic behind machine learning techniques will guide you toward applying the best tool for the data.
Algorithms, Decision Trees, K-nearest neighbors, Linear Regression, Logistic Regression, Machine Learning, Naive Bayes, SVM, XGBoost
- A Critical Comparison of Machine Learning Platforms in an Evolving Market - Feb 11, 2021.
There’s a clear inclination towards the MLaaS model across industries, given the fact that companies today have an option to select from a wide range of solutions that can cater to diverse business needs. Here is a look at 3 of the top ML platforms for data excellence.
Google Cloud, IBM Watson, Machine Learning, Microsoft Azure, Platform
- My machine learning model does not learn. What should I do? - Feb 10, 2021.
This article presents 7 hints on how to get out of the quicksand.
Algorithms, Business Context, Data Quality, Hyperparameter, Machine Learning, Modeling, Tips
- Microsoft Explores Three Key Mysteries of Ensemble Learning - Feb 8, 2021.
A new paper studies three key puzzling characteristics of deep learning ensembles and some potential explanations.
Ensemble Methods, Machine Learning, Microsoft
- Saving and loading models in TensorFlow — why it is important and how to do it - Feb 3, 2021.
So much time and effort can go into training your machine learning models. But, shut down the notebook or system, and all those trained weights and more vanish with the memory flush. Saving your models to maximize reusability is key for efficient productivity.
Deep Learning, Machine Learning, TensorFlow
- Machine learning adversarial attacks are a ticking time bomb - Jan 29, 2021.
Software developers and cyber security experts have long fought the good fight against vulnerabilities in code to defend against hackers. A new, subtle approach to maliciously targeting machine learning models has been a recent hot topic in research, but its statistical nature makes it difficult to find and patch these so-called adversarial attacks. Such threats in the real-world are becoming imminent as the adoption of machine learning spreads, and a systematic defense must be implemented.
Adversarial, Generative Adversarial Network, Machine Learning
- Top 5 Reasons Why Machine Learning Projects Fail - Jan 28, 2021.
The rise in machine learning project implementation is coming, as is the the number of failures, due to several implementation and maintenance challenges. The first step of closing this gap lies in understanding the reasons for the failure.
Data Preparation, Data Science, Failure, Implementation, Machine Learning
- Machine learning is going real-time - Jan 28, 2021.
Extracting immediate predictions from machine learning algorithms on the spot based on brand-new data can offer a next level of interaction and potential value to its consumers. The infrastructure and tech stack required to implement such real-time systems is also next level, and many organizations -- especially in the US -- seem to be resisting. But, what even is real-time ML, and how can it deliver a better experience?
China, Machine Learning, MLOps, Real-time, Stream Processing
- Popular Machine Learning Interview Questions, part 2 - Jan 27, 2021.
Get ready for your next job interview requiring domain knowledge in machine learning with answers to these thirteen common questions.
Convolutional Neural Networks, Interview Questions, Linear Regression, Logistic Regression, Machine Learning, Regularization, Transfer Learning, Unbalanced
- Support Vector Machine for Hand Written Alphabet Recognition in R - Jan 27, 2021.
We attempt to break down a problem of hand written alphabet image recognition into a simple process rather than using heavy packages. This is an attempt to create the data and then build a model using Support Vector Machines for Classification.
Classification, Image Recognition, Machine Learning, R, Support Vector Machines
- Want to Be a Data Scientist? Don’t Start With Machine Learning - Jan 26, 2021.
Machine learning may appear like the go-to topic to start learning for the aspiring data scientist. But. thinking these techniques are the key aspects of the role is the biggest misconception. So much more goes into becoming a successful data scientist, and machine learning is only one component of broader skills around processing, managing, and understanding the science behind the data.
Career Advice, Data Scientist, Machine Learning, Statistics
- The Ultimate Scikit-Learn Machine Learning Cheatsheet - Jan 25, 2021.
With the power and popularity of the scikit-learn for machine learning in Python, this library is a foundation to any practitioner's toolset. Preview its core methods with this review of predictive modelling, clustering, dimensionality reduction, feature importance, and data transformation.
Cheat Sheet, Machine Learning, scikit-learn
- Cloud Computing, Data Science and ML Trends in 2020–2022: The battle of giants - Jan 22, 2021.
Kaggle’s survey of ‘State of Data Science and Machine Learning 2020’ covers a lot of diverse topics. In this post, we are going to look at the popularity of cloud computing platforms and products among the data science and ML professionals participated in the survey.
AWS, Cloud Computing, Data Science, Google Cloud, Kaggle, Machine Learning, Microsoft Azure, Trends
- Going Beyond the Repo: GitHub for Career Growth in AI & Machine Learning - Jan 21, 2021.
Many online tools and platforms exist to help you establish a clear and persuasive online profile for potential employers to review. Have you considered how your go-to online code repository could also help you land your next job?
AI, Career Advice, GitHub, Machine Learning
- Popular Machine Learning Interview Questions - Jan 20, 2021.
Get ready for your next job interview requiring domain knowledge in machine learning with answers to these eleven common questions.
Bias, Confusion Matrix, Interview Questions, Machine Learning, Overfitting, Variance
- K-Means 8x faster, 27x lower error than Scikit-learn in 25 lines - Jan 15, 2021.
K-means clustering is a powerful algorithm for similarity searches, and Facebook AI Research's faiss library is turning out to be a speed champion. With only a handful of lines of code shared in this demonstration, faiss outperforms the implementation in scikit-learn in speed and accuracy.
Algorithms, K-means, Machine Learning, scikit-learn
- 5 Tools for Effortless Data Science - Jan 11, 2021.
The sixth tool is coffee.
Data Science, Data Science Tools, Keras, Machine Learning, MLflow, PyCaret, Python
- CatalyzeX: A must-have browser extension for machine learning engineers and researchers - Jan 6, 2021.
CatalyzeX is a free browser extension that finds code implementations for ML/AI papers anywhere on the internet (Google, Arxiv, Twitter, Scholar, and other sites).
Implementation, Machine Learning, Programming, Research
- MLOps: Model Monitoring 101 - Jan 6, 2021.
Model monitoring using a model metric stack is essential to put a feedback loop from a deployed ML model back to the model building stage so that ML models can constantly improve themselves under different scenarios.
AI, Data Science, DevOps, Machine Learning, MLOps, Modeling
- All Machine Learning Algorithms You Should Know in 2021 - Jan 4, 2021.
Many machine learning algorithms exits that range from simple to complex in their approach, and together provide a powerful library of tools for analyzing and predicting patterns from data. If you are learning for the first time or reviewing techniques, then these intuitive explanations of the most popular machine learning models will help you kick off the new year with confidence.
Algorithms, Decision Trees, Explained, Gradient Boosting, K-nearest neighbors, Machine Learning, Naive Bayes, Regression, SVM
- 15 Free Data Science, Machine Learning & Statistics eBooks for 2021 - Dec 31, 2020.
We present a curated list of 15 free eBooks compiled in a single location to close out the year.
Automated Machine Learning, Data Science, Deep Learning, Free ebook, Machine Learning, NLP, Python, R, Statistics
- How to easily check if your Machine Learning model is fair? - Dec 24, 2020.
Machine learning models deployed today -- as will many more in the future -- impact people and society directly. With that power and influence resting in the hands of Data Scientists and machine learning engineers, taking the time to evaluate and understand if model results are fair will become the linchpin for the future success of AI/ML solutions. These are critical considerations, and using a recently developed fairness module in the dalex Python package is a unified and accessible way to ensure your models remain fair.
Bias, Dalex, Ethics, Machine Learning
- Can you trust AutoML? - Dec 23, 2020.
Automated Machine Learning, or AutoML, tries hundreds or even thousands of different ML pipelines to deliver models that often beat the experts and win competitions. But, is this the ultimate goal? Can a model developed with this approach be trusted without guarantees of predictive performance? The issue of overfitting must be closely considered because these methods can lead to overestimation -- and the Winner's Curse.
Accuracy, AutoML, Cross-validation, Machine Learning, Model Performance, Overfitting
- Production Machine Learning Monitoring: Outliers, Drift, Explainers & Statistical Performance - Dec 21, 2020.
A practical deep dive on production monitoring architectures for machine learning at scale using real-time metrics, outlier detectors, drift detectors, metrics servers and explainers.
AI, Deployment, Explainable AI, Machine Learning, Modeling, Outliers, Production, Python
- MLOps Is Changing How Machine Learning Models Are Developed - Dec 21, 2020.
Delivering machine learning solutions is so much more than the model. Three key concepts covering version control, testing, and pipelines are the foundation for machine learning operations (MLOps) that help data science teams ship models quicker and with more confidence.
Deployment, Machine Learning, MLOps
- ebook: Fundamentals for Efficient ML Monitoring - Dec 17, 2020.
We've gathered best practices for data science and engineering teams to create an efficient framework to monitor ML models. This ebook provides a framework for anyone who has an interest in building, testing, and implementing a robust monitoring strategy in their organization or elsewhere.
ebook, Machine Learning, Monitoring
- How to use Machine Learning for Anomaly Detection and Conditional Monitoring - Dec 16, 2020.
This article explains the goals of anomaly detection and outlines the approaches used to solve specific use cases for anomaly detection and condition monitoring.
Anomaly Detection, Machine Learning, Python, scikit-learn, Unsupervised Learning
- Data Science and Machine Learning: The Free eBook - Dec 15, 2020.
Check out the newest addition to our free eBook collection, Data Science and Machine Learning: Mathematical and Statistical Methods, and start building your statistical learning foundation today.
Data Science, Free ebook, Machine Learning, Python
- State of Data Science and Machine Learning 2020: 3 Key Findings - Dec 15, 2020.
Kaggle recently released its State of Data Science and Machine Learning report for 2020, based on compiled results of its annual survey. Read about 3 key findings in the report here.
Data Science, Kaggle, Machine Learning, Survey
- Implementing the AdaBoost Algorithm From Scratch - Dec 10, 2020.
AdaBoost technique follows a decision tree model with a depth equal to one. AdaBoost is nothing but the forest of stumps rather than trees. AdaBoost works by putting more weight on difficult to classify instances and less on those already handled well. AdaBoost algorithm is developed to solve both classification and regression problem. Learn to build the algorithm from scratch here.
Adaboost, Algorithms, Ensemble Methods, Machine Learning, Python
- A Journey from Software to Machine Learning Engineer - Dec 10, 2020.
In this blog post, the author explains his journey from Software Engineer to Machine Learning Engineer. The focus of the blog post is on the areas that the author wished he'd have focused on during his learning journey, and what should you look for outside of books and courses when pursuing your Machine Learning career.
Career Advice, Machine Learning, Machine Learning Engineer, Software Engineer
- Main 2020 Developments and Key 2021 Trends in AI, Data Science, Machine Learning Technology - Dec 9, 2020.
Our panel of leading experts reviews 2020 main developments and examines the key trends in AI, Data Science, Machine Learning, and Deep Learning Technology.
2021 Predictions, AI, AutoML, Bill Schmarzo, Carla Gentry, COVID-19, Doug Laney, GPT-3, Kirk D. Borne, Machine Learning, MLOps, Predictions, Ronald van Loon, Tom Davenport, Trends
- AI registers: finally, a tool to increase transparency in AI/ML - Dec 9, 2020.
Transparency, explainability, and trust are pressing topics in AI/ML today. While much has been written about why they are important and what you need to do, no tools have existed until now.
AI, Bias, Ethics, Explainability, Helsinki, Machine Learning, Trust
- Change the Background of Any Video with 5 Lines of Code - Dec 7, 2020.
Learn to blur, color, grayscale and create a virtual background for a video with PixelLib.
Computer Vision, Image Processing, Machine Learning, Python, Segmentation, Video
- Pruning Machine Learning Models in TensorFlow - Dec 4, 2020.
Read this overview to learn how to make your models smaller via pruning.
Machine Learning, Modeling, Python, TensorFlow
- AI, Analytics, Machine Learning, Data Science, Deep Learning Research Main Developments in 2020 and Key Trends for 2021 - Dec 3, 2020.
2020 is finally coming to a close. While likely not to register as anyone's favorite year, 2020 did have some noteworthy advancements in our field, and 2021 promises some important key trends to look forward to. As has become a year-end tradition, our collection of experts have once again contributed their thoughts. Read on to find out more.
2021 Predictions, AI, Ajit Jaokar, Analytics, Brandon Rohrer, Daniel Tunkelang, Data Science, Deep Learning, Machine Learning, Pedro Domingos, Predictions, Research, Rosaria Silipo
- How to Know if a Neural Network is Right for Your Machine Learning Initiative - Nov 26, 2020.
It is important to remember that there must be a business reason for even considering neural nets and it should not be because the C-Suite is feeling a bad case of FOMO.
Algorithms, Machine Learning, Neural Networks
- Better data apps with Streamlit’s new layout options - Nov 26, 2020.
Introducing new layout primitives - including columns, containers and expanders!
App, Data Science, Machine Learning, Streamlit
- Essential Math for Data Science: Integrals And Area Under The Curve - Nov 25, 2020.
In this article, you’ll learn about integrals and the area under the curve using the practical data science example of the area under the ROC curve used to compare the performances of two machine learning models.
Machine Learning, Mathematics, Metrics, numpy, Python, Unbalanced
- How to Incorporate Tabular Data with HuggingFace Transformers - Nov 25, 2020.
In real-world scenarios, we often encounter data that includes text and tabular features. Leveraging the latest advances for transformers, effectively handling situations with both data structures can increase performance in your models.
Data Preparation, Deep Learning, Machine Learning, NLP, Python, Transformer
- Fraud through the eyes of a machine - Nov 24, 2020.
Data structured as a network of relationships can be modeled as a graph, which can then help extract insights into the data through machine learning and rule-based approaches. While these graph representations provide a natural interface to transactional data for humans to appreciate, caution and context must be applied when leveraging machine-based interpretations of these connections.
Fraud, Fraud Detection, Graph Analytics, Machine Learning
- Know-How to Learn Machine Learning Algorithms Effectively - Nov 23, 2020.
The takeaway from the story is that machine learning is way beyond a simple fit and predict methods. The author shares their approach to actually learning these algorithms beyond the surface.
Algorithms, Complexity, Interpretability, Machine Learning
- How Machine Learning Works for Social Good - Nov 21, 2020.
We often discuss applying data science and machine learning techniques in term so of how they help your organization or business goals. But, these algorithms aren't limited to only increasing the bottom line. Developing new applications that leverage the predictive power of AI to benefit society and those communities in need is an equally valuable endeavor for Data Scientists that will further expand the positive impact of machine learning to the world.
Advice, Chicago, Machine Learning, Social Good
- Compute Goes Brrr: Revisiting Sutton’s Bitter Lesson for AI - Nov 19, 2020.
"It's just about having more compute." Wait, is that really all there is to AI? As Richard Sutton's 'bitter lesson' sinks in for more AI researchers, a debate has stirred that considers a potentially more subtle relationship between advancements in AI based on ever-more-clever algorithms and massively scaled computational power.
AI, AlphaGo, Machine Learning, OpenAI, Richard Sutton, Scalability, Trends
- 5 Most Useful Machine Learning Tools every lazy full-stack data scientist should use - Nov 18, 2020.
If you consider yourself a Data Scientist who can take any project from data curation to solution deployment, then you know there are many tools available today to help you get the job done. The trouble is that there are too many choices. Here is a review of five sets of tools that should turn you into the most efficient full-stack data scientist possible.
Data Science Tools, Data Scientist, GitHub, Heroku, Machine Learning, Postgres, PyCharm, PyTorch, scikit-learn, Streamlit
- 5 Things You Are Doing Wrong in PyCaret - Nov 16, 2020.
PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few words only. This makes experiments exponentially fast and efficient. Find out 5 ways to improve your usage of the library.
Machine Learning, PyCaret, Python, Tips
- Top Python Libraries for Deep Learning, Natural Language Processing & Computer Vision - Nov 16, 2020.
This article compiles the 30 top Python libraries for deep learning, natural language processing & computer vision, as best determined by KDnuggets staff.
Computer Vision, Data Science, Deep Learning, Machine Learning, Neural Networks, NLP, Python
- tensorflow + dalex = :) , or how to explain a TensorFlow model - Nov 13, 2020.
Having a machine learning model that generates interesting predictions is one thing. Understanding why it makes these predictions is another. For a tensorflow predictive model, it can be straightforward and convenient develop an explainable AI by leveraging the dalex Python package.
Dalex, Explainability, Explainable AI, Machine Learning, Python, TensorFlow
- Predicting Heart Disease Using Machine Learning? Don’t! - Nov 10, 2020.
I believe the “Predicting Heart Disease using Machine Learning” is a classic example of how not to apply machine learning to a problem, especially where a lot of domain experience is required.
Advice, Failure, Healthcare, Machine Learning, Medical, Prediction
- Moving from Data Science to Machine Learning Engineering - Nov 10, 2020.
The world of machine learning — and software — is changing. Read this article to find out how, and what you can do to stay ahead of it.
Career Advice, Data Engineering, Data Science, Machine Learning, Machine Learning Engineer
- Doing the impossible? Machine learning with less than one example - Nov 9, 2020.
Machine learning algorithms are notoriously known for needing data, a lot of data -- the more data the better. But, much research has gone into developing new methods that need fewer examples to train a model, such as "few-shot" or "one-shot" learning that require only a handful or a few as one example for effective learning. Now, this lower boundary on training examples is being taken to the next extreme.
Algorithms, K-nearest neighbors, Machine Learning, Research
- Change the Background of Any Image with 5 Lines of Code - Nov 9, 2020.
Blur, color, grayscale and change the background of any image with a picture using PixelLib.
Computer Vision, Image Processing, Machine Learning, Python, Segmentation
- Top 5 Free Machine Learning and Deep Learning eBooks Everyone should read - Nov 5, 2020.
There is always so much new to learn in machine learning, and keeping well grounded in the fundamentals will help you stay up-to-date with the latest advancements while acing your career in Data Science.
Deep Learning, Free ebook, Machine Learning
- Interpretability, Explainability, and Machine Learning – What Data Scientists Need to Know - Nov 4, 2020.
The terms “interpretability,” “explainability” and “black box” are tossed about a lot in the context of machine learning, but what do they really mean, and why do they matter?
Explainability, Explainable AI, Interpretability, Machine Learning
- Dealing with Imbalanced Data in Machine Learning - Oct 29, 2020.
This article presents tools & techniques for handling data when it's imbalanced.
Balancing Classes, Machine Learning, Python
- An Introduction to AI, updated - Oct 28, 2020.
We provide an introduction to key concepts and methods in AI, covering Machine Learning and Deep Learning, with an updated extensive list that includes Narrow AI, Super Intelligence, and Classic Artificial Intelligence, as well as recent ideas of NeuroSymbolic AI, Neuroevolution, and Federated Learning.
AGI, AI, Beginners, Deep Learning, Machine Learning
- DeepMind Relies on this Old Statistical Method to Build Fair Machine Learning Models - Oct 23, 2020.
Causal Bayesian Networks are used to model the influence of fairness attributes in a dataset.
Bayesian Networks, Bias, DeepMind, Machine Learning
- Behavior Analysis with Machine Learning and R: The free eBook - Oct 22, 2020.
Check out this new free ebook to learn how to leverage the power of machine learning to analyze behavioral patterns from sensor data and electronic records using R.
Behavioral Analytics, Free ebook, Machine Learning, R
- 5 Must-Read Data Science Papers (and How to Use Them) - Oct 20, 2020.
Keeping ahead of the latest developments in a field is key to advancing your skills and your career. Five foundational ideas from recent data science papers are highlighted here with tips on how to leverage these advancements in your work, and keep you on top of the machine learning game.
Data Science, Machine Learning, P-value, Research, Software, Technical Debt, Transformer
- Feature Ranking with Recursive Feature Elimination in Scikit-Learn - Oct 19, 2020.
This article covers using scikit-learn to obtain the optimal number of features for your machine learning project.
Feature Selection, Machine Learning, Python, scikit-learn
- How to Explain Key Machine Learning Algorithms at an Interview - Oct 19, 2020.
While preparing for interviews in Data Science, it is essential to clearly understand a range of machine learning models -- with a concise explanation for each at the ready. Here, we summarize various machine learning models by highlighting the main points to help you communicate complex models.
Algorithms, Decision Trees, Interview Questions, K-nearest neighbors, Machine Learning, Naive Bayes, Regression, SVM
- Fast Gradient Boosting with CatBoost - Oct 16, 2020.
In this piece, we’ll take a closer look at a gradient boosting library called CatBoost.
CatBoost, Gradient Boosting, Machine Learning, Python
- Machine Learning’s Greatest Omission: Business Leadership - Oct 15, 2020.
Eric Siegel's business-oriented, vendor-neutral machine learning course is designed to fulfill vital unmet learner needs, delivering material critical for both techies and business leaders.
Business, Data Leadership, Eric Siegel, Machine Learning
- Uber Open Sources the Third Release of Ludwig, its Code-Free Machine Learning Platform - Oct 13, 2020.
The new release makes Ludwig one of the most complete open source AutoML stacks in the market.
Automated Machine Learning, AutoML, Machine Learning, Open Source, Uber
- 5 Best Practices for Putting Machine Learning Models Into Production - Oct 12, 2020.
Our focus for this piece is to establish the best practices that make an ML project successful.
Best Practices, Machine Learning, Production
- Exploring The Brute Force K-Nearest Neighbors Algorithm - Oct 12, 2020.
This article discusses a simple approach to increasing the accuracy of k-nearest neighbors models in a particular subset of cases.
Algorithms, K-nearest neighbors, Machine Learning, Python
- Annotated Machine Learning Research Papers - Oct 9, 2020.
Check out this collection of annotated machine learning research papers, and no longer fear their reading.
Machine Learning, Research
- How LinkedIn Uses Machine Learning in its Recruiter Recommendation Systems - Oct 8, 2020.
LinkedIn uses some very innovative machine learning techniques to optimize candidate recommendations.
LinkedIn, Machine Learning, Recommendation Engine, Recommender Systems, Recruitment
- Free Introductory Machine Learning Course From Amazon - Oct 7, 2020.
Amazon's Machine Learning University offers an introductory course titled Accelerated Machine Learning, which is a good starting place for those looking for a foundation in generalized practical ML.
Amazon, Courses, Machine Learning, MOOC
- 5 Challenges to Scaling Machine Learning Models - Oct 7, 2020.
ML models are hard to be translated into active business gains. In order to understand the common pitfalls in productionizing ML models, let’s dive into the top 5 challenges that organizations face.
Deployment, Machine Learning, Scalability
- 10 Best Machine Learning Courses in 2020 - Oct 6, 2020.
If you are ready to take your career in machine learning to the next level, then these top 10 Machine Learning Courses covering both practical and theoretical work will help you excel.
Courses, DataCamp, Deep Learning, fast.ai, Machine Learning, Online Education, Python, Stanford
- Key Machine Learning Technique: Nested Cross-Validation, Why and How, with Python code - Oct 5, 2020.
Selecting the best performing machine learning model with optimal hyperparameters can sometimes still end up with a poorer performance once in production. This phenomenon might be the result of tuning the model and evaluating its performance on the same sets of train and test data. So, validating your model more rigorously can be key to a successful outcome.
Cross-validation, Machine Learning, Python
- Machine Learning Model Deployment - Sep 30, 2020.
Read this article on machine learning model deployment using serverless deployment. Serverless compute abstracts away provisioning, managing severs and configuring software, simplifying model deployment.
Cloud, Deployment, Machine Learning, Modeling, Workflow
- Missing Value Imputation – A Review - Sep 29, 2020.
Detecting and handling missing values in the correct way is important, as they can impact the results of the analysis, and there are algorithms that can’t handle them. So what is the correct way?
Data Preprocessing, Knime, Machine Learning, Missing Values
- International alternatives to Kaggle for Data Science / Machine Learning competitions - Sep 29, 2020.
While Kaggle might be the most well-known, go-to data science competition platform to test your skills at model building and performance, additional regional platforms are available around the world that offer even more opportunities to learn... and win.
Competition, Data Science, Kaggle, Machine Learning
- LinkedIn’s Pro-ML Architecture Summarizes Best Practices for Building Machine Learning at Scale - Sep 23, 2020.
The reference architecture is powering mission critical machine learning workflows within LinkedIn.
Best Practices, LinkedIn, Machine Learning, Scalability
- How I Consistently Improve My Machine Learning Models From 80% to Over 90% Accuracy - Sep 23, 2020.
Data science work typically requires a big lift near the end to increase the accuracy of any model developed. These five recommendations will help improve your machine learning models and help your projects reach their target goals.
Accuracy, Ensemble Methods, Feature Engineering, Feature Selection, Hyperparameter, Machine Learning, Missing Values, Tips
- KDnuggets™ News 20:n36, Sep 23: New Poll: What Python IDE / Editor you used the most in 2020?; Automating Every Aspect of Your Python Project - Sep 23, 2020.
New Poll: What Python IDE / Editor you used the most in 2020?; Automating Every Aspect of Your Python Project; Autograd: The Best Machine Learning Library You're Not Using?; Implementing a Deep Learning Library from Scratch in Python; Online Certificates/Courses in AI, Data Science, Machine Learning; Can Neural Networks Show Imagination?
Automation, Certificate, Courses, Data Science, Deep Learning, DeepMind, Machine Learning, Neural Networks, Python