- Explainable Forecasting and Nowcasting with State-of-the-art Deep Neural Networks and Dynamic Factor Model - Dec 27, 2021.
Review this detailed tutorial with code and revisit the decades-long old problem using a democratized and interpretable AI framework of how precisely can we anticipate the future and understand its causal factors?
Data Exploration, Explainable AI, Feature Engineering, Forecasting
- Machine Learning Model Development and Model Operations: Principles and Practices - Oct 27, 2021.
The ML model management and the delivery of highly performing model is as important as the initial build of the model by choosing right dataset. The concepts around model retraining, model versioning, model deployment and model monitoring are the basis for machine learning operations (MLOps) that helps the data science teams deliver highly performing models.
Algorithms, Deployment, Feature Engineering, Machine Learning, MLOps
- 11 Most Practical Data Science Skills for 2022 - Oct 19, 2021.
While the field of data science continues to evolve with exciting new progress in analytical approaches and machine learning, there remain a core set of skills that are foundational for all general practitioners and specialists, especially those who want to be employable with full-stack capabilities.
Career Advice, Data Science Skills, Explainable AI, Feature Engineering, GitHub, NLP, Regression, SQL
- Date Processing and Feature Engineering in Python - Jul 15, 2021.
Have a look at some code to streamline the parsing and processing of dates in Python, including the engineering of some useful and common features.
Beginners, Data Preprocessing, Data Processing, Feature Engineering, Python, Time Series
- Feature Selection – All You Ever Wanted To Know - Jun 10, 2021.
Although your data set may contain a lot of information about many different features, selecting only the "best" of these to be considered by a machine learning model can mean the difference between a model that performs well--with better performance, higher accuracy, and more computational efficiency--and one that falls flat. The process of feature selection guides you toward working with only the data that may be the most meaningful, and to accomplish this, a variety of feature selection types, methodologies, and techniques exist for you to explore.
Feature Engineering, Feature Selection, Machine Learning
- Feature Engineering of DateTime Variables for Data Science, Machine Learning - Apr 29, 2021.
Learn how to make more meaningful features from DateTime type variables to be used by Machine Learning Models.
Data Science, Feature Engineering, Machine Learning, Python
- Data Science 101: Normalization, Standardization, and Regularization - Apr 20, 2021.
Normalization, standardization, and regularization all sound similar. However, each plays a unique role in your data preparation and model building process, so you must know when and how to use these important procedures.
Data Preprocessing, Feature Engineering, Normalization, Regression, Regularization, Statistics
- Feature Store as a Foundation for Machine Learning - Feb 19, 2021.
With so many organizations now taking the leap into building production-level machine learning models, many lessons learned are coming to light about the supporting infrastructure. For a variety of important types of use cases, maintaining a centralized feature store is essential for higher ROI and faster delivery to market. In this review, the current feature store landscape is described, and you can learn how to architect one into your MLOps pipeline.
Data Engineering, Data Infrastructure, Data Lake, Feature Engineering, Feature Store, Machine Learning, Metadata, MLOps, Pipeline
- How I Consistently Improve My Machine Learning Models From 80% to Over 90% Accuracy - Sep 23, 2020.
Data science work typically requires a big lift near the end to increase the accuracy of any model developed. These five recommendations will help improve your machine learning models and help your projects reach their target goals.
Accuracy, Ensemble Methods, Feature Engineering, Feature Selection, Hyperparameter, Machine Learning, Missing Values, Tips
- Feature Engineering for Numerical Data - Sep 11, 2020.
Data feeds machine learning models, and the more the better, right? Well, sometimes numerical data isn't quite right for ingestion, so a variety of methods, detailed in this article, are available to transform raw numbers into something a bit more palatable.
Data Preparation, Data Science, Feature Engineering
- Feature Engineering in SQL and Python: A Hybrid Approach - Jul 2, 2020.
Set up your workstation, reduce workplace clutter, maintain a clean namespace, and effortlessly keep your dataset up-to-date.
Feature Engineering, Python, SQL
- The Architecture Used at LinkedIn to Improve Feature Management in Machine Learning Models - May 11, 2020.
The new typed feature schema streamlined the reusability of features across thousands of machine learning models.
Feature Engineering, Feature Selection, LinkedIn, Machine Learning
- A Key Missing Part of the Machine Learning Stack - Apr 20, 2020.
With many organizations having machine learning models running in production, some are discovering that inefficiencies exists in the first step of the process: feature definition and extraction. Robust feature management is now being realized as a key missing part of the ML stack, and improving it by applying standard software development practices is gaining attention.
Feature Engineering, Feature Extraction, Feature Store, Machine Learning
- Diffusion Map for Manifold Learning, Theory and Implementation - Mar 25, 2020.
This article aims to introduce one of the manifold learning techniques called Diffusion Map. This technique enables us to understand the underlying geometric structure of high dimensional data as well as to reduce the dimensions, if required, by neatly capturing the non-linear relationships between the original dimensions.
Data Preparation, Data Science, Dimensionality Reduction, Feature Engineering, Machine Learning
- 4 Tips for Advanced Feature Engineering and Preprocessing - Aug 29, 2019.
Techniques for creating new features, detecting outliers, handling imbalanced data, and impute missing values.
Data Preprocessing, Feature Engineering, Python, Tips
- Dealing with categorical features in machine learning - Jul 16, 2019.
Many machine learning algorithms require that their input is numerical and therefore categorical features must be transformed into numerical features before we can use any of these algorithms.
Data Cleaning, Data Preprocessing, Feature Engineering, Machine Learning, Python
- The Hitchhiker’s Guide to Feature Extraction - Jun 3, 2019.
Check out this collection of tricks and code for Kaggle and everyday work.
Feature Engineering, Feature Extraction, Feature Selection, Kaggle, Python
- 7 Steps to Mastering Intermediate Machine Learning with Python — 2019 Edition - Jun 3, 2019.
This is the second part of this new learning path series for mastering machine learning with Python. Check out these 7 steps to help master intermediate machine learning with Python!
7 Steps, Classification, Cross-validation, Dimensionality Reduction, Feature Engineering, Feature Selection, Image Classification, K-nearest neighbors, Machine Learning, Modeling, Naive Bayes, numpy, Pandas, PCA, Python, scikit-learn, Transfer Learning
- Feature Reduction using Genetic Algorithm with Python - Mar 25, 2019.
This tutorial discusses how to use the genetic algorithm (GA) for reducing the feature vector extracted from the Fruits360 dataset in Python mainly using NumPy and Sklearn.
Pages: 1 2
Deep Learning, Feature Engineering, Genetic Algorithm, Neural Networks, numpy, Python, scikit-learn
- 3 Reasons Why AutoML Won’t Replace Data Scientists Yet - Mar 6, 2019.
We dispel the myth that AutoML is replacing Data Scientists jobs by highlighting three factors in Data Science development that AutoML can’t solve.
Automated Machine Learning, Automation, AutoML, Data Scientist, Feature Engineering, Reinforcement Learning
- A Quick Guide to Feature Engineering - Feb 11, 2019.
Feature engineering plays a key role in machine learning, data mining, and data analytics. This article provides a general definition for feature engineering, together with an overview of the major issues, approaches, and challenges of the field.
Feature Engineering, Feature Extraction, Feature Selection
- Good Feature Building Techniques and Tricks for Kaggle - Dec 31, 2018.
A selection of top tips to obtain great results on Kaggle leaderboards, including useful code examples showing how best to use Latitude and Longitude features.
Feature Engineering, Kaggle, Tips
- Feature Engineering for Machine Learning: 10 Examples - Dec 21, 2018.
A brief introduction to feature engineering, covering coordinate transformation, continuous data, categorical features, missing values, normalization, and more.
Data, Data Preparation, Data Processing, Feature Engineering, Normalization
- Data Mining Book – Chapter Download - Dec 4, 2018.
Download this immediately useful book chapter, and learn how to create derived variables, which allow the statistical and Data Science modeling to incorporate human insights.
Data Mining, Data Visualization, Derived Variables, Feature Engineering, JMP, Michael Berry
- Data Mining Book – Chapter Download - Nov 2, 2018.
Download this immediately useful book chapter, and learn how to create derived variables, which allow the statistical and Data Science modeling to incorporate human insights.
Data Mining, Data Visualization, Derived Variables, Feature Engineering, JMP, Michael Berry
- Implementing Automated Machine Learning Systems with Open Source Tools - Oct 25, 2018.
What if you want to implement an automated machine learning pipeline of your very own, or automate particular aspects of a machine learning pipeline? Rest assured that there is no need to reinvent any wheels.
Automated Machine Learning, Feature Engineering, Feature Selection, Hyperparameter, Machine Learning, Open Source
- Why Automated Feature Engineering Will Change the Way You Do Machine Learning - Aug 20, 2018.
Automated feature engineering will save you time, build better predictive models, create meaningful features, and prevent data leakage.
Automated Machine Learning, Feature Engineering, Machine Learning, Python
- Implementing Deep Learning Methods and Feature Engineering for Text Data: FastText - May 1, 2018.
Overall, FastText is a framework for learning word representations and also performing robust, fast and accurate text classification. The framework is open-sourced by Facebook on GitHub.
Facebook, Feature Engineering, NLP, Python
- Implementing Deep Learning Methods and Feature Engineering for Text Data: The GloVe Model - Apr 25, 2018.
The GloVe model stands for Global Vectors which is an unsupervised learning model which can be used to obtain dense word vectors similar to Word2Vec.
Deep Learning, Feature Engineering, NLP, Python, Text Mining
- Robust Word2Vec Models with Gensim & Applying Word2Vec Features for Machine Learning Tasks - Apr 17, 2018.
The gensim framework, created by Radim Řehůřek consists of a robust, efficient and scalable implementation of the Word2Vec model.
Feature Engineering, NLP, Python, Word Embeddings, word2vec
- Implementing Deep Learning Methods and Feature Engineering for Text Data: The Skip-gram Model - Apr 10, 2018.
Just like we discussed in the CBOW model, we need to model this Skip-gram architecture now as a deep learning classification model such that we take in the target word as our input and try to predict the context words.
Deep Learning, Feature Engineering, NLP, Python, Text Mining, Word Embeddings
- Understanding Feature Engineering: Deep Learning Methods for Text Data - Mar 28, 2018.
Newer, advanced strategies for taming unstructured, textual data: In this article, we will be looking at more advanced feature engineering strategies which often leverage deep learning models.
Deep Learning, Feature Engineering, NLP, Python, Text Mining
- Quick Feature Engineering with Dates Using fast.ai - Mar 16, 2018.
The fast.ai library is a collection of supplementary wrappers for a host of popular machine learning libraries, designed to remove the necessity of writing your own functions to take care of some repetitive tasks in a machine learning workflow.
fast.ai, Feature Engineering, Machine Learning, Pandas, Python, Time Series
- Applied Data Science: Solving a Predictive Maintenance Business Problem Part 2 - Feb 20, 2018.
In this post we will discuss further on how exploratory analysis can be used for getting insights for feature engineering.
Data Analysis, Data Exploration, Data Science, Feature Engineering
- Deep Feature Synthesis: How Automated Feature Engineering Works - Feb 7, 2018.
Automating feature engineering optimizes the process of building and deploying accurate machine learning models by handling necessary but tedious tasks so data scientists can focus more on other important steps.
Automated Machine Learning, Automation, Data Science, Feature Engineering, Machine Learning
- Automated Feature Engineering for Time Series Data - Nov 20, 2017.
We introduce a general framework for developing time series models, generating features and preprocessing the data, and exploring the potential to automate this process in order to apply advanced machine learning algorithms to almost any time series problem.
Automated Machine Learning, Data Preparation, Feature Engineering, Feature Selection, Time Series
- How Feature Engineering Can Help You Do Well in a Kaggle Competition – Part 3 - Jul 4, 2017.
In this last post of the series, I describe how I used more powerful machine learning algorithms for the click prediction problem as well as the ensembling techniques that took me up to the 19th position on the leaderboard (top 2%)
Feature Engineering, Jupyter, Kaggle, Machine Learning, Python
- Data Mining Techniques, Free Chapter: Derived Variables – Making the Data Mean More - Jun 12, 2017.
Download this chapter by Gordon Linoff and Michael Berry, and learn how to create derived variables, which allow the statistical modeling process to incorporate human insights.
Data Mining, Derived Variables, Feature Engineering, JMP, Michael Berry
- How Feature Engineering Can Help You Do Well in a Kaggle Competition – Part I - Jun 8, 2017.
As I scroll through the leaderboard page, I found my name in the 19th position, which was the top 2% from nearly 1,000 competitors. Not bad for the first Kaggle competition I had decided to put a real effort in!
Apache Spark, Feature Engineering, Jupyter, Kaggle, Machine Learning, Python
- 17 More Must-Know Data Science Interview Questions and Answers, Part 2 - Feb 22, 2017.
The second part of 17 new must-know Data Science Interview questions and answers covers overfitting, ensemble methods, feature selection, ground truth in unsupervised learning, the curse of dimensionality, and parallel algorithms.
Algorithms, Data Science, Ensemble Methods, Feature Engineering, Feature Selection, High-dimensional, Interview Questions, Overfitting, Unsupervised Learning
- Data Mining Tip: How to Use High-cardinality Attributes in a Predictive Model - Aug 29, 2016.
High-cardinality nominal attributes can pose an issue for inclusion in predictive models. There exist a few ways to accomplish this, however, which are put forward here.
Feature Engineering, Feature Selection, Predictive Models
- In Deep Learning, Architecture Engineering is the New Feature Engineering - Jul 19, 2016.
A discussion of architecture engineering in deep neural networks, and its relationship with feature engineering.
Architecture, Deep Learning, Feature Engineering, Neural Networks
- Opening Up Deep Learning For Everyone - Feb 19, 2016.
Opening deep learning up to everyone is a noble goal. But is it achievable? Should non-programmers and even non-technical people be able to implement deep neural models?
Caffe, Deep Learning, Feature Engineering, Open Source, TensorFlow
- Useful Data Science: Feature Hashing - Jan 28, 2016.
Feature engineering plays major role while solving the data science problems. Here, we will learn Feature Hashing, or the hashing trick which is a method for turning arbitrary features into a sparse binary vector.
Feature Engineering, Hashing, Python, Will McGinnis
- Anthony Goldbloom gives you the Secret to winning Kaggle competitions - Jan 20, 2016.
Kaggle CEO shares insights on best approaches to win Kaggle competitions, along with a brief explanation of how Kaggle competitions work.
Anthony Goldbloom, Competition, Deep Learning, Feature Engineering, Kaggle, Neural Networks, Success
- The Art of Data Science: The Skills You Need and How to Get Them - Dec 28, 2015.
Learn, how to turn the deluge of data into the gold by algorithms, feature engineering, reasoning out business value and ultimately building a data driven organization.
Algorithms, Data Science Skills, Feature Engineering, MapR
- Lessons from 2 Million Machine Learning Models on Kaggle - Dec 24, 2015.
Lessons from Kaggle competitions, including why XG Boosting is the top method for structured problems, Neural Networks and deep learning dominate unstructured problems (visuals, text, sound), and 2 types of problems for which Kaggle is suitable.
Anthony Goldbloom, Boosting, Competition, Feature Engineering, Kaggle
- The Data Science Machine, or ‘How To Engineer Feature Engineering’ - Oct 22, 2015.
MIT researchers have developed what they refer to as the Data Science Machine, which combines feature engineering and an end-to-end data science pipeline into a system that beats nearly 70% of humans in competitions. Is this game-changing?
Automated, Data Science, Feature Engineering, Feature Extraction, MIT