- DataScience.com Releases Python Package for Interpreting the Decision-Making Processes of Predictive Models - May 24, 2017.
DataScience.com new Python library, Skater, uses a combination of model interpretation algorithms to identify how models leverage data to make predictions.
Datascience.com, GitHub, Interpretability, Python
- Introduction to Anomaly Detection - Apr 3, 2017.
This overview will cover several methods of detecting anomalies, as well as how to build a detector in Python using simple moving average (SMA) or low-pass filter.
Anomaly Detection, Datascience.com, Python, Time Series
- The Challenges of Building a Predictive Churn Model - Mar 8, 2017.
Unlike other data science problems, there is no one method for predicting which customers are likely to churn in the next month. Here we review the most popular approaches.
Churn, Customer Analytics, Datascience.com, Survival Analysis
- What is Customer Churn Modeling? Why is it valuable? - Mar 1, 2017.
Getting new customers is much more more expensive than retaining existing ones, so reducing churn is a top priority for many firms. Understanding why customers churn and estimating the risks are powerful components of a data-driven retention strategy.
Churn, Customer Analytics, Datascience.com
- Introduction to Correlation - Feb 22, 2017.
Correlation is one of the most widely used (and widely misunderstood) statistical concepts. We provide the definitions and intuition behind several types of correlation and illustrate how to calculate correlation using the Python pandas library.
Beginners, Correlation, Datascience.com, Pandas, Python, Statistics
- Introduction to Natural Language Processing, Part 1: Lexical Units - Feb 16, 2017.
This series explores core concepts of natural language processing, starting with an introduction to the field and explaining how to identify lexical units as a part of data preprocessing.
Data Preprocessing, Datascience.com, Feature Extraction, Natural Language Processing, NLP, Tokenization
- Introduction to Forecasting with ARIMA in R - Jan 16, 2017.
ARIMA models are a popular and flexible class of forecasting model that utilize historical information to make predictions. In this tutorial, we walk through an example of examining time series for demand at a bike-sharing service, fitting an ARIMA model, and creating a basic forecast.
ARIMA, Datascience.com, Forecasting, R, Stationarity, Time Series
- Introduction to Bayesian Inference - Dec 16, 2016.
Bayesian inference is a powerful toolbox for modeling uncertainty, combining researcher understanding of a problem with data, and providing a quantitative measure of how plausible various facts are. This overview from Datascience.com introduces Bayesian probability and inference in an intuitive way, and provides examples in Python to help get you started.
Bayesian, Datascience.com, Inference, Probability
- Introduction to K-means Clustering: A Tutorial - Dec 9, 2016.
A beginner introduction to the widely-used K-means clustering algorithm, using a delivery fleet data example in Python.
Clustering, Datascience.com, K-means, Python