- 10 Key AI & Data Analytics Trends for 2022 and Beyond - Dec 17, 2021.
What AI and data analytics trends are taking the industry by storm this year? This comprehensive review highlights upcoming directions in AI to carefully watch and consider implementing in your personal work or organization.
2022 Predictions, AI, Data, Data Analysis, Deep Learning, Environment, Low-Code, No-Code, Python, Trends
- 7 Differences Between a Data Analyst and a Data Scientist - Sep 9, 2021.
This article discusses the 7 key differences between data analysts and data scientists with an aim to help potential data analysts/scientists determine which is the right one for them. I touch on day-to-day tasks, skill requirements, typical career progression, and salary and career prospects for both.
Career Advice, Data Analysis, Data Analyst, Data Science, Data Scientist
- How Visualization is Transforming Exploratory Data Analysis - Aug 4, 2021.
Data analysts are dealing with bigger datasets than ever before, making interrogation difficult. Visualized Exploratory Data Analysis, supported by advanced parallel computing, promises an answer.
Data Analysis, Data Exploration, Data Visualization, Geospatial
- A Lightning Fast Look at Single Line Exploratory Data Analysis - Jul 8, 2021.
Here's a very quick look at how you can perform EDA with a single line of code using D-Tale.
Data Analysis, Data Exploration, Data Science, Data Visualization
- Applying Python’s Explode Function to Pandas DataFrames - May 7, 2021.
Read this applied Python method to solve the issue of accessing column by date/ year using the Pandas library and functions lambda(), list(), map() & explode().
Data Analysis, Pandas, Programming, Python
- Data Analysis Using Tableau - Apr 20, 2021.
Read this overview of using Tableau for sale data analysis, and see how visualization can help tell the business story.
Business, Data Analysis, Ecommerce, Python, Sales, Tableau
- What makes a song popular? Analyzing Top Songs on Spotify - Apr 16, 2021.
With so many great (and not-so-great) songs out there, it can be hard to find those that match your musical preferences. Follow along this ML model building project to explore the extensive song data available on Spotify and design a recommendation engine that could help you discover your next favorite artist!
Beatles, Data Analysis, Data Exploration, Feature Selection, Music, Spotify
- E-commerce Data Analysis for Sales Strategy Using Python - Apr 7, 2021.
Check out this informative and concise case study applying data analysis using Python to a well-defined e-commerce scenario.
Business, Data Analysis, Ecommerce, Python, Sales
- How to frame the right questions to be answered using data - Mar 18, 2021.
Understanding your data first is a key step before going too far into any data science project. But, you can't fully understand your data until you know the right questions to ask of it.
Advice, Data Analysis, Data Exploration, Data Science, Data Visualization
- Forget Telling Stories; Help People Navigate - Mar 15, 2021.
When designing reporting & visualizations, think of them as part of a navigation framework rather than stand-alone information.
Data Analysis, Data Science, Infographic, KPI, Storytelling
- Know your data much faster with the new Sweetviz Python library - Mar 12, 2021.
One of the latest exploratory data analysis libraries is a new open-source Python library called Sweetviz, for just the purposes of finding out data types, missing information, distribution of values, correlations, etc. Find out more about the library and how to use it here.
Data Analysis, Data Exploration, Data Visualization, Python
- 11 Essential Code Blocks for Complete EDA (Exploratory Data Analysis) - Mar 5, 2021.
This article is a practical guide to exploring any data science project and gain valuable insights.
Data Analysis, Data Exploration, Data Visualization, Pandas, Python
- Pandas Profiling: One-Line Magical Code for EDA - Feb 24, 2021.
EDA can be automated using a Python library called Pandas Profiling. Let’s explore Pandas profiling to do EDA in a very short time and with just a single line code.
Data Analysis, Data Exploration, Data Science, Pandas, Python
- Powerful Exploratory Data Analysis in just two lines of code - Feb 22, 2021.
EDA is a fundamental early process for any Data Science investigation. Typical approaches for visualization and exploration are powerful, but can be cumbersome for getting to the heart of your data. Now, you can get to know your data much faster with only a few lines of code... and it might even be fun!
Data Analysis, Data Exploration, Data Visualization, Python
- Multidimensional multi-sensor time-series data analysis framework - Feb 19, 2021.
This blog post provides an overview of the package “msda” useful for time-series sensor data analysis. A quick introduction about time-series data is also provided.
Data Analysis, Python, Sensors, Time Series
- One question to make your data project 10x more valuable - Feb 1, 2021.
If you are the "data person" for your organization, then providing meaningful results to stakeholder data requests can sometimes feel like shots in the dark. However, you can make sure your data analysis is actionable by asking one magic question before getting started.
Advice, Business, Data Analysis, Data Mining, Data Science, Deployment, Problem Definition
- Cleaner Data Analysis with Pandas Using Pipes - Jan 15, 2021.
Check out this practical guide on Pandas pipes.
Data Analysis, Data Cleaning, Pandas, Pipeline, Python
- R or Python? Why Not Both? - Dec 9, 2020.
Do you use both R and Python, either in different projects or in the same? Check out prython, an IDE designed to handle your needs.
Data Analysis, Data Science, IDE, Programming, Python, R
- Why the Future of ETL Is Not ELT, But EL(T) - Dec 4, 2020.
The well-established technologies and tools around ETL (Extract, Transform, Load) are undergoing a potential paradigm shift with new approaches to data storage and expanding cloud-based compute. Decoupling the EL from T could reconcile analytics and operational data management use cases, in a new landscape where data warehouses and data lakes are merging.
Data Analysis, Data Engineering, Data Lakes, Data Preparation, ELT, ETL
- 10 Principles of Practical Statistical Reasoning - Nov 3, 2020.
Practical Statistical Reasoning is a term that covers the nature and objective of applied statistics/data science, principles common to all applications, and practical steps/questions for better conclusions. The following principles have helped me become more efficient with my analyses and clearer in my conclusions.
Data Analysis, Data Quality, Data Science, Statistical Analysis, Statistics
- 10 Underrated Python Skills - Oct 21, 2020.
Tips for feature analysis, hyperparameter tuning, data visualization and more.
Data Analysis, Data Science Skills, Data Visualization, MLflow, Pandas, Programming, Python, Time Series
- Powerful CSV processing with kdb+ - Jul 23, 2020.
This article provides a glimpse into the available tools to work with CSV files and describes how kdb+ and its query language q raise CSV processing to a new level of performance and simplicity.
Data Analysis, Data Processing, Python
- Clustering Uber Rideshare Data - Jul 14, 2020.
This blog discusses clustering the Uber ridesharing dataset, with a focus on interpretation and understanding the concepts in the real world.
Clustering, Data Analysis, Uber
- Understanding Time Series with R - Jul 9, 2020.
Analyzing time series is such a useful resource for essentially any business, data scientists entering the field should bring with them a solid foundation in the technique. Here, we decompose the logical components of a time series using R to better understand how each plays a role in this type of analysis.
Beginners, Business Analytics, Data Analysis, R, Time Series
- Exploratory Data Analysis on Steroids - Jul 6, 2020.
This is a central aspect of Data Science, which sometimes gets overlooked. The first step of anything you do should be to know your data: understand it, get familiar with it. This concept gets even more important as you increase your data volume: imagine trying to parse through thousands or millions of registers and make sense out of them.
Data Analysis, Data Exploration, Data Preparation, Pandas, Python
- Python for data analysis… is it really that simple?!? - Apr 2, 2020.
The article addresses a simple data analytics problem, comparing a Python and Pandas solution to an R solution (using plyr, dplyr, and data.table), as well as kdb+ and BigQuery solutions. Performance improvement tricks for these solutions are then covered, as are parallel/cluster computing approaches and their limitations.
Data Analysis, Pandas, Python, R, SQL
- The Last SQL Guide for Data Analysis You’ll Ever Need - Oct 4, 2019.
This is it: the last SQL guide for data analysis you'll ever need! OK, maybe it’s actually the first. But it’ll give you a solid head start.
Cheat Sheet, Data Analysis, Data Science, SQL
- Exploratory Data Analysis Using Python - Aug 7, 2019.
In this tutorial, you’ll use Python and Pandas to explore a dataset and create visual distributions, identify and eliminate outliers, and uncover correlations between two datasets.
ActiveState, Data Analysis, Data Exploration, Pandas, Python
- 10 Simple Hacks to Speed up Your Data Analysis in Python - Jul 11, 2019.
This article lists some curated tips for working with Python and Jupyter Notebooks, covering topics such as easily profiling data, formatting code and output, debugging, and more. Hopefully you can find something useful within.
Data Analysis, Jupyter, Pandas, Python, Tips
- Who is your Golden Goose?: Cohort Analysis - May 30, 2019.
Step-by-step tutorial on how to perform customer segmentation using RFM analysis and K-Means clustering in Python.
Pages: 1 2
Clustering, Data Analysis, K-means, Python, Retail
- Explaining the 68-95-99.7 rule for a Normal Distribution - Jul 19, 2018.
This post explains how those numbers were derived in the hope that they can be more interpretable for your future endeavors.
Data Analysis, Data Science, Normal Distribution, Python, Statistics
- Hands-on: Intro to Python for Data Analysis - May 2, 2018.
Learn one of the top languages used in data science and machine learning with this new hands-on course by TDWI Online Learning.
Data Analysis, Online Education, Python, TDWI
- Jupyter Notebook for Beginners: A Tutorial - May 1, 2018.
The Jupyter Notebook is an incredibly powerful tool for interactively developing and presenting data science projects. Although it is possible to use many different programming languages within Jupyter Notebooks, this article will focus on Python as it is the most common use case.
Pages: 1 2
Data Analysis, GitHub, Jupyter, Matplotlib, Python
- Applied Data Science: Solving a Predictive Maintenance Business Problem Part 2 - Feb 20, 2018.
In this post we will discuss further on how exploratory analysis can be used for getting insights for feature engineering.
Data Analysis, Data Exploration, Data Science, Feature Engineering
- Top 15 Scala Libraries for Data Science in 2018 - Feb 9, 2018.
For your convenience, we have prepared a comprehensive overview of the most important libraries used to perform machine learning and Data Science tasks in Scala.
Apache Spark, Data Analysis, Data Science, Data Visualization, Machine Learning, NLP, Scala
- An opinionated Data Science Toolbox in R from Hadley Wickham, tidyverse - Oct 10, 2017.
Get your productivity boosted with Hadley Wickham's powerful R package, tidyverse. It has all you need to start developing your own data science workflows.
Data Analysis, Data Science, Data Science Platform, Data Science Tools, Hadley Wickham, R, Tidyverse
- A Guide to Instagramming with Python for Data Analysis - Aug 17, 2017.
I am writing this article to show you the basics of using Instagram in a programmatic way. You can benefit from this if you want to use it in a data analysis, computer vision, or any other cool project you can think of.
Pages: 1 2
Data Analysis, Image Recognition, Instagram, Python
- How to squeeze the most from your training data - Jul 27, 2017.
In many cases, getting enough well-labelled training data is a huge hurdle for developing accurate prediction systems. Here is an innovative approach which uses SVM to get the most from training data.
Data Analysis, Data Preparation, Machine Learning, Support Vector Machines, SVM, Training Data
- Exploratory Data Analysis in Python - Jul 7, 2017.
We view EDA very much like a tree: there is a basic series of steps you perform every time you perform EDA (the main trunk of the tree) but at each step, observations will lead you down other avenues (branches) of exploration by raising questions you want to answer or hypotheses you want to test.
Data Analysis, Data Exploration, Data Preparation, Jupyter, Python, SVDS
- Getting Started with Python for Data Analysis - Jul 5, 2017.
A guide for beginners to Python for getting started with data analysis.
Beginners, Data Analysis, Jupyter, numpy, Python
- K-means Clustering with Tableau – Call Detail Records Example - Jun 16, 2017.
We show how to use Tableau 10 clustering feature to create statistically-based segments that provide insights about similarities in different groups and performance of the groups when compared to each other.
Pages: 1 2
Clustering, Data Analysis, GitHub, K-means, Tableau, Telecom
- K-means Clustering with R: Call Detail Record Analysis - Jun 6, 2017.
Call Detail Record (CDR) is the information captured by the telecom companies during Call, SMS, and Internet activity of a customer. This information provides greater insights about the customer’s needs when used with customer demographics.
Clustering, Data Analysis, K-means, Telecom
- The Guerrilla Guide to Machine Learning with R - May 11, 2017.
This post is a lean look at learning machine learning with R. It is a complete, if very short, course for the quick study hacker with no time (or patience) to spare.
Data Analysis, Machine Learning, R
- Did you know cavemen were already dealing with “Big Data” issues? - May 3, 2017.
We know Big Data & Analytics are new & cutting edge technologies; but actually, human started using data & analytics techniques 5000 years ago. Let’s take a look.
Big Data, Big Data Analytics, Data Analysis, Data Science, History
- The Value of Exploratory Data Analysis - Apr 20, 2017.
In this post, we will give a high level overview of what exploratory data analysis (EDA) typically entails and then describe three of the major ways EDA is critical to successfully model and interpret its results.
Data Analysis, Data Exploration, Data Visualization, SVDS
- What is Structural Equation Modeling? - Mar 27, 2017.
Structural Equation Modeling (SEM) is an extremely broad and flexible framework for data analysis, perhaps better thought of as a family of related methods rather than as a single technique. What is its relevance to Marketing Research?
Data Analysis, Market Research, Modeling, Psychology
- Time Series Analysis: A Primer - Jan 17, 2017.
Time series analysis is a complex subject but, in short, when we use our usual cross-sectional techniques such as regression on time series data, variables can appear "more significant" than they really are and we are not taking advantage of the information the serial correlation in the data provides.
Data Analysis, Time Series
- Free ebooks: Machine Learning with Python and Practical Data Analysis - Dec 5, 2016.
Two free ebooks: "Building Machine Learning Systems with Python" and "Practical Data Analysis" will give your skills a boost and make a great start in the New Year.
Data Analysis, Free ebook, Machine Learning, Packt Publishing, Python
- Comprehensive Guide to Learning Python for Data Analysis and Data Science - Apr 20, 2016.
Want to make a career change to Data Science using python? Well learning anything on your own can be a challenge & a little guidance could be a great help, that is exactly what this article will provide you with.
Pages: 1 2
Data Analysis, Data Science Education, DataCamp, Python
- Integrating Python and R into a Data Analysis Pipeline, Part 1 - Oct 29, 2015.
The first in a series of blog posts that: outline the basic strategy for integrating Python and R, run through the different steps involved in this process; and give a real example of how and why you would want to do this.
Pages: 1 2
Data Analysis, Mango Solutions, Python, Python vs R, R
- Which Movie Sequels Are Really Better? A Data Science Answer - Oct 19, 2015.
The internet is filled with polls and lists of sequels that are better or worse movie in the series. Yet such rankings are often based on personal judgement and rarely on data and statistics. Here is our solution to analyze and visualize the movie series.
Data Analysis, Data Visualization, IMDb, James Bond, Movies, Silk.co
- Interview: David Kasik, Boeing on Data Analysis vs Data Analytics - Feb 23, 2015.
We discuss the impact of increasing amount of data on visualization, difference between Data Analysis and Data Analytics, motivation, trends, desired skills and more.
3D, Boeing, Career, Data Analysis, Data Analytics, David Kasik, Trends, Visualization
- Domino – A Platform For Modern Data Analysis - Jun 26, 2014.
Tools that facilitate data science best practices have not yet matured to match their counterparts in the world of software engineering. Domino is a platform built from the ground up to fill in these gaps and accelerate modern analytical workflows.
Business Analytics, Data Analysis, Data Science Platform, Domino, Tools
- Top 10 Data Analysis Tools for Business - Jun 13, 2014.
Ten free, easy-to-use, and powerful tools to help you analyze and visualize data, analyze social networks, do optimization, search more efficiently, and solve your data analysis problems.
Data Analysis, Knime, RapidMiner, Tableau, Top 10, Wolfram