- Three R Libraries Every Data Scientist Should Know (Even if You Use Python) - Dec 20, 2021.
Check out these powerful R libraries built by the world’s biggest tech companies.
Data Science, Data Scientist, Python, R
- Path to Full Stack Data Science - Sep 27, 2021.
Start your journey toward mastering all aspects of the field of Data Science with this focused list of in-depth self-learning resources. Curated with the beginner in mind, these recommendations will help you learn efficiently, and can also offer existing professionals useful highlights for review or help filling in any gaps in skills.
Career Advice, Data Science, Data Science Education, Data Visualization, Mathematics, Python, R, Roadmap
- ebook: Learn Data Science with R – free download - Sep 7, 2021.
Check out this new book for data science beginners with many practical examples that covers statistics, R, graphing, and machine learning. As a source to learn the full breadth of data science foundations, "Learn Data Science with R" starts at the beginner level and gradually progresses into expert content.
Data Science, Data Science Education, ebook, R
- Introduction to Statistical Learning Second Edition - Aug 13, 2021.
The second edition of the classic "An Introduction to Statistical Learning, with Applications in R" was published very recently, and is now freely-available via PDF on the book's website.
Books, Data Science, Machine Learning, R, Statistical Learning, Statistics
- 5 Tips for Writing Clean R Code - Aug 9, 2021.
This article summarizes the most common mistakes to avoid and outline best practices to follow in programming in general. Follow these tips to speed up the code review iteration process and be a rockstar developer in your reviewer’s eyes!
Programming, R
- The Most In-Demand Skills for Data Scientists in 2021 - Apr 15, 2021.
If you are preparing to make a career as a Data Scientist or are looking for opportunities to skill-up in your current role, this analysis of in-demand skills for 2021, based on over 15,000 Data Scientist job postings, should offer you a good idea as to which programming languages and software tools are increasing and decreasing in importance.
AWS, Data Science Skills, Python, PyTorch, R, scikit-learn, SQL, TensorFlow
- Data Science Curriculum for Professionals - Mar 25, 2021.
If you are looking to expand or transition your current professional career that is buried in spreadsheet analysis into one powered by data science, then you are in for an exciting but complex journey with much to explore and master. To begin your adventure, following this complete road map to guide you from a gnome in the forest of spreadsheets to an AI wizard known far and wide throughout the kingdom.
Cloud Computing, Data Science Education, Data Visualization, Machine Learning, Python, R, Roadmap, Statistics
- Support Vector Machine for Hand Written Alphabet Recognition in R - Jan 27, 2021.
We attempt to break down a problem of hand written alphabet image recognition into a simple process rather than using heavy packages. This is an attempt to create the data and then build a model using Support Vector Machines for Classification.
Classification, Image Recognition, Machine Learning, R, Support Vector Machines
- Creating Good Meaningful Plots: Some Principles - Jan 12, 2021.
Hera are some thought starters to help you create meaningful plots.
Charts, Data Visualization, Python, R
- 15 Free Data Science, Machine Learning & Statistics eBooks for 2021 - Dec 31, 2020.
We present a curated list of 15 free eBooks compiled in a single location to close out the year.
Automated Machine Learning, Data Science, Deep Learning, Free ebook, Machine Learning, NLP, Python, R, Statistics
- Undersampling Will Change the Base Rates of Your Model’s Predictions - Dec 17, 2020.
In classification problems, the proportion of cases in each class largely determines the base rate of the predictions produced by the model. Therefore if you use sampling techniques that change this proportion, there is a good chance you will want to rescale / calibrate your predictions before using them in the wild.
Classification, Modeling, Predictions, R, Sampling
- R or Python? Why Not Both? - Dec 9, 2020.
Do you use both R and Python, either in different projects or in the same? Check out prython, an IDE designed to handle your needs.
Data Analysis, Data Science, IDE, Programming, Python, R
- Simple & Intuitive Ensemble Learning in R - Dec 2, 2020.
Read about metaEnsembleR, an R package for heterogeneous ensemble meta-learning (classification and regression) that is fully-automated.
Classification, Ensemble Methods, R, Regression
- Top 6 Data Science Programs for Beginners - Nov 20, 2020.
Udacity has the best industry-leading programs in data science. Here are the top six data science courses for beginners to help you get started.
Beginners, Certificate, Data Engineer, Data Science Education, Data Visualization, Online Education, Python, R, SQL, Udacity
- Behavior Analysis with Machine Learning and R: The free eBook - Oct 22, 2020.
Check out this new free ebook to learn how to leverage the power of machine learning to analyze behavioral patterns from sensor data and electronic records using R.
Behavioral Analytics, Free ebook, Machine Learning, R
- Text Mining with R: The Free eBook - Oct 15, 2020.
This freely-available book will show you how to perform text analytics in R, using packages from the tidyverse.
Free ebook, R, Text Mining, Tidyverse
- Data Science Tools Illustrated Study Guides - Aug 25, 2020.
These data science tools illustrated guides are broken up into four distinct categories: data retrieval, data manipulation, data visualization, and engineering tips. Both online and PDF versions of these guides are available.
Cheat Sheet, Data Preprocessing, Data Processing, Data Science, Data Science Tools, Data Visualization, Python, R, SQL
- Wrapping Machine Learning Techniques Within AI-JACK Library in R - Jul 17, 2020.
The article shows an approach to solving problem of selecting best technique in machine learning. This can be done in R using just one library called AI-JACK and the article shows how to use this tool.
Automated Machine Learning, AutoML, Machine Learning, Modeling, R
- Understanding Time Series with R - Jul 9, 2020.
Analyzing time series is such a useful resource for essentially any business, data scientists entering the field should bring with them a solid foundation in the technique. Here, we decompose the logical components of a time series using R to better understand how each plays a role in this type of analysis.
Beginners, Business Analytics, Data Analysis, R, Time Series
- An Introduction to Statistical Learning: The Free eBook - Jun 29, 2020.
This week's free eBook is a classic of data science, An Introduction to Statistical Learning, with Applications in R. If interested in picking up elementary statistical learning concepts, and learning how to implement them in R, this book is for you.
Free ebook, R, Robert Tibshirani, Statistical Learning, Trevor Hastie
- Practical Markov Chain Monte Carlo - Jun 26, 2020.
This is a slightly more intricate example of MCMC, compared to many with a fairly simple model, a single predictor (maybe two), and not much else, which highlights a couple of issues and tricks worth noting for a handwritten implementation.
Bayesian, Markov Chains, Monte Carlo, R
- Data Science Tools Popularity, animated - Jun 25, 2020.
Watch the evolution of the top 10 most popular data science tools based on KDnuggets software polls from 2000 to 2019.
About KDnuggets, Data Science Platform, Poll, Python, R
- Build a Branded Web Based GIS Application Using R, Leaflet and Flexdashboard - Jun 24, 2020.
By using R, Flexdashboard and Leaflet, we can build a customized and branded web application to showcase location based data interactively across the organization. Instead of crowding the application with many widgets, we use menu tabs and pages to separate the interactive aspects.
Data Scientist, Data Visualization, Geospatial, GIS, Leaflet, R, Rstudio
- modelStudio and The Grammar of Interactive Explanatory Model Analysis - Jun 19, 2020.
modelStudio is an R package that automates the exploration of ML models and allows for interactive examination. It works in a model agnostic fashion, therefore is compatible with most of the ML frameworks.
Analysis, Explainability, Interpretability, Machine Learning, R
- Python for data analysis… is it really that simple?!? - Apr 2, 2020.
The article addresses a simple data analytics problem, comparing a Python and Pandas solution to an R solution (using plyr, dplyr, and data.table), as well as kdb+ and BigQuery solutions. Performance improvement tricks for these solutions are then covered, as are parallel/cluster computing approaches and their limitations.
Data Analysis, Pandas, Python, R, SQL
- Time Series Classification Synthetic vs Real Financial Time Series - Mar 18, 2020.
This article discusses distinguishing between real financial time series and synthetic time series using XGBoost.
Finance, R, Time Series, XGBoost
- Decision Boundary for a Series of Machine Learning Models - Mar 13, 2020.
I train a series of Machine Learning models using the iris dataset, construct synthetic data from the extreme points within the data and test a number of Machine Learning models in order to draw the decision boundaries from which the models make predictions in a 2D space, which is useful for illustrative purposes and understanding on how different Machine Learning models make predictions.
Decision Boundaries, Machine Learning, Modeling, R
- Python and R Courses for Data Science - Feb 26, 2020.
Since Python and R are a must for today's data scientists, continuous learning is paramount. Online courses are arguably the best and most flexible way to upskill throughout ones career.
Coursera, Data Science, edX, MOOC, Programming, Python, R
- Getting Started with R Programming - Feb 19, 2020.
An end to end Data Analysis using R, the second most requested programming language in Data Science.
Data Science, Machine Learning, Programming, R
- Introduction to Geographical Time Series Prediction with Crime Data in R, SQL, and Tableau - Feb 14, 2020.
When reviewing geographical data, it can be difficult to prepare the data for an analysis. This article helps by covering importing data into a SQL Server database; cleansing and grouping data into a map grid; adding time data points to the set of grid data and filling in the gaps where no crimes occurred; importing the data into R; running XGBoost model to determine where crimes will occur on a specific day
Crime, Geospatial, R, SQL, Tableau, Time Series
- Basics of Audio File Processing in R - Feb 11, 2020.
This post provides basic information on audio processing using R as the programming language. It also walks through and understands some basics of sound and digital audio.
Audio, Data Processing, R
- Serverless Machine Learning with R on Cloud Run - Feb 4, 2020.
Expedite the deployment of your machine models using serverless cloud infrastructure. In this tutorial, we explore creating and deploying a model which scraps real time Twitter data and returns interactive visualization using R.
Cloud, Machine Learning, R, Twitter
- Classify A Rare Event Using 5 Machine Learning Algorithms - Jan 15, 2020.
Which algorithm works best for unbalanced data? Are there any tradeoffs?
Algorithms, Classification, Machine Learning, R, ROC-AUC, Unbalanced
- Beginner’s Guide to K-Nearest Neighbors in R: from Zero to Hero - Jan 3, 2020.
This post presents a pipeline of building a KNN model in R with various measurement metrics.
Beginners, K-nearest neighbors, Metrics, R
- Plotnine: Python Alternative to ggplot2 - Dec 12, 2019.
Python's plotting libraries such as matplotlib and seaborn does allow the user to create elegant graphics as well, but lack of a standardized syntax for implementing the grammar of graphics compared to the simple, readable and layering approach of ggplot2 in R makes it more difficult to implement in Python.
Data Science, Data Visualization, Python, R
- How to Visualize Data in Python (and R) - Nov 14, 2019.
Producing accessible data visualizations is a key data science skill. The following guidelines will help you create the best representations of your data using R and Python's Pandas library.
Data Visualization, Matplotlib, Python, R, SuperDataScience
- Orchestrating Dynamic Reports in Python and R with Rmd Files - Nov 8, 2019.
Do you want to extract csv files with Python and visualize them in R? How does preparing everything in R and make conclusions with Python sound? Both are possible if you know the right libraries and techniques. Here, we’ll walk through a use-case using both languages in one analysis
Python, R, Report
- Customer Segmentation for R Users - Sep 26, 2019.
This article shows you how to separate your customers into distinct groups based on their purchase behavior. For the R enthusiasts out there, I demonstrated what you can do with r/stats, ggradar, ggplot2, animation, and factoextra.
Customer Analytics, R, Segmentation
- Scikit-Learn vs mlr for Machine Learning - Sep 10, 2019.
How does the scikit-learn machine learning library for Python compare to the mlr package for R? Following along with a machine learning workflow through each approach, and see if you can gain a competitive advantage by knowing both frameworks.
Exxact, Machine Learning, R, scikit-learn
- R Users’ Salaries from the 2019 Stackoverflow Survey - Aug 30, 2019.
Let’s take a look on what R users are saying about their salaries. Note that the following results could be biased because of unrepresentative and in some cases small samples.
R, Salary, StackOverflow, Survey
- Coding Random Forests® in 100 lines of code* - Aug 7, 2019.
There are dozens of machine learning algorithms out there. It is impossible to learn all their mechanics; however, many algorithms sprout from the most established algorithms, e.g. ordinary least squares, gradient boosting, support vector machines, tree-based algorithms and neural networks.
Algorithms, Machine Learning, Multicollinearity, R, random forests algorithm
- Ten more random useful things in R you may not know about - Jul 31, 2019.
I had a feeling that R has developed as a language to such a degree that many of us are using it now in completely different ways. This means that there are likely to be numerous tricks, packages, functions, etc that each of us use, but that others are completely unaware of, and would find useful if they knew about them.
Advice, Analytics, Data Science, R
- Kaggle Kernels Guide for Beginners: A Step by Step Tutorial - Jul 23, 2019.
This is an attempt to hold the hands of a complete beginner and walk them through the world of Kaggle Kernels — for them to get started.
Kaggle, Python, R
- The Evolution of a ggplot - Jul 18, 2019.
A step-by-step tutorial showing how to turn a default ggplot into an appealing and easily understandable data visualization in R.
Data Visualization, ggplot2, R
- How to Make Stunning 3D Plots for Better Storytelling - Jul 17, 2019.
3D Plots built in the right way for the right purpose are always stunning. In this article, we’ll see how to make stunning 3D plots with R using ggplot2 and rayshader.
Data Visualization, ggplot2, R, Storytelling
- Ten random useful things in R that you might not know about - Jun 20, 2019.
Because the R ecosystem is so rich and constantly growing, people can often miss out on knowing about something that can really help them in a task that they have to complete
Advice, Analytics, Data Science, R
- Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS - Jun 17, 2019.
Data science jobs continue to grow in 2019, and this report shares the change and spread of jobs by software over recent years.
Data Science, indeed, Jobs, Python, R, SAS, TensorFlow
- What you need to know: The Modern Open-Source Data Science/Machine Learning Ecosystem - Jun 10, 2019.
We identify the 6 tools in the modern open-source Data Science ecosystem, examine the Python vs R question, and determine which tools are used the most with Deep Learning and Big Data.
Anaconda, Apache Spark, Big Data Software, Deep Learning, Excel, Keras, Poll, Python, R, RapidMiner, scikit-learn, Software, SQL, Tableau, TensorFlow
- The Whole Data Science World in Your Hands - Jun 5, 2019.
Testing MatrixDS capabilities on different languages and tools: Python, R and Julia. If you work with data you have to check this out.
Data Science, Data Scientist, Julia, Jupyter, MatrixDS, Python, R
- Python leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis - May 30, 2019.
Python continues to lead the top Data Science platforms, but R and RapidMiner hold their share; Almost 50% have used Deep Learning tools; SQL is steady; Consolidation continues.
Pages: 1 2
Anaconda, Apache Spark, Deep Learning, Excel, Keras, Poll, Python, R, RapidMiner, scikit-learn, Software, SQL, TensorFlow
- The Mueller Report Word Cloud: A brief tutorial in R - Apr 22, 2019.
Word clouds are simple visual summaries of the mostly frequently used words in a text, presenting essentially the same information as a histogram but are somewhat less precise and vastly more eye-catching. Get a quick sense of the themes in the recently released Mueller Report and its 448 pages of legal content.
Donald Trump, Politics, R, Word Cloud
- R vs Python for Data Visualization - Mar 25, 2019.
This article demonstrates creating similar plots in R and Python using two of the most prominent data visualization packages on the market, namely ggplot2 and Seaborn.
Data Visualization, ggplot2, Matplotlib, Python, Python vs R, R, Seaborn
- Top R Packages for Data Cleaning - Mar 15, 2019.
Data cleaning is one of the most important and time consuming task for data scientists. Here are the top R packages for data cleaning.
Data Cleaning, Data Preparation, Data Science, Machine Learning, R
- Who is a typical Data Scientist in 2019? - Mar 11, 2019.
We investigate what a typical data scientist looks like and see how this differs from this time last year, looking at skill set, programming languages, industry of employment, country of employment, and more.
Career, Data Science Skills, Data Scientist, Industry, MATLAB, Python, R, SQL
- Running R and Python in Jupyter - Feb 19, 2019.
The Jupyter Project began in 2014 for interactive and scientific computing. Fast forward 5 years and now Jupyter is one of the most widely adopted Data Science IDE's on the market and gives the user access to Python and R
IPython, Jupyter, Python, R
- Understanding Gradient Boosting Machines - Feb 6, 2019.
However despite its massive popularity, many professionals still use this algorithm as a black box. As such, the purpose of this article is to lay an intuitive framework for this powerful machine learning technique.
Adaboost, Decision Trees, Gradient Boosting, R
- Using Caret in R to Classify Term Deposit Subscriptions for a Bank - Feb 4, 2019.
This article uses direct marketing campaign data from a Portuguese banking institution to predict if a customer will subscribe for a term deposit. We’ll be working with R’s Caret package to achieve this.
Banking, Classification, R
- Airbnb Rental Listings Dataset Mining - Jan 28, 2019.
An Exploratory Analysis of Airbnb’s Data to understand the rental landscape in New York City.
AirBnB, Data Exploration, Data Visualization, New York City, R, Real Estate
- 2018’s Top 7 R Packages for Data Science and AI - Jan 22, 2019.
This is a list of the best packages that changed our lives this year, compiled from my weekly digests.
Pages: 1 2
AI, Data Science, R
- Deep learning in Satellite imagery - Dec 26, 2018.
This article outlines possible sources of satellite imagery, what its properties are and how this data can be utilised using R.
Deep Learning, Image Recognition, R
- Automated Web Scraping in R - Dec 11, 2018.
How to automatically web scrape periodically so you can analyze timely/frequently updated data.
Data Science Dojo, R, Web Scraping
- Data Science Projects Employers Want To See: How To Show A Business Impact - Dec 4, 2018.
The best way to create better data science projects that employers want to see is to provide a business impact. This article highlights the process using customer churn prediction in R as a case-study.
Career Advice, Churn, Data Preparation, Data Science, R
- Best Machine Learning Languages, Data Visualization Tools, DL Frameworks, and Big Data Tools - Dec 3, 2018.
We cover a variety of topics, from machine learning to deep learning, from data visualization to data tools, with comments and explanations from experts in the relevant fields.
Big Data, Data Visualization, Deep Learning, Jupyter, Machine Learning, Python, R, Tableau
- SQL, Python, & R in One Platform - Oct 26, 2018.
No more jumping between applications. Mode Studio combines a SQL editor, Python and R notebooks, and a visualization builder in one platform.
Data Visualization, Mode Analytics, Python, R, SQL
- Apache Spark Introduction for Beginners - Oct 18, 2018.
An extensive introduction to Apache Spark, including a look at the evolution of the product, use cases, architecture, ecosystem components, core concepts and more.
Apache Spark, Beginners, Hadoop, R
- Evaluating the Business Value of Predictive Models in Python and R - Oct 11, 2018.
In these blogs for R and python we explain four valuable evaluation plots to assess the business value of a predictive model. We show how you can easily create these plots and help you to explain your predictive model to non-techies.
Pages: 1 2
Business Value, Data Visualization, Lift charts, Predictive Models, Python, R
- Introducing Path Analysis Using R - Sep 27, 2018.
Path analysis is an extension of multiple regression. It allows for the analysis of more complicated models.
Analysis, Analytics, R
- Optimization 101 for Data Scientists - Aug 8, 2018.
We show how to use optimization strategies to make the best possible decision.
Football, Julia, Optimization, Python, R, Sports
- From Data to Viz: how to select the the right chart for your data - Aug 1, 2018.
We offer an interactive, decision tree-style tool, which examines the data you have and proposes a set of potentially appropriate visualizations to represent your dataset.
Data, Data Visualization, ggplot2, GitHub, R, Tidyverse
- Remote Data Science: How to Send R and Python Execution to SQL Server from Jupyter Notebooks - Jul 27, 2018.
Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around.
Jupyter, Machine Learning, Microsoft, Python, R, SQL, SQL Server
- 5 of Our Favorite Free Visualization Tools - Jul 5, 2018.
5 key free data visualization tools that can provide flexible and effective data presentation.
Analytics, D3.js, Data Science, Data Visualization, Free Software, R, Tableau
- Stagraph – a general purpose R GUI, for data import, wrangling, and visualization - Jun 25, 2018.
Stagraph is a new simple visual interface for R, which focuses on data import, data wrangling and data visualization.
Data Preparation, Data Visualization, R, Tidyverse
- How to Execute R and Python in SQL Server with Machine Learning Services - Jun 25, 2018.
Machine Learning Services in SQL Server eliminates the need for data movement - you can install and run R/Python packages to build Deep Learning and AI applications on data in SQL Server.
Azure ML, Machine Learning, Microsoft, Python, R, SQL, SQL Server
- 7 Simple Data Visualizations You Should Know in R - Jun 22, 2018.
This post presents a selection of 7 essential data visualizations, and how to recreate them using a mix of base R functions and a few common packages.
Charts, Data Visualization, Graphs, R
- The 6 components of Open-Source Data Science/ Machine Learning Ecosystem; Did Python declare victory over R? - Jun 6, 2018.
We find 6 tools form the modern open source Data Science / Machine Learning ecosystem; examine whether Python declared victory over R; and review which tools are most associated with Deep Learning and Big Data.
Anaconda, Apache Spark, Data Science, Keras, Machine Learning, Open Source, Poll, Python, R, RapidMiner, Scala, scikit-learn, TensorFlow
- Using Linear Regression for Predictive Modeling in R - Jun 1, 2018.
In this post, we’ll use linear regression to build a model that predicts cherry tree volume from metrics that are much easier for folks who study trees to measure.
Pages: 1 2
Linear Regression, Predictive Modeling, R
- Top 20 R Libraries for Data Science in 2018 - May 25, 2018.
We have prepared an infographic of Top 20 R packages for data science, which covers the libraries main features and GitHub activities, as all of the libraries are open-source.
Data Science, Infographic, R
- Modelling Time Series Processes using GARCH - May 25, 2018.
To go into the turbulent seas of volatile data and analyze it in a time changing setting, ARCH models were developed.
Pages: 1 2
Modeling, R, Time Series
- How to tackle common data cleaning issues in R - May 24, 2018.
R is a great choice for manipulating, cleaning, summarizing, producing probability statistics, and so on. In addition, it's not going away anytime soon, it is platform independent, so what you create will run almost anywhere, and it has awesome help resources.
Book, Data Cleaning, ebook, Packt Publishing, R
- Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018: Trends and Analysis - May 22, 2018.
Python continues to eat away at R, RapidMiner gains, SQL is steady, Tensorflow advances pulling along Keras, Hadoop drops, Data Science platforms consolidate, and more.
Pages: 1 2
Anaconda, Data Mining Software, Data Science Platform, Hadoop, Keras, Poll, Python, R, RapidMiner, SQL, TensorFlow, Trends
- Optimization Using R - May 18, 2018.
Optimization is a technique for finding out the best possible solution for a given problem for all the possible solutions. Optimization uses a rigorous mathematical model to find out the most efficient solution to the given problem.
Pages: 1 2
Excel, Linear Programming, Optimization, R
- New Book: Credit risk analytics, The R Companion - Mar 16, 2018.
Credit risk analytics in R will enable you to build credit risk models from start to finish, with access to real credit data on accompanying website, you will master a wide range of applications.
Analytics, Bart Baesens, Credit Risk, R
- Choropleth Maps in R - Mar 12, 2018.
Choropleth maps provides a very simple and easy way to understand visualizations of a measurement across different geographical areas, be it states or countries.
Pages: 1 2
Choropleth, Data Visualization, India, Maps, R
- Text Processing in R - Mar 9, 2018.
There are good reasons to want to use R for text processing, namely that we can do it, and that we can fit it in with the rest of our analyses. Furthermore, there is a lot of very active development going on in the R text analysis community right now.
Data Processing, R, Text Analytics, Text Mining
- Control Structures in R: Using If-Else Statements and Loops - Feb 23, 2018.
Control structures allow you to specify the execution of your code. They are extremely useful if you want to run a piece of code multiple times, or if you want to run a piece a code if a certain condition is met.
Decision Making, Programming Languages, R
- Deep Learning in H2O using R - Jan 22, 2018.
This article is about implementing Deep Learning (DL) using the H2O package in R. We start with a background on DL, followed by some features of H2O's DL framework, followed by an implementation using R.
Backpropagation, Deep Learning, Gradient Descent, H2O, Machine Learning, R
- Propensity Score Matching in R - Jan 18, 2018.
Propensity scores are an alternative method to estimate the effect of receiving treatment when random assignment of treatments to subjects is not feasible.
Pages: 1 2
Bias, R, Statistics
- Topological Data Analysis for Data Professionals: Beyond Ayasdi - Jan 16, 2018.
We review recent developments and tools in topological data analysis, including applications of persistent homology to psychometrics and a recent extension of piecewise regression, called Morse-Smale regression.
Algorithms, Clustering, R, Regression, Topological Data Analysis
- A Primer on Web Scraping in R - Jan 12, 2018.
If you are a data scientist who wants to capture data from such web pages then you wouldn’t want to be the one to open all these pages manually and scrape the web pages one by one. To push away the boundaries limiting data scientists from accessing such data from web pages, there are packages available in R.
Pages: 1 2
Data Cleaning, Data Curation, R, Web Scraping
- 10 Tools to Help You Learn R - Jan 4, 2018.
There are several tools to help you grasp the foundational principles and more. The list below gives you an idea of what’s available and how much it costs.
R, Tools, Training
- Simple Ways Of Working With Medium To Big Data Locally - Dec 27, 2017.
An overview of the installation and implementation of simple techniques for working with large datasets in your machine.
Big Data, iPhone, Python, R, SAS
- Extracting Tweets With R - Nov 14, 2017.
This article will give you a great, brief overview for extracting Tweets using R.
R, Twitter
- Tips for Getting Started with Text Mining in R and Python - Nov 8, 2017.
This article opens up the world of text mining in a simple and intuitive way and provides great tips to get started with text mining.
Python, R, Text Mining
- Process Mining with R: Introduction - Nov 2, 2017.
In the past years, several niche tools have appeared to mine organizational business processes. In this article, we’ll show you that it is possible to get started with “process mining” using well-known data science programming languages as well.
Pages: 1 2
Data Mining, Data Science, Process Mining, R
- Top 10 Machine Learning with R Videos - Oct 24, 2017.
A complete video guide to Machine Learning in R! This great compilation of tutorials and lectures is an amazing recipe to start developing your own Machine Learning projects.
Algorithms, Clustering, K-nearest neighbors, Machine Learning, PCA, R, Text Mining, Top 10, Youtube
- Data Science Bootcamp in Zurich, Switzerland, January 15 – April 6, 2018 - Oct 12, 2017.
Come to the land of chocolate and Data Science where the local tech scene is booming and the jobs are a plenty. Learn the most important concepts from top instructors by doing and through projects. Use code KDNUGGETS to save.
Bootcamp, Data Science, Data Visualization, Machine Learning, NLP, Python, R, Switzerland, Zurich
- Introducing R-Brain: A New Data Science Platform - Oct 11, 2017.
R-Brain is a next generation platform for data science built on top of Jupyterlab with Docker, which supports not only R, but also Python, SQL, has integrated intellisense, debugging, packaging, and publishing capabilities.
Data Science Platform, Docker, Jen Underwood, Jupyter, R, R-Brain
- Learn Generalized Linear Models (GLM) using R - Oct 11, 2017.
In this article, we aim to discuss various GLMs that are widely used in the industry. We focus on: a) log-linear regression b) interpreting log-transformations and c) binary logistic regression.
Pages: 1 2
Generalized Linear Models, Linear Regression, Logistic Regression, Machine Learning, R, Regression
- An opinionated Data Science Toolbox in R from Hadley Wickham, tidyverse - Oct 10, 2017.
Get your productivity boosted with Hadley Wickham's powerful R package, tidyverse. It has all you need to start developing your own data science workflows.
Data Analysis, Data Science, Data Science Platform, Data Science Tools, Hadley Wickham, R, Tidyverse
- Find Out What Celebrities Tweet About the Most - Oct 5, 2017.
Word cloud is a popular data visualisation method. Here we show how to use R to create twitter word cloud of celebrities and politicians.
Andrew Ng, Data Visualization, Donald Trump, R, Twitter, Word Cloud
- Top 10 Videos on Machine Learning in Finance - Sep 29, 2017.
Talks, tutorials and playlists – you could not get a more gentle introduction to Machine Learning (ML) in Finance. Got a quick 4 minutes or ready to study for hours on end? These videos cover all skill levels and time constraints!
Credit Risk, Finance, Investment Portfolio, Machine Learning, Python, R, Stocks, Tutorials, Videolectures, Youtube
- Visualizing High Dimensional Data In Augmented Reality - Sep 25, 2017.
When Data Scientists first get a data set, they oftne use a matrix of 2D scatter plots to quickly see the contents and relationships between pairs of attributes. But for data with lots of attributes, such analysis does not scale.
Data Science, Data Visualization, IBM, Instacart, Machine Learning, R
- 30 Essential Data Science, Machine Learning & Deep Learning Cheat Sheets - Sep 22, 2017.
This collection of data science cheat sheets is not a cheat sheet dump, but a curated list of reference materials spanning a number of disciplines and tools.
Pages: 1 2 3
Cheat Sheet, Data Science, Deep Learning, Machine Learning, Neural Networks, Probability, Python, R, SQL, Statistics
- A Solution to Missing Data: Imputation Using R - Sep 21, 2017.
Handling missing values is one of the worst nightmares a data analyst dreams of. In situations, a wise analyst ‘imputes’ the missing values instead of dropping them from the data.
Data Preparation, Missing Values, R
- Videos for Business Analytics using Data Mining course - Sep 12, 2017.
Here we present links to very useful videos on Business Analytics using data mining courses.
Business Analytics, Data Mining, Galit Shmueli, Online Education, R, Youtube
- Python vs R – Who Is Really Ahead in Data Science, Machine Learning? - Sep 12, 2017.
We examine Google Trends, job trends, and more and note that while Python has only a small advantage among current Data Science and Machine Learning related jobs, this advantage is likely to increase in the future.
Data Science, Google Trends, Jobs, Kaggle, Machine Learning, Python, Python vs R, R
- Next Generation Data Manipulation with R and dplyr - Aug 31, 2017.
The idea behind the dplyr package is to do one thing at a time. dplyr has separate functions for every task which make its implementation crisp and easy to understand.
Data Cleaning, Data Exploration, R, R Packages
- Python overtakes R, becomes the leader in Data Science, Machine Learning platforms - Aug 28, 2017.
While Python did not "swallow" R, in 2017 Python ecosystem overtook R as the leading platform for Analytics, Data Science, and Machine Learning and is pulling users from other platforms.
Data Science Platform, Poll, Python, Python vs R, R
- Insights from Data mining of Airbnb Listings - Aug 4, 2017.
AirBnB has 2 million listings and operates in 65,000 cities. Here we look at insights related to vacation rental space in the sharing economy using the property listings data for Texas, US.
AirBnB, Data Mining, R, TX
- Deep Learning with R + Keras - Jun 27, 2017.
Keras has grown in popularity and supported on a wide set of platforms including Tensorflow, CNTK, Apple’s CoreML, and Theano. It is becoming the de factor language for deep learning.
Deep Learning, Keras, Neural Networks, R
- New Leader, Trends, and Surprises in Analytics, Data Science, Machine Learning Software Poll - May 22, 2017.
Python caught up with R and (barely) overtook it; Deep Learning usage surges to 32%; RapidMiner remains top general Data Science platform; Five languages of Data Science.
Pages: 1 2
Anaconda, Data Mining Software, Poll, Python, R, RapidMiner, Spark, TensorFlow
- The Guerrilla Guide to Machine Learning with R - May 11, 2017.
This post is a lean look at learning machine learning with R. It is a complete, if very short, course for the quick study hacker with no time (or patience) to spare.
Data Analysis, Machine Learning, R
- Building Regression Models in R using Support Vector Regression - Mar 8, 2017.
The article studies the advantage of Support Vector Regression (SVR) over Simple Linear Regression (SLR) models for predicting real values, using the same basic idea as Support Vector Machines (SVM) use for classification.
R, Regression, Support Vector Machines
- Gartner Data Science Platforms – A Deeper Look - Mar 3, 2017.
Thomas Dinsmore critical examination of Gartner 2017 MQ of Data Science Platforms, including vendors who out, in, have big changes, Hadoop and Spark integration, open source software, and what Data Scientists actually use.
Apache Spark, Data Science Platform, Gartner, IBM, Python, R, SAS, Thomas Dinsmore
- Moving from R to Python: The Libraries You Need to Know - Feb 24, 2017.
Are you considering making a move from R to Python? Here are the libraries you need to know, how they stack up to their R contemporaries, and why you should learn them.
Jupyter, Pandas, Programming, Python, R, scikit-learn, Yhat
- Top R Packages for Machine Learning - Feb 3, 2017.
What are the most popular ML packages? Let's look at a ranking based on package downloads and social website activity.
Machine Learning, R, R Packages
- Introduction to Forecasting with ARIMA in R - Jan 16, 2017.
ARIMA models are a popular and flexible class of forecasting model that utilize historical information to make predictions. In this tutorial, we walk through an example of examining time series for demand at a bike-sharing service, fitting an ARIMA model, and creating a basic forecast.
ARIMA, Datascience.com, Forecasting, R, Stationarity, Time Series
- The Most Popular Language For Machine Learning and Data Science Is … - Jan 11, 2017.
When it comes to choosing programming language for Data Analytics projects or job prospects, people have different opinions depending on their career backgrounds and domains they worked in. Here is the analysis of data from indeed.com with respect to choice of programming language for machine learning and data science.
Data Science, Machine Learning, Programming Languages, Python, R, Scala
- 50+ Data Science, Machine Learning Cheat Sheets, updated - Dec 14, 2016.
Gear up to speed and have concepts and commands handy in Data Science, Data Mining, and Machine learning algorithms with these cheat sheets covering R, Python, Django, MySQL, SQL, Hadoop, Apache Spark, Matlab, and Java.
Cheat Sheet, Data Science, Django, Hadoop, Java, Machine Learning, MATLAB, Python, R
- Introduction to Machine Learning for Developers - Nov 28, 2016.
Whether you are integrating a recommendation system into your app or building a chat bot, this guide will help you get started in understanding the basics of machine learning.
Pages: 1 2
Beginners, Classification, Clustering, Machine Learning, Pandas, Python, R, scikit-learn, Software Developer
- Eight Things an R user Will Find Frustrating When Trying to Learn Python - Nov 2, 2016.
Are you an R user considering learning Python? Here's some insight into what you may be up against, and what, specifically, you may find frustrating. But don't worry, it's not all terrible.
Python, R
- Data Science 101: How to get good at R - Nov 1, 2016.
Everybody talks about R programming, how to learn, how to be good at it. But in this article, Ari Lamstein tells us his story about why and how he started with R along with how to publish, market and monetise R projects.
Ari Lamstein, Beginners, Data Science, Monetizing, Programming, R
- Top 10 Data Science Videos on Youtube - Oct 17, 2016.
Learning and the future are the key topics in the recent Youtube videos on Data Science. The main questions revolve around: “how to become a Data Scientist”, “what is a data scientist”, and “where data science is going”. But why there is so little explanation of data science to the masses?
Pages: 1 2
Data Science, Data Scientist, DJ Patil, Online Education, R, Videolectures, Youtube
- The R Graph Gallery Data Visualization Collection - Oct 13, 2016.
Welcome to the R graph gallery, a collection of R graph examples, organized by chart type, searchable by R function, with reproducible code and explanation.
Art, Data Visualization, ggplot2, Graphics, R, Visualization
- Understanding the Empirical Law of Large Numbers and the Gambler’s Fallacy - Aug 12, 2016.
Law of large numbers is a important concept for practising data scientists. In this post, The empirical law of large numbers is demonstrated via simple simulation approach using the Bernoulli process.
Algorithms, R, Statistics
- A Beginner’s Guide to Neural Networks with R! - Aug 11, 2016.
In this article we will learn how Neural Networks work and how to implement them with the R programming language! We will see how we can easily create Neural Networks with R and even visualize them. Basic understanding of R is necessary to understand this article.
Pages: 1 2
Beginners, Neural Networks, R, Udemy
- Short course: Statistical Learning and Data Mining IV, Washington, DC, Oct 19-20 - Aug 8, 2016.
This new two-day course gives a detailed and modern overview of statistical models used by data scientists for prediction and inference, including sparse models and deep learning.
Data Mining, DC, R, Robert Tibshirani, Statistical Learning, Trevor Hastie, Washington
- SAS vs R vs Python: Which Tool Do Analytics Pros Prefer? - Jul 22, 2016.
There are lots of flame wars involving different data science and analytics tools... but this isn't one of them. Check out the quantitative results and analysis of a Burtch Works survey on the subject.
Burtch Works, Python, R, SAS, Survey
- Interview: Florian Douetteau, Dataiku Founder, on Empowering Data Scientists - Jul 7, 2016.
Here is an interview with Florian Douetteau, founder of Dataiku, on how their tools empower data scientists, and how data science itself is evolving.
Ajay Ohri, API, Data Science Tools, Dataiku, Florian Douetteau, Python, R
- R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results - Jun 6, 2016.
R remains the leading tool, with 49% share, but Python grows faster and almost catches up to R. RapidMiner remains the most popular general Data Science platform. Big Data tools used by almost 40%, and Deep Learning usage doubles.
Pages: 1 2
Data Mining Software, Data Science Platform, Poll, Python, Python vs R, R, RapidMiner, SQL