Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018: Trends and Analysis
Python continues to eat away at R, RapidMiner gains, SQL is steady, Tensorflow advances pulling along Keras, Hadoop drops, Data Science platforms consolidate, and more.
Regional Participation
The participation by region was:
- Europe, 37.5%
- US/Canada, 36.6%
- Asia, 11.7%
- Latin America, 6.6%
- Africa/Middle East, 4.5%
- Australia/NZ, 3.1%
Full Results and 3-year trends
The following table shows the poll results in detailKDnuggets 2018 Poll: What Analytics, Big Data, Data Science, Machine Learning software you used in the past 12 months for a real project? | |
Tool (number of voters in 2018) |
% voters in 2018
% voters in 2017 % voters in 2016 |
Python (1347) | |
RapidMiner (1081) | |
R (996) | |
SQL (813) | |
Excel (803) | |
Anaconda (686) | |
Tensorflow (614) | |
Tableau (542) | |
scikit-learn (500) | |
Keras (456) | na |
Apache Spark (442) | |
Java (309) | |
Microsoft SQL Server (283) | |
PyCharm (276) | na |
Microsoft Power BI (257) | |
KNIME (252) | |
Spark SQL (240) | na
na |
Weka (233) | |
Hadoop: Open Source Tools (225) | |
SQL on Hadoop tools (209) | |
MATLAB (191) | |
Unix shell/awk/gawk (188) | |
Other free analytics/data mining tools (170) | |
IBM SPSS Statistics (164) | |
Other programming and data languages (142) | |
C/C++ (140) | |
PyTorch (132) | na |
Dataiku (130) | |
H2O.ai (126) | |
Scala (121) | |
Hadoop: Commercial Tools (116) | na |
Microsoft Azure Machine Learning (113) | |
SAS Base (112) | |
IBM SPSS Modeler (100) | |
Theano (100) | |
Other Deep Learning Tools (100) | |
SAS Enterprise Miner (89) | |
QlikView (89) | |
Orange (85) | |
Alteryx (83) | |
MLlib (77) | |
DeepLearning4J (69) | |
Amazon Machine Learning (67) | |
IBM Watson / Watson Analytics (64) | |
TIBCO Spotfire (63) | |
Microsoft Cognitive Toolkit (Prev. CNTK) (62) | |
Other paid analytics/data mining/data science software (50) | |
Gnu Octave (44) | |
Teradata (44) | na |
Microsoft Machine Learning Server (former R Server) (43) | na |
Rattle (41) | |
Minitab/Salford Systems (36) | |
JMP (35) | |
MicroStrategy (35) | |
Pentaho (33) | |
Mathematica (32) | |
Apache MXnet (31) | |
Stata (31) | |
Caffe (30) | |
IBM Cognos (30) | |
IBM Data Science Experience (29) | na |
SAP Analytics/Predictive Analytics (28) | |
Microsoft other ML/Data Science tools (27) | |
SAP HANA (27) | |
Solver (former XLMiner) (27) | |
DataRobot (26) | |
TIBCO Statistica (26) | |
Databricks Unified Analytics Platform (25) | na
na |
Caffe2 (24) | na
na |
TFLearn (23) | na
na |
Perl (21) | |
Oracle Advanced Analytics (21) | |
C4.5/C5.0/See5 (20) | |
Torch (20) | |
BigML (18) | |
Julia (14) | |
DataScience.com (12) | na |
BayesiaLab (12) | |
Vowpal Wabbit (9) | |
Lasagne (7) | na |
RapidInsight/Veera (7) | |
Angoss/Datawatch (6) | |
Lisp (6) | |
Clojure (4) | |
Domino Data Labs (3) | na |
F# (3) | |
Ontotext GraphDB (3) |
Here are the results of the previous KDnuggets Polls on Analytics, Data Mining, Data Science Software:
- New Leader, Trends, and Surprises in Analytics, Data Science, Machine Learning Software Poll, 2017
- R, Python Duel As Top Analytics, Data Science software, 2016
- R leads RapidMiner, Python catches up, Big Data tools grow, Spark ignites, 2015
- RapidMiner Continues To Lead, 2014
- RapidMiner and R vie for first place, 2013
- KDnuggets 2012 Poll: Analytics, Data mining, Big Data software used
- KDnuggets 2011 Poll: Data Mining/Analytic Tools Used
- KDnuggets 2010 Poll: Data Mining / Analytic Tools Used
- KDnuggets 2009 Poll: Data Mining Tools Used
- KDnuggets 2008 Poll: Data Mining Software Used
- KDnuggets 2007 Poll: Data Mining/Analytics Software Tools
Comments
Jean-Francois Puget, @JFPuget :
I am a bit disappointed that latest @kdnuggets poll does not include anyway to indicate use of XGBoost or other gradient boosted machines. This is missing a real trend in #MachineLearning.
Miyuru, WSO2 Stream Processor WSO2 Stream Processor is an open source, scalable, and feature rich stream processing platform currently used by many enterprises worldwide. It can ingest data from Kafka, HTTP requests, message brokers and you can query data stream using "Streaming SQL" language. With just two commodity servers it can provide high availability and can handle 100K+ TPS throughput. It can scale up to millions of TPS on top of Kafka. WSO2 Stream Processor has been built using Siddhi library which performs both Stream Processing and Complex Event Processing. See https://wso2.com/analytics, https://github.com/wso2/siddhi
Naveen Goud Bobburi,
RStudio is also not in the list which i use daily