- Managing Machine Learning Cycles: Five Learnings from comparing Data Science Experimentation/ Collaboration Tools - Jan 29, 2020.
Machine learning projects require handling different versions of data, source code, hyperparameters, and environment configuration. Numerous tools are on the market for managing this variety, and this review features important lessons learned from an ongoing evaluation of the current landscape.
Collaboration, Comet.ml, Data Operations, Data Workflow, DataOps, MLflow, MLOps, Pipeline, Reproducibility
- Reproducibility, Replicability, and Data Science - Nov 19, 2019.
As cornerstones of scientific processes, reproducibility and replicability ensure results can be verified and trusted. These two concepts are also crucial in data science, and as a data scientist, you must follow the same rigor and standards in your projects.
Best Practices, Data Science, Overfitting, Reproducibility, Trust, Validation
- Data Version Control: iterative machine learning - May 11, 2017.
ML modeling is an iterative process and it is extremely important to keep track of all the steps and dependencies between code and data. New open-source tool helps you do that.
CRISP-DM, DVC, GitHub, Machine Learning, Open Source, Reproducibility, Version Control
- What is Academic Torrents and Where is Data Sharing Going? - Oct 26, 2016.
Learn more about Academic Torrents, a platform for researchers to share data consisting of a site where users can search for datasets, and a BitTorrent backbone which makes sharing data scalable and fast.
Datasets, Reproducibility, Research
- Ten Simple Rules for Effective Statistical Practice: An Overview - Jun 23, 2016.
An overview of 10 simple rules to follow to ensure proper effective statistical data analysis.
Advice, Data Quality, Noise, Replication, Reproducibility, Statistical Analysis
- We need a statistically rigorous and scientifically meaningful definition of replication - Oct 29, 2015.
Replication and confirmation are indispensable concepts that help define scientific facts. It seems that before continuing the debate over replication, we need a statistically meaningful definition of replication.
Replication, Reproducibility, Statistics
- The Elements of Data Analytic Style – checklist - Mar 4, 2015.
Jeff Leek book "Elements of Data Analytic Style" had a rocket launch, thanks to author course on Coursera. The book includes a useful checklist that can guide beginning data analysts or serve for evaluating data analyses.
Book, Checklist, Data Analytics, Jeff Leek, Leanpub, Reproducibility