5 Free Books to Master Data Science

Want to break into data science? Check this list of free books for learning Python, statistics, linear algebra, machine learning and deep learning.



5 Free Books to Master Data Science
Illustration by Author

 

When you break into data science, you have a huge variety of resources at your fingertips, like Udemy courses, YouTube videos, and articles. But you need to give yourself a clear structure of what you should study to avoid feeling overwhelmed and losing motivation.

This article will explore five books that will cover the basic concepts you should learn within the data science journey. Each of these books helps to learn: 

  • Python
  • Statistics
  • Linear Algebra
  • Machine Learning
  • Deep Learning 

 

A Whirlwind Tour of Python

 

Book link: A Whildwind Tour of Python

If you are interested in starting to learn Python without taking too much time, this book can be a good match for you. It gives a very short overview of Python’s basic concepts. Together with the 100-page book, there is also a GitHub repository with exercises. 

In particular, you can quickly learn the principal data types of Python: integers, floating-point numbers, strings, Booleans, lists, tuples, dictionaries and sets. At the end of the book, there is a brief overview of Python libraries, NumPy, Pandas, Matplotlib, Scipy.

It covers the following content:

  • Basic Syntax
  • Variables
  • Operators
  • Principal Data Types
  • For Loop
  • While loop
  • Functions
  • If-elif-else
  • Fast overview of Python libraries

 

Think Stats: Probability and Statistics for Programmers

 

Book link: Think Stats: Probability and Statistics

It can be hard to acquire a good knowledge of probability and statistics without putting into practice what you study.  The beauty of this book is that it’s focused on a few basic concepts and doesn’t only show theory, but there are also practical exercises written with Python. 

The book covers:

  • Summary Statistics
  • Data Distribution
  • Probability Distributions
  • Bayes’s Theorem
  • Central limit theorem
  • Hypothesis testing
  • Estimation

 

Introduction to Linear Algebra for Applied Machine Learning with Python

 

Book link: Introduction to Linear Algebra for Applied Machine Learning

When you study Linear Algebra in university, most of the time the professors explain all the theory without any practical application. So, you end up taking the exam, and forget every concept once you are done, because in your head it’s too abstract. 

Luckily, I have found this amazing book that gives you a good introduction of linear algebra’s fundamentals that you’ll meet when you study machine learning models. Every theoretical concept is followed by a practical example written with NumPy, a well-known Python library for scientific computing.

These are the main topics covered:

  • Vectors
  • Matrices
  • Projections
  • Determinant
  • Eigenvectors and Eigenvalues
  • Singular Value Decomposition  

 

Introduction to Machine Learning with Python

 

Book link: Introduction to Machine Learning with Python

After studying Python, Statistics and Linear Algebra, it’s time to finally learn everything about Machine Learning models to solve real-world problems. The book is suggested for people getting started and uses scikit-learn for the machine learning applications. 

These are the main machine learning models explained:

  • Linear Regression 
  • Naïve Bayes
  • Decision Trees 
  • Ensembles of Decision Trees
  • Support Vector Machines
  • Principal Component Analysis
  • t-SNE
  • K-Means Clustering
  • DBSCAN

 

Deep Learning with Python

 

Book link: Deep Learning with Python

This fifth and last book was conceived for people that already have Python programming knowledge and no prior experience with machine learning is required. The author of this book is Francois Chollet, a software engineer and AI researcher at Google, famous for creating Keras, a deep learning library released in 2015. These are the most important notions:

  • Neural Networks
  • Convolutional Neural Networks
  • Recurrent Neural Networks
  • LSTM
  • Generative Adversarial Networks

 

Final thoughts

 

These suggestions are all great for beginners that want to break into the data science field. Moreover, they can be useful for data scientists and researchers that are aware of having a lack of knowledge on some concepts and need to strengthen their understanding. I hope that you have appreciated this list of books. Do you know other helpful books about Data Science? Drop them in the comments if you have insightful suggestions.
 

Eugenia Anello is currently a research fellow at the Department of Information Engineering of the University of Padova, Italy. Her research project is focused on Continual Learning combined with Anomaly Detection.