Should I Learn Julia?
Do you think learning Julia is better for your data science career? Let’s find out.
Image by Author
Data science is dominated by Python and R programming languages. Its popularity is due to simple syntax, a large community, and open-source contributors. Even on job boards, you will see recruiters looking for developers and data scientists who are good in Python, SQL, and R.
But is this likely to change soon? There is a new contender in the town called Julia, which is built for high-performance scientific calculations.
Position | Programming Language | Ratings |
25 | PL/SQL | 0.52% |
26 | Lisp | 0.44% |
27 | Julia | 0.43% |
28 | Kotlin | 0.43% |
29 | Scala | 0.42% |
Currently, Julia is ranked 27th on the TIOBE Index, but it has all the attributes to become a top 10 language for general purposes and a top 5 language for data science. Julia’s stable version was launched in 2018, and 90% of the promises made by the founders were delivered.
The 4 founders, Jeff Bezanson, Stefan Karpinski, Alan Edelman, and Viral B. Shah have announced at the start that Julia will be:
- As fast as C
- As dynamic as Ruby
- You can write Matlab-like notions
- Lisp-like in macros
- As general as Python
- As statistically friendly like R
- It will adopt simple string operations like Perl
- Hadoop-like in distribution
- Powerful for linear algebra
In short, you will be getting the best of all worlds.
“Python is not the future of machine learning” - Jeremy Howard
In the YouTube video, he talks about how Python is frustrating dealing with large datasets and machine learning applications. To achieve high performance, we have to use libraries that are written in other languages and use various packages to effectively implement parallel computing. He also explained how Julia will take over Python in a few years.
In the post, we will dive deep into Julia and compare it with Python. Furthermore, we will learn how Julia is best for your career and look at some of the best learning resources.
What is Julia?
Julia is a high-level, general-purpose language that allows researchers and scientists to implement and execute faster. It is designed for high-performance calculations.
According to Julia’s developers, “Julia was built for scientific computing, statistical analysis, machine learning, data mining, large-scale linear algebra, distributed and parallel computing.”
You can download the installer from the official site and install it in a few seconds. The current stable release is 1.8.2, and if you are switching from R and Matlab, it will feel natural for you to understand the syntax and functions.
Image by Author
You can install packages using `Pkg` as shown below or typing “]” in Repl.
using Pkg Pkg.add(["PyPlot", "CSV", "DataFrames"])
Just like pandas in Python, we can simply load a CSV file as a dataframe without doing extra work.
In the example, we have imported CSV and DataFrames packages then used `CSV.read` to read the CSV file and converted it into a data frame.
using CSV using DataFrames df_raw = CSV.read("Heart_Disease_Dataset.csv", DataFrame)
Note: You can also import packages using `import` syntax, similar to Python.
Julia vs. Python
In this section, we will compare the king of all programming languages (Python) with Julia.
1. Syntax
Julia is not an Object-Oriented programming language. It is a dynamically typed language that helps scientists and researchers to write code in the simplest way that is more close to the English language. Julia is cleaner than Python, especially for scientific computations.
You can write polynomials just like in Matlab. In the example below, we have plotted a polynomial equation and its Log function in blue color.
We are using Pyplot for creating graphs, which is similar to Matplotlib.
using PyPlot PyPlot.svg(true) x = range(-3,stop=3,length=20) y = 2*x.^2 .+ 0.7 figure(figsize=(6,4.3)) #width and height, in inches plot(x,y,linestyle="-",color="r",linewidth=1.0) plot(x,log.(y),linestyle=":",color="b",linewidth=3.0)
Just like Python’s Ternary Conditional Operator, We can reduce the if-else statements to a single line.
x = 80; y = 43 if x > y println("$x is greater than $y") end >>> 80 is greater than 43
It is much simpler than Python.
x > y && println("$x is greater than $y") >>> 80 is greater than 43
2. Performance
The benchmark posted by Julia shows the superior execution time of Julia over Python.
In one example, they simulated a coin toss 1 billion times and stored the result in an array. As a result, Julia was 62X faster than Python.
LANGUAGE | TIME (SECONDS) |
PYTHON | 1804 |
JULIA | 28.8 |
Julia natively comes with parallel computing. The execution speed is close to C.
3. Package Ecosystem
Julia has 7k registered packages, and you can find all types of tools for data analytics, file handling, machine learning, scientific computation, and data engineering.
Even though the number of packages is far less than in Python, the main advantage of using Julia packages is that most of the packages are 100% using Julia.
You can even find your favorite Python libraries in Julia, such as Fast AI, which is built upon flux.jl.
Image from FluxML/FastAI.jl
4. Community
Community support is the backbone of programming languages. A large and active community means more learning resources for problem-solving. As you already know, Julia is quite a new language, and it has a smaller but enthusiastic community. If you encounter a unique issue, it will be harder for you to find a final solution online compared to Python.
5. Parallelism
Both Julia and Python can run operations in parallel. For any application or solution, it is necessary to use all of the available resources (compute and memory) available. Julia inherently comes with parallel computing and better data management. Whereas in Python, you have to use various libraries to achieve high performance. In short, Julia has less top-heavy parallelization syntax than used in Python.
So, should you learn it?
The simple answer is “Yes.” It all depends on your career choice and the matter of time. If you are a student trying to work in a scientific field, I will suggest you start learning Julia.
If you are starting your career in data science, I will recommend Python. It is an industry standard, and companies will be more than happy to hire you.
Julia is also suitable for data scientists and engineers who are experts in the field and want to become future-proof by learning superior performance language.
The key reasons for learning Julia:
- It is a dynamically typed language which makes it easy to use.
- It is open source, and the code is available on GitHub.
- Julia allows you to call Python, C, and Fortran libraries directly.
- You get to enjoy a similar-to-C performance with fewer lines of code.
- Julia can handle complex data analysis tasks easily.
- It has native machine learning and deep learning packages for larger and more complex models.
- Code conversion from Python and C is easy.
How do you Start Learning Julia?
If you are new to Julia, I will highly recommend you start your journey with an Introduction to Julia course. It comes with interactive coding exercises and 16 videos covering the basic syntax, data structures, functions and packages, and DataFrames for data analytical tasks.
The learning process never ends, and you can learn more by reading books, cheat sheets, blogs, and tutorials. Furthermore, you can start working on the data analytics project if you are an expert in data science and trying to navigate through new syntax.
Additional resource:
- Explore the collection of Julia books on machine learning, data science, statistics, and many more.
- Free Julia tutorials on data science, scientific computation, and machine learning.
- Extensive playlists of tutorials and courses on YouTube.
- Fast track to Julia syntax and functionality by following the Julia Cheatsheet.
- Read community-contributed Medium blogs.
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in Technology Management and a bachelor's degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.