Is Data Science a Dying Career?
At the end of the day, the value a data scientist provides to an organization lies in their ability to apply data to real-world use cases.
Photo by cottonbro studio
Introduction
I recently read an article describing data science as an oversaturated field. The article predicted that ML engineers would replace data scientists in the upcoming years.
According to the author of this article, most companies worked to solve very similar business problems with data science. Due to this, it wouldn’t be necessary for data scientists to come up with novel methods of solving problems.
The author went on to say that only basic data science skills were required in order to solve problems in most data-driven organizations. This role could easily be replaced by a machine learning engineer — a person with basic knowledge of data science algorithms, who also possessed knowledge of deploying ML models.
I have read many similar articles in the past year.
Some of them state that the role of a data scientist will be replaced by tools like AutoML, while others refer to data science as a “dying field” that will soon be surpassed by roles like data engineering and ML operations.
As someone who works closely with different pillars of the data industry, I would like to provide my opinion on this topic, and answer questions along these lines:
- Is data science a dying career, and will there still be demand for it in the next few years?
- Will automated tools render data scientists jobless?
- Is data science oversaturated, and will the field be replaced by newer roles in the near future?
- Are data scientists profitable to organizations? How do they add value to businesses?
Are Data Scientists Needed?
The data science workflow within most organizations is pretty similar. Many companies hire data scientists to solve similar business problems. Most of the models built don’t require you to come up with novel solutions.
Most of the approaches you will take to solve data-driven problems at these organizations have most likely already been used before, and you can borrow inspiration from the sea of resources available online.
Also, the rise of automated tools like AutoML and DataRobot have made predictive modelling even easier.
I use DataRobot for some business use-cases, and it is a great tool. It iterates over many values and chooses the best possible parameters for your model, to ensure that you end up with the most highly accurate model possible.
So if predictive modelling has become easier over time, why do companies still require data scientists? Why don’t they just use a combination of automated tools and ML engineers to manage their entire data science workflow?
The answer is simple:
Firstly, data science has never been about re-inventing the wheel or building highly complex algorithms.
The role of a data scientist is to add value to an organization with data. And in most companies, only a very small portion of this involves building ML algorithms.
Secondly, there will always be problems that cannot be solved by automated tools. These tools have a fixed set of algorithms you can pick from, and if you do find a problem that requires a combination of approaches to solve, you will need to do it manually.
And although this doesn’t happen often, it still does — and as an organization, you need to hire people skilled enough to do this. Also, tools like DataRobot can’t do data pre-processing or any of the heavy lifting that comes before model building.
The Human Touch
As someone who has created data-driven solutions for startups and large companies alike, the situation is very different from what it’s like dealing with Kaggle datasets.
There is no fixed problem. Usually, you have a dataset, and you are given a business problem. It is up to you to figure out what to do with customer data to maximize sales for the company.
This means that it isn’t just technical or modelling skills that is required from a data scientist. You will need to connect the data with the problem at hand. You need to decide on external data sources that can optimize your solution.
Data pre-processing is long and painstaking, and not just because it requires strong programming skills, but because you need to experiment with different variables and their relevance to the problem at hand.
You need to relate model accuracy to a metric like conversion rate.
Model building isn’t always a part of this process. Sometimes, a simple calculation might suffice to perform a task like customer ranking. Only some problems require you to actually come up with a prediction.
At the end of the day, the value a data scientist provides to an organization lies in their ability to apply data to real-world use cases. Whether it’s building a segmentation model, recommendation system, or evaluating customer potential, there is no real benefit to an organization unless the results are interpretable.
As long as a data scientist is able to solve problems with the help of data and bridge the gap between technical and business skills, the role will continue to persist.
Natassha Selvaraj is a self-taught data scientist with a passion for writing. You can connect with her on LinkedIn.