5 Things That Set a Data Scientist Apart From Other Professions
Here are five things that help set the data scientist apart from other professions.
Photo by pressfoto on www.freepik.com
I recently wrote an article titled Data Scientist, Data Engineer & Other Data Careers, Explained, in which I did my best to concisely define and differentiate five popular data-related professions. Each of the professions received a very high level, single sentence summary in that article, and the data scientist, for reference, was described as follows:
The data scientist is concerned primarily with the data, the insights which can be extracted from it, and the stories that it can tell.
Along with the additional few paragraphs I wrote for each profession, I tried to come up with a single overarching differentiation feature, the five of which could be worked together into a flowchart and used, perhaps, by an aspiring data professional to help land on which profession might be best for them.
I received some feedback from readers making it apparent that I put too much focus on predictive analytics as a defining feature of the data scientist profession, and that I leaned on this feature in a way that might make it seem as though data scientists do more predictive analytics than anything else — and that other data professionals might not do any of this at all.
This constructive criticism naturally got me thinking: what else is it that separates data scientists from other data professionals, specifically? There are lots of technical skills, and particular technical languages, systems, and tools that are used by data scientists. There, too, are numerous soft skills that data scientists — as well as all sorts of other professionals — employ to excel at their careers. But what are some of the inherent characteristics of the successful data scientist, either those that come with the data scientist to the profession or those that can be developed after they reach the profession?
Here are five things that I came up with that, when taken as a whole, help set the data scientist apart from other professions.
Let's preface this by noting that all data scientist roles are different, but they all have some common connecting threads, and hopefully these points help connect some of these threads.
1. Predictive Analytics Mindset
The perceived focus of this feature is what I got some flak for. I am going to double down here, however, and say that the predictive analytics mindset is one of the major defining features of the data scientist, perhaps more so than any other. Is it the only defining feature? Of course not. Should it have been used in a flowchart to separate data scientist from all other occupations? In retrospect, no, probably not.
Do data scientists perform predictive analytics? Absolutely. Do non-data scientists also? Sure. However, if I were to put data scientist on one end of the predictive analytics see-saw, and <insert other data professional here> on the other end, I would expect the data scientist would always hit the ground.
But it's not just the application of predictive analytics in particular situations; it is a mindset. And it's not just an analytical mindset (minus the predictive), but one which is always thinking about how we might be able to leverage what we already know to find out what we don't yet know. This suggests that predictive is an integral part of the equation.
Data scientists don't solely have prediction in mind, but, in my view, working from within this mindset is one of the role-defining characteristics, and one that a lot of other professions, data-related or otherwise, do not share. Others that do share this characteristic likely place it further down the list of those valued for the profession in question.
2. Curiosity
Looking to use what we do know to find out what we don't know is not enough, obviously. Data scientists must have a curiosity about them that other roles don't necessarily need to have (note that I didn't say that others absolutely do not have this curiosity). Curiosity is almost the flipside of the predictive analytics mindset: while the predictive analytics mindset is looking to solve for X with Y, curiosity will be determining what Y is in the first place.
- "How can we increase sales?"
- "Why is churn higher in some months than in others?"
- "Why does this need to be done like that?"
- "What happens if we do X to Y?"
- "How does X play into what happens over here?"
- "Have we tried...?"
- And so on...
A natural curiosity is required to be a useful data scientist, end of story. If you are the type of person to wake up in the morning and go through your entire day without giving much thought to the wonders of the universe — on any level — data science is not for you.
Before killing it, curiosity was responsible for the cat's very long and successful career as a successful data scientist.
3. Systems Thinking
Here's a hard-hitting piece of philosophy: the world is a complex place. Everything is connected in some way, well beyond the obvious, which leads to layer upon layer of real world complexity. Complex systems interact with other complex systems to produce additional complex systems of their own, and so goes the universe. This game of complexity goes beyond just recognizing the big picture: where does this big picture fit into the bigger picture, and so on?
But this isn't just philosophical. This real world infinite web of complexity is recognized by data scientists. They are interested in knowing as much about relevant interactions, latent or otherwise, as they work through their problems. They look for situation-dependent known knowns, known unknowns, and unknown unknowns, understanding that any given change could have unintended consequences elsewhere.
It is the data scientist's job to know as much about their relevant systems as possible, and leverage their curiosity and predictive analytical mindset to account for as much of these systems' operations and interactions as feasible, in order to keep them running smoothly even when being tweaked. If you aren't able to appreciate why no one person is able to fully explain how the economy works, data science is not for you.
4. Creativity
Now we've come to our requisite "thinking outside the box" characteristic. Don't we encourage everyone to do this to some degree? Of course we do. But I don't mean it the same way here.
Remember that data scientists don't work in a vacuum; we work with all types of different roles, and encounter all sorts of different domain experts in our journeys. These domain experts have particular ways of looking at their particular domains, even when thinking outside of the box. As a data scientist, with a unique set of skills and a particular type of mindset — which I am doing my best to describe in some fashion herein — you can approach problems from outside of the box in which domain experts reside. You can be the fresh set of eyes that looks at a problem in a new light — providing, of course, that you understand the problem well enough. Your creativity will help you conjure up fresh ideas and perspectives for doing so.
This isn't to diminish domain experts; in fact, it's the opposite. We data scientists are their support, and bringing a set of skills that is trained to do what we do, we are (hopefully) able to bring a new perspective in our support role to contribute to domain experts being able to excel at what they do. This new perspective will be driven by the creative thinking of the data scientist, a creativity that, when paired with curiosity, will lead to being able to ask questions — and pursue answers.
Of course, we need the technical, statistical, and additional skills to be able to follow up on these questions, but these skills are useless if we don't have the creativity to think of interesting and non-obvious ways to be able to investigate and ultimately provide answers. This is why data scientists must inherently be creative.
5. Storytelling Sensibilities
Everyone needs to be able to communicate effectively with others, regardless of their station in life. Data scientists are no different.
But even beyond that, data scientists often have to do some hand-holding when explaining their work to other stakeholders who may not be — and may have no desire to be — fully immersed in the Statistical Analysis Cinematic Universe™. A data scientist must be able to narrate someone from point A to point B, even if that someone has little idea of what, exactly, either of those points are. Put bluntly, storytelling is being able to weave a realistic narrative from some data and your analytical process: how we got from this to this.
This doesn't end at simply stating facts; the data scientist has to see where the stakeholder fits into the equation and make the narrative journey relevant — perhaps with useful visuals or other props to help close the proverbial deal.
This storytelling is not like fictional storytelling; it's more like "fancy explaining," or providing an intuitive explanation tailored to the listener. You wouldn't tell a five year old a Stephen King story a bedtime, just like you wouldn't delve into a dry, long-winded narrative about supply chain metrics to someone working in research and development. Be aware of your audience.
This storytelling is also not persuasive in nature; it's explanatory. We are not data politicians, we are data scientists. Nothing good ever comes of scientists misrepresenting stats in order to bend others to their will. Leave that to the elected officials.
I hope this has helped paint a rich picture of what I believe to be important characteristics of a successful data scientist. I wish you well as you pursue your career.
Matthew Mayo (@mattmayo13) is a Data Scientist and the Editor-in-Chief of KDnuggets, the seminal online Data Science and Machine Learning resource. His interests lie in natural language processing, algorithm design and optimization, unsupervised learning, neural networks, and automated approaches to machine learning. Matthew holds a Master's degree in computer science and a graduate diploma in data mining. He can be reached at editor1 at kdnuggets[dot]com.