While generalizing any profession is difficult since so many factors can come into play: (academic versus corporate — large company versus start-up), I find the role of Data Scientist can truly only be described by breaking it apart into three very different roles. For lack of better terms I will name these roles: Academic/Research Data Scientist, Applied Data Scientist, and finally Data Analyst/Scientist. These roles require different skill sets and different mindset that I will discuss below.
First off, I would like to put my bias out front, I am an Applied Data Scientist. Now, that does not mean I feel that my current role is superior to any other data scientist role out there. Also, I work for a very large company with lots of resources, so I also want to point out that the roles I will describe below can overlap depending on your working environment.
Academic/Research Data Scientist
These are the people who design new machine learning algorithms and push the boundaries of data science and AI. When you see advances in self driving cars and computer vision, these are the people behind those advances. These individuals work either in a university setting or as part of a research team for companies like Google, Facebook, Tesla. Most either hold a PhD or are currently working on one. These data scientists actively develop and conduct research experiments that are written up and published in scientific journals. These are the pure scientists among the data science world.
To be amongst this crowd of data scientists, you need to be at the top of your game with advanced mathematics and programming skills. If you really love diving deep into the field and can handle the often glacial pace research moves at, this could be the job for you. The biggest drawback is that these jobs are limited. There are only so many academic or industry research positions out there. Most data scientists today instead fall into the other two categories.
Applied Data Scientist
Applied Data Scientists are a bit more pragmatic. They work for a company that drives their goals, instead of being funded by research grants. They do not have the luxury of time often afforded those working on grants, so the solutions they build need to go into production sooner rather than later. (Now to be fair to researchers, applied data scientists also don’t have to deal with the headaches surrounding grant proposals).
Typically, data scientists working in industry (not part of a research team) are not out developing new algorithms or trying to push the limits of machine learning. Instead, they use tools created by others to explore and derive meaning from data that can be acted upon. When the boss wants actionable data and they need it now. Most applied data scientists keep a few algorithms on hand that they know work for certain scenarios and spend most of their time gathering, cleaning, and prepping the data to build out the models.
I kind of like this position. I view myself more of an applied scientist. Even while going through my PhD, I always leaned towards applied versus theoretical.
To work as an applied data scientist, a candidate should have a master’s degree or at least 6 years industry experience. They should be inquisitive and honestly interested in the domain in which they work in. I work in cyber security, I have spent a lot of time researching and studying the field so I can identify opportunities to provide a data driven solution. Candidates should also have a wide ranging skill set beyond just ML. An applied data scientist should be well versed in tools such as dashboard development, optimization modeling, forecasting, and simulation modeling.
This is probably the most common data science position right now. What this position is calling for is basically a top level analyst familiar with data science tools. Now I am not denigrating this positon. What these data scientists do is every bit as challenging and important as the other two I described above. They are expected to operate as a data scientist while also handling analyst or business intelligence duties as well.
People in this position are not less than or incapable of doing the other roles described above, they are instead part of an organization that either does not have a fully developed data science program or they are working outside the data science organization. These data scientists are often embedded in a department providing data management and analysis expertise. Truly jacks of all trades, these data scientist often stand-up and manage data warehouses or data marts for their department. They can be expected to handle reporting duties as well as machine learning model development.
To sum this all up, data science is a broad and still evolving field. Solid industry definitions are not common place and titles often do not represent actual job duties. However, most data scientists can be generalized under one of the three roles I discussed above: Academic/Research Data Scientist, Applied Data Scientist, or Data Analysts/Scientists. Neither role is inherently better or more important, but the differences in the rows can definitely attract different individuals.