What skills do you need to be a data scientist?
I read an interesting data-driven approach to answering this question in the book Doing Data Science: Straight Talk from the Frontline.
In this post I summarize this self-assessment approach that you can use to evaluate your strengths as a data scientist and where you might fit into an amazing data science team.
You can use applied machine learning practitioner as a synonym for data scientist if you like.
Data Science Unicorns
Reviewing jobs for data scientists, the authors of Doing Data Science see that employers are looking for unicorns.
Job ads seek employees that do not exist with strengths in computer science, statistics, communication, data visualization, and domain expertise.
This is not surprising given how the term “data scientist” is ill defined, employers don’t even know what they need or even what problems they need solved.
Cleverly, the authors make a list of common required skills of data scientists from job ads.
They use this list and suggest that you rank yourself on a relative scale (0-100) against each skill.
Finally they suggest that you present the results as a bar graph or histogram.
A single person won’t have all the skills, but a well designed data science team will.
The skills in this self assessment are as follows:
- Computer science
- Machine learning
- Domain expertise
- Communication and presentation skills
- Data visualization
An example of a completed assessment for Rachel from page 11 of the book is as follows:
I think this is a useful tool to help you focus on your strengths and acknowledged your weaknesses that team members can help you cover.
Ensemble of Skill Histograms
Good data science results require a team.
An individual may have a speciality and be generally weak a other areas. It is when individuals with diverse strengths are brought together into a team that you are able to do great data science.
The authors demonstrate this pictorially as follows (taken from page 12 of the book)
What is your strength from the 7 listed above?
Are you able to give yourself a subjective score between 1-100 on each of these skills?
See below for my attempt at a self-assessment.
It’s hard. I believe my strengths are perhaps in computer science, machine learning and communication. The graph above suggests that my visualization skills are not awesome.
I think it is very easy to inflate your skills. How good is good and how do you compare one skill to another? Being good at discrete math in computer science does not help your math score if your calculus is rubbish. Stats is math right? So on. Nevertheless, you have to start somewhere.
The key learning here is to identify and double down on your strengths. You cannot master all the skills. Bring your strongest skill to the table.
Post your results below, I think it would be a fascinating way to group people together on small projects or kaggle competitions.
Is there a skill missing from above?