What is the difference between a Data Analyst and a Data Scientist and what type of work do they do all day?
These questions and questions like them are answered in the new free ebook The Data Analytics Handbook: Data Analysts and Data Scientists.
The ebook was created by Brian Liou, Tristan Tao and Elizabeth Lin. Brian and Tristan are Computer Science + Statistics grads and run the blog statsguys. Although they have jobs, they have taken the initiative to interview data analysts and data scientists in industry and ask questions around their background, how they were hired and daily routine.
Top 5 Take Aways
The book is beautifully put together by Elizabeth (the designer) and thoughtfully opens with the top 5 take away points from the interviews. In summary these are:
- Communication skills are critical.
- Data collection and cleaning is the biggest challenge.
- Data analysts and data scientists are different.
- The industry is nascent.
- Curiosity trumps tech skills.
The book is 32 pages long and you will be able to read it very quickly. There are 7 interviews from 6 companies, as follows:
- Abraham Cabangbang from LinkedIn
- Josh Wills from Cloudera
- Ben Bregman from Facebook
- Leon Rudyak from Yelp
- Peter Harrington from HG Data
- John Yeung from Flurry
- Santiago Cortes from HG Data
I enjoyed Josh’s interview the most. His answers were clear and thoughtful and I took many notes.
He provided an excellent quote reused in one of the take away points for the book which was ” data scientist is better at statistics than a software engineer and better at software engineering than a statistician“. This really gels with me. He elaborated and commented that a Data Scientist is required when the scale of the data requires thoughts of the computational complexity – when the data requires an engineering mindset to answer questions.
I have not heard the role of the Data Scientist defined so eloquently.
Generally, the interviews highlight at the importance of SQL for accessing data and the use of a variety of tools (Excel, Tableau, Micro Strategy) and languages (R and Python) to get things done. Many times power point / slide decks were mentioned as the medium to present findings, not technical dashboards and webpages.
I recommend download a copy of this free ebook if you are interested in getting insight into the tools and considerations of those on the front line.