A friend of mine recently asked me to share some of my experiences in making the transition from a biophysics Ph.D. student to data scientist. I realized there are probably a lot of people interested in making a similar transition who could benefit from my experience.
Goals
A year and a half before I finished my Ph.D. I starting wondering what my plan for keeping a roof over my head after graduation was. I had two main goals:
- Stay in the Bay Area.
- Never touch another pipette again as long as I live.
It was at this time that I decided data science was where I would focus my future efforts. While many companies are hiring quantitative science Ph.D.s I realized that if I wanted one of the best, most interesting data science jobs I was going to have to put in a lot of time learning and practicing. Below are the things that I found improved my skills the most.
Establish a data science portfolio
-
Have a personal website (e.g. frankcleary.com). I learned a lot about the internet and using remote linux machines by building my personal website. It also helped give me motivation to work on other projects because I had a place to share them. My site now gets 10-20 users per day from searches, mostly landing on pages where I've posted tutorials.
-
Do data related projects in your spare time (and then post them on your website!). I've worked on a variety of fun side projects ranging from interactive D3.js graphs to tutorials on matrix decomposition that made be a better programmer and a better data scientist. Having these projects on your website also gives you a portfolio that will help you stand out by showing commitment to learning and (hopefully) skill.
Study books and videos to learn more about computer science and data science
-
Check out my recommended books page for the best books I've read.
-
Many great talks from conferences and meetups are available for free online. Besides searching directly on youtube, other sites like pyvideo.org have great video libraries. Take a look at my recommended videos for some of my favorites.
-
Take a free online class. I took Introduction to Databases with Prof. Jennifer Widom and Machine Learning with Prof. Andrew Ng. Both were engaging and cover essential knowledge for a data scientist that comes in handy when doing data science and when interviewing for data scientist jobs. Introduction to Databases was more polished and more rigorous.
Do an internship
- A summer internship is a great way to get first hand experience doing data science in industry. I did one the summer before I graduated and it was a very valuable. Look into career fairs on campus (how I found mine), job posting websites and company websites directly for opportunities.
Learn to use Git for version control
-
Git is great when writing software solo, and essential when working on team (which you will be in a job). The more you learn about Git the more efficient your code development process will be.
-
Create an account on GitHub and post your portfolio projects to it. Link to the account on your resume to make sure your projects are visible.
Similar Posts
- Installing python for data science, Score: 0.956
- PyCon 2015 videos, Score: 0.935
- Recommended Videos, Score: 0.928
- Recommended Books, Score: 0.928
- GitHub now renders Jupyter (IPython) notebooks, Score: 0.819
Comments