MyPage is a personalized page based on your interests.The page is customized to help you to find content that matters you the most.

I'm not curious

How can a Fresher get into a Data Science career? (Part 2)

Published on 20 June 17

This post continues from Part 1, in which I began to share my thoughts on how to get into a career in Data Science, and then began with what might work for a graduate in computer science or any other branch of engineering.

I’ve met data science aspirants from these disciplines who are able to code in Python, R or any one of the other popular languages used in this field, as well as those who are not able to code. As I’ve mentioned earlier, being able to code is a first starting point for them.

But what next? If you go online, there’s a bewildering range of courses of all kinds available, many of them seeming to be good ways to get introduced to data science. Some begin with machine learning theory, some begin with machine learning using a programming language, and some begin with the basics of the kind of mathematics underlying data science techniques. It can get quite confusing for the uninitiated, so here’s my two cents.

Remember that at this stage what you’re want to do is get on a path towards becoming a data scientist, not try to become one overnight. The theory behind machine learning as well as the mathematics used in it can get to pretty advanced levels, and you may not really need to know all of it to reach your first goal, which is to get an entry level job in data sciences.

For young aspirants, that first entry level job may be as a machine learning developer or as a data analyst. Or maybe a role in which some tasks from both roles need to be done. If your starting point is that you are an accomplished Python (or R, etc) programmer, then the next step is to learn how to code machine learning algorithms and programs using that language.

Fortunately, a lot of the machine learning data pre-processing techniques and computation algorithms that are commonly used have already been made available in open source libraries such as scikit-learn. There are also other libraries for a range of other areas such as NLP and image processing. These implement the theory and algorithms very well, and so most of the time you will not need to actually write the code to implement the most commonly used machine learning algorithms. You just need to learn how to use these libraries. This is the reason why at an entry level job you don’t really need to learn a whole lot of machine learning theory just yet. The decisions about which functions to use, argument values, and so on will very likely be made by a technical lead or a lead data scientist, and that's the reason I say it may not be necessary for you to learn all the theory unless you are sure you can cope with it at this stage.

There are several course that you could take to learn how to use one of the languages that you know to produce machine learning programs without having too learn everything about the science behind them.

One of the most popular is Machine Learning AZ™: Hands On Python & R in Data Science. This course, of course, covers the implementation of all the commonly used machine learning algorithms in both Python and R, and it’s entirely up to you if you want to learn both.

If you’re happy to just learn how to implement those programs in Python, there’s Python for Data Science and Machine Learning Bootcamp.

You could even do it the good old fashioned way and just buy a book that covers the same content.

There are similar courses available online that cover equivalent ground for other languages as well, so just choose the one for the language you are comfortable with. It would be a good idea, of course, to do a quick browse-through some online job ads in your desired work location to find out what language most employers are using for their data science work.

That’s it for this post. In my next one I’ll talk about what comes next. Later on, I’ll of course also cover what I recommend as good paths for graduates in statistics/mathematics, and those who come from non STEM (science, technology, engineering and mathematics) backgrounds.
This blog is listed under Development & Implementations and Data & Information Management Community

Post a Comment

Please notify me the replies via email.

  • We hope the conversations that take place on will be constructive and thought-provoking.
  • To ensure the quality of the discussion, our moderators may review/edit the comments for clarity and relevance.
  • Comments that are promotional, mean-spirited, or off-topic may be deleted per the moderators' judgment.
You may also be interested in
Awards & Accolades for MyTechLogy
Winner of
Top 100 Asia
Finalist at SiTF Awards 2014 under the category Best Social & Community Product
Finalist at HR Vendor of the Year 2015 Awards under the category Best Learning Management System
Finalist at HR Vendor of the Year 2015 Awards under the category Best Talent Management Software
Hidden Image Url