How to Break into Data Science in 2020

How To Break Into Data Science In 2020
Share on email
Share on facebook
Share on twitter
Share on linkedin
Share on reddit

 

This year, I finished a bootcamp and immediately landed a data science job. If I had to go back and learn everything by myself, here’s how I would do it.

Nicole Janeway BillsNicole Janeway Bills

 

When it comes to learning data science online, these recos represent the best, mostly free resources I’ve run across in the three years I’ve been training on Pythonanalytics, and productionizing machine learning models. A data scientist should be a great programmer, a decent analyst, and a reasonably good engineer. You also need a rock solid understanding of statistics — for that, there’s ESL.

To learn everything else, let’s get started.


Disclaimer: this post is in no way sponsored, nor does it represent views of anyone but myself. Sling your arrows this way if these recommendations don’t work out for you.


Python has quickly grown into the lingua franca of machine learning. It’s outpaced R and offers abundant packages for scientific computing. A data scientist must be an adept Python programmer.

In addition to having good coding chops, it’s reasonable to expect a data scientist to possess some core analytics skills, including data visualization. I offer some tips on a popular third-party tool, Tableau, for drag-and-drop analytics. A data scientist should be comfortable communicating their insights, sometimes using visualizations.

Finally, to be truly full-stack, a data scientists should be comfortable with all the steps from prototyping to productionizing a model. I love this quote from a guide on operations: “Putting a model in production is the beginning of the model’s journey, not the end.” A data scientist should be aware of what it takes to productionize a model.


🐍 Learning Python

Develop muscle memory for coding

Codecademy is the first place I tell people to go in order to learn Python, command line, and Git before jumping into data science. The platform’s simple interface helps you practice coding until getting the computer to do what you want is no longer the hard part — this frees up brainspace to focus on the challenges associated with actual data science. (Not free — but worth it!)

Learn Python 3 | Codecademy

Learn the latest and greatest version of the most popular programming language in the world!

www.codecademy.com

 

Cover the essentials of Data Science

Flatiron School, the amazing institution where I did my Data Science Bootcamp, offers a totally free online mini-curriculum. Legitimately so helpful.

Learn Coding, Data Science, & Cybersecurity Analytics

Not at all. While making progress in our free courses is the best way to strengthen your application to our full-time…

flatironschool.com

 

Get acclimated to Machine Learning

Katie (of the inimitable, and now sadly defunct, Linear Digressions podcast) and Sebastian (of the victorious Stanley the self-driving car) offer a really friendly overview of Machine Learning.

Introduction to Machine Learning Course | Udacity

Machine Learning is a first-class ticket to the most exciting careers in data analysis today. As data sources…

www.udacity.com

 

Develop Computer Science fundamentals

Eric Grimson is kind of like your stern and brilliant uncle — the one who is harvesting pieces from a dozen no-longer-functioning laptops to build a quantum computer. His classic course is a great way to learn CS best practices while deepening your knowledge of Python.

 

Try a sample data science project

Here are several ideas to get you started:

Walkthrough: Mapping GIS Data in Python

Improve your understanding of geospatial information through GeoPandas DataFrames and Google Colab

towardsdatascience.com

 

Getting Started with Spotify’s API & Spotipy

A data scientist’s quick start guide to navigating Spotify’s Web API and accessing data using the Spotipy Python…

medium.com

 

12-Hour ML Challenge

How to build & deploy an ML app with Streamlit and DevOps tools

towardsdatascience.com

 

Learn more quickly by getting excited about data science

Pick something you’re passionate about and dive deep. To start, check out this list of general resources — blogs, YouTube channels, and podcasts. You can also follow me on LinkedIn and Twitter for real time updates on my favorite learning resources.

Resources to Supercharge your Data Science Learning in 2020

Advance your understanding of machine learning with this helpful collection of journals, videos, and lectures.

towardsdatascience.com

 

🎨 Learning Data Viz

Image for post

Tableau worksheet with dimensions in blue and measures in green. Sidebar at far left shows out-of-the-box analytics tools for basic summary statistics. via Tableau.

When your data needs to get dressed up, Tableau is a fool-proof style service. It offers a sleek, drag-and-drop interface for data analytics with native integration to pull data from CSVs, JSON files, Google Sheets, SQL databases, and that back corner of the dryer where you’ve inevitably forgotten a sock.

Data is automatically separated into dimensions (qualitative) and measures (quantitative) — and presumed to be ready for chart-making. Of course, if there are still a few data cleaning steps to be undertaken, Tableau can handle the dirty laundry as well. For example, it supports re-formatting data types and pivoting data from wide to tall format.

When ready to make a chart, simply ctrl+click features of interest and an option from the “Show me” box of defaults. This simplicity of interaction enables even the most design-impaired data scientist to easily marshal data into a presentable format. Tableau will put your data into a suit and tie and send it to the boardroom.

Follow these tips to go from “good” to “great” in your data visualization abilities.

Gain inspiration from master chart-makers

Throughout my time as a business analyst at a Big Four firm, these three blogs were my go-tos for how to create a great looking, functional Tableau dashboard.

Tableau Blog

Regular dispatches from the Tableau Public Team.

public.tableau.com

 

Anaytics Tips and Insights from Data Experts | Evolytics Blog

Evolytics shares how-to articles, analytics tips, expert advice, industry insights and news. Learn more about timely…

evolytics.com

 

Tableau Archives | InterWorks

Order up! We have another month’s worth of hot and fresh data resources ready for you. In this blog round up,…

interworks.com

 

Keep these 4 guidelines in mind

#1 — Sheets are the artist’s canvas and dashboards are the gallery wall. Sheets are for creating the artwork (ahem, charts), which you will then position onto a dashboard (using a tiled layout with containers — more on this in a second) along with any formatting elements.

#2 — To save yourself time, set Default Properties for dimensions and measures. This will provide a unified approach to color, number of decimal points, sort order, etc. and prevent you from having to fiddle with these settings each time you go to use a given field.

#3 — Along those lines, make use of the overarching Format Workbook and Format Dashboard options instead of one-off formatting tweaks.

#4 — Avoid putting floating objects into your dashboards. Dragging charts around becomes a headache once you have more than two or three to work with. You can make your legends floating objects, but otherwise stay away from this “long-cut.”

Instead, use the tiled layout, which forces objects to snap into place and automatically resizes if you change the size dimensions of your dashboard. Much faster and simpler in the long run.

Get started with your first dashboard

In summary, the Tableau platform is easier than finger paints to use, so if you’re ready to get started, Tableau Public is the free version that will allow you to create publicly accessible visualizations— like this one I put together after webscraping some info on questionable exempted developments from the Washing DC Office of Zoning — and share them to the cloud.

Image for post

Getting ready to present financials to the C-suite.

After investigating data from your local community, another good sample project is pulling your checking account data and pretending you’re presenting it to a CEO for analysis.

Read more about the difference between a data scientist and a data analyst:

What’s the Difference Between a Data Analyst, Data Scientist, and a Machine Learning Engineer?

Explore the distinction between these common job titles with the analogy of a track meet.

towardsdatascience.com

 

Now if you not-so-secretly love data viz and need to find more time to devote to putting your models into production (🙋‍♀️), let’s move on to…

Learning DevOps

Your machine learning model is only as good as its predictions and classifications on data in the real world setting. Give your model a fighting chance by gaining at least a basic understanding of DevOps — the field responsible for integrating development and IT.

Reframe your thinking about what data science is or isn’t

In this brilliant article, hero of deep learning Andrej Karpathy argues that machine learning models are the new hotness in software — instead of following if-then rules, data is their codebase.

Software 2.0

I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some…

medium.com

 

Get a sense for how this works in enterprise

This clever novel fictionalizes The DevOps Handbook and is surprisingly readable. (Not free — but if you buy a copy, give it to your coworker and hope they become super passionate about productionizing your models).

The Unicorn Project by Gene Kim (author of The Phoenix Project)

A story about rebel developers & business leaders racing against time to innovate, survive & thrive in a time of…

itrevolution.com

 

Introduce your machine learning model to the wild

Check out this article about how to use Streamlit for both deployment and data exploration. I’d be remiss if I didn’t also mention Docker and Kubernetes as enterprise-level tools for productionization.

The Most Useful ML Tools 2020

5 sets of tools every lazy full-stack data scientist should use

towardsdatascience.com

 

Other Useful Topics

Explain Computer Science Like I’m Five

Learn about the internet, programming, machine learning, and other computer science fundamentals through clear and…

medium.com

 

How to Ace the AWS Cloud Practitioner Certification with Minimal Effort

Forecast: cloudy with a 100% chance of passing on your first try.

medium.com

 

Comprehensive Guide to the Data Warehouse

Learn about the role of the data warehouse as the master store of analysis-ready datasets.

towardsdatascience.com

 
 

Except for the featured image, this story has not been edited by Javelynn and is published from a syndicated feed. Originally published on https://towardsdatascience.com/new-data-science-f4eeee38d8f6.

0 0 vote
Article Rating
Share on email
Share on facebook
Share on twitter
Share on linkedin
Share on reddit
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x