Choosing the right tech stack for Data Science

Choosing the right tech stack for Data Science
Share on email
Share on facebook
Share on twitter
Share on linkedin
Share on reddit

 

During my internship this summer, I have spent an immense amount of time outside the office learning new technical skills, as well as refining my old ones. While I am finishing up my marketing automation/analytics internship, the tenure marketing analyst I’ve had the privilege of working with has a very strong technical background, and has taught me some things I never would have thought of.

When I started my journey into analytics and data science, I had a very small tech stack. Not to say that knowing a million languages and tools will help, but the tools I had experience with had very little relationship to analytics and data science. Now every tech stack is different and valuable in its own way, however, I wanted to provide some perspective on different tech stacks I’ve seen from colleagues of mine, as well as my own!

Many of you will see that my tech stack is not the most advanced thing in the world. I am not an AI expert or machine learning guru, as I am still learning, and will be for quite awhile, even after college graduation. What I have learned, however, has been able to help me so far, so I hope to give some direction to anyone who is self-teaching themselves or is moving from a business discipline into analytics and data science. Let’s get into it!

Disclaimer: I am not a full time data professional, I am simply relaying my experience from my summer internships and personal practice. This is not meant to tell you what exactly you need to know,rather, this is to give you some practical advice from experiences I’ve had. For advanced technical information, please check out Towards Data Science. If you are interested in some advice to help you get started in analytics, keep reading!

I. Novice Tech Stack

Image for post

   1. Microsoft Excel

   2. Tableau

   3. SQL

This is all you need to get started! You don’t need to know Python right off the bat or a NoSQL database. Something I learned the hard way is that you need to know things like how to ask questions, how to look past the numbers, and how to think like a data professional before you start coding and modeling. Every great musician learns basic music theory before they learn their instrument, as do data professionals learn the trade before the tools. After I toned back my many lessons in many different tools, I brought it back to these three tools. I’m going to be honest with you all as well, these are probably the three most commonly used tools in entry-level analytics. Most of the professionals who do the modeling and machine learning at my internship are at least 7–10 years into their careers, and that is only their specialty. All the analysts and data scientists at my internship use these three tools in some capacity. You can do all your analysis, ETL, and database management with these three tools. Now are they the fastest or most fun? Personally I like SQL, but for the most part, it isn’t as exciting as Python or other advanced tools. These tools, however, are a great way to get started learning basic concepts!

II. Beginner Tech Stack

Image for post

  1. Microsoft Excel

  2. Tableau

  3. SQL

  4. Python (Novice/Beginner)

  5. Power BI

After you spend some time getting to know the different concepts that go into analytics and data science, such as mathematics, statistics, workflows, and asking the right questions, then you can start implementing tools like Power BI, and the fan favorite, Python. When I first started learning Python, a mentor of mine said, “Python is an incredibly powerful tool, but can also be incredibly overwhelming. For data science, learn the basics, and then start learning the different packages one at a time. Python makes sense when you see how different elements come together to make one great language”. So for me, I picked up the basics of Python, such as the syntax, loops, data types, etc. After that, I started working on Pandas, Matplotlib, and Numpy. These would be the three packages I recommend learning first, as they are simple to learn, but powerful to use as well as being used in a lot of the work you will do.

“Python is an incredibly powerful tool, but can also be incredibly overwhelming. For data science, learn the basics, and then start learning the different packages one at a time. Python makes sense when you see how different elements come together to make one great language”

III. Intermediate/Advanced Tech Stack

Image for post

    1. Microsoft Excel

    2. Tableau

    3. SQL

    4. Python (Intermediate/Advanced)

    5. Power BI

    6. Snowflake

    7. Spark

    8. NoSQL

I would like to start with a word of caution: you do not need to know every one of these tools! Excel, Tableau/Power BI, Python, and SQL are the main tools for analysts and data scientists, as well as a part of my own tech stack. I would not incorporate anything else into your stack until you have solid knowledge and experience with those tools. That being said, once you feel you have a grasp on said tools, then would be a good time to add something else depending on what you think you’ll need. Are you going to be working in databases? A NoSQL tool like MongoDB would be a good tool to start learning. Data Warehousing? Snowflake is a good choice. ETL or machine learning? Spark is a popular favorite. For me personally, I will be diving into Snowflake a little more because I know I will be working with data warehousing starting out in my career. Notice that I didn’t say I’m learning all three new tools, just one. As you find use cases for different tools, that would be when you start picking more up. “If you don’t use it, you lose it”, is a real thing, so don’t fall into the trap of having the biggest tech stack! Companies would much rather you have three tools that you know really well rather than ten that you don’t know anything about.

IV. My Personal Tech Stack

Image for post

As you can see, I’m not a machine learning expert or database guru. I have some tools in my tool belt, but I still have so much to learn. I put this here, however, to show you that once you get started, you’ll get on a roll and be adding things as you go. I didn’t necessarily learn these tools just to learn them though, I learned each of them because I was either going to be in a situation where that tool was needed, or I was in a situation where that tool was the only thing I could get. Don’t get bogged down in the different tools you want to learn, learn the process and craft of analytics and data science, because that’s what it’s all about!

Conclusion

Hopefully this was helpful in some way! This was not meant to tell you exactly what you need to learn, rather, this was meant to give you a map where you can chart your own journey. Take it slow, learn one tool at a time, and pick up different tools as you need them. I promise that if you go slow, the process will be enjoyable and fulfilling. Slow and steady wins the race! Be sure to practice the skills on personal projects and coding challenges so you can solidify what you learn. So take a deep breath, fire up your web browser, and get to learning! There’s so much we don’t know about this world, especially during COVID-19, and the world needs the help of analysts and data scientists to understand and make sense of the different problems facing us. The images above should be downloadable, so feel free to download them as a guide when deciding what you should learn next. Happy learning!

 

Except for the headline and featured image, this story has not been edited by Javelynn and is published from a syndicated feed. Originally published on https://towardsdatascience.com/three-tech-stacks-for-aspiring-analysts-5cde49a22337?source=rss—-7f60cf5620c9—4&gi=9e94376a0524

0 0 vote
Article Rating
Share on email
Share on facebook
Share on twitter
Share on linkedin
Share on reddit
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x