Today, I will explain the step-by-step process to find awesome data sets for your data analytics portfolio.
Many beginner data analysts have trouble finding a usable data set for their portfolio. This guide will show you how to find a data set that works for you and will be great to build something with that can be added to your data analytics portfolio. By the end, you’ll have a full list of datasets that you can use for projects without wasting any time.
Unfortunately, most aspiring data analysts don’t follow this approach. They don’t realize that there are actually some data sets that are better for building portfolio projects.
It’s not your fault. Most data sets aren’t very good for creating portfolio-worthy projects.
If you Google “free data sets” you will find millions of results. But this is the wrong approach for 3 reasons.
- Reason 1: Data sets you find on sites like Kaggle are for data science projects, not data analytics.
- Reason 2: There’s a lack of usable examples with these data sets. You can’t see what other people have created.
- Reason 3: They are boring and don’t solve real-world problems.
This guide will give you the exact steps so you never feel stuck again.
Here’s how, step by step:
Step 1: Find a data set that’s actually interesting to you
By finding a data set that’s interesting to you, you’ll actually be excited to include it in your project.
Here are two places to find awesome data sets:
Place #1: Real World Fake Data
🕵️♀️ Real World Fake Data has dozens of ready-to-use datasets including HR, Healthcare, Finance, and more!
- Interested in HR? Use the Human Resources data set
- Customer analytics sound interesting? Use this one → Financial Services Consumer Complaints
- How about marketing analytics? Check out the Social Media/Marketing data set
Place #2: Data is Plural
👨🔬 Data is Plural provides a mega spreadsheet with 1,500 curated data sets, updated weekly.
- Like art? Use this → The Museum of Modern Art (MoMA) Collection
- RateBeer Beer Ratings is a full data set of beer ratings. Cheers! 🍻
- Coffees of Twin Peaks is a personal project from Judit Bekker on all the coffees consumed in Twin Peaks.
Between these two websites, you will have plenty of variety to choose a data set that interests you and is perfect for your next data analytics portfolio project.
What’s helpful is that these curated data sets have already been used by people so you can get some inspiration and not get stuck.
Step 2: Get inspired by what others have created
Most data analysts think they have to start from scratch when building a portfolio project.
Fortunately, you can look for inspiration from work that other people have done — assuming you follow the advice in this guide.
Note: it’s REALLY important that you don’t just copy another data analyst’s work. That’s stealing and never ends well. Do not just copy another person’s work. Ever. Get inspired and build your own thing.
Let’s take the Human Resource data set I mentioned above. Now that you know about that data set, you can start looking for inspiration.
Here are 3 data projects that use that exact data set linked above:
Example 1: HR Diversity Dashboard (RWFD) by Zak Geis
Another example would be the MoMA data set that I found on Data Is Plural, including a non-Tableau, Python data cleanup project:
Example 1: Museum of Modern Art Collection by Marc Reid
By following these two steps, you will never run out of ideas — and inspiration — for your data analytics portfolio.
Step 3: Build your own, and add it to your portfolio
Now that you have a few data sets in hand and some examples for inspiration, it’s time to finish and get these projects built and added to your portfolio.
This may seem daunting at first, but just follow these steps based on your interest:
- Data Visualization: Use Tableau to build out a dashboard, using the data sets and examples I’ve provided as inspiration.
- Data Cleansing: If you are interested in more data cleansing, use Python to process the data using the example above as inspiration.
Yes, there are other tools out there to get the job done.
But Tableau and Python are two industry standards. And they are very popular. This helps, because other data people will be able to help you more if you use these two tools.
If you use something besides Tableau and Python, you may have trouble finding examples or finding people that can help you when you get stuck.
That’s why I recommend using Python and Tableau.
After you’ve built out your project, add each project to your portfolio.
Use a no-code tool like Webflow to build your portfolio website.
Sign-up takes just a minute and it’s completely free. You don’t have to learn coding, web development, or anything else. After all, you are trying to build your data analytics portfolio, not become a web developer or GitHub pro.
Don’t let these tech distractions trip you up! Once you finish each project, add it to your portfolio immediately using Webflow.
Follow these steps until you have 5-10 different projects in your portfolio.
Having an awesome portfolio will set you apart from the crowd of job searchers!