Skip to content

Python Power-Ups: Boost Your Sales Analysis in 3 Simple Steps (And Yes, the Sample Code is Included!)

Every time I speak to sales managers, they share the same complaint.

“I have so much sales data, but I don’t know what to do with it.”

Sound familiar?

Drowning in numbers:

  • Monthly targets
  • Lead conversion rates
  • Quarterly revenue figures

A relentless barrage of data. Impossible to keep up with.

And even if you somehow manage, interpreting that data to make meaningful decisions? A colossal task.

You’ve considered learning programming, and diving into Python, perhaps. But even the thought of setting up complicated software, dealing with installations, and debugging code is daunting.

It’s a complete nightmare. And it doesn’t feel worth all the hassle.

Never feel overwhelmed with data again

In this guide, I break down the entire process into 3 manageable steps.

The best part?

No installations.

No fuss.

Just pure, actionable steps.

Here’s what’s waiting for you:

  • Quickly generate insights from your sales data
  • Guide your sales team based on real numbers
  • Make proactive decisions to drive business goals

Sounds like a dream, right?

The D.A.T.A Framework

Over the past 15+ years of my data career, I’ve followed the same 4 steps for every new project. And I call it the D.A.T.A. Framework. Clever, I know. 😉

  • Define: Start by defining sales objectives and the key metrics that matter.
  • Acquire: Gather the data you need. This might involve exporting data from different systems or tools. You also want to gather requirements at this step.
  • Transform: Convert raw data into a format suitable for analysis. Clean up inconsistencies, handle missing values, and structure your data to make your life easy.
  • Analyze: Use statistical tools, charts, and graphs to study trends, compare sales figures, and deduce patterns that can inform business strategies.

Don’t overcomplicate it! Just keep that framework in mind (like I do) to keep things simple.

👉 btw if you want me to share more about this framework (and others that I use all the time) just let me know!

Okay, let’s apply the D.A.T.A. Framework to solve the sales manager’s problem with Python.

Remember: there are only two types of data professionals: action-takers and sideliners.

This is where you decide if you’re an action-taker or a sideliner.

Use my Global Superstore Sales Analysis with Python notebook so you can follow along. A free Google Colab account is required, but you can also download the ipynb file.

Step 1: Setup and Configuration

The first step is setting everything up.

Protip: Always ensure you have the latest version of libraries to avoid compatibility issues.

# Install the latest version of the plotly library
!pip install plotly --upgrade

# Import necessary libraries
import pandas as pd             # Library for data manipulation and analysis
import as px    # Plotly Express for simple syntax plotting
import as pio        # To access Plotly's input-output module 

# Set the default plotly template to 'seaborn' for consistent and visually appealing plots
px.defaults.template = "seaborn"

Step 2: Load the Data and Get a Preview

Next, let’s load the Superstore Sales dataset and get a preview. Understanding your data structure is the first step in any analysis. This is the Acquire step of the D.A.T.A Framework.

Rookie mistake: Not checking the first few rows of your dataset. Always inspect the initial rows to understand your data’s structure.

# Load the Superstore Sales dataset

# URL pointing to the raw data file on GitHub
url = ''
# Read the CSV data from the URL into a pandas DataFrame
data = pd.read_csv(url)

# Display the first few rows of the data to understand its structure and get a quick overview

Here’s what it should look like:

Using data.head() to view the first 5 rows of our sample data set.

Step 3: Analyze the Data

The final step for using Python to build a sales analysis is to actually create the analysis.

Now, if you’ve been following the D.A.T.A. Framework, this is the final two parts. I’ll show you how to transform the raw CSV data into something useable and then we’ll build out three different analyses, all in Python.

Analysis 1: Sales Analysis by Category

Visualizing total sales by category provides a high-level overview of where the majority of sales are coming from.

Proper labeling in your plots is essential. It makes your charts easily understandable to anyone viewing them.

# Create a bar chart using Plotly Express
fig =
    # Group data by 'Category', sum the values, then reset the index for plotting
    # Set 'Category' as the x-axis variable
    # Set 'Sales' as the y-axis variable
    # Provide a title for the chart
    title='Total Sales by Category',
    # Rename the 'Sales' label for clarity
    labels={'Sales': 'Total Sales ($)'},
    # Use the 'seaborn' theme for the chart
# Update the trace to display y-values on the bars and format them as currency
fig.update_traces(texttemplate='%{y:$,.0f}', textposition='outside')
# Display the generated chart

Here’s what you should see

Analysis 2: Monthly Sales Analysis

Breaking down sales on a monthly basis helps in understanding trends, seasonal variations, and anomalies.

Protip: Be sure to convert any date columns into a datetime datatype. Always ensure date-related operations are performed on columns of the correct datatype.

# Convert the 'Order Date' column from string format to datetime format
data['Order Date'] = pd.to_datetime(data['Order Date'])

# Extract the month and year from 'Order Date' and store it in a new column 'Order Month'
data['Order Month'] = data['Order Date'].dt.to_period('M').astype(str)

# Group the data by 'Order Month' and calculate the sum for each month, then reset the index for plotting
monthly_sales = data.groupby('Order Month').sum().reset_index()

# Create a line chart to visualize monthly sales over time
fig = px.line(monthly_sales, 
              x='Order Month',  # x-axis: Month and Year
              y='Sales',        # y-axis: Sales amount
              labels={'Sales': 'Monthly Sales ($)', 'Order Month': 'Month'},
              title='Monthly Sales Over Time')

# Update the y-axis to display values as currency (with $ prefix)

# Display the generated chart

Here’s the sales data by month so you can easily spot trends:

Analysis 3: Monthly Sales Over Time by Category

Okay, now we’re getting somewhere! Let’s combine the insights we’ve gotten from the first two analyses into one where we can look at monthly trends by sales category.

# Group the data by both 'Order Month' and 'Category', 
# calculate the sum for each grouping, then reset the index for plotting
monthly_sales_by_category = data.groupby(['Order Month', 'Category']).sum().reset_index()

# Create a line chart to visualize monthly sales over time, segmented by product category
fig = px.line(monthly_sales_by_category, 
              x='Order Month',  # x-axis: Month and Year
              y='Sales',        # y-axis: Sales amount
              color='Category', # Differentiate lines by product category
              labels={'Sales': 'Monthly Sales ($)', 'Order Month': 'Month', 'Category': 'Product Category'},
              title='Monthly Sales Over Time by Category')

# Update the y-axis to display values as currency (with $ prefix)

# Display the generated chart

And here’s what it should look like. Nice!!

Analysis 4: Profit vs Sales Analysis by Subcategory

Visualizing the relationship between profit and sales for each subcategory can reveal which products are the most lucrative.

More profit = more cash money!

A scatter plot is particularly useful for this type of analysis as it visually separates high-profit, high-sales products from the rest.

# Group the data by 'Sub-Category', 
# calculate the sum for each subcategory, then reset the index for plotting
subcategory_data = data.groupby('Sub-Category').sum().reset_index()

# Create a scatter plot to visualize the relationship between profit and sales for each subcategory
fig = px.scatter(subcategory_data, 
                 x='Sales',          # x-axis: Sales amount
                 y='Profit',         # y-axis: Profit amount
                 color='Sub-Category', # Differentiate points by subcategory using color
                 size='Sales',       # Vary point size based on sales amount
                 labels={'Sales': 'Total Sales ($)', 'Profit': 'Total Profit ($)'},
                 title='Profit vs Sales by Subcategory')

# Update the x and y axes to display values as currency (with $ prefix)
fig.update_layout(xaxis_tickprefix='$', yaxis_tickprefix='$')

# Display the generated scatter plot

And here’s the profit vs. sales scatter plot (created in Python and Plotly from CSV data!)

Great job!

If you worked through those steps, you are well on your way to analyzing all sorts of data with Python.

Here’s a quick recap of what you learned

  • Getting everything set up and installed properly using Google Colab
  • Loading sales data and previewing it quickly
  • Creating a category bar chart and monthly sales line chart
  • Combining categories and monthly sales to understand the data at a deeper level
  • Understanding profit vs sales by subcategory using a scatterplot

By following the D.A.T.A. Framework, we streamlined the process of deriving actionable insights from raw sales data.

Ideas for next steps

  1. Expand: What other data could you include to make your analysis even more useful? Product details? Marketing data? Sales rep data?
  2. Predict: Python has a ton of features to help you with predictive analytics use cases. This could be a game-changer in proactive decision-making.
  3. Enhance: Enhance data visualization with interactive dashboards. Tools like Dash by Plotly can be integrated with Python for this purpose.
  4. Iterate: Implement a system to act on the insights and continuously monitor the outcomes. This iterative approach ensures the sales strategy remains dynamic and responsive.

Leave a Reply

Your email address will not be published. Required fields are marked *