Fun fact: when I got back into the data industry after spending 2 years as an IT Director my skills were RUSTY!
Well, it wasn’t ACTUALLY fun for me. I had to relearn everything I’d forgotten. Plus pick up new skills (like Python) to solve new business problems like automating dozens of reports. I felt very stuck.
Up until 2 AM for many nights, banging my head on the keyboard.
It’s a common challenge.
But there’s a solution.
When you’re dealing with lots of data:
- Manual processing becomes tedious.
- Generating graphs and charts is time-consuming.
- Errors and missed insights are almost inevitable.
Wasting time troubleshooting without clear direction.
That’s where Python comes in. It’s a total game-changer for data analysis.
With Python, you can:
- Process data quickly and accurately
- Create data visualizations effortlessly
- Get everything automated so you never have to run a stupid report again in your entire life.
No more hours on spreadsheets.
This guide provides a straightforward approach to automating call center reports using Python. From data setup to visualization and Excel export, everything is covered. Of course, these steps can be applied to any type of business analytics problem, but I’m picking call center reports as an example.
Escape reporting hell.
In this guide, you will learn:
- Data import and exploration
- Creating key business metrics
- Data cleaning and preprocessing techniques
- Creating data visualizations with Seaborn
- Saving Excel reports using XlsxWriter
Before we get started, make sure you have Google Colab set up and ready to go. You can follow along using this Python Notebook as a template to guide you.
Alright, let’s automate!
Step 1: Import Necessary Libraries
First things first, let’s import the libraries we’ll use. Paste this code into a new Google Collab notebook cell and run it.
!pip install xlsxwriter import pandas as pd import seaborn as sns import matplotlib.pyplot as plt import xlsxwriter
Step 2: Load Your Dataset
After importing the libraries, it’s time to load your data. Use the following code snippet to read the CSV file into a DataFrame.
# Load your dataset and preview it data = pd.read_csv('https://query.data.world/s/wdetqq6qpsxda7g46asbdagtq5iua3?dws=00000') data.head()
And here’s what it should look like after you’ve completed Step 2
Here’s what it should look like after loading your dataset |
Step 3: Perform Initial Data Exploration
You’ve got your data. Great! But what does it look like? Get a feel for your dataset by running this code:
data.describe()
This is a neat Pandas function that gives you some basic statistics for numeric data in your data frame.
You can see that there are about 32,000 records in the data set and that there are CSAT Scores for only 12,000 of them. Interesting! That means that there are some records without CSAT Scores, which could be an interesting thing to explore more.
Step 4: Calculate Key Metrics
What’s the average CSAT score? Which channel gets the most calls?
Let’s create a few visualizations:
- A horizontal bar chart for the average CSAT score.
- A pie chart displaying the distribution of channels with a highlight on the most popular one.
# Compute the average CSAT score and identify the most popular channel avg_csat = data['csat_score'].mean() popular_channel = data['channel'].value_counts().idxmax() # Visualization 1: Bar chart for average CSAT score plt.figure(figsize=(10, 4)) sns.barplot(x=['Average CSAT Score'], y=[avg_csat]) plt.ylim(0, 10) # Assuming CSAT scores range from 0 to 10 plt.ylabel('Score') plt.title('Average CSAT Score') plt.show() # Visualization 2: Pie chart for channels with a highlight on the most popular channel_counts = data['channel'].value_counts() explode = [0.1 if channel == popular_channel else 0 for channel in channel_counts.index] plt.figure(figsize=(10, 6)) plt.pie(channel_counts, labels=channel_counts.index, autopct='%1.1f%%', startangle=140, explode=explode) plt.title('Channel Distribution with Highlight on Most Popular Channel') plt.show()
And here’s what that looks like:
Using Python to calculate and visualize some key call center metrics |
Step 5: Visualize Data
Okay, based on our initial analysis and review with the call center manager, showing the average CSAT score is great, but we want to show it by Call Center Location so that the manager can follow up with each team lead.
Here’s the code for that:
# Calculate the average CSAT score by call center avg_csat_by_center = data.groupby('call_center')['csat_score'].mean().sort_values(ascending=False) # Create the bar chart ax = sns.barplot(x=avg_csat_by_center.index, y=avg_csat_by_center.values) # Add labels for each bar, with some padding for better visibility for index, value in enumerate(avg_csat_by_center.values): # +0.05 adds a slight upward shift to the label for better clarity ax.text(index, value + 0.05, f'{value:.2f}', ha='center') # Add title and axis labels plt.title('Average CSAT Score by Call Center') plt.xlabel('Call Center') plt.ylabel('Average CSAT Score') # Move the legend to a position where it doesn't overlap, for instance upper left ax.legend(['Average CSAT Score'], loc='upper left') # Rotate x-axis labels for better visibility plt.xticks(rotation=45) # Display the chart with a modified y-limit for additional padding at the top plt.ylim(0, 10) # Save the figure as an image plt.savefig('hist_csat_score.png', bbox_inches='tight') # Display the chart plt.show()
Step 6: Save Reports to Excel
All these insights are great, but they’re better if you can share them. Let’s save your graphs and metrics to an Excel file.
# Create a Pandas Excel writer using XlsxWriter as the engine. writer = pd.ExcelWriter('call_center_report.xlsx', engine='xlsxwriter') # Convert the dataframe to an XlsxWriter Excel object. data.to_excel(writer, sheet_name='Report') # Insert the histogram into the worksheet. workbook = writer.book worksheet = writer.sheets['Report'] worksheet.insert_image('J2', 'hist_csat_score.png') # Save it all writer.save()
And here’s what the final Excel report looks like:
Great job!
From here, the sky’s the limit. Here are some ideas on where you could take the analysis so you never have to think about it again (while keeping the managers happy at the same time):
Idea 1: Integrate Real-Time Data Feeds
Instead of manually uploading and working with static CSV files, connect directly to the call center’s database. This will allow the report to be dynamically updated with real-time data, ensuring that the analysis is always up-to-date.
Idea 2: Implement Anomaly Detection
If historical call center data is available, use machine learning to detect unusual patterns or anomalies in call metrics. For instance, a sudden dip in CSAT scores or an unexpected spike in call durations could be automatically flagged for review.
Idea 3: Automate Report Generation and Distribution
Really automate the full report generation process end-to-end. This is the “golden state” where you are completely hands-off except for any maintenance. I ended up automating pretty much every single report I was responsible for, which allowed me to grow in my career.