Skip to content

How to Build Customer Segments From Scratch With CASE WHEN + GROUP BY

    Hello ,

    Your CEO asks a common question: “Who are our best customers?”

    If you pull a list sorted by total spend, you’ll get a spreadsheet with hundreds of rows. That’s not useful — no one acts on a list of 500 customers.

    What they actually need are segments: 3-5 groups with clear labels, clear revenue impact, and clear recommended actions for each. “Our 19 VIP customers generate 10.3% of total revenue” is something a team can work with.

    This is where CASE WHEN and GROUP BY combine. CASE WHEN creates the categories. GROUP BY summarizes them.

    The Business Problem

    Gabby from marketing has budget for one targeted campaign next quarter. She needs to know: which customer group offers the best return? She can’t market to everyone, so she needs clear segments with numbers.

    Step 1: Define Your Segments With CASE WHEN

    The first decision is the hardest: what thresholds define each segment?

    There’s no universal rule, but here’s a practical approach. Start with the data:

    -- Quick distribution of customer spending
    SELECT 
        MIN(total_spent) AS min_spent,
        ROUND(AVG(total_spent)::numeric, 2) AS avg_spent,
        PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_spent) AS median_spent,
        PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY total_spent) AS p90_spent,
        MAX(total_spent) AS max_spent
    FROM (
        SELECT 
            c.customer_id,
            COALESCE(SUM(p.amount), 0) AS total_spent
        FROM customers c
            LEFT JOIN bookings b ON c.customer_id = b.customer_id
                AND b.status IN ('completed', 'confirmed')
            LEFT JOIN payments p ON b.booking_id = p.booking_id
                AND p.payment_status = 'completed'
        GROUP BY c.customer_id
    ) customer_totals
    WHERE total_spent > 0;

    This tells you the range and distribution. Then you choose breakpoints that create meaningful, actionable groups — not arbitrary cuts.

    Step 2: Build and Summarize Segments

    -- Customer value segmentation
    SELECT 
        CASE 
            WHEN total_spent >= 10000 THEN 'VIP (10K+)'
            WHEN total_spent >= 5000 THEN 'High Value (5K-10K)'
            WHEN total_spent >= 2000 THEN 'Regular (2K-5K)'
            WHEN total_spent > 0 THEN 'Light (Under 2K)'
            ELSE 'No Purchases'
        END AS customer_segment,
        COUNT(*) AS customer_count,
        ROUND(AVG(total_spent)::numeric, 2) AS avg_segment_spend,
        ROUND(SUM(total_spent)::numeric, 2) AS segment_revenue,
        ROUND(
            SUM(total_spent)::numeric / (
                SELECT SUM(amount) FROM payments 
                WHERE payment_status = 'completed'
            ) * 100, 1
        ) AS pct_of_revenue
    FROM (
        SELECT 
            c.customer_id,
            COALESCE(SUM(p.amount), 0) AS total_spent
        FROM customers c
            LEFT JOIN bookings b ON c.customer_id = b.customer_id
                AND b.status IN ('completed', 'confirmed')
            LEFT JOIN payments p ON b.booking_id = p.booking_id
                AND p.payment_status = 'completed'
        GROUP BY c.customer_id
    ) customer_totals
    GROUP BY 
        CASE 
            WHEN total_spent >= 10000 THEN 'VIP (10K+)'
            WHEN total_spent >= 5000 THEN 'High Value (5K-10K)'
            WHEN total_spent >= 2000 THEN 'Regular (2K-5K)'
            WHEN total_spent > 0 THEN 'Light (Under 2K)'
            ELSE 'No Purchases'
        END
    ORDER BY avg_segment_spend DESC;

    Results:

    | Segment | Customers | Avg Spend | Revenue | % of Revenue |
    |———|———–|———–|———|————-|
    | VIP (10K+) | 19 | $12,448 | $236,517 | 10.3% |
    | High Value (5K-10K) | 98 | $6,881 | $674,299 | 29.3% |
    | Regular (2K-5K) | 179 | $3,322 | $594,591 | 25.8% |
    | Light (Under 2K) | 225 | $1,057 | $237,873 | 10.3% |
    | No Purchases | 479 | $0 | $0 | 0.0% |

    This is immediately actionable. Let’s read the story these numbers tell:

    19 VIP customers (1.9% of the customer base) drive 10.3% of revenue. These are your most valuable relationships — protect them at all costs.

    98 High Value customers drive nearly 30% of all revenue. This is your core business. Retention matters here more than anywhere else.

    479 customers have never purchased. That’s 48% of the customer base sitting at $0. Even converting 5% of this group into Light customers would add roughly $25K in revenue.

    How CASE WHEN + GROUP BY Work Together

    The key insight is understanding the division of labor:

    CASE WHEN creates the labels. It looks at each row and assigns it to a category:

    CASE 
        WHEN total_spent >= 10000 THEN 'VIP (10K+)'
        WHEN total_spent >= 5000 THEN 'High Value (5K-10K)'
        ...
    END

    GROUP BY collapses those categories into summary rows. It takes 1,000 individual customer records and produces 5 segment totals.

    The aggregation functions (COUNT, AVG, SUM) calculate metrics within each group.

    Without CASE WHEN, you’d have to create temporary tables or manual filters for each segment. Without GROUP BY, you’d have labeled rows but no summaries. Together, they turn raw data into a strategy document.

    Building Better Segments: Three Variations

    Variation 1: Multi-dimensional segmentation

    Instead of just spending, combine value with recency:

    CASE 
        WHEN total_spent >= 5000 AND last_booking >= CURRENT_DATE - INTERVAL '6 months' 
            THEN 'High Value - Active'
        WHEN total_spent >= 5000 AND last_booking < CURRENT_DATE - INTERVAL '6 months' 
            THEN 'High Value - At Risk'
        WHEN total_spent > 0 AND last_booking >= CURRENT_DATE - INTERVAL '6 months' 
            THEN 'Active - Growth Potential'
        WHEN total_spent > 0 
            THEN 'Dormant'
        ELSE 'Never Purchased'
    END AS customer_segment

    Now “High Value – At Risk” is a retention emergency, not just a line in a report.

    Variation 2: Expedition engagement segments

    CASE 
        WHEN expedition_count >= 5 THEN 'Power User (5+ trips)'
        WHEN expedition_count >= 3 THEN 'Regular (3-4 trips)'
        WHEN expedition_count >= 1 THEN 'Tried It (1-2 trips)'
        ELSE 'Never Booked'
    END AS engagement_tier

    This answers: “How deeply engaged are our customers?”

    Variation 3: Ticket size segments

    CASE 
        WHEN avg_booking_value >= 3000 THEN 'Premium Buyer'
        WHEN avg_booking_value >= 1500 THEN 'Standard Buyer'
        ELSE 'Budget Buyer'
    END AS buyer_type

    This tells marketing which pricing tier each customer gravitates toward.

    Answering Gabby’s Question

    Back to the original ask: where should Gabby focus her campaign budget?

    Option A: Convert “No Purchases” into Light customers.
    Targeting 479 dormant sign-ups with a first-time booking offer. Even 10% conversion at $1,000 avg = $47,900.

    Option B: Upgrade “Light” to “Regular.”
    225 customers averaging $1,057. Moving 20% to the $3,322 average = additional $102K.

    Option C: Retain “High Value” customers.
    98 customers averaging $6,881. Preventing 10% from churning preserves $67K.

    The data says Option B — upgrading Light to Regular — has the highest potential return. These are customers who already trust you enough to buy. Getting a second or third booking from them is easier than acquiring new customers or preventing churn.

    Common Mistakes to Avoid

    Mistake 1: Too many segments

    If you have 10 segments, you’ve created a spreadsheet, not a strategy. Aim for 3-5 groups with clear action items for each.

    Mistake 2: Using the same CASE WHEN in SELECT but not GROUP BY

    If your CASE WHEN appears in SELECT, the same expression must appear in GROUP BY. Otherwise you’ll get an error or unexpected results.

    Mistake 3: Overlapping conditions

    CASE WHEN evaluates top-to-bottom and stops at the first match. Put your most specific conditions first (highest thresholds), then work down. Overlapping conditions create misclassified rows.

    Try This at Your Job

    Pick your most important table (customers, orders, users) and build one segmentation:

    1. Choose a metric to segment by (revenue, activity, engagement)
    2. Define 3-5 meaningful breakpoints
    3. Use CASE WHEN + GROUP BY to get counts and totals for each
    4. Calculate the percentage of revenue each segment represents
    5. Write one recommended action per segment

    This is one of the most frequently requested analyses in any business. Once you build it, you’ll use it constantly.

    Until next time,
    Brian ([say hi on twitter!](https://twitter.com/briangraves))

    P.S. Customer segmentation is a core skill in Module 2 (The Marie Kondo Blueprint) and Module 6 (The So What Framework) of SQL for Business Impact. The course teaches you to build segments that drive decisions, not just organize data. Check it out at [sqlforbusinessimpact.com](https://sqlforbusinessimpact.com).

    P.P.S. What’s the most useful customer segment you’ve built at work? Hit reply — I’d love to hear how you’re using this pattern. I read every response.


    Want More SQL Insights Like This?

    Join thousands of analysts getting weekly tips on turning data into business impact.

    Subscribe to Analytics in Action →

    Leave a Reply

    Your email address will not be published. Required fields are marked *