💻 Notebook Exercise: SQL Aggregate Functions

Author

Jacques Mock Schindler

Published

18.03.2026

NoteHow to run this notebook locally
  1. create a folder
  2. create a python virtual environment
    python -m venv .venv
  3. activate the virutal environment
    .venv\Scripts\activate
  4. install jupyter and pandas in this virtual environment
    pip install jupyter pandas
  5. download this file to your newly created folder
  6. start the local jupyter server
    jupyter-lab

Analysing a Sales Database

Context:
You are analysing a simplified sales database for an online shop. The database contains two tables: products (product catalogue) and orders (customer orders).

Goal:
Apply the SQL aggregate functions COUNT, SUM, AVG, MIN, and MAX in combination with GROUP BY and HAVING to a real SQLite database running inside this notebook.

Duration: approx. 15–20 minutes

Prerequisites:
Basic Python and pandas (DataFrame);
SQL basics: SELECT, FROM, WHERE

Boot Camp To get back on track with SQL querys do the following exercises:

  1. Basics
  2. SELECT

Database Schema

Table: products

product_id product_name category price
1 Laptop Electronics 1200.00
2 Mouse Electronics 25.00
3 Desk Furniture 450.00
4 Chair Furniture 320.00
5 Headset Electronics 80.00

Table: orders

order_id product_id quantity customer
1 1 2 Anna
2 2 5 Ben
3 3 1 Clara
4 1 1 Daniel
5 4 3 Anna
6 5 2 Ben
7 2 8 Clara
8 3 2 Evi
9 5 1 Anna
# --- Setup (do not modify) ---
import sqlite3
import pandas as pd

# Helper: run a SQL query and return a DataFrame.
# Works without SQLAlchemy on all pandas versions.
def sql(query):
    cursor.execute(query)
    cols = [d[0] for d in cursor.description]
    return pd.DataFrame(cursor.fetchall(), columns=cols)

# Create an in-memory SQLite database
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()

cursor.executescript("""
CREATE TABLE products (
    product_id   INTEGER PRIMARY KEY,
    product_name TEXT,
    category     TEXT,
    price        REAL
);
CREATE TABLE orders (
    order_id    INTEGER PRIMARY KEY,
    product_id  INTEGER,
    quantity    INTEGER,
    customer    TEXT,
    FOREIGN KEY (product_id)
        REFERENCES products(product_id)
);
INSERT INTO products VALUES
    (1, 'Laptop',   'Electronics', 1200.00),
    (2, 'Mouse',    'Electronics',   25.00),
    (3, 'Desk',     'Furniture',    450.00),
    (4, 'Chair',    'Furniture',    320.00),
    (5, 'Headset',  'Electronics',   80.00);
INSERT INTO orders VALUES
    (1, 1, 2, 'Anna'),
    (2, 2, 5, 'Ben'),
    (3, 3, 1, 'Clara'),
    (4, 1, 1, 'Daniel'),
    (5, 4, 3, 'Anna'),
    (6, 5, 2, 'Ben'),
    (7, 2, 8, 'Clara'),
    (8, 3, 2, 'Evi'),
    (9, 5, 1, 'Anna');
""")
conn.commit()
print('Database ready. Good luck!')
# --- Exercises ---
# Use sql(""" ... """) to run your queries.
# Replace the placeholder comment with your SQL.

# TASK 1: How many orders are there in total?
# Use COUNT(*) on the orders table.
result1 = sql("""
-- your query here
""")
print('Task 1:\n', result1)
# TASK 2: What is the cheapest and most expensive
# product price? Use MIN and MAX.
result2 = sql("""
-- your query here              
""")
print('Task 2:\n', result2)
# TASK 3: How many orders and what total quantity
# were placed per category?
# Join orders and products, group by category.
result3 = sql("""
-- your query here
""")
print('Task 3:\n', result3)
# TASK 4: Show only categories where the total
# quantity ordered is greater than 5.
# Use HAVING (not WHERE!).
result4 = sql("""
-- your query here
""")
print('Task 4:\n', result4)

Additional Training

For further aggregate functions training see SUM and COUNT on sqlzoo.net.

Reflection

Answer the following questions in this cell:

  1. Why does WHERE category = 'Electronics' before GROUP BY produce a different result than HAVING category = 'Electronics' after GROUP BY?

    Your answer:

  2. What is the difference between COUNT(*) and COUNT(product_id)? When could this make a difference?

    Your answer:

  3. In which real-world professional situations would you use queries like these? Give two concrete examples.

    Your answer: