# --- Setup (do not modify) ---
import sqlite3
import pandas as pd
# Helper: run a SQL query and return a DataFrame.
# Works without SQLAlchemy on all pandas versions.
def sql(query):
cursor.execute(query)
cols = [d[0] for d in cursor.description]
return pd.DataFrame(cursor.fetchall(), columns=cols)
# Create an in-memory SQLite database
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
cursor.executescript("""
CREATE TABLE products (
product_id INTEGER PRIMARY KEY,
product_name TEXT,
category TEXT,
price REAL
);
CREATE TABLE orders (
order_id INTEGER PRIMARY KEY,
product_id INTEGER,
quantity INTEGER,
customer TEXT,
FOREIGN KEY (product_id)
REFERENCES products(product_id)
);
INSERT INTO products VALUES
(1, 'Laptop', 'Electronics', 1200.00),
(2, 'Mouse', 'Electronics', 25.00),
(3, 'Desk', 'Furniture', 450.00),
(4, 'Chair', 'Furniture', 320.00),
(5, 'Headset', 'Electronics', 80.00);
INSERT INTO orders VALUES
(1, 1, 2, 'Anna'),
(2, 2, 5, 'Ben'),
(3, 3, 1, 'Clara'),
(4, 1, 1, 'Daniel'),
(5, 4, 3, 'Anna'),
(6, 5, 2, 'Ben'),
(7, 2, 8, 'Clara'),
(8, 3, 2, 'Evi'),
(9, 5, 1, 'Anna');
""")
conn.commit()
print('Database ready. Good luck!')💻 Notebook Exercise: SQL Aggregate Functions
Analysing a Sales Database
Context:
You are analysing a simplified sales database for an online shop. The database contains two tables: products (product catalogue) and orders (customer orders).
Goal:
Apply the SQL aggregate functions COUNT, SUM, AVG, MIN, and MAX in combination with GROUP BY and HAVING to a real SQLite database running inside this notebook.
Duration: approx. 15–20 minutes
Prerequisites:
Basic Python and pandas (DataFrame);
SQL basics: SELECT, FROM, WHERE
Boot Camp To get back on track with SQL querys do the following exercises:
Database Schema
Table: products
| product_id | product_name | category | price |
|---|---|---|---|
| 1 | Laptop | Electronics | 1200.00 |
| 2 | Mouse | Electronics | 25.00 |
| 3 | Desk | Furniture | 450.00 |
| 4 | Chair | Furniture | 320.00 |
| 5 | Headset | Electronics | 80.00 |
Table: orders
| order_id | product_id | quantity | customer |
|---|---|---|---|
| 1 | 1 | 2 | Anna |
| 2 | 2 | 5 | Ben |
| 3 | 3 | 1 | Clara |
| 4 | 1 | 1 | Daniel |
| 5 | 4 | 3 | Anna |
| 6 | 5 | 2 | Ben |
| 7 | 2 | 8 | Clara |
| 8 | 3 | 2 | Evi |
| 9 | 5 | 1 | Anna |
Additional Training
For further aggregate functions training see SUM and COUNT on sqlzoo.net.
Reflection
Answer the following questions in this cell:
Why does
WHERE category = 'Electronics'beforeGROUP BYproduce a different result thanHAVING category = 'Electronics'afterGROUP BY?Your answer:
What is the difference between
COUNT(*)andCOUNT(product_id)? When could this make a difference?Your answer:
In which real-world professional situations would you use queries like these? Give two concrete examples.
Your answer: