
When working with SQL, window functions provide a powerful way to perform calculations across a set of table rows while maintaining individual row details. One key clause that enhances these functions is PARTITION BY
. In this article, I’ll walk you through exactly how PARTITION BY
works, why it’s useful, and provide the best examples for real-world applications.
Understanding PARTITION BY
in SQL
The PARTITION BY
clause groups rows into partitions so that window functions can be applied separately to each partition. Think of it as resetting an aggregate or analytic function within each group, without collapsing the result into a single row, like GROUP BY
would.
Here’s the basic syntax:
SELECT column_name,
window_function() OVER (PARTITION BY column_name ORDER BY column_name)
FROM table_name;
Unlike GROUP BY
, which collapses rows into one per group, PARTITION BY
keeps all rows while applying the function to defined partitions.
Key Differences Between PARTITION BY
and GROUP BY
Feature | PARTITION BY |
GROUP BY |
---|---|---|
Result Set | Keeps all rows | Reduces rows per group |
Function Use | Works with window functions | Works with aggregate functions |
Ordering | Allows ordering within partitions | Does not support ordering within groups |
Best Examples of PARTITION BY
Usage
1. Ranking with ROW_NUMBER()
One useful application of PARTITION BY
is generating row numbers within each group:
SELECT department, employee_name,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;
This assigns a unique rank to each employee per department based on salary.
2. Calculating Running Totals
To compute cumulative totals within partitions, use SUM()
as a window function:
SELECT customer_id, order_date, amount,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS running_total
FROM orders;
This returns a progressive sum of amounts for each customer.
3. Finding the First and Last Entry Per Partition
To retrieve the first or last value within each partition:
SELECT employee_id, department, hire_date,
FIRST_VALUE(hire_date) OVER (PARTITION BY department ORDER BY hire_date ASC) AS first_hire,
LAST_VALUE(hire_date) OVER (PARTITION BY department ORDER BY hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_hire
FROM employees;
This query fetches the earliest and latest hire dates per department.
4. Calculating Percentile Ranks
To determine the relative rank of a value within partitions:
SELECT student_id, subject, score,
PERCENT_RANK() OVER (PARTITION BY subject ORDER BY score DESC) AS percentile_rank
FROM exam_results;
This helps analyze student performance distribution within subjects.
Common Mistakes and How to Avoid Them
- Omitting
ORDER BY
in window functions: Without specifying order, functions likeROW_NUMBER()
orRANK()
may provide unpredictable results. - Expecting
PARTITION BY
to filter results:PARTITION BY
creates logical partitions; it does not filter data. UseWHERE
orFILTER
for that. - Confusing
PARTITION BY
withGROUP BY
:GROUP BY
aggregates data, whilePARTITION BY
maintains original row details.
Conclusion
The PARTITION BY
clause is a game-changer when working with window functions in SQL. Whether you need rankings, cumulative totals, or percentile calculations, understanding PARTITION BY
will help you write more efficient and insightful queries. Applying the techniques shown in this guide will enable you to achieve sophisticated analytics directly in SQL.
Other interesting article:
How LAST_VALUE works in SQL? Best LAST_VALUE examples