How PARTITION BY works in SQL? Best PARTITION BY examples

How PARTITION BY works in SQL? Best PARTITION BY examples

When working with SQL, window functions provide a powerful way to perform calculations across a set of table rows while maintaining individual row details. One key clause that enhances these functions is PARTITION BY. In this article, I’ll walk you through exactly how PARTITION BY works, why it’s useful, and provide the best examples for real-world applications.

Understanding PARTITION BY in SQL

The PARTITION BY clause groups rows into partitions so that window functions can be applied separately to each partition. Think of it as resetting an aggregate or analytic function within each group, without collapsing the result into a single row, like GROUP BY would.

Here’s the basic syntax:

SELECT column_name, 
       window_function() OVER (PARTITION BY column_name ORDER BY column_name)
FROM table_name;

Unlike GROUP BY, which collapses rows into one per group, PARTITION BY keeps all rows while applying the function to defined partitions.

Key Differences Between PARTITION BY and GROUP BY

Feature PARTITION BY GROUP BY
Result Set Keeps all rows Reduces rows per group
Function Use Works with window functions Works with aggregate functions
Ordering Allows ordering within partitions Does not support ordering within groups

Best Examples of PARTITION BY Usage

1. Ranking with ROW_NUMBER()

One useful application of PARTITION BY is generating row numbers within each group:

SELECT department, employee_name,
       ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;

This assigns a unique rank to each employee per department based on salary.

2. Calculating Running Totals

To compute cumulative totals within partitions, use SUM() as a window function:

SELECT customer_id, order_date, amount,
       SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS running_total
FROM orders;

This returns a progressive sum of amounts for each customer.

3. Finding the First and Last Entry Per Partition

To retrieve the first or last value within each partition:

SELECT employee_id, department, hire_date,
       FIRST_VALUE(hire_date) OVER (PARTITION BY department ORDER BY hire_date ASC) AS first_hire,
       LAST_VALUE(hire_date) OVER (PARTITION BY department ORDER BY hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_hire
FROM employees;

This query fetches the earliest and latest hire dates per department.

4. Calculating Percentile Ranks

To determine the relative rank of a value within partitions:

SELECT student_id, subject, score,
       PERCENT_RANK() OVER (PARTITION BY subject ORDER BY score DESC) AS percentile_rank
FROM exam_results;

This helps analyze student performance distribution within subjects.

Common Mistakes and How to Avoid Them

  • Omitting ORDER BY in window functions: Without specifying order, functions like ROW_NUMBER() or RANK() may provide unpredictable results.
  • Expecting PARTITION BY to filter results: PARTITION BY creates logical partitions; it does not filter data. Use WHERE or FILTER for that.
  • Confusing PARTITION BY with GROUP BY: GROUP BY aggregates data, while PARTITION BY maintains original row details.

Conclusion

The PARTITION BY clause is a game-changer when working with window functions in SQL. Whether you need rankings, cumulative totals, or percentile calculations, understanding PARTITION BY will help you write more efficient and insightful queries. Applying the techniques shown in this guide will enable you to achieve sophisticated analytics directly in SQL.

 

Other interesting article:

How LAST_VALUE works in SQL? Best LAST_VALUE examples