
Working with SQL, you often need to transform rows of data into a more manageable format. One of the most powerful and often underutilized aggregate functions for this purpose is ARRAY_AGG
. It allows you to group multiple rows into a single array, making it a fantastic tool for data aggregation and manipulation.
What Is ARRAY_AGG in SQL?
Simply put, ARRAY_AGG
is an aggregate function that collects values from a group of rows into a single array. This function exists in many modern SQL-based databases, including PostgreSQL, MySQL (8.0+), and SQL Server (via XML or JSON methods).
How ARRAY_AGG Works in SQL?
The ARRAY_AGG
function takes multiple row results from a query and returns them as an array. This is particularly useful when you need to collect multiple rows under a single field, such as gathering all product names related to a specific category.
Basic Syntax of ARRAY_AGG
SELECT ARRAY_AGG(column_name)
FROM table_name
GROUP BY another_column;
This query will collect all values of column_name
into an array, grouping them based on another_column
.
Best ARRAY_AGG Examples
Let’s break down some practical examples where ARRAY_AGG
can simplify query results.
1. Grouping Data into an Array
Consider a students
table with the following structure:
student_id | name | class |
---|---|---|
1 | John | Math |
2 | Mike | Math |
3 | Anna | Science |
Using ARRAY_AGG
, we can group students by their class:
SELECT class, ARRAY_AGG(name) AS students
FROM students
GROUP BY class;
The result:
class | students |
---|---|
Math | {John, Mike} |
Science | {Anna} |
2. Sorting the Array with ORDER BY
In many SQL databases, you can sort the elements inside the array using ORDER BY
inside ARRAY_AGG
:
SELECT class, ARRAY_AGG(name ORDER BY name ASC) AS students
FROM students
GROUP BY class;
This ensures your array is sorted alphabetically.
3. Using ARRAY_AGG with DISTINCT
To remove duplicate values inside your array, use DISTINCT
:
SELECT class, ARRAY_AGG(DISTINCT name) AS students
FROM students
GROUP BY class;
4. Filtering an ARRAY_AGG Result
If you want to include only specific values in the output array, use a FILTER
clause:
SELECT class, ARRAY_AGG(name) FILTER (WHERE name LIKE 'A%') AS filtered_students
FROM students
GROUP BY class;
This query will only include students whose names start with “A”.
Performance Considerations
While ARRAY_AGG
is powerful, using it improperly on large datasets can lead to performance issues. Here are a few best practices:
- Use
DISTINCT
only when necessary to avoid unnecessary computations. - Ensure indexes exist on columns involved in the grouping.
- Consider using
LIMIT
if the array size can become excessively large.
Conclusion
Understanding ARRAY_AGG
in SQL opens up new ways to manipulate data efficiently. Whether you’re grouping textual values, ordering them, or applying filters, this function simplifies aggregation tasks and makes queries more readable.
Other interesting article:
How JSON_VALUE works in SQL? Best JSON_VALUE examples