How numpy unique works in Python? Best example

How numpy unique works in Python? Best example
“`html

When working with arrays in Python, one of the most common tasks is identifying unique elements. Thankfully, NumPy provides an efficient and convenient way to achieve this with the numpy.unique() function. Let’s dive deep into how it works, explore its different parameters, and examine examples to better understand its capabilities.

Understanding numpy.unique()

At its core, numpy.unique() serves one primary function: returning the unique elements of an array. However, it offers much more than just filtering duplicates. The function can also provide counts, indices, and even reconstruct the original array using indices.

Basic Usage

The simplest way to use numpy.unique() is by passing an array to it. It automatically removes duplicate values and sorts the output.

import numpy as np

arr = np.array([3, 1, 2, 3, 4, 1, 2, 5])
unique_elements = np.unique(arr)

print(unique_elements)  # Output: [1 2 3 4 5]

Notice that the returned array is sorted. Unlike regular Python sets, numpy.unique() guarantees a sorted output.

Exploring Additional Return Values

numpy.unique() can return more than just unique values. By passing specific parameters, we can retrieve additional metadata:

  • return_index=True – Returns the indices of the first occurrences of the unique values.
  • return_inverse=True – Returns indices that reconstruct the original array from the unique values.
  • return_counts=True – Returns the count of each unique element.

Example: Using All Return Values

unique_elements, indices, inverse_indices, counts = np.unique(arr, return_index=True, return_inverse=True, return_counts=True)

print("Unique Elements:", unique_elements)
print("Indices:", indices)
print("Inverse Indices:", inverse_indices)
print("Counts:", counts)

The output gives us:

Unique Elements Indices Inverse Indices Counts
[1, 2, 3, 4, 5] [1, 2, 0, 4, 7] [2, 0, 1, 2, 3, 0, 1, 4] [2, 2, 2, 1, 1]

Each row in the table represents the respective return value:

  • Unique Elements: The distinct values found in the original array.
  • Indices: The positions where these unique elements first appear in the input array.
  • Inverse Indices: A mapping that reconstructs the original array using the unique elements.
  • Counts: How many times each unique element appears in the array.

Reconstructing the Original Array

With the inverse indices, we can recreate the original array:

reconstructed_array = unique_elements[inverse_indices]
print(reconstructed_array)  # Output: [3 1 2 3 4 1 2 5]

As expected, the reconstructed array matches the original input.

Handling Multidimensional Arrays

Applying numpy.unique() to multidimensional arrays can be tricky. By default, it flattens the input before processing:

arr_2d = np.array([[3, 1, 2], [3, 4, 1]])
unique_elements = np.unique(arr_2d)

print(unique_elements)  # Output: [1 2 3 4]

To maintain multidimensional uniqueness, process row-wise or column-wise using additional strategies like np.unique(arr, axis=0).

Performance Considerations

numpy.unique() is optimized for performance but sorting adds overhead. For large datasets, alternative methods like pandas’ pd.Series.unique() might be worth exploring for better efficiency, especially when order preservation is required.

Conclusion

Now that we’ve explored how numpy.unique() works, it’s clear that it is a powerful tool for filtering and analyzing arrays. Whether you’re interested in unique values, their counts, or reconstructing data, this function provides everything needed. Next time you’re dealing with duplicate values in NumPy, you’ll know exactly how to handle them!

“` Other interesting article: How numpy sort works in Python? Best example