
Sorting is one of the fundamental operations in data manipulation, and thankfully, numpy.sort()
makes it incredibly easy in Python. In this article, I’ll walk you through how it works, the different sorting algorithms it offers, and some practical examples to see it in action.
What Is numpy.sort()
?
numpy.sort()
is a function that sorts an array in ascending order by default. It works efficiently with both one-dimensional and multi-dimensional arrays. It’s part of the NumPy library, which is a powerful toolkit for numerical computing in Python.
Basic Usage of numpy.sort()
Let’s start with the basics. Here is how you can sort a simple NumPy array:
import numpy as np
arr = np.array([5, 2, 8, 1, 9, 3])
sorted_arr = np.sort(arr)
print(sorted_arr) # Output: [1 2 3 5 8 9]
As you can see, numpy.sort()
returns a new sorted array while leaving the original array unchanged.
Sorting a Multi-Dimensional Array
numpy.sort()
can handle multi-dimensional arrays too. By default, it sorts along the last axis, meaning it will sort individual rows in a 2D array.
arr_2d = np.array([[12, 4, 7],
[9, 1, 3],
[5, 8, 6]])
sorted_2d = np.sort(arr_2d)
print(sorted_2d)
# Output:
# [[ 4 7 12]
# [ 1 3 9]
# [ 5 6 8]]
If you want to sort along a different axis, you can specify it using the axis
parameter.
Sorting Along a Specific Axis
You can sort along rows (axis 1) or columns (axis 0). Here’s how it works:
- Sorting along rows (
axis=1
) – Each row is sorted independently. - Sorting along columns (
axis=0
) – Each column is sorted independently.
sorted_rows = np.sort(arr_2d, axis=1) # Sort each row
sorted_columns = np.sort(arr_2d, axis=0) # Sort each column
print("Sorted Rows:\n", sorted_rows)
print("Sorted Columns:\n", sorted_columns)
This flexibility makes numpy.sort()
highly useful when dealing with multidimensional datasets.
Sorting with Different Algorithms
One of the cool things about numpy.sort()
is that it allows you to choose the sorting algorithm using the kind
parameter. NumPy provides three sorting algorithms:
Sorting Algorithm | Best For | Time Complexity |
---|---|---|
'quicksort' |
Default, fast for most cases | O(n log n) |
'mergesort' |
Stable sort, useful when the order of equal elements matters | O(n log n) |
'heapsort' |
Not widely used, but has better worst-case performance | O(n log n) |
You can specify the sorting algorithm like this:
sorted_mergesort = np.sort(arr, kind='mergesort')
print(sorted_mergesort)
Sorting Strings and Structured Arrays
Sorting isn’t limited to numbers. NumPy can also sort strings and structured arrays.
Sorting an Array of Strings
str_arr = np.array(["banana", "apple", "cherry"])
sorted_str_arr = np.sort(str_arr)
print(sorted_str_arr) # Output: ['apple' 'banana' 'cherry']
Sorting a Structured Array
If you’re dealing with structured arrays, you can sort based on a specific field:
dtype = [('name', 'U10'), ('age', int)]
data = np.array([("Alice", 25), ("Bob", 30), ("Charlie", 22)], dtype=dtype)
sorted_data = np.sort(data, order='age')
print(sorted_data)
The Difference Between np.sort()
and np.argsort()
While np.sort()
returns a sorted array, np.argsort()
returns the indices that would sort the array.
arr = np.array([40, 10, 30, 20])
sorted_indices = np.argsort(arr)
print(sorted_indices) # Output: [1 3 2 0]
You can use these indices to reorder the array:
print(arr[sorted_indices]) # Output: [10 20 30 40]
Conclusion
Now you know how numpy.sort()
works in Python! Whether you’re handling simple lists, multi-dimensional arrays, or structured data, NumPy provides a powerful and flexible sorting function. By choosing the right kind
and axis parameter, you can optimize sorting for different scenarios.