How pandas notna works in Python? Best example

How pandas notna works in Python? Best example
“`html

When working with data in Python, you’ll often encounter missing values. Whether you’re handling financial data, scientific experiments, or simple user records, dealing with NaN (Not a Number) values is a crucial part of data processing. Enter pandas.notna() – a simple yet powerful function that helps us identify non-missing data. In this article, I’ll walk you through how pandas.notna() works, providing clear examples, use cases, and even a comparison with similar methods.

Understanding pandas.notna()

The function pandas.notna() is part of the pandas library and is used to detect non-missing values in a DataFrame or Series. Unlike its counterpart pandas.isna(), which identifies missing values, notna() does the opposite – returning True for values that are not NaN and False for ones that are.

The syntax is straightforward:

pandas.notna(obj)

Where obj can be any of the following:

  • A single scalar value
  • A Pandas Series
  • A Pandas DataFrame

Basic Example of pandas.notna()

Let’s start with a simple example using a Pandas Series:

import pandas as pd

data = pd.Series([10, None, 25, float('NaN'), "Hello"])
result = pd.notna(data)

print(result)

Output:

0     True
1    False
2     True
3    False
4     True
dtype: bool

As you can see, the function correctly identifies None and NaN values as False, while marking all valid entries as True.

Using notna() with DataFrames

Now let’s apply the function to a DataFrame.

df = pd.DataFrame({
    "A": [1, 2, None, 4],
    "B": ["apple", None, "banana", "cherry"],
    "C": [None, 5.5, float('NaN'), 7.2]
})

result_df = pd.notna(df)
print(result_df)

Output:

       A      B      C
0   True   True  False
1   True  False   True
2  False   True  False
3   True   True   True

Each cell in the DataFrame is evaluated, returning True for valid values and False for missing ones.

Practical Use Cases of notna()

Now that you’ve seen how pandas.notna() works, let’s explore some practical use cases where it can be incredibly useful:

1. Filtering Out Missing Values

One of the most common use cases is filtering out NaN values from a Pandas Series:

filtered_data = data[pd.notna(data)]
print(filtered_data)

This effectively removes any missing values while keeping valid ones.

2. Selecting Non-Empty Rows in a DataFrame

You may want to select only the rows where a specific column has non-missing values:

filtered_df = df[pd.notna(df["A"])]
print(filtered_df)

In this case, rows where column “A” contains missing values are removed.

Comparison: notna() vs. isna()

It’s worth comparing notna() with isna() to appreciate their differences:

Function Description Returns True for
pandas.notna() Checks if a value is NOT missing Valid (non-null) values
pandas.isna() Checks if a value is missing NaN / None

In short, notna() flips the boolean values of isna(). If you ever find yourself using isna() followed by ~ (negation), you’re better off using notna() directly.

Best Example of pandas.notna()

Here’s a real-world example where I’ll demonstrate how to clean a messy dataset:

data = pd.DataFrame({
    "Name": ["Alice", "Bob", None, "Dave"],
    "Age": [25, None, 30, 40],
    "City": ["New York", "Los Angeles", "Chicago", None]
})

cleaned_data = data[pd.notna(data["Name"]) & pd.notna(data["Age"])]
print(cleaned_data)

Output:

   Name   Age        City
0  Alice  25.0   New York
3   Dave  40.0       None

In this example, we removed rows where “Name” or “Age” contained missing values, ensuring our dataset remains useful and complete.

Conclusion

Understanding how pandas.notna() works in Python is an essential skill for any data scientist or analyst. Whether you’re cleaning data, filtering missing values, or preparing datasets for modeling, this function provides a quick and efficient way to identify non-missing entries.

The best part? It’s intuitive, easy to use, and complements Pandas’ missing data handling capabilities beautifully. The next time you’re debugging a DataFrame filled with NaN values, remember that pandas.notna() can be your best friend.

“` Other interesting article: How pandas isna works in Python? Best example