Count number of non-NaN values in each column of DataFrame

# Table of Contents

# Count number of non-NaN values in each column of DataFrame

Use the count() method to count the number of non-NaN (or non-missing) values in each column of a DataFrame.

The method counts the non-NA cells for each column or row.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bobby', None, None],
    'experience': [None, 5, None, None],
    'salary': [None, 180.2, 190.3, 205.4],
})

print(df)

print('-' * 50)

print(df.count())

The code for this article is available on GitHub

Running the code sample produces the following output.

shell

Copied!
    name  experience  salary
0  Alice         NaN     NaN
1  Bobby         5.0   180.2
2   None         NaN   190.3
3   None         NaN   205.4
--------------------------------------------------
name          2
experience    1
salary        3
dtype: int64

count number of non nan values in every column on dataframe

We used the pandas.DataFrame() method to create a DataFrame object.

shell

Copied!
    name  experience  salary
0  Alice         NaN     NaN
1  Bobby         5.0   180.2
2   None         NaN   190.3
3   None         NaN   205.4

You can use the DataFrame.count() method to count the non-NA (or non-missing) cells for each column or row of the DataFrame.

main.py

Copied!
print(df.count())

The method produces the following output.

shell

Copied!
name          2
experience    1
salary        3
dtype: int64

The name column has 2 non-missing values
The experience column has 1 non-missing value
The salary column has 3 non-missing values

If you need to get the number of non-NaN (or non-missing) values in each row, set the axis parameter to 1 when calling count().

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bobby', None, None],
    'experience': [None, 5, None, None],
    'salary': [None, 180.2, 190.3, 205.4],
})

print(df)

print('-' * 50)

print(df.count())

print('-' * 50)

print(df.count(axis=1))

The code for this article is available on GitHub

Running the code sample produces the following output.

shell

Copied!
    name  experience  salary
0  Alice         NaN     NaN
1  Bobby         5.0   180.2
2   None         NaN   190.3
3   None         NaN   205.4
--------------------------------------------------
name          2
experience    1
salary        3
dtype: int64
--------------------------------------------------
0    1
1    3
2    1
3    1
dtype: int64

count number of non nan values is each row of dataframe

The row at index 0 (first row) has 1 non-missing value.
The row at index 1 (second row) has 3 non-missing values
The row at index 2 (third row) has 1 non-missing value.
The row at index 3 (fourth row) has 1 non-missing value.

Technically, the DataFrame.count() method counts the non-NA cells for each column or row.

The NA values are:

None
NaN
NaT
optionally numpy.inf (depending on pandas.options.mode.use_inf_as_na).

The method takes an optional axis parameter.

By default, the parameter is set to 0, which means that non-NA counts are generated for each column.

If the axis parameter is set to 1, then non-NA counts are generated for each row.

# Get the total number of the non-NaN values in the `DataFrame`

If you need to get the total number of the non-NaN (or non-missing) values in the DataFrame, use the numpy.sum() method.

First, make sure you have the numpy module installed by running the following command from your terminal.

shell

Copied!
pip install numpy

pip3 install numpy

Now import numpy and use the numpy.sum() method as follows.

main.py

Copied!
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'name': ['Alice', 'Bobby', None, None],
    'experience': [None, 5, None, None],
    'salary': [None, 180.2, 190.3, 205.4],
})

print(df)

print('-' * 50)

print(np.sum(df.count()))  # 👉️ 6

get total number of non missing values in dataframe

The code for this article is available on GitHub

The example shows that the DataFrame contains 6 non-missing values.

The numpy.sum() method returns the sum of the array elements over a given axis.

# Counting the empty strings as NA values

In some cases, you might also want to count the empty strings as NA values.

You can use the DataFrame.replace() method to replace the empty strings with numpy.nan before calling count().

main.py

Copied!
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'name': ['Alice', 'Bobby', None, np.nan],
    'experience': [np.nan, 5, '', None],
    'salary': [None, 180.2, '', 205.4],
})

print(df)

print('-' * 50)

print(df.replace('', np.nan).count())

The code for this article is available on GitHub

Running the code sample produces the following output.

shell

Copied!
    name experience salary
0  Alice        NaN   None
1  Bobby          5  180.2
2   None
3    NaN       None  205.4
--------------------------------------------------
name          2
experience    1
salary        2
dtype: int64

count empty strings as na values

We used the DataFrame.replace() method to replace each empty string with an np.nan value.

We directly called the count() method on the result, so the empty strings are counted as NA values.

# Count number of non-missing values using `notna()`

You can also use the DataFrame.notna() method to count the number of non-missing values.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bobby', None, None],
    'experience': [None, 5, None, None],
    'salary': [None, 180.2, 190.3, 205.4],
})

#     name  experience  salary
# 0  Alice         NaN     NaN
# 1  Bobby         5.0   180.2
# 2   None         NaN   190.3
# 3   None         NaN   205.4
print(df)

print('-' * 50)


# name          2
# experience    1
# salary        3
# dtype: int64
print(df.notna().sum())

count number of not missing values using notna

The code for this article is available on GitHub

The pandas.DataFrame.notna() method detects the non-missing values in the DataFrame.

The method returns a DataFrame mask of boolean values for each element that indicates whether an element is not an NA value.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bobby', None, None],
    'experience': [None, 5, None, None],
    'salary': [None, 180.2, 190.3, 205.4],
})

#     name  experience  salary
# 0  Alice         NaN     NaN
# 1  Bobby         5.0   180.2
# 2   None         NaN   190.3
# 3   None         NaN   205.4
print(df)

print('-' * 50)


#     name  experience  salary
# 0   True       False   False
# 1   True        True    True
# 2  False       False    True
# 3  False       False    True
print(df.notna())

The code for this article is available on GitHub

You can call the sum() method on the result to get the count of the non-missing values in each column.

main.py

Copied!
# name          2
# experience    1
# salary        3
# dtype: int64
print(df.notna().sum())

You might also see the DataFrame.notnull() method being used.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bobby', None, None],
    'experience': [None, 5, None, None],
    'salary': [None, 180.2, 190.3, 205.4],
})

#     name  experience  salary
# 0  Alice         NaN     NaN
# 1  Bobby         5.0   180.2
# 2   None         NaN   190.3
# 3   None         NaN   205.4
print(df)

print('-' * 50)


# name          2
# experience    1
# salary        3
# dtype: int64
print(df.notnull().sum())

The code for this article is available on GitHub

The DataFrame.notnull() method is an alias for the DataFrame.notna() method.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.

You can use the search field on my Home Page to filter through all of my articles.

Count number of non-NaN values in each column of DataFrame

# Table of Contents