Last updated: Apr 12, 2024
Reading time·4 min
errors
argument to coerce
to solve the errorstrptime()
to convert strings to datetime
objectsThe Pandas error "OutOfBoundsDatetime: Out of bounds nanosecond timestamp"
occurs when you try to create a datetime
object that is out of bounds.
To solve the error, set the errors
argument to coerce
to convert dates that
are out of bounds to NaT
.
Here is an example of how the error occurs.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], 'date': ['2023-01-05', '2023-03-25', '2362-01-24'] }) # ⛔️ pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2362-01-24 00:00:00, at position 2. You might want to try: print(pd.to_datetime(df['date']))
Here is the complete error message:
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2362-01-24 00:00:00, at position 2. You might want to try: - passing `format` if your strings have a consistent format; - passing `format='ISO8601'` if your strings are all ISO8601 but not necessarily in exactly the same format; - passing `format='mixed'`, and the format will be inferred for each element individually. You might want to use `dayfirst` alongside this.
The issue in the code sample is that one of the dates is outside of the range of allowed dates.
You can use the pandas.Timestamp.min and pandas.Timestamp.max attribute to print the allowed date range.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], 'date': ['2023-01-05', '2023-03-25', '2362-01-24'] }) # 1677-09-21 00:12:43.145224193 print(pd.Timestamp.min) # 2262-04-11 23:47:16.854775807 print(pd.Timestamp.max)
The third date in the DataFrame
(2362-01-24) falls outside the allowed range
which caused the error.
errors
argument to coerce
to solve the errorYou can solve the error by setting the errors
argument to coerce
.
When the errors
argument is set to coerce
, dates that cannot be parsed will
be set to NaT
.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], 'date': ['2023-01-05', '2023-03-25', '2362-01-24'] }) # 0 2023-01-05 # 1 2023-03-25 # 2 NaT # Name: date, dtype: datetime64[ns] print(pd.to_datetime(df['date'], errors='coerce'))
The pandas.to_datetime() method and
many other Pandas methods take an errors
argument.
By default, the argument is set to "raise"
, which means that invalid parsing
raises an exception.
When the argument is set to coerce
, invalid parsing will be set as NaT
.
The argument can also be set to "ignore"
, in which case, invalid parsing
returns the input.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], 'date': ['2023-01-05', '2023-03-25', '2362-01-24'] }) # 0 2023-01-05 # 1 2023-03-25 # 2 2362-01-24 # Name: date, dtype: object print(pd.to_datetime(df['date'], errors='ignore'))
However, it is much better to set the errors
argument to "coerce"
because
timestamps have limitations.
"coerce"
, you are able to process the non-NaT data points.For nanosecond resolution, the time span that can be represented using a 64-bit integer is limited to 584 years:
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], 'date': ['2023-01-05', '2023-03-25', '2362-01-24'] }) # 1677-09-21 00:12:43.145224193 print(pd.Timestamp.min) # 2262-04-11 23:47:16.854775807 print(pd.Timestamp.max)
strptime()
to convert strings to datetime
objectsIf setting the errors
argument to coerce
doesn't suit your use case, try
using the
datetime.strptime()
method to convert the strings in the DataFrame
to datetime
objects.
import datetime as dt import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], 'date': ['2023-01-05', '2023-03-25', '2362-01-24'] }) df['date'] = df['date'].apply(lambda x: dt.datetime.strptime( x, '%Y-%m-%d') if isinstance(x, str) else pd.NaT) # name salary date # 0 Alice 175.1 2023-01-05 00:00:00 # 1 Bobby 180.2 2023-03-25 00:00:00 # 2 Carl 190.3 2362-01-24 00:00:00 print(df) print('-' * 50) # <class 'datetime.datetime'> print(type(df.iloc[0][2]))
If your date objects also include the time, make sure to update the format
string when calling strptime()
.
import datetime as dt import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], 'date': ['2023-01-05 09:21:00', '2023-03-25 08:11:00', '2362-01-24 07:15:00'] }) df['date'] = df['date'].apply(lambda x: dt.datetime.strptime( x, '%Y-%m-%d %H:%M:%S') if isinstance(x, str) else pd.NaT) # name salary date # 0 Alice 175.1 2023-01-05 09:21:00 # 1 Bobby 180.2 2023-03-25 08:11:00 # 2 Carl 190.3 2362-01-24 07:15:00 print(df) print('-' * 50) # <class 'datetime.datetime'> print(type(df.iloc[0][2]))
Depending on how your date
or datetime
string is formatted, you might have
to adjust the argument you pass to strptime()
.
The following table contains information about all of the available directives and their meaning.
Once you convert the values in the column to datetime
objects, you can process
them as you see fit.
We used the
DataFrame.apply
method to apply a function to each row of the date
column.
df['date'] = df['date'].apply(lambda x: dt.datetime.strptime( x, '%Y-%m-%d %H:%M:%S') if isinstance(x, str) else pd.NaT)
The lambda function we passed to apply()
converts each datetime
string to a
datetime
object.
If the value in the column is not of type string, then a pd.NaT
value is
returned.
This article has more
information on how the datetime.strptime()
method works.
You can learn more about the related topics by checking out the following tutorials: