Last updated: Apr 12, 2024
Reading time·4 min
Use the pandas.to_datetime()
method to create a date column from year, month
and day.
The method can be passed a dictionary containing the year, month and day and
will convert the supplied arguments to datetime
.
import pandas as pd df = pd.DataFrame({ 'year': [2011, 2012, 2013, 2014], 'month': [1, 2, 3, 4], 'day': [10, 12, 14, 16], 'champion': ['Austria', 'Belgium', 'Canada', 'Denmark'] }) print(df) df['date'] = pd.to_datetime( dict(year=df.year, month=df.month, day=df.day) ) print('-' * 50) print(df)
Running the code sample produces the following output.
year month day champion 0 2011 1 10 Austria 1 2012 2 12 Belgium 2 2013 3 14 Canada 3 2014 4 16 Denmark -------------------------------------------------- year month day champion date 0 2011 1 10 Austria 2011-01-10 1 2012 2 12 Belgium 2012-02-12 2 2013 3 14 Canada 2013-03-14 3 2014 4 16 Denmark 2014-04-16
The pandas.to_datetime method takes an
argument and converts it to datetime
.
We used the dict()
class to construct a dictionary from the year
, month
and day
column values.
df['date'] = pd.to_datetime( dict(year=df.year, month=df.month, day=df.day) )
Note that the names of the columns you pass to pd.to_datetime()
have to be:
You can also include the time components when creating the date
column if
necessary.
import pandas as pd df = pd.DataFrame({ 'year': [2011, 2012, 2013, 2014], 'month': [1, 2, 3, 4], 'day': [10, 12, 14, 16], 'hour': [3, 1, 5, 10], 'minute': [32, 11, 22, 14], 'second': [5, 10, 15, 20], 'champion': ['Austria', 'Belgium', 'Canada', 'Denmark'] }) print(df) df['date'] = pd.to_datetime( dict(year=df.year, month=df.month, day=df.day, hour=df.hour, minute=df.minute, second=df.second) ) print('-' * 50) print(df)
Running the code sample produces the following output.
year month day hour minute second champion 0 2011 1 10 3 32 5 Austria 1 2012 2 12 1 11 10 Belgium 2 2013 3 14 5 22 15 Canada 3 2014 4 16 10 14 20 Denmark -------------------------------------------------- year month day ... second champion date 0 2011 1 10 ... 5 Austria 2011-01-10 03:32:05 1 2012 2 12 ... 10 Belgium 2012-02-12 01:11:10 2 2013 3 14 ... 15 Canada 2013-03-14 05:22:15 3 2014 4 16 ... 20 Denmark 2014-04-16 10:14:20
Make sure the keyword arguments you pass to the dict()
class when calling
pd.to_datetime()
are spelled correctly.
df['date'] = pd.to_datetime( dict(year=df.year, month=df.month, day=df.day, hour=df.hour, minute=df.minute, second=df.second) )
You can also only include the year
, month
, day
, hour
or year
, month
,
day
, hour
and minute
arguments, however, you must at least supply values
for year
, month
and day
.
DataFrame
You can also call the pandas.to_datetime()
method with a subset of the
DataFrame
that only contains the year
, month
and day
columns.
import pandas as pd df = pd.DataFrame({ 'year': [2011, 2012, 2013, 2014], 'month': [1, 2, 3, 4], 'day': [10, 12, 14, 16], 'champion': ['Austria', 'Belgium', 'Canada', 'Denmark'] }) print(df) df['date'] = pd.to_datetime( df[['year', 'month', 'day']] ) print('-' * 50) print(df)
Running the code sample produces the following output.
year month day champion 0 2011 1 10 Austria 1 2012 2 12 Belgium 2 2013 3 14 Canada 3 2014 4 16 Denmark -------------------------------------------------- year month day champion date 0 2011 1 10 Austria 2011-01-10 1 2012 2 12 Belgium 2012-02-12 2 2013 3 14 Canada 2013-03-14 3 2014 4 16 Denmark 2014-04-16
This time, we used bracket notation to only select a subset of the DataFrame
when calling pandas.to_datetime()
.
df['date'] = pd.to_datetime( df[['year', 'month', 'day']] )
Here is the output of the expression.
import pandas as pd df = pd.DataFrame({ 'year': [2011, 2012, 2013, 2014], 'month': [1, 2, 3, 4], 'day': [10, 12, 14, 16], 'champion': ['Austria', 'Belgium', 'Canada', 'Denmark'] }) # year month day # 0 2011 1 10 # 1 2012 2 12 # 2 2013 3 14 # 3 2014 4 16 print(df[['year', 'month', 'day']])
format
You can also use the format
argument of the pandas.to_datetime
method to
create a date column from year, month and day.
import pandas as pd df = pd.DataFrame({ 'year': [2011, 2012, 2013, 2014], 'month': [1, 2, 3, 4], 'day': [10, 12, 14, 16], 'champion': ['Austria', 'Belgium', 'Canada', 'Denmark'] }) df['date'] = pd.to_datetime( df.year*10000+df.month*100+df.day, format='%Y%m%d' ) # year month day champion date # 0 2011 1 10 Austria 2011-01-10 # 1 2012 2 12 Belgium 2012-02-12 # 2 2013 3 14 Canada 2013-03-14 # 3 2014 4 16 Denmark 2014-04-16 print(df)
We used the multiplication and addition operators to construct a number that we can format.
The format
argument is a string and represents the strftime
to parse a date.
You can view all of the available directives and their descriptions in this section of the docs.
You can learn more about the related topics by checking out the following tutorials: