Last updated: Apr 12, 2024
Reading time·3 min
The Pandas "TypeError: Cannot setitem on a Categorical with a new category
(X), set the categories first" occurs when you try to assign values outside
the categories of a Categorical
column.
Use the Series.map()
method to solve the error.
Here is an example of how the error occurs.
import pandas as pd df = pd.DataFrame({ 'A': [100, 0] }) print(df) df['A'] = pd.Categorical(df['A']) # ⛔️ TypeError: Cannot setitem on a Categorical with a new category (Yes), set the categories first df.loc['200', 'A'] = 'Yes'
We used the pandas.Categorical
class for the A
column.
The class is used to represent categorical variables.
All values of the Categorical are either in categories
(the second argument
the pandas.Categorical()
class takes) or are np.nan
.
Trying to assign values outside of categories
raises an error.
Series.map()
method to solve the errorOne way to solve the error is to use the Series.map() method instead.
import pandas as pd df = pd.DataFrame({ 'A': [100, 0] }) print(df) df['A'] = pd.Categorical(df['A']) df['A'] = df['A'].map({100: 'Yes', 0: 'No'}) print('-' * 50) print(df)
Running the code sample produces the following output.
A 0 100 1 0 -------------------------------------------------- A 0 Yes 1 No
The Series.map()
method maps the values of the Series
according to an input
mapping (or a function).
We passed a dictionary containing the previous row values and the new row values
to Series.map()
.
df['A'] = df['A'].map({100: 'Yes', 0: 'No'})
cat.rename_categories()
method to solve the errorYou can also use the Series.cat.rename_categories() method to solve the error.
import pandas as pd df = pd.DataFrame({ 'A': [100, 0] }) print(df) df['A'] = pd.Categorical(df['A']) df['A'] = df['A'].cat.rename_categories( {100: 'Yes', 0: 'No'} ) print('-' * 50) print(df)
Running the code sample produces the following output.
A 0 100 1 0 -------------------------------------------------- A 0 Yes 1 No
The
Series.cat.rename_categories()
method returns the Categorical
column with the given categories renamed.
We passed a dictionary to the method.
The dictionary stores a mapping from old categories to new ones.
df['A'] = df['A'].cat.rename_categories( {100: 'Yes', 0: 'No'} )
Categories that are not contained in the mapping are passed through and no errors are raised.
cat.add_categories()
methodIf you need to add new categories to the Categorical
column, use the
Series.cat.add_categorical()
method.
import pandas as pd df = pd.DataFrame({ 'A': [100, 0, None] }) print(df) df['A'] = pd.Categorical(df['A']) df['A'] = df['A'].cat.add_categories('Another') df['A'].fillna('Another', inplace=True) print('-' * 50) print(df)
Running the code sample produces the following output.
A 0 100.0 1 0.0 2 NaN -------------------------------------------------- A 0 100.0 1 0.0 2 Another
The cat.add_categories()
method takes a category or a list of categories and
adds them to the Categorical
column.
categories
(the second argument the pandas.Categorical()
class takes) or are np.nan
.Trying to assign values outside of categories
raises an error.
You can learn more about the related topics by checking out the following tutorials: