Pandas: Make new Column from string Slice of another Column

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
4 min

banner

# Table of Contents

  1. Pandas: Make new Column from string Slice of another Column
  2. Create a new Column from string Slice of another Column using apply()
  3. Create a Column from string Slice of another Column using find()

# Pandas: Make new Column from string Slice of another Column

To create a new column from a string slice of another column:

  1. Use the str attribute to get the string values of the given column.
  2. Use bracket notation to slice each string.
  3. Assign the sliced strings to the DataFrame using bracket notation.
main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'age': [29, 30, 31, 32], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) df['initials'] = df['name'].str[:1] print('-' * 50) print(df)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
name age salary 0 Alice 29 175.1 1 Bobby 30 180.2 2 Carl 31 190.3 3 Dan 32 205.4 -------------------------------------------------- name age salary initials 0 Alice 29 175.1 A 1 Bobby 30 180.2 B 2 Carl 31 190.3 C 3 Dan 32 205.4 D

create new column from string slice of another column

We used bracket notation [] to access the name column of the DataFrame.

main.py
df['initials'] = df['name'].str[:1]

You can then access the str attribute on the column to get the string values.

The syntax for string slicing is my_str[start:stop:step].

The start index is inclusive, whereas the stop index is exclusive (up to, but not including).

Python indexes are zero-based, so the first character in a string has an index of 0, and the last character has an index of -1 or len(my_str) - 1.

We used a stop index of 1 to only include the first character of each string in the new column.

Here is an example that takes the first two letters.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'age': [29, 30, 31, 32], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) df['initials'] = df['name'].str[:2] print('-' * 50) print(df)
The code for this article is available on GitHub

The code sample produces the following output.

shell
name age salary 0 Alice 29 175.1 1 Bobby 30 180.2 2 Carl 31 190.3 3 Dan 32 205.4 -------------------------------------------------- name age salary initials 0 Alice 29 175.1 Al 1 Bobby 30 180.2 Bo 2 Carl 31 190.3 Ca 3 Dan 32 205.4 Da

The slice starts at index 0 and goes up to but not including index 2.

main.py
df['initials'] = df['name'].str[:2]

You can also use the str.slice() method when slicing the row values.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'age': [29, 30, 31, 32], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) df['initials'] = df['name'].str.slice(0, 1) print('-' * 50) print(df)
The code for this article is available on GitHub

The DataFrame.str.slice() method slices substrings from each row of the DataFrame.

The first argument the method takes is the start index (inclusive).

The second argument the method takes is the stop index (exclusive).

# Create a new Column from string Slice of another Column using apply()

You can also use the DataFrame.apply() method to create a new column from a string slicing of another column.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'age': [29, 30, 31, 32], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) df['initials'] = df['name'].apply(lambda x: x[:1]) print('-' * 50) print(df)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
name age salary 0 Alice 29 175.1 1 Bobby 30 180.2 2 Carl 31 190.3 3 Dan 32 205.4 -------------------------------------------------- name age salary initials 0 Alice 29 175.1 A 1 Bobby 30 180.2 B 2 Carl 31 190.3 C 3 Dan 32 205.4 D

create new column from string slice of another column using apply

The DataFrame.apply() method applies a function along an axis of the DataFrame.

The lambda function we passed to the method gets called with the value of each row, accesses the first character and returns the result.

main.py
df['initials'] = df['name'].apply(lambda x: x[:1])

# Create a Column from string Slice of another Column using find()

In some cases, you might not know the slice indices in advance.

You can use the str.find() method to get the index of a common character that is then used when slicing.

main.py
import pandas as pd df = pd.DataFrame({ 'email': ['Alice@example.com', 'Bobby@example.com', 'Carl@example.com'], 'age': [29, 30, 31], 'salary': [175.1, 180.2, 190.3], }) print(df) df['stop_index'] = df['email'].str.find('@') df['name'] = df.apply(lambda x: x['email'][:x['stop_index']], axis=1) print('-' * 50) print(df)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
email age salary 0 Alice@example.com 29 175.1 1 Bobby@example.com 30 180.2 2 Carl@example.com 31 190.3 -------------------------------------------------- email age salary stop_index name 0 Alice@example.com 29 175.1 5 Alice 1 Bobby@example.com 30 180.2 5 Bobby 2 Carl@example.com 31 190.3 4 Carl

We used the find() method to get the index of each @ symbol in the email column.

main.py
df['stop_index'] = df['email'].str.find('@')

The next step is to use the apply() method to slice the email column with the results used as stop indices.

main.py
df['name'] = df.apply(lambda x: x['email'][:x['stop_index']], axis=1)

The axis argument determines the axis along which the supplied function is applied.

When the axis argument is set to 1, the function is applied to each row.

By default, the axis argument is set to 0, which means that the function is applied to each column.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.