Split a string by tab in Python

avatar

Borislav Hadzhiev

Last updated: Jun 24, 2022

banner

Photo from Unsplash

Split a string by tab in Python #

Use the str.split() method to split a string by tabs, e.g. my_list = my_str.split('\t'). The str.split method will split the string on each occurrence of a tab and will return a list containing the results.

main.py
import re # ✅ split string on each occurrence of tab my_str = 'one\ttwo\tthree\tfour' my_list = my_str.split('\t') print(my_list) # 👉️ ['one', 'two', 'three', 'four'] # ----------------------------- # ✅ split string by one or more consecutive tabs my_list_2 = re.split(r'\t+', my_str) print(my_list_2) # 👉️ ['one', 'two', 'three', 'four']

The str.split() method splits the string into a list of substrings using a delimiter.

The method takes the following 2 parameters:

NameDescription
separatorSplit the string into substrings on each occurrence of the separator
maxsplitAt most maxsplit splits are done (optional)

If the separator is not found in the string, a list containing only 1 element is returned.

main.py
my_str = 'one' my_list = my_str.split('\t') # 👇️ ['one'] print(my_list)

If your string starts with or ends with a tab, you would get empty string elements in the list.

main.py
my_str = '\tone\ttwo\tthree\tfour\t' my_list = my_str.split('\t') print(my_list) # 👉️ ['', 'one', 'two', 'three', 'four', '']

You can use the filter() function to remove any empty strings from the list.

main.py
my_str = '\tone\ttwo\tthree\tfour\t' my_list = list(filter(None, my_str.split('\t'))) print(my_list) # 👉️ ['one', 'two', 'three', 'four']

The filter function takes a function and an iterable as arguments and constructs an iterator from the elements of the iterable for which the function returns a truthy value.

If you pass None for the function argument, all falsy elements of the iterable are removed.

All values that are not truthy are considered falsy. The falsy values in Python are:

  • constants defined to be falsy: None and False.
  • 0 (zero) of any numeric type
  • empty sequences and collections: "" (empty string), () (empty tuple), [] (empty list), {} (empty dictionary), set() (empty set), range(0) (empty range).

Note that the filter() function returns a filter object, so we have to use the list() class to convert the filter object to a list.

An alternative is to use the re.split() method.

Use the re.split() method to split a string by tab, e.g. my_list = re.split(r'\t+', my_str). The re.split() method will split the string on each occurrence of a tab and return a list containing the results.

main.py
import re my_str = 'one\t\t\ttwo\t\tthree\tfour' my_list_2 = re.split(r'\t+', my_str) print(my_list_2) # 👉️ ['one', 'two', 'three', 'four']

The re.split method takes a pattern and a string and splits the string on each occurrence of the pattern.

The \t character matches tabs.

The plus + is used to match the preceding character (tab) 1 or more times.

In its entirety, the regular expression matches one or more tab characters.

This is useful when you want to count multiple consecutive tabs as a single tab when splitting the string.

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.