Split text by empty line in Python

avatar

Borislav Hadzhiev

Last updated: Jun 25, 2022

banner

Photo from Unsplash

Split text by empty line in Python #

To split text by empty line, split the string on two newline characters, e.g. my_str.split('\n\n') for POSIX encoded files and my_str.split('\r\n\r\n') for Windows encoded files.

main.py
my_str = """First line Second line Third line""" # 👇️ shows newline characters print(repr(my_str)) # 👉️ 'First line\n\nSecond line\n\nThird line' # 👇️ POSIX style print(my_str.split('\n\n')) # 👉️ ['First line', 'Second line', 'Third line'] # 👇️ Windows style print(my_str.split('\r\n\r\n'))

You can use the repr() function to print the newline characters in your string.

The newline characters should be:

  • \n for POSIX style encoded files
  • \r\n for Windows style encoded files
  • \r for old Mac encoded files

Once you see what newline characters your text uses, split on 2 of them.

main.py
my_str = """First line Second line Third line""" # 👇️ shows newline characters print(repr(my_str)) # 👉️ 'First line\n\nSecond line\n\nThird line' my_list = my_str.split('\n\n') print(my_list) # 👉️ ['First line', 'Second line', 'Third line'] for line in my_list: print(line)
We have to split on 2 newline characters because the first newline character is from the previous, non-empty line, and the second is from the empty line.

The str.split() method splits the string into a list of substrings using a delimiter.

The method takes the following 2 parameters:

NameDescription
separatorSplit the string into substrings on each occurrence of the separator
maxsplitAt most maxsplit splits are done (optional)

If the separator is not found in the string, a list containing only 1 element is returned.

If you get a list with a single element after calling split, then the separator you are using isn't found in the string.

main.py
my_str = """First line Second line Third line""" # 👇️ shows newline characters print(repr(my_str)) # 👉️ 'First line\n\nSecond line\n\nThird line' my_list = my_str.split('\r\n\r\n') print(my_list) # 👉️ ['First line\n\nSecond line\n\nThird line']

Use the repr() function, like in the example above, to print the newline characters in your string and make sure to split using the correct newline character.

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.