How to extract Strings between Quotes in Python

avatar
Borislav Hadzhiev

Last updated: Apr 9, 2024
3 min

banner

# Extract strings between quotes in Python

Use the re.findall() method to extract strings between quotes.

The re.findall method will match the provided pattern in the string and will return a list containing the strings between the quotes.

main.py
import re my_str = 'Bobby "Hadz" Com "ABC"' my_list = re.findall(r'"([^"]*)"', my_str) print(my_list) # ๐Ÿ‘‰๏ธ ['Hadz', 'ABC'] print(my_list[0]) # ๐Ÿ‘‰๏ธ Hadz print(my_list[1]) # ๐Ÿ‘‰๏ธ ABC

extract strings between quotes

The code for this article is available on GitHub

The example extracts a string between double quotes.

If you need to extract a string between single quotes, use the following code sample instead.

main.py
import re my_str = "Bobby 'Hadz' Com 'ABC'" my_list = re.findall(r"'([^']*)'", my_str) print(my_list) # ๐Ÿ‘‰๏ธ ['Hadz', 'ABC']

The re.findall method takes a pattern and a string as arguments and returns a list of strings containing all non-overlapping matches of the pattern in the string.

Let's look at the regular expression in the first example.

main.py
import re my_str = 'Bobby "Hadz" Com "ABC"' my_list = re.findall(r'"([^"]*)"', my_str) print(my_list) # ๐Ÿ‘‰๏ธ ['Hadz', 'ABC'] print(my_list[0]) # ๐Ÿ‘‰๏ธ Hadz print(my_list[1]) # ๐Ÿ‘‰๏ธ ABC
The code for this article is available on GitHub
The regex starts and ends with double quotes because we want to match anything that is inside of double quotes in the string.

The parentheses () in the regular expression match whatever is inside and indicate the start and end of a group.

The group's contents can still be retrieved after the match.

The square brackets [] are used to indicate a set of characters.

The caret ^ at the beginning of the set means "NOT". In other words, match all characters that are NOT a double quote.

The asterisk * matches the preceding regular expression (anything but double quotes) zero or more times.

In its entirety, the regular expression matches zero or more characters that are not double quotes and are inside of double quotes.

You can also use this approach to extract strings from between single quotes.

main.py
import re my_str = "Bobby 'Hadz' Com 'ABC'" my_list = re.findall(r"'([^']*)'", my_str) print(my_list) # ๐Ÿ‘‰๏ธ ['Hadz', 'ABC']

All we had to do is wrap the group in single quotes instead of double quotes and place a single quote in the set of characters.

In its entirety, the regex matches zero or more characters that are not single quotes and are inside of single quotes.

# Extract strings between quotes using split()

You can also use the str.split() method to extract strings between quotes.

main.py
my_str = 'Bobby "Hadz" Com "ABC"' my_list = my_str.split('"')[1::2] print(my_list) # ๐Ÿ‘‰๏ธ ['Hadz', 'ABC']

extract strings between quotes using split

The code for this article is available on GitHub

The str.split() method splits the string into a list of substrings using a delimiter.

The method takes the following 2 parameters:

NameDescription
separatorSplit the string into substrings on each occurrence of the separator
maxsplitAt most maxsplit splits are done (optional)
main.py
my_str = 'Bobby "Hadz" Com "ABC"' # ๐Ÿ‘‡๏ธ ['Bobby ', 'Hadz', ' Com ', 'ABC', ''] print(my_str.split('"'))

We split the string on each occurrence of a double quote and used list slicing.

The syntax for list slicing is a_list[start:stop:step].

The start index is inclusive and the stop index is exclusive (up to, but not including).

If the start index is omitted, it is considered to be 0, if the stop index is omitted, the slice goes to the end of the list.

Python indexes are zero-based, so the first item in a list has an index of 0, and the last item has an index of -1 or len(a_list) - 1.

The slice list[1::2] starts at the second list item and selects every 2nd list item.

main.py
my_str = 'Bobby "Hadz" Com "ABC"' my_list = my_str.split('"')[1::2] print(my_list) # ๐Ÿ‘‰๏ธ ['Hadz', 'ABC']
The code for this article is available on GitHub

We start at the second list item to exclude the element before the first double quote.

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev