Last updated: Apr 9, 2024
Reading timeยท6 min
Use the re.sub()
method to remove all non-numeric characters from a
string.
The re.sub()
method will remove all non-numeric characters from the string
by replacing them with empty strings.
import re my_str = 'bo_1bby_2_ha_3_dz.com' result = re.sub(r'[^0-9]', '', my_str) print(result) # ๐๏ธ '123'
If you need to remove all non-numeric characters except for the dot ".", click on the following subheading.
We used the re.sub()
method to remove all non-numeric characters from a
string.
The re.sub method returns a new string that is obtained by replacing the occurrences of the pattern with the provided replacement.
import re my_str = '1bobby, 2hadz, 3com' result = re.sub(r'[^0-9]', '', my_str) print(result) # ๐๏ธ 123
If the pattern isn't found, the string is returned as is.
The first argument we passed to the re.sub()
method is a regular expression.
The square brackets []
are used to indicate a set of characters.
^
, all characters that are not in the set will be matched.In other words, our set matches any character that is not a digit in the range
0-9
.
The second argument we passed to the re.sub()
method is the replacement for
each match.
import re my_str = 'bo_1bby_2_ha_3_dz.com' result = re.sub(r'[^0-9]', '', my_str) print(result) # ๐๏ธ '123'
We want to remove all non-numeric characters, so we replace each with an empty string.
There is also a shorthand for the [^0-9]
character set.
import re my_str = 'a1s2d3f4g5' result = re.sub(r'\D', '', my_str) print(result) # ๐๏ธ '12345'
The \D
special character matches any character that is not a digit. It is very
similar to the [^0-9]
character set but includes more digit characters.
This is a three-step process:
str.isdigit()
character to check if each character is a digit.str.join()
method to join the digits into a string.my_str = 'bo_1bby_2_ha_3_dz.com' result = ''.join(char for char in my_str if char.isdigit()) print(result) # ๐๏ธ '123'
We used a generator expression to iterate over the string.
On each iteration, we use the str.isdigit()
method to check if the current
character is a digit and return the result.
The generator object only contains the digits from the string.
my_str = 'bo_1bby_2_ha_3_dz.com' # ๐๏ธ ['1', '2', '3'] print(list(char for char in my_str if char.isdigit()))
The last step is to join the digits into a string.
my_str = 'bo_1bby_2_ha_3_dz.com' result = ''.join(char for char in my_str if char.isdigit()) print(result) # ๐๏ธ '123'
The str.join() method takes an iterable as an argument and returns a string which is the concatenation of the strings in the iterable.
The string the method is called on is used as the separator between the elements.
For our purposes, we called the join()
method on an empty string to join the
digits without a separator.
If you need to remove the non-numeric characters except for ".", use the
re.sub()
method.
import re my_str = 'a3.1b4c' result = re.sub(r'[^0-9.]', '', my_str) print(result) # ๐๏ธ '3.14'
We used the re.sub()
method to remove all non-numeric characters except for
dot from a string.
The re.sub() method returns a new string that is obtained by replacing the occurrences of the pattern with the provided replacement.
If the pattern isn't found, the string is returned as is.
The first argument we passed to the re.sub()
method is a regular expression.
The square brackets []
are used to indicate a set of characters.
^
, all characters that are not in the set will be matched.In other words, our set matches any character that is not a digit in the range
0-9
or a dot.
The second argument we passed to the re.sub()
method is the replacement for
each match.
import re my_str = 'a3.1b4c' result = re.sub(r'[^0-9.]', '', my_str) print(result) # ๐๏ธ '3.14'
We want to remove all non-numeric characters or dots, so we replace each with an empty string.
There is also a shorthand for the 0-9
range.
import re my_str = 'a3.1b4c' result = re.sub(r'[^\d.]', '', my_str) print(result) # ๐๏ธ '3.14'
The \d
character matches any Unicode decimal digit. This includes [0-9]
, and
many other digit characters.
This is a three-step process:
str.join()
method to join the characters that pass the test.my_str = 'a3.1b4c' result = ''.join(char for char in my_str if char in '123456789.') print(result) # ๐๏ธ '3.14'
We used a generator expression to iterate over the string.
On each iteration, we check if the current character is a digit or a dot and return the result.
The in operator tests
for membership. For example, x in s
evaluates to True
if x
is a member of
s
, otherwise it evaluates to False
.
The generator object only contains the digits and dots from the string.
my_str = 'a3.1b4c' # ๐๏ธ ['3', '.', '1', '4'] print(list(char for char in my_str if char in '123456789.'))
The last step is to join the digits and the dot into a string.
my_str = 'a3.1b4c' result = ''.join(char for char in my_str if char in '123456789.') print(result) # ๐๏ธ '3.14'
The str.join() method takes an iterable as an argument and returns a string which is the concatenation of the strings in the iterable.
The string the method is called on is used as the separator between the elements.
For our purposes, we called the join()
method on an empty string to join the
digits and the dot without a separator.
for
loopYou can also use a for loop to remove the non-numeric characters from a string.
my_str = 'bo_1bby_2_ha_3_dz.com' result = '' for char in my_str: if char.isdigit(): result += char print(result) # ๐๏ธ 123
We declared a new variable and initialized it to an empty string.
On each iteration of the for
loop, we check if the current character is a
digit.
If the condition is met, we add the character to the result
variable.
for
loopThe same approach can be used to remove all non-numeric characters from a string
except for the dot .
.
my_str = 'a3.1b4c' result = '' for char in my_str: if char.isdigit() or char == '.': result += char print(result) # ๐๏ธ 3.14
On each iteration, we check if the current character is a digit or a period.
If the condition is met, we add the character to the result
variable.
You can learn more about the related topics by checking out the following tutorials: