Check if string contains only letters, numbers, hyphens and underscores

avatar

Borislav Hadzhiev

Last updated: Sep 19, 2022

banner

Photo from Unsplash

Check if string contains only letters, numbers, hyphens and underscores #

Use the re.match() method to check if a string contains only letters, numbers, hyphens and underscores, e.g. re.match(r'^[a-zA-Z0-9_-]*$', string). The re.match() method will return a match object if the string contains only the allowed characters, otherwise None is returned.

main.py
import re def validate(string): return re.match(r'^[a-zA-Z0-9_-]*$', string) # 👇️ <re.Match object; span=(0, 18), match='bobby-hadz-com_123'> print(validate('bobby-hadz-com_123')) # 👇️ <re.Match object; span=(0, 0), match=''> print(validate('')) # 👇️ None print(validate('bobbyhadz.com')) # 👇️ None print(validate(' '))
If you're looking for a non-regex solution, scroll down to the next subheading.

The re.match method returns a match object if the provided regular expression is matched in the string.

The match method returns None if the string doesn't match the regex pattern.

The first argument we passed to the re.match method is a regular expression.

The square brackets [] are used to indicate a set of characters.

main.py
import re def validate(string): return re.match(r'^[a-zA-Z0-9_-]*$', string) if validate('bobby-hadz-com_123'): # 👇️ this runs print('The string contains only letters, numbers, hyphens and underscores') else: print('The string does NOT contain only letters, numbers, hyphens and underscores')
The a-z and A-Z characters represent lowercase and uppercase ranges of letters.

The 0-9 characters represent a range of digits from 0 to 9.

The last two characters are an underscore and a hyphen.

You can tweak the regex by adding characters between the square brackets.

The caret ^ matches the start of the string and the dollar sign $ matches the end of the string.

The asterisk * matches the preceding character (letters, numbers, hyphens and underscores) zero or more times.

When the asterisk * is used, the regex is matched in empty strings.

If you don't watch to get a match for empty strings, use the + character instead of an asterisk.

main.py
import re def validate(string): return re.match(r'^[a-zA-Z0-9_-]+$', string) # 👇️ <re.Match object; span=(0, 18), match='bobby-hadz-com_123'> print(validate('bobby-hadz-com_123')) print(validate('')) # 👉️ None print(validate('bobbyhadz.com')) # 👉️ None print(validate(' ')) # 👉️ None
The plus + causes the regular expression to match 1 or more repetitions of the preceding character.

You can use the bool() class if you'd rather return a boolean from the function instead of a match object or None.

main.py
import re def validate(string): return bool(re.match(r'^[a-zA-Z0-9_-]+$', string)) print(validate('bobby-hadz-com_123')) # 👉️ True print(validate('')) # 👉️ False print(validate('bobbyhadz.com')) # 👉️ False print(validate(' ')) # 👉️ False

The bool() class takes a value and converts it to a boolean (True or False).

If you ever need help reading or writing a regular expression, consult the regular expression syntax subheading in the official docs.

The page contains a list of all of the special characters with many useful examples.

Check if string contains only letters, numbers, hyphens and underscores using issubset() #

To check if a string contains only letters, numbers, hyphens and underscores:

  1. Use the string module to construct a string of the allowed characters.
  2. Use the set() class to convert the string to a set object.
  3. Use the issubset() method to check if the string only contains the specified characters.
main.py
import string allowed_characters = string.ascii_letters + string.digits + '_-' # 👇️ abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_- print(allowed_characters) if set('bobby-hadz-com_123').issubset(allowed_characters): # 👇️ this runs print('The string contains only letters, numbers, hyphens and underscores') else: print('The string does NOT contain only letters, numbers, hyphens and underscores') # 👇️ True print(set('bobby-hadz-com_123').issubset(allowed_characters)) # 👇️ False print(set('a b ! @').issubset(allowed_characters))

We used the string module to construct a string containing all ASCII letters and digits and added an underscore and a hyphen.

We then used the set() class to convert the string to a set.

Set objects are an unordered collection of unique elements.

The set.issubset method tests if every element of the set is in the provided sequence.

The condition evaluates to True only if all of the characters of the string are present in the allowed_characters string.

You can use the addition (+) operator if you need to tweak the string of allowed characters.

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.