Python InvalidURL: URL can't contain control characters

avatar
Borislav Hadzhiev

Last updated: Apr 10, 2024
3 min

banner

# Python InvalidURL: URL can't contain control characters

The Python urllib error "http.client.InvalidURL: URL can't contain control characters." occurs when the URL to which you're making a request contains spaces.

To solve the error, replace the spaces with %20 or simply remove them.

invalidurl url cant control characters

Here is the complete stack trace.

shell
File "/usr/lib/python3.10/http/client.py", line 1235, in _validate_host raise InvalidURL(f"URL can't contain control characters. {host!r} " http.client.InvalidURL: URL can't contain control characters. 'www.pyt hon.org' (found at least ' ')

Here is an example of how the error occurs.

main.py
import urllib.request url = 'http://www.python.org/ab cd' # โ›”๏ธ http.client.InvalidURL: URL can't contain control characters. '/ab cd' (found at least ' ') with urllib.request.urlopen(url) as f: print(f.read(300))

Notice that the URL contains a space in the path.

# Replace the spaces in the URL with %20

URLs cannot contain spaces. When you encode a URL, spaces are usually replaced with a plus + or with %20.

You str.replace method to replace all spaces in the URL with %20 to resolve the error.

main.py
import urllib.request url = 'http://www.python.org/ab cd' url = url.replace(' ', '%20') with urllib.request.urlopen(url) as f: print(f.read(300))
The code for this article is available on GitHub

If you print the value of the url variable, you will see that the space has been replaced with %20.

main.py
url = 'http://www.python.org/ab cd' url = url.replace(' ', '%20') print(url) # ๐Ÿ‘‰๏ธ http://www.python.org/ab%20cd

You should only replace spaces with %20 if your URL contains spaces in its path or query string.

In some cases, you might have to remove the spaces from the URL string by replacing them with empty strings.

main.py
import urllib.request url = 'http://www.pyt hon.org/' url = url.replace(' ', '') with urllib.request.urlopen(url) as f: print(f.read(300))

remove spaces from url

The code for this article is available on GitHub

You might only have spaces (%20) in the path or query string of the URL, but you will never have spaces in the domain, otherwise, the URL is invalid.

In this case, you should replace all spaces with an empty string.

If you print the value of the url variable, you will see that it doesn't contain any spaces.

main.py
import urllib.request url = 'http://www.pyt hon.org/' url = url.replace(' ', '') print(url) # ๐Ÿ‘‰๏ธ http://www.python.org/

The str.replace() method takes the following parameters:

NameDescription
oldThe substring we want to replace in the string
newThe replacement for each occurrence of old
countOnly the first count occurrences are replaced (optional)
Note that the method doesn't change the original string. Strings are immutable in Python.

# Using the urlparse and quote methods to encode the URL

You can also use the urlparse() and quote() methods from the urllib.parse module to encode the URL.

main.py
import urllib.request from urllib.parse import urlparse, quote url = 'http://www.python.org/ab cd ef' parsed_url = urlparse(url) url = parsed_url.scheme + '://' + parsed_url.netloc + \ quote(parsed_url.path) if parsed_url.query: url += '?' + quote(parsed_url.query) print(url) # ๐Ÿ‘‰๏ธ http://www.python.org/ab%20cd%20ef
The code for this article is available on GitHub

The URL string in the example contains spaces in its path, so we used the urlparse() method from the urllib.parse module.

The urlparse method takes a URL and parses it into six components.

main.py
from urllib.parse import urlparse, quote url = 'http://www.python.org/ab cd ef' parsed_url = urlparse(url) # ParseResult( # scheme='http', netloc='www.python.org', # path='/ab cd ef', params='', query='', fragment='') print(parsed_url)
Attribute NameDescription
schemereturns the protocol, e.g. http or https
netlocreturns the base URL.
pathreturns the hierarchical path.
paramsreturns the parameter for the last path element.
queryreturns the query string.
fragmentreturns the fragment identifier #.

The quote function takes a string and replaces special characters in the string using the %xx escape.

The function is mainly used for quoting the path section of a URL.

main.py
from urllib.parse import urlparse, quote url = 'http://www.python.org/ab cd ef' parsed_url = urlparse(url) # ๐Ÿ‘‡๏ธ /ab%20cd%20ef print(quote(parsed_url.path))

The code sample uses the addition (+) operator to concatenate the URL components and replaces the spaces in the path with %20.

main.py
import urllib.request from urllib.parse import urlparse, quote url = 'http://www.python.org/ab cd ef' parsed_url = urlparse(url) url = parsed_url.scheme + '://' + parsed_url.netloc + \ quote(parsed_url.path) if parsed_url.query: url += '?' + quote(parsed_url.query) print(url) # ๐Ÿ‘‰๏ธ http://www.python.org/ab%20cd%20ef
The code for this article is available on GitHub

We also check if the URL has a query string, in which case we append it to the URL string and escape the spaces if necessary.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2025 Borislav Hadzhiev