Join URL path components when constructing a URL in Python

avatar

Borislav Hadzhiev

Last updated: Jun 18, 2022

banner

Photo from Unsplash

Join URL path components when constructing a URL in Python #

Use the urljoin method from the urllib.parse module to join URL path components when constructing a URL, e.g. urljoin('/global/images/', 'static/nature.webp'). The urljoin method constructs a full (absolute) URL by combining a base URL with another URL.

main.py
from urllib.parse import urljoin # 👇️ /global/images/static/nature.webp print(urljoin('/global/images/', 'static/nature.webp'))

The urllib.parse.urljoin method takes a base URL and another URL as parameters and constructs a full (absolute) URL by combining them.

If you have multiple URL components, use the posixpath module to join them before passing them to the urljoin() method.
main.py
import posixpath from urllib.parse import urljoin path_1 = 'images' path_2 = 'static' path_3 = 'nature.webp' path = posixpath.join(path_1, path_2, path_3) print(path) # 👉️ images/static/nature.webp result = urljoin('https://example.com', path) print(result) # 👉️ https://example.com/images/static/nature.webp

When joining URL path components with the urljoin method, make sure the output you get is what you expect.

The output from the method can be a bit confusing when the first URL component doesn't end with a forward slash /.

main.py
from urllib.parse import urljoin # 👇️ /global/static/nature.webp print(urljoin('/global/images', 'static/nature.webp'))
Notice that the method stripped images from the first component before joining the second component.

The method behaves as expected when the first component ends with a forward slash.

main.py
from urllib.parse import urljoin # 👇️ /global/images/static/nature.webp print(urljoin('/global/images/', 'static/nature.webp'))

You might also notice confusing behavior if the second component starts with a forward slash.

main.py
from urllib.parse import urljoin # 👇️ /static/nature.webp print(urljoin('/global/images/', '/static/nature.webp'))

When the second component starts with a forward slash, it is assumed to start at the root.

The posixpath.join() method is a bit more predictable and could also be used to join URL path components.

main.py
import posixpath # 👇️ /global/images/static/nature.webp print(posixpath.join('/global/images', 'static/nature.webp')) # 👇️ /global/images/static/nature.webp print(posixpath.join('/global/images/', 'static/nature.webp')) # 👇️ /static/nature.webp print(posixpath.join('/global/images', '/static/nature.webp'))

The posixpath.join method can also be passed more than 2 paths.

main.py
import posixpath # 👇️ /global/images/static/nature.webp print(posixpath.join('/global', 'images', 'static', 'nature.webp'))
I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.