Last updated: Apr 8, 2024
Reading timeยท2 min
Use the urljoin
method from the urllib.parse
module to join a base URL
with another URL.
The urljoin
method constructs a full (absolute) URL by combining a base URL
with another URL.
from urllib.parse import urljoin base_url = 'https://bobbyhadz.com' path = 'images/static/cat.jpg' # โ Join a base URL with another URL result = urljoin(base_url, path) # ๐๏ธ https://bobbyhadz.com/images/static/cat.jpg print(result) # --------------------------------------- # โ Join URL path components when constructing a URL # ๐๏ธ /global/images/static/dog.png print(urljoin('/global/images/', 'static/dog.png'))
If you have multiple URL components, use the posixpath
module to join them
before passing them to the urljoin()
method.
import posixpath from urllib.parse import urljoin base_url = 'https://bobbyhadz.com' path_1 = 'images' path_2 = 'static' path_3 = 'cat.jpg' path = posixpath.join(path_1, path_2, path_3) print(path) # ๐๏ธ 'images/static/cat.jpg' result = urljoin(base_url, path) # ๐๏ธ https://bobbyhadz.com/images/static/cat.jpg print(result)
The urllib.parse.urljoin() method takes a base URL and another URL as parameters and constructs a full (absolute) URL by combining them.
You can also use the urljoin
method to join URL path components when
constructing a URL.
from urllib.parse import urljoin # โ Join URL path components # ๐๏ธ /global/images/static/dog.png print(urljoin('/global/images/', 'static/dog.png'))
Make sure the output you get is what you expect because the urljoin()
method
can be a bit confusing when working with URL components that don't end in a
forward slash /
.
Here is an example.
from urllib.parse import urljoin # ๐๏ธ /global/static/dog.png print(urljoin('/global/images', 'static/dog.png'))
Notice that the method stripped images
from the first component before joining
the second component.
The method behaves as expected when the first component ends with a forward slash.
from urllib.parse import urljoin # ๐๏ธ /global/images/static/dog.png print(urljoin('/global/images/', 'static/dog.png'))
You might also notice confusing behavior if the second component starts with a forward slash.
from urllib.parse import urljoin # ๐๏ธ /static/dog.png print(urljoin('/global/images', '/static/dog.png'))
When the second component starts with a forward slash, it is assumed to start at the root.
The posixpath.join()
method is a bit more predictable and could also be used
to join URL path components.
import posixpath # ๐๏ธ /global/images/static/dog.png print(posixpath.join('/global/images', 'static/dog.png')) # ๐๏ธ /global/images/static/dog.png print(posixpath.join('/global/images/', 'static/dog.png')) # ๐๏ธ /static/dog.png print(posixpath.join('/global/images', '/static/dog.png'))
The posixpath.join()
method can also be passed more than 2 paths.
import posixpath # ๐๏ธ /global/images/static/dog.png print(posixpath.join('/global', 'images', 'static', 'dog.png'))
You can learn more about the related topics by checking out the following tutorials: