Borislav Hadzhiev
Mon Sep 20 2021·3 min read
Photo by Victoria Bilsborough
In order to download an S3 folder to your local file system, use the s3 cp
AWS CLI command, passing in the --recursive
parameter. The
s3 cp
command will take the S3 source folder and the destination directory as
inputs.
Create a folder on your local file system where you'd like to store the
downloads from the bucket, open your terminal in that directory and run the
s3 cp
command.
At first, we will run the command in test mode by setting the --dryrun
parameter, to verify everything look good.
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER . --recursive --dryrun
The output of the dry run
shows that everything works as expected.
Now let's run the s3 cp
command in real mode and download the contents of
the S3 folder to our local file system.
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER . --recursive
YOUR_BUCKET
and YOUR_FOLDER
placeholders with the bucket and directory names.In the command above, we've set the --recursive
parameter. It makes the
s3 cp
command applicable to all files under the specified directory.
The .
characters signifies that the destination of the downloads is the
current directory.
To filter the files that are downloaded to our local file system from the
bucket, make use of the --include
and --exclude
parameters, when running the
cp
command.
The default behavior is that all files in the folder are included. For example,
to exclude all files with a .jpg
extension from the downloaded files, run the
following command:
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER . --recursive --exclude "*.jpg"
We can also exclude an entire directory from being downloaded:
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER/ . --recursive --exclude "movies/*"
The above command would exclude all files located in the movies/
directory,
e.g. movies/movie-1.mp4
and movies/movie-2.mp5
.
Conversely, to only download .txt
files we have to exclude everything and
include only files with the .txt
extension:
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER/ . --recursive --exclude "*" --include "*.txt"
It's important to note that the order is important when setting both the
--exclude
and --include
parameters.
The filters that appear later in the command have higher precedence. For
instance, the following command excludes all files because the --exclude
parameter overrides --include
.
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER/ . --recursive --include "*.jpg" --exclude "*"
However, if we specify the --include
parameter second, we would download all
.jpg
files from the specified directory.
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER/ . --recursive --exclude "*" --include "*.jpg"
Lastly, we can set the --exclude
and --include
parameters as many times a we
want, for instance the following command downloads all .jpg
and .png
files
from the bucket:
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER/ . --recursive --exclude "*" --include "*.jpg" --include "*.png"