Download a Folder or an Entire Bucket from AWS S3

avatar
Borislav Hadzhiev

Last updated: Feb 26, 2024
5 min

banner

# Table of Contents

  1. How to Download a Folder from AWS S3
  2. Downloading an Entire S3 Bucket
  3. Downloading an S3 Bucket with Filters

# How to Download a Folder from AWS S3

Use the s3 cp command with the --recursive parameter to download an S3 folder to your local file system.

The s3 cp command takes the S3 source folder and the destination directory as inputs and downloads the folder.

Create a folder on your local file system where you'd like to store the downloads from the bucket, open your terminal in that directory and run the s3 cp command.

Let's set the --dryrun parameter to run the command in test mode to verify everything looks good.

shell
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER . --recursive --dryrun

download s3 folder test mode

The output of the dry run command shows that everything works as expected.

Now let's run the s3 cp command in real mode to download the contents of the S3 folder to our local file system.

shell
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER . --recursive

download s3 folder

Replace the YOUR_BUCKET and YOUR_FOLDER placeholders with the bucket and directory names.

We set the --recursive parameter in the command. It makes the s3 cp command applicable to all files under the specified directory.

The . character signifies that the destination of the downloads is the current directory.

# Set Filters when Downloading a Folder from AWS S3

Set the --include and --exclude parameters when running the cp command to filter the files that are downloaded to your local file system from a bucket.

The default behavior is that all files in the folder are included. For example, to exclude all files with a .jpg extension from being downloaded, run the following command.

shell
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER . --recursive --exclude "*.jpg"

exclude jpg files from downloads

We can also exclude an entire directory from being downloaded.

shell
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER/ . --recursive --exclude "movies/*"

The command excludes all files located in the movies/ directory, e.g. movies/movie-1.mp4 and movies/movie-2.mp5.

Conversely, to only download .txt files, we have to exclude everything and include only files with the .txt extension.

shell
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER/ . --recursive --exclude "*" --include "*.txt"

only download txt files

Note that the order is important when setting both the --exclude and --include parameters.

The filters that appear later in the command have higher precedence. For instance, the following command excludes all files because the --exclude parameter overrides --include.

shell
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER/ . --recursive --include "*.jpg" --exclude "*"

However, if we specify the --include parameter second, we would download all .jpg files from the specified directory.

shell
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER/ . --recursive --exclude "*" --include "*.jpg"

Lastly, we can set the --exclude and --include parameters as many times as necessary.

For instance, the following command downloads all .jpg and .png files from the S3 Bucket.

shell
aws s3 cp s3://YOUR_BUCKET/YOUR_FOLDER/ . --recursive --exclude "*" --include "*.jpg" --include "*.png"

# Table of Contents

  1. Downloading an Entire S3 Bucket
  2. Downloading an S3 Bucket with Filters

# Downloading an Entire S3 Bucket

To download an entire bucket to your local file system, use the AWS CLI sync command, passing it the S3 Bucket as a source and a directory on your file system as a destination, e.g. aws s3 sync s3://YOUR_BUCKET ..

The sync command recursively copies the contents of the source to the destination.

Let's first run the sync command in test mode by setting the --dryrun parameter. This demonstrates which files and directories would be downloaded to the local file system once the command is run.

shell
aws s3 sync s3://YOUR_BUCKET . --dryrun

test download s3 bucket

Make sure to replace the YOUR_BUCKET placeholder, otherwise, you would get an AccessDenied error.

The . character signifies the current directory. Create a directory that will store the files and folders of the S3 bucket and navigate to it before running the sync command in real mode.

Let's run the sync command to download all of the files and folders of the S3 bucket to the current directory on the local file system.

shell
aws s3 sync s3://YOUR_BUCKET .

download entire s3 bucket

# Downloading an S3 Bucket with Filters

To filter the files that are downloaded from the bucket to the local file system, make use of the --include and --exclude parameters when running the sync command, e.g. aws s3 sync s3://YOUR_BUCKET . --quiet --exclude "*.png".

By default, all files are included. To exclude all .png images from the downloaded files, run the following command:

shell
aws s3 sync s3://YOUR_BUCKET . --quiet --exclude "*.png"

download all except png

The command downloads all of the bucket's files, except for files with the .png extension.

We also used the --quiet parameter to suppress the sync command's output.

The --quiet parameter would also suppress any errors the command throws, which can be confusing and hard to debug.

We can also exclude an entire directory from being downloaded.

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "images/*"

We excluded all of the files located in the images/ folder, e.g. images/cat.jpg, images/dog.png, etc.

Conversely, to download only .pdf files to the local file system, you would exclude everything and include only files with the .pdf extension.

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "*" --include "*.pdf"

only download pdf files from bucket

Note that the order is important when specifying both the --exclude and --include parameters.

Filters that appear later in the command have higher precedence. For example, the following command excludes all files because the --exclude parameter overrides --include.

shell
aws s3 sync s3://YOUR_BUCKET . --include "*.png" --exclude "*"

However, if we specify the --include parameter second, we would download all .png images to our local file system:

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "*" --include "*.png"

Lastly, you can set the --exclude and --include parameters as many times as necessary. For example, the following command downloads all .txt and .pdf files from the bucket to the local file system.

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "*" --include "*.txt" --include "*.pdf"

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.