Download an Entire S3 Bucket - Complete Guide

avatar

Borislav Hadzhiev

Last updated: Jul 26, 2022

banner

Photo from Unsplash

Table of Contents #

  1. Downloading an Entire S3 Bucket
  2. Downloading an S3 Bucket with Filters

Downloading an Entire S3 Bucket #

To download an entire bucket to your local file system, use the AWS CLI sync command, passing it the s3 bucket as a source and a directory on your file system as a destination, e.g. aws s3 sync s3://YOUR_BUCKET ..

The sync command recursively copies the contents of the source to the destination.

Let's first run the sync command in test mode by setting the --dryrun parameter. This demonstrates which files and directories would be downloaded to the local file system once the command is run.

shell
aws s3 sync s3://YOUR_BUCKET . --dryrun

test download s3 bucket

Make sure to replace the YOUR_BUCKET placeholder, otherwise you would get an AccessDenied error.

The . character signifies the current directory. Create a directory that will store the files and folders of the S3 bucket and navigate to it before running the sync command in real mode.

Let's run the sync command to download all of the files and folders of the S3 bucket to the current directory on the local file system.

shell
aws s3 sync s3://YOUR_BUCKET .

download entire s3 bucket

Downloading an S3 Bucket with Filters #

To filter the files that are downloaded from the bucket to the local file system, make use of the --include and --exclude parameters when running the sync command, e.g. aws s3 sync s3://YOUR_BUCKET . --quiet --exclude "*.png".

By default, all files are included. To exclude all .png images from the downloaded files, run the following command:

shell
aws s3 sync s3://YOUR_BUCKET . --quiet --exclude "*.png"

download all except png

The command downloads all of the bucket's files, except for files with the .png extension.

We also used the --quiet parameter to suppress the sync command's output.

The --quiet parameter would also suppress any errors the command throws, which can be confusing and hard to debug.

We can also exclude an entire directory from being downloaded.

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "images/*"

We excluded all of the files located in the images/ folder, e.g. images/cat.jpg, images/dog.png, etc.

Conversely, to download only .pdf files to the local file system, you would exclude everything and include only files with the .pdf extension.

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "*" --include "*.pdf"

only download pdf files from bucket

Note that the order is important when specifying both the --exclude and --include parameters.

Filters that appear later in the command have higher precedence. For example, the following command excludes all files because the --exclude parameter overrides --include.

shell
aws s3 sync s3://YOUR_BUCKET . --include "*.png" --exclude "*"

However, if we specify the --include parameter second, we would download all .png images to our local file system:

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "*" --include "*.png"

Lastly, you can set the --exclude and --include parameters as many times as necessary. For example, the following command downloads all .txt and .pdf files from the bucket to the local file system.

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "*" --include "*.txt" --include "*.pdf"

Further Reading #

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.