Download an Entire S3 Bucket - Complete Guide

avatar

Borislav Hadzhiev

Mon Sep 20 20213 min read

Table of Contents #

  1. Downloading an Entire S3 Bucket
  2. Downloading an S3 Bucket with Filters

Downloading an Entire S3 Bucket #

To download an entire bucket to your local file system, use the AWS CLI sync command, passing in the s3 bucket as a source and a directory on your file system as a destination.

The sync command recursively copies the contents of the source to the destination.

Let's first run the sync command in test mode, by setting the --dryrun parameter. This demonstrates which files and directories would be downloaded to your local file system once the command is executed.

shell
aws s3 sync s3://YOUR_BUCKET . --dryrun

test download s3 bucket

Make sure to replace the YOUR_BUCKET placeholder, otherwise you would get an AccessDenied error.

The . characters signifies the current directory. Create a directory that will store the files and folders of the S3 bucket and navigate to it before running the sync command in real mode.

Let's run the sync command to download all of the files and folders of the S3 bucket to the current directory of our local file system:

shell
aws s3 sync s3://YOUR_BUCKET .

download entire s3 bucket

Downloading an S3 Bucket with Filters #

To filter the files that are downloaded from the bucket to our local file system, make use of the --include and --exclude parameters, when executing the sync command.

By default all files are included. To exclude all .png images from the downloaded files, run the following command:

shell
aws s3 sync s3://YOUR_BUCKET . --quiet --exclude "*.png"

download all except png

The above command downloads all of the bucket's files, except for files with the .png extension.

In the call to the sync command we also used the --quiet parameter to suppress its output.

The --quiet parameter would also suppress any errors the command throws, which can be confusing and hard to debug.

We can also exclude an entire directory from being downloaded:

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "images/*"

In the command above, we've excluded all of the files located in the images/ folder, for example images/cat.jpg, images/dog.png, etc.

Conversely, to only download .pdf files to the local file system, you would exclude everything and only include files with the .pdf extension:

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "*" --include "*.pdf"

only download pdf files from bucket

It's important to note that the order is important when specifying both the --exclude and --include parameters.

Filters that appear later in the command have higher precedence, for example the following command excludes all files, because the --exclude parameter overrides --include.

shell
aws s3 sync s3://YOUR_BUCKET . --include "*.png" --exclude "*"

However, if we specify the --include parameter second, we would download all .png images to our local file system:

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "*" --include "*.png"

Lastly, you can set the --exclude and --include parameters as many times as you want, for example the following command downloads all .txt and .pdf files from the bucket, to our local file system:

shell
aws s3 sync s3://YOUR_BUCKET . --exclude "*" --include "*.txt" --include "*.pdf"

Further Reading #

Join my newsletter

I'll send you 1 email a week with links to all of the articles I've written that week

Buy Me A Coffee