BlogNode.js

How to delete an S3 folder recursively with aws-sdk in Node.js

Updated by Codemzy on March 5th, 2024

In this blog post, we will create a function to delete a folder or directory in AWS S3. Because folders don't actually exist in S3, the function will get the objects at the prefix, and recursively delete them all.

This post contains affiliate links. If you use these links, I may earn a commission (at no cost to you). I only recommend products I use myself that solve a specific problem. In this post, you are recommended DigitalOcean Spaces, for S3-compatible storage with affordable and predictable pricing. Get started with a $200 credit when you use my referral.

You've probably got your S3 buckets organised into folders. For example, if you let users upload files, you might have a structure like:

📁 user1/
├── 📁 project1/
 ├── 📄 first-file.png
 ├── 📄 second-file.jpg
 └── 📄 third-file.pdf
└── 📁 project2/
 ├── 📄 first-file.pdf
 ├── 📄 second-file.jpg
 └── 📄 third-file.jpg

Let's imagine that user1 deletes project1. Since the project has been deleted, you don't need the files anymore. Let's delete all their file uploads so we can save on storage costs.

What we want to do is delete /user1/project1.

But there are no actual folders or directories in S3 buckets - just the concept of folders. Those folders are just object prefixes, pretending to be folders.

Amazon S3 has a flat structure instead of a hierarchy like you would see in a file system. However, for the sake of organizational simplicity, the Amazon S3 console supports the folder concept as a means of grouping objects. It does this by using a shared name prefix for objects (that is, objects have names that begin with a common string).

- AWS Organizing objects in the Amazon S3 console using folders

So that folder structure we think we have, isn't the reality of how our objects are stored.

// folder-based file system ❌
📁 user1/
├── 📁 project1/
 ├── 📄 first-file.png
 ├── 📄 second-file.jpg
 └── 📄 third-file.pdf
└── 📁 project2/
 ├── 📄 first-file.pdf
 ├── 📄 second-file.jpg
 └── 📄 third-file.jpg

// S3 flat file system ✅
📄 /user1/project1/first-file.png
📄 /user1/project1/second-file.jpg
📄 /user1/project1/third-file.pdf
📄 /user1/project2/first-file.pdf
📄 /user1/project2/second-file.jpg
📄 /user1/project2/third-file.jpg

Since folders don't really exist, there is no DeleteFolder or DeleteDirectory command. Instead, we'll need to delete all of the objects with the folder (in this case - user1/project1) prefix.

NodeJS @aws-sdk set up

If you already have @aws-sdk installed and configured, you can skip to the next section. Let's start by using the latest version (v3.x) of AWS SDK.

npm install @aws-sdk/client-s3

Now we will configure it with an S3-compatible service.

I'm a big fan of (and currently use) DigitalOcean Spaces as my S3-compatible object storage provider. So my setup looks like this:

const { S3Client } = require('@aws-sdk/client-s3');

const s3 = new S3Client({
  endpoint: "https://ams3.digitaloceanspaces.com",
  forcePathStyle: false,
  region: "ams3",
  credentials: {
    accessKeyId: process.env.S3_KEY,
    secretAccessKey: process.env.S3_SECRET
  }
});

If you use AWS directly, the setup is similar, but it will look more like this:

const { S3Client } = require('@aws-sdk/client-s3');

const s3 = new S3Client({
  region:'eu-west-1',
  credentials: {
    accessKeyId: process.env.S3_KEY,
    secretAccessKey: process.env.S3_SECRET
  }
});

I use DigitalOcean Spaces because as an indie developer, I value the predictable pricing and I found it much easier to get started compared to using AWS directly. I'd recommend it to a friend - you can get a $200 credit to try out DigitalOcean Spaces here.

In both the examples above, you will switch the region to wherever your buckets are located, and pass in your own credentials.

Ok, now we have S3 set up in NodeJS, let's start coding a deleteFolder function.

function deleteFolder() {
 // we will add the code here
};

ListObjects

We can't delete folders (since they don't exist), we can only delete objects. So to delete the folder, what you need to do instead is delete all of the objects in that folder* path.

*When I use the word folder in S3 going forward, what I really mean is the path before the filename - also known as the prefix.

Basically, any object with the prefix /user1/project1/... needs to get in the bin.

We can give the deleteFolder function a location argument, so we can pass the path of whatever folder we want to delete - e.g. deleteFolder("/user1/project1/").

function deleteFolder(location) {
 // we will add the code here
};

And we will use [ListObjectsV2](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html) (the latest version of ListObjects) to get a list of all the objects at that location.

Since that's an asynchronous function, we will make the deleteFolder function async so we can await the response when we send the command to our s3 client (that we set up earlier).

const { S3Client, ListObjectsV2Command } = require('@aws-sdk/client-s3');

// const s3 = new S3Client({ ...

async function deleteFolder(location) {
  const listCommand = new ListObjectsV2Command({
    Bucket: "your-bucket", // the bucket
    Prefix: location, // the 'folder'
  });
  let list = await s3.send(listCommand); // get the list
};

Ok, now we have our list of objects, it's time to do what we came for and delete them!

DeleteObjects

To delete the objects with the folder - now that we have the list of the objects - we can use the DeleteObjectsCommand.

We will need to pass in a list of Key's to delete (which is the full path to each object within our folder). We can get this from the ListObjectsV2Command response which is stored as the list variable.

list.Contents.map((item) => ({ Key: item.Key }));

This list will be passed to the DeleteObjectsCommand to tell it what files to delete.

We will only get Contents in the response if some items exist in the folder, so we will wrap the DeleteObjectsCommand in an if statement, to only run if there are files to delete.

const { S3Client, ListObjectsV2Command, DeleteObjectsCommand } = require('@aws-sdk/client-s3');

// const s3 = new S3Client({ ...

async function deleteFolder(location) {
  // get the files
  const listCommand = new ListObjectsV2Command({
    Bucket: "your-bucket", 
    Prefix: location,
  });
  let list = await s3.send(listCommand);
  if (list.KeyCount) { // if items to delete
    // delete the files
    const deleteCommand = new DeleteObjectsCommand({
      Bucket: "your-bucket",
      Delete: {
        Objects: list.Contents.map((item) => ({ Key: item.Key })), // array of keys to be deleted
        Quiet: false, // provides info on successful deletes
      },
    });
    let deleted = await s3.send(deleteCommand); // delete the files
    // log any errors deleting files
    if (deleted.Errors) {
      deleted.Errors.map((error) => console.log(`${error.Key} could not be deleted - ${error.Code}`));
    }
    return `${deleted.Deleted.length} files deleted.`;
  }
};

Make it recursive

So far our code works (yay!), but it will only delete the first 1,000 objects in a folder. And that's because ListObjectsV2 only returns up to 1,000 objects, and DeleteObjects will only delete up to 1,000 objects.

Returns some or all (up to 1,000) of the objects in a bucket with each request.

- AWS ListObjectsV2

That might be fine for your use case, but if you may have more than 1,000 objects on a folder path, we need to do a little more work to delete them all.

Let's make it recursive, just in case!

If ListObjectsV2 returns a NextContinuationToken we know there are more objects to fetch. So we can wrap all of our code inside a recursiveDelete function, and call it again after we have deleted the first 1,000 files, to fetch the next batch of keys.

async function deleteFolder(location) {
  let bucket = "your-bucket"; // your bucket name
  let count = 0; // number of files deleted
  async function recursiveDelete(token) {
    // get the files
    const listCommand = new ListObjectsV2Command({
      Bucket: bucket, 
      Prefix: location,
      ContinuationToken: token
    });
    let list = await s3.send(listCommand);
    if (list.KeyCount) { // if items to delete
      // delete the files
      const deleteCommand = new DeleteObjectsCommand({
        Bucket: bucket,
        Delete: {
          Objects: list.Contents.map((item) => ({ Key: item.Key })),
          Quiet: false,
        },
      });
      let deleted = await s3.send(deleteCommand);
      count += deleted.Deleted.length;
      // log any errors deleting files
      if (deleted.Errors) {
        deleted.Errors.map((error) => console.log(`${error.Key} could not be deleted - ${error.Code}`));
      }
    }
    // repeat if more files to delete
    if (list.NextContinuationToken) {
      recursiveDelete(list.NextContinuationToken);
    }
    // return total deleted count when finished
    return `${count} files deleted.`;
  };
  // start the recursive function
  return recursiveDelete();
};

I've added a count variable to keep track of the total number of files deleted.

And that's it! You can now delete S3 folders from Node.js with the deleteFolder function.

Add a bucket argument

So far, we hard-coded the bucket variable.

let bucket = "your-bucket"; // your bucket name

Since you might be working with multiple buckets in your service, let's instead have the deleteFolder function take a second bucket argument.

I'll pass an object as a parameter to the deleteFolder function so we don't have to worry about the order.

async function deleteFolder({ bucket, location }) {
  let count = 0; // number of files deleted
  async function recursiveDelete(token) {
    // get the files
    const listCommand = new ListObjectsV2Command({
      Bucket: bucket, 
      Prefix: location,
      ContinuationToken: token
    });
    let list = await s3.send(listCommand);
    if (list.KeyCount) { // if items to delete
      // delete the files
      const deleteCommand = new DeleteObjectsCommand({
        Bucket: bucket,
        Delete: {
          Objects: list.Contents.map((item) => ({ Key: item.Key })),
          Quiet: false,
        },
      });
      let deleted = await s3.send(deleteCommand);
      count += deleted.Deleted.length;
      // log any errors deleting files
      if (deleted.Errors) {
        deleted.Errors.map((error) => console.log(`${error.Key} could not be deleted - ${error.Code}`));
      }
    }
    // repeat if more files to delete
    if (list.NextContinuationToken) {
      recursiveDelete(list.NextContinuationToken);
    }
    // return total deleted count when finished
    return `${count} files deleted.`;
  };
  // start the recursive function
  return recursiveDelete();
};

If you only plan to use one bucket, you can still add that as a default so that you only need to pass the bucket argument if you switch from the default.

async function deleteFolder({ bucket = "your-bucket", location }) {
  // ...
};

Final Code

const { S3Client, ListObjectsV2Command, DeleteObjectsCommand } = require('@aws-sdk/client-s3');

// s3 client
const s3 = new S3Client({
  region: "your-region",
  credentials: {
    accessKeyId: process.env.S3_KEY,
    secretAccessKey: process.env.S3_SECRET
  }
});

// delete all files in a folder on s3
async function deleteFolder({ bucket, location }) {
  let count = 0; // number of files deleted
  async function recursiveDelete(token) {
    // get the files
    const listCommand = new ListObjectsV2Command({
      Bucket: bucket, 
      Prefix: location,
      ContinuationToken: token
    });
    let list = await s3.send(listCommand);
    if (list.KeyCount) { // if items to delete
      // delete the files
      const deleteCommand = new DeleteObjectsCommand({
        Bucket: bucket,
        Delete: {
          Objects: list.Contents.map((item) => ({ Key: item.Key })),
          Quiet: false,
        },
      });
      let deleted = await s3.send(deleteCommand);
      count += deleted.Deleted.length;
      // log any errors deleting files
      if (deleted.Errors) {
        deleted.Errors.map((error) => console.log(`${error.Key} could not be deleted - ${error.Code}`));
      }
    }
    // repeat if more files to delete
    if (list.NextContinuationToken) {
      recursiveDelete(list.NextContinuationToken);
    }
    // return total deleted count when finished
    return `${count} files deleted.`;
  };
  // start the recursive function
  return recursiveDelete();
};

If you want your files to be deleted automatically after a certain amount of time, you can use lifecycle rules in s3 instead!