This post contains affiliate links. If you use these links, I may earn a commission (at no cost to you). I only recommend products I use myself that solve a specific problem. In this post, you are recommended DigitalOcean Spaces, for S3-compatible storage with affordable and predictable pricing. Get started with a $200 credit when you use my referral.
While this blog post is written for DigitalOcean Spaces, if you're using some other s3 storage provider (and they support lifecycle rules - more on that shortly), this code should work!
Sometimes you only need to store files temporarily and want them to expire (and get deleted) after a certain amount of time.
For example, you might want to store logs for a year. Or create downloads or documents for users that are only stored for a few days - like a temp
directory.
๐ temp/
โโโ ๐ first-file.png
โโโ ๐ second-file.jpg
โโโ ๐ third-file.pdf
โโโ ๐ ... more files
But you donโt need those files forever. Forever is a long time, especially for file storage that you pay for every month! And that temp
folder is going to keep growing and growing.
DigitalOcean Spaces is pretty cheap, but even so, getting rid of old files when you (or your users) donโt need them anymore is good for the wallet - and the mind.
I originally thought I could just go in and delete old files periodically myself, "Why automate it?" I thought. Iโll just clean up annually and it wonโt take long. News flash - I didnโt.
A year goes by pretty quickly and these things just never get done. Or maybe they do get done, but not as often as you planned, and itโs just another thing weighing on your mind. Or is that just me?
And for every month you don't delete those files, you're paying to store them.
Since I never got around to deleting files manually, I started wondering, "How can I automate this?"
My first thought was to set up a cron job or set up my server to run a function once a week or once a month. This function would check through all the files and delete any that were older than a certain date.
But that wasnโt ideal because it uses bandwidth and resources to run the function.
It would involve listing all the files, checking through them all, and then sending requests to delete each file that was too old. And while that might not take too long, it could take quite a few requests.
So what if I grouped by files into date folders?
๐ temp/
โโโ ๐ 23-09-22/
โโโ ๐ first-file.png
โโโ ๐ second-file.jpg
โโโ ๐ third-file.pdf
โโโ ๐ 23-09-21/
โโโ ๐ first-file.pdf
โโโ ๐ second-file.jpg
โโโ ๐ third-file.jpg
Every day I could run a function that deletes the folder from 30 days earlier. Or 7 days. Or however long I want the files saved for.
But even if I saved the files in dated directories, you canโt just delete a folder in s3. Because s3 doesnโt actually have folders, the directory structure shown above actually looks more like this in a flat file system like s3:
๐ /temp/23-09-22/first-file.png
๐ /temp/23-09-22/second-file.jpg
๐ /temp/23-09-22/third-file.pdf
๐ /temp/23-09-21/first-file.pdf
๐ /temp/23-09-21/second-file.jpg
๐ /temp/23-09-21/third-file.jpg
And because there are no folders, you still need to list the contents and then delete files 1000 at a time. Which sure, would be better than the one-by-one approach above, but thereโs an even better way to do itโฆ
Lifecycle rules!
And luckily for me (and you if youโre reading this!), DigitalOcean Spaces supports lifecycle rules.
And it turns out itโs quicker to automate file deletion than it is to figure out how to delete files in a cron job every week or month. You can set a lifecycle rule on the bucket, and tell it to delete files that are older than a certain number of days, years or whatever.
And thatโs good because:
- you donโt need to remember to do it yourself
- you donโt need to run a server or cron job to do it
- you can set it up and forget about it!
But - thereโs a catch. Itโs not possible to set up lifecycle rules in the web interface. So youโre going to need to run a bit of code to get the lifecycle rule set up on your bucket.
How to set up s3 lifecycle rules with AWS SDK in Node.js
Since we can't use the web interface, we can use the AWS SDK instead to set up the lifecycle rule.
The good news is, that since this code uses the AWS SDK for JavaScript S3 Client for Node.js @aws-sdk/client-s3
, it should be pretty compatible with most s3 providers who support lifecycle rules on buckets.
This code example sets an expiration of 30 days, so don't forget to change that if you want a longer or shorter expiration.
const { S3Client, PutBucketLifecycleConfigurationCommand } = require("@aws-sdk/client-s3");
// connect to spaces
const s3 = new S3Client({
endpoint: "https://nyc3.digitaloceanspaces.com",
forcePathStyle: false,
region: "nyc3",
credentials: {
accessKeyId: process.env.YOUR_KEY,
secretAccessKey: process.env.YOUR_SECRET
}
});
// create the lifecycle policy
const command = new PutBucketLifecycleConfigurationCommand({
Bucket: "your-bucket-name",
LifecycleConfiguration: {
Rules: [{
ID: "autodelete_rule",
Expiration: { Days: 30 },
Status: "Enabled",
Prefix: "", // Unlike AWS in DO this parameter is required
}]
}
});
try {
await s3.send(command);
console.log("Lifecycle policy enabled!");
} catch (error) {
console.error(error);
}
Now, files get deleted automatically without any further action or brain space on my part! Nice!