Here is a interesting use case that I worked on recently, I had to process about 4Tb of  360° panoramic images stored in AWS S3 and generate tile images for them.

I had a lambda function which was listening on any s3:ObjectCreated event on an S3 bucket, which would in turn process the image and generate the tiles. So all I had to do was to copy the existing images from one the bucket to a temporary bucket and make the lambda function listen on the s3 event.

Pretty simple .. eh?

Well, here is the catch. I needed to control the rate at which the objects were copied in order to make sure that I do not shoot it over the roof and make the lambda function throttle. By default, an AWS account has a limit on how many lambda invocation can be made in parallel - only 1000/account.

I was using AWS CLI to copy the image from one bucket to another, and pretty soon, I was hitting that bottleneck, and the lambda functions started to throttle.


Luckily, AWS CLI S3 has some configurations to tweak concurrency settings, which I could easily tweak to adjust to my need.

Setting the max_concurrent_requests in your aws config (~/.aws/config)

s3 =
  max_concurrent_requests = 500
  max_queue_size = 10000
  use_accelerate_endpoint = true

I was to specify the max_concurrent_requests value, and after a few trial and errors and monitoring the results, I was able to control the objects transferred per second and able to keep it within limits.


While in my case I wanted to throttle the no of objects that were copied, tweaking the same configuration would also allow us to copy the objects much faster for a different use case. If you have resources on your machine to spawn multiple threads, then you increase the value of the max_concurrent_requests and have the objects copied much faster.


Hope its helpful for someone out there.