Introduction
We will discuss how to increase the t timeout in this article, but first I have to admit there’s no way to directly increase the AWS API gateway timeout as at the time of this writing it’s 30 seconds we will talk more about working around that limitation.
I have been working in the Software industry for over 23 years now and I have been a software architect, manager, developer and engineer. I am a machine learning and crypto enthusiast with emphasis in security. I have experience in various industries such as entertainment, broadcasting, healthcare, security, education, retail and finance.
I’ll explain what is the best way to deal with it, but first let me explain why Amazon did this. They simply want the user to have a good experience so it makes no sense for the client to be sitting in their browser for minutes until an API request completes. As such and to preserve proper implementation our solution focuses on keeping the user happy and completing our long executing task either via a lambda or via ec2 if you are using a middleware proxy.
If you would like to familiarize yourself more on what API gateway is and how it’s being used you can read more about it on Amazon’s AWS API Gateway page. It has a lot of information of what limitations it has, why they are in place and what the best practices are.
Using status
Now that we understand the problem I’m going to discuss about how to solve this. I will start by talking on a method I refer too as the ‘status’ approach. This in layman’s terms means we introduced a variable either in a database or in a store like redis called status. This status will be used as the connecting point of our lambda execution and the web or app portal.
More specifically a way to deal with this is to have 2 lambda’s. The first lambda will take the task at hand and just schedule it for execution, how it gets scheduled is beyond the scope of this article but some high level ways to do this are using a queue system like SQS or the lambda invoking itself. So once the first lambda schedules the task then it can proceed and send back to the web client an API response similar to this:
{
"status": "SCHEDULED"
}
The webclient in return will now have to poll the same lambda periodically to check the status of the background task which is executing in parallel. For example if the task is still executing and the lambda can check the queue or the invocation of itself via a status in a persistent store like a database and provide an update to the webclient. So the webclient will be receiving something like this:
{
"status": "IN_PROGRESS"
}
Finally by repeating this process over and over the lambda that checks the status of the task will eventually respond with a result whatever it was meant to do or it will respond with a failure status message accordingly.
This approach lets us execute long tasks and be compatible with API Gateways short limit of 30 seconds.
Using AWS Batch
As we described above the solution we offered was to stick to the same technology stack that you were already using this was AWS Lambda in combination with API Gateway. Without deviating too much from this approach you can use a different technology that Amazon offers called AWS Batch. This lets you invoke workflows using Lambda Step functions that can be customized as a temporary docker container but also having no issues with time limits. Further more you can optionally combine this with SQS as I will describe below. The idea behind this to solve the short timing constraints you have with API Gateway.
Similar to what we described above we will be following a paradigm where instead of running a different lambda or putting something in an SQS queue one can leverage AWS Batch jobs. So the idea is that the lambda will be invoking an AWS Batch job which will in return start a workflow that executes the long pending task. In a similar fashion the web client or application frontend can in return check the status of the AWS Batch job and see when it completes. Once this is done the result will be reported back to the client. Using AWS Batch minimizes the amount of boiler plate code that needs to be written but has a bit of a maintenance and initial setup overhead in the cloud infrastructure. Furthermore the visualization aspect of being able to inspect and see details for all the batch jobs you have started helps with debugging and diagnosing potential problems that may rise from the execution of your task.
Conclusion
As you can see with the above proposed work-arounds the idea remains the same in both cases. You receive the incoming API request from whatever frontend is calling you and you somehow schedule or batch it to run in the background. At the same time your API returns a response back to your client without blocking or delaying them. I hope I didn’t mislead you with the how to increase AWS API Gateway timeout, the reality is that I wanted to give alternatives to get around that.
Whichever method you decide to go with you can weigh the pros and cons according to the stack you are using, if you have both technologies in your stack I would recommend to go with AWS Batch jobs as it’s easier to manage and debug in case of problems.
I hope I helped you find some workarounds this limitation if you know of any other more clever way to deal with that or have something else to recommend please leave it at the comments below.
If you found this article useful and you think it may have helped you please drop me a cheer below I would appreciate it.
If you have any questions, comments below I check periodically and try to answer them in the priority they come in.
Also if you have any corrections please do let me know and I’ll update the article with new updates or mistakes I did.
Do you like using API Gateway or you prefer to use another framework?
If you enjoyed this article you can find some similar ones below: