Hi,
Getting the output of 'email' API for the build usually takes a long time. In some cases we're close to hitting the 30 seconds timeout. I think the timeout is inevitable when there are a lot of results with big number of changes. In order to avoid timeout, maybe the API should do the work in background? This would work as follows 1. GET call to /api/builds/<id>/email 2. server creates a 'cashed report' object in database and returns URL for it immediately to the user 3. in the background server adds a report generation task to the queue 4. using the URL received in 2) user is able to retrieve the final results or check the progress 5. once the result is generated it can be a short lived object in the database (removed after 1 day for example)
Is this the solution we should aim for? The downside is that it requires active polling from the client side.
milosz
On Tue, Oct 30, 2018 at 05:30:01PM +0000, Milosz Wasilewski wrote:
Hi,
Getting the output of 'email' API for the build usually takes a long time. In some cases we're close to hitting the 30 seconds timeout. I think the timeout is inevitable when there are a lot of results with big number of changes. In order to avoid timeout, maybe the API should do the work in background? This would work as follows
- GET call to /api/builds/<id>/email
- server creates a 'cashed report' object in database and returns URL
for it immediately to the user 3. in the background server adds a report generation task to the queue 4. using the URL received in 2) user is able to retrieve the final results or check the progress 5. once the result is generated it can be a short lived object in the database (removed after 1 day for example)
I think it's a good idea, and might be a useful pattern for other query/report types in the future.
It may need a different endpoint than /email - perhaps /email-async or /email?async=true? It's best not to break existing behavior, which is probably suitable for most projects.
What would the maximum delay be? I hope it wouldn't get put at the end of the queue and subject to 10's of minutes. That might break the expectations of some clients.
Dan
Is this the solution we should aim for? The downside is that it requires active polling from the client side.
milosz _______________________________________________ Squad-dev mailing list Squad-dev@lists.linaro.org https://lists.linaro.org/mailman/listinfo/squad-dev
On Tue, 30 Oct 2018 at 17:55, Dan Rue dan.rue@linaro.org wrote:
On Tue, Oct 30, 2018 at 05:30:01PM +0000, Milosz Wasilewski wrote:
Hi,
Getting the output of 'email' API for the build usually takes a long time. In some cases we're close to hitting the 30 seconds timeout. I think the timeout is inevitable when there are a lot of results with big number of changes. In order to avoid timeout, maybe the API should do the work in background? This would work as follows
- GET call to /api/builds/<id>/email
- server creates a 'cashed report' object in database and returns URL
for it immediately to the user 3. in the background server adds a report generation task to the queue 4. using the URL received in 2) user is able to retrieve the final results or check the progress 5. once the result is generated it can be a short lived object in the database (removed after 1 day for example)
I think it's a good idea, and might be a useful pattern for other query/report types in the future.
It may need a different endpoint than /email - perhaps /email-async or /email?async=true? It's best not to break existing behavior, which is probably suitable for most projects.
sounds reasonable
What would the maximum delay be? I hope it wouldn't get put at the end of the queue and subject to 10's of minutes. That might break the expectations of some clients.
I will need to check, but I think if we put it to different queue than other tasks it will only wait for other similar items. This means that other types of tasks won't affect the wait time, but similar tasks will. Right now we're not getting an awful lot of these requests. I only found around 50 from October 17th. If this is the case, the wait will most likely be in tens of seconds (depending on the number of compared results).
milosz
Dan
Is this the solution we should aim for? The downside is that it requires active polling from the client side.
milosz _______________________________________________ Squad-dev mailing list Squad-dev@lists.linaro.org https://lists.linaro.org/mailman/listinfo/squad-dev
On Tue, Oct 30, 2018 at 06:05:18PM +0000, Milosz Wasilewski wrote:
On Tue, 30 Oct 2018 at 17:55, Dan Rue dan.rue@linaro.org wrote:
On Tue, Oct 30, 2018 at 05:30:01PM +0000, Milosz Wasilewski wrote:
Hi,
Getting the output of 'email' API for the build usually takes a long time. In some cases we're close to hitting the 30 seconds timeout. I think the timeout is inevitable when there are a lot of results with big number of changes. In order to avoid timeout, maybe the API should do the work in background? This would work as follows
- GET call to /api/builds/<id>/email
- server creates a 'cashed report' object in database and returns URL
for it immediately to the user 3. in the background server adds a report generation task to the queue 4. using the URL received in 2) user is able to retrieve the final results or check the progress 5. once the result is generated it can be a short lived object in the database (removed after 1 day for example)
I think it's a good idea, and might be a useful pattern for other query/report types in the future.
It may need a different endpoint than /email - perhaps /email-async or /email?async=true? It's best not to break existing behavior, which is probably suitable for most projects.
sounds reasonable
I would rather have a new, totally different endpoint for "report" objects, this one being the first type of them.
Also, maybe it would be worth it trying to squeeze some performance out of the current code to see if we can avoid having to do this.
On Tue, 30 Oct 2018 at 19:39, Antonio Terceiro antonio.terceiro@linaro.org wrote:
On Tue, Oct 30, 2018 at 06:05:18PM +0000, Milosz Wasilewski wrote:
On Tue, 30 Oct 2018 at 17:55, Dan Rue dan.rue@linaro.org wrote:
On Tue, Oct 30, 2018 at 05:30:01PM +0000, Milosz Wasilewski wrote:
Hi,
Getting the output of 'email' API for the build usually takes a long time. In some cases we're close to hitting the 30 seconds timeout. I think the timeout is inevitable when there are a lot of results with big number of changes. In order to avoid timeout, maybe the API should do the work in background? This would work as follows
- GET call to /api/builds/<id>/email
- server creates a 'cashed report' object in database and returns URL
for it immediately to the user 3. in the background server adds a report generation task to the queue 4. using the URL received in 2) user is able to retrieve the final results or check the progress 5. once the result is generated it can be a short lived object in the database (removed after 1 day for example)
I think it's a good idea, and might be a useful pattern for other query/report types in the future.
It may need a different endpoint than /email - perhaps /email-async or /email?async=true? It's best not to break existing behavior, which is probably suitable for most projects.
sounds reasonable
I would rather have a new, totally different endpoint for "report" objects, this one being the first type of them.
Also, maybe it would be worth it trying to squeeze some performance out of the current code to see if we can avoid having to do this.
I didn't do the math yet, but I'm afraid that time of the current report will grow at least linearly with growing number of results. If this is the case, we'll hit the timeout eventually. The execution time would have to be constant regardless the number of results. I'm not sure if this is possible.
milosz
Squad-dev mailing list Squad-dev@lists.linaro.org https://lists.linaro.org/mailman/listinfo/squad-dev
On Wed, Oct 31, 2018 at 08:57:55AM +0000, Milosz Wasilewski wrote:
On Tue, 30 Oct 2018 at 19:39, Antonio Terceiro antonio.terceiro@linaro.org wrote:
On Tue, Oct 30, 2018 at 06:05:18PM +0000, Milosz Wasilewski wrote:
On Tue, 30 Oct 2018 at 17:55, Dan Rue dan.rue@linaro.org wrote:
On Tue, Oct 30, 2018 at 05:30:01PM +0000, Milosz Wasilewski wrote:
Hi,
Getting the output of 'email' API for the build usually takes a long time. In some cases we're close to hitting the 30 seconds timeout. I think the timeout is inevitable when there are a lot of results with big number of changes. In order to avoid timeout, maybe the API should do the work in background? This would work as follows
- GET call to /api/builds/<id>/email
- server creates a 'cashed report' object in database and returns URL
for it immediately to the user 3. in the background server adds a report generation task to the queue 4. using the URL received in 2) user is able to retrieve the final results or check the progress 5. once the result is generated it can be a short lived object in the database (removed after 1 day for example)
I think it's a good idea, and might be a useful pattern for other query/report types in the future.
It may need a different endpoint than /email - perhaps /email-async or /email?async=true? It's best not to break existing behavior, which is probably suitable for most projects.
sounds reasonable
I would rather have a new, totally different endpoint for "report" objects, this one being the first type of them.
Also, maybe it would be worth it trying to squeeze some performance out of the current code to see if we can avoid having to do this.
I didn't do the math yet, but I'm afraid that time of the current report will grow at least linearly with growing number of results. If this is the case, we'll hit the timeout eventually. The execution time would have to be constant regardless the number of results. I'm not sure if this is possible.
Just to follow up on this and our irc conversation, today we found a report that cannot complete in 30 seconds and times out [1]. Antonio reports that memory grows on the server and if the timeout were increased, memory usage on the front end web server could grow too high.
I expect this we will continue to hit this until a workaround or fix is available. We will send a modified report upstream in these cases in the meantime, when this happens.
Dan
[1] https://qa-reports.linaro.org/api/builds/10934/email/?template=9&baselin...
On Fri, 9 Nov 2018 at 16:37, Dan Rue dan.rue@linaro.org wrote:
On Wed, Oct 31, 2018 at 08:57:55AM +0000, Milosz Wasilewski wrote:
On Tue, 30 Oct 2018 at 19:39, Antonio Terceiro antonio.terceiro@linaro.org wrote:
On Tue, Oct 30, 2018 at 06:05:18PM +0000, Milosz Wasilewski wrote:
On Tue, 30 Oct 2018 at 17:55, Dan Rue dan.rue@linaro.org wrote:
On Tue, Oct 30, 2018 at 05:30:01PM +0000, Milosz Wasilewski wrote:
Hi,
Getting the output of 'email' API for the build usually takes a long time. In some cases we're close to hitting the 30 seconds timeout. I think the timeout is inevitable when there are a lot of results with big number of changes. In order to avoid timeout, maybe the API should do the work in background? This would work as follows
- GET call to /api/builds/<id>/email
- server creates a 'cashed report' object in database and returns URL
for it immediately to the user 3. in the background server adds a report generation task to the queue 4. using the URL received in 2) user is able to retrieve the final results or check the progress 5. once the result is generated it can be a short lived object in the database (removed after 1 day for example)
I think it's a good idea, and might be a useful pattern for other query/report types in the future.
It may need a different endpoint than /email - perhaps /email-async or /email?async=true? It's best not to break existing behavior, which is probably suitable for most projects.
sounds reasonable
I would rather have a new, totally different endpoint for "report" objects, this one being the first type of them.
Also, maybe it would be worth it trying to squeeze some performance out of the current code to see if we can avoid having to do this.
I didn't do the math yet, but I'm afraid that time of the current report will grow at least linearly with growing number of results. If this is the case, we'll hit the timeout eventually. The execution time would have to be constant regardless the number of results. I'm not sure if this is possible.
Just to follow up on this and our irc conversation, today we found a report that cannot complete in 30 seconds and times out [1]. Antonio reports that memory grows on the server and if the timeout were increased, memory usage on the front end web server could grow too high.
I expect this we will continue to hit this until a workaround or fix is available. We will send a modified report upstream in these cases in the meantime, when this happens.
I created a branch with new feature: https://github.com/mwasilew/squad/tree/long-running-reports Only one commit there: https://github.com/mwasilew/squad/commit/b85d30526a9e2801ddc4ead3a7f12f7f776...
It's not ready for review yet as it's missing unit tests and some docs. I fixed all of the problems with existing tests, but didn't write tests for new features. I also didn't test the callback yet (email notification works). Feel free to try it out and comment. I hope to finish it tomorrow morning.
milosz
Dan
[1] https://qa-reports.linaro.org/api/builds/10934/email/?template=9&baselin...