Improve queue management in tryton worker

sergyo · November 14, 2019, 3:27pm

In deployments with many workers or workers for dedicated queues I found some limitations in the worker configuration.
For instance I have an interface with another application and all the model syncronizations are managed by queues. So I have prioritized workers based on queue names, this is a queue for products, another one for parties, another queue for sales, … and a default worker with no queue name for the rest of data.
In this scenario I cannot control that all records for queue “product” are managed by this queue due to default worker (with no queue name) does not apply any kind of filtering.

I propose to improve trytond-worker parameters to allow some configuration that exists in others distributed systems like celery:

manage many queue names (-n will accept a list of queue names)
add exclude queue names parameter

Opinions?

ced · November 14, 2019, 4:14pm

For me, it will make the code more complex for almost no benefit.
I do not see the point to prevent a worker to work a queue if it has free resource.
Also such implementation will require to make SQL query with an IN instead of = which is more difficult to optimize.

sergyo · November 15, 2019, 10:20am

FMPOV the benefit is to have full control over queues and workers.

I see your point.

albert · March 3, 2022, 11:27am

We just hit another use case in which I think it makes it clearer the need for the proposed feature.

Our customer needs to post many invoices.

We’ve got a worker with “-n 8” which is used to process other queue tasks and a new worker with “-n 1” with “–name post” to do only the posting. The problem is that the generic queue with 8 workers will start trying to post all invoices added to the queue but given that posting must be done sequentially we consume lots of resources for no benefit as tryton will only allow 1 out of the 9 invoices being processed in parallel to succeed.

Not only do we consume lots of resources for just posting an invoice each time, but also the probability of reaching the maximum number of retries highly increases to the point that some invoices are not posted and needs to be handled somehow.

The alternative in the current situation is not ideal because we must create one worker with “-n 1” with “–name post” and as many other workers are queues exist.

The proposed parameter --except would solve this issue and my personal preference.

However, as an alternative we could make all queue_name configurable via trytond.conf. For example those:

https://hg.tryton.org/modules/sale/file/tip/sale.py#l851
https://hg.tryton.org/modules/purchase/file/55722fc603cd/purchase.py#l881

could be set with:

[sale]
queue = sale

[purchase]
queue = purchase

In this case, I would make default the default queue name for all modules.

albert · March 6, 2022, 1:27am

I uploaded a patch that allows to configure queue names.

It seems a reasonable solution that does not penalize performance.

ced · March 6, 2022, 2:04am

I do not see the point to create so much undocumented configurations. All this customization can be done easily with extension.

This has been solved with the post_batch.

For me the queue is the wrong place to solve this issue. It sounds more like a workflow design issue.

It may be interesting to have an easy mechanism to hook a default queue name setup. Maybe by extending the Queue.push method.

albert · March 6, 2022, 8:05pm

The point is that the current queue names are also undocumented, so I don’t think the patch makes this matter worse. The patch makes queue management more flexible without much complexity and avoiding the need for customization.

We don’t use post_batch in this case because there’re some other tasks we have to do that can be parallelized with several workers. As soon as each invoice is ready we start using post for each of them so they can start to be posted so we don’t have to wait for all of them to be ready (which is what we would have to do if we used post_batch). Of course, adding individual invoices to the queue to be posted, has the drawback that when the other tasks have finished their worker start “helping” the “post” queue executing several post operations in parallel.

If the solution must be done through customization it doesn’t look like a hook should be necessary. One can override ir.queue push() calling super() with the desired queue name.

The thing is that, for me, it’s better to manage that through system administration/configuration. One can try to add new servers for executing queues, add workers to some of them, using the same worker for processing sales and purchases, and leave product cost recomputation to another worker, etc. I think this kind of experimenting should not require writing code.

ced · March 6, 2022, 11:04pm

It is documented as there is only the default: Task Queue — Tryton server

For me it is worse because it adds many configuration options and without documentation.
But also (and probably the big blocker) it is fixing a single schema for naming queues which is only based on the method.

For me it is less flexible as-is.

For me it will never be a target to “avoid” customization. The goal is to have a generally good behavior but mainly simplest possible.

I can not understand this argument.

I do not think the system should help you if you do not embrace its design.

I never saw a case where posting invoice is a race.

It can not be done cleanly for now because the default name is already computed.

We always had considering customization as a possible way to configure and we are in favor of that instead of too much configuration.

albert · March 8, 2022, 12:24pm

I don’t understand. There are other queue names being used apart from “default”. “sale” is one example:

https://hg.tryton.org/modules/sale/file/tip/sale.py#l936

ced · March 8, 2022, 12:29pm

It is a mistake they should not be set.

albert · March 8, 2022, 12:47pm

Then, everything is clear now.

We allow no configuration but at least the default queue names don’t come on the way of a specific need.