Improve processing errors

Continuing the discussion from Make sale warehouse not required until processing:

I agree with albert here that this is currently an issue as the user will not now that we failed to process a sale or not due to a missing configuration.

I’m wondering if it won’t be better to mark the sale as processing in case of error but set the shipment/invoice state to exception to let him know where the problem is.

There will be a side effect of this behaviour: currently on a sale with both shipment and invoice methods set to order the shipment is not created when there is a missing account. If we split the process (which is required to have two separate exceptions) we should be able to correctly create the shipment when there is some error creating the invoice.

This does not look correct to me because there is no exception to handle. We can not change the meaning of the state for convenience.

More over it is technically impossible nor correct to update record if any exception occurs. This may break the integrity of the data or commit third party transaction.

Well this is not completly right. There is an execption to handle and is the user who should handle it. Here is an scenario to reproduce the problem:

  1. Create a sale with invoice method set to “On shipment”
  2. Add a product without accounting configuration set
  3. Quote, Confirm and process the sale
  4. Open the related shipment and deliver it.

Once the shipment is done, the following exception is raised at the server log:

Traceback (most recent call last):
  File "/home/pokoli/projectes/nclone/trytond/trytond/worker.py", line 114, in run_task
    task.run()
  File "/home/pokoli/projectes/nclone/trytond/trytond/ir/queue.py", line 176, in run
    getattr(Model, self.data['method'])(
  File "/home/pokoli/projectes/nclone/trytond/trytond/model/modelview.py", line 773, in wrapper
    return func(cls, records, *args, **kwargs)
  File "/home/pokoli/projectes/nclone/trytond/trytond/modules/sale/sale.py", line 985, in process
    sale.create_invoice()
  File "/home/pokoli/projectes/nclone/trytond/trytond/modules/sale/sale.py", line 815, in create_invoice
    invoice_lines.append(line.get_invoice_line())
  File "/home/pokoli/projectes/nclone/trytond/trytond/modules/sale/sale.py", line 1482, in get_invoice_line
    invoice_line.account = self.product.account_revenue_used
  File "/home/pokoli/projectes/nclone/trytond/trytond/modules/account_product/product.py", line 51, in prop
    return getattr(self.template, field_name)
  File "/home/pokoli/projectes/nclone/trytond/trytond/modules/account_product/product.py", line 38, in wrapper
    raise AccountError(
trytond.modules.account_product.exceptions.AccountError: There is no "Account Category" defined for "[PD000023]Product without accounting". - 

But the user nevers sees that this error is raised. This does not sound wrong to me because the stock workflow should not be blocked because of wrong configuration.

The problem is that if you go back to the sale, you will see:

  • State is “Processing”
  • Invoice state is “None”
  • Shipment state is “Waiting”

From there you can not know that there is a problem with this sale so it’s very complex to find it for the end user and it will have wrong consequences for the company.

You are right, resuing the same invoice_state or shipment_state will be complex and will raise the handle exception wizards when there is nothing to do with it. But this does not means that we should hide the error to the end user.

So I think we only have another option here: Add a new “Error” state on the document workflow.

Absolutly right, but we should find a way to trigger another transaction in the case there is an error with the task. For the sale/purchase workflow I think it can be a function like:

@classmethod
def process_exception(cls, sales):
    for sale in sales:
         sale.set_invoice_state()
         sale.set_shiment_state()
         sale.state = 'error'
    cls.save(sales)

And then show the sales with error on the exception tab.
From there, the user will be able to click the process button and the error will be shown to their screen. So he will be able to fix and create the invoice.

Also it should be possible to trigger notifications (using notification_email) to send an email to the responsible of the current sale.

This is very tailored to the sale purchase workflow, but as far as we have a way to trigger an “exception task” when there is an error on the main task it can be reused for any task queue.

I’m open to any other solution that may fix the problem if you have any but I think this is something that we need to deal with once we have the background processing.

This still not works because these method may also fail.
For me there is only one solution that is to look at the logging.

The only case when a method that updates the record state may fail is when there is an error on the database and in this case we will retry the tasks (like we do for others) so no issue from my side.

Of course one may have a broken code due to wrong development of custom module but we can not prevent this on any function.

Users do not have the possibility to access the logs. Furthermore, the account administrator or product administrator user normally does not understand a log. That’s why we have nice errors to shown to our users. Otherwise we can just remove all the UserErrors and just make our users “look at the logging.

Please note that the scenario that I explained has exactly the same behaviour when using a task queue in background and when using only the tryton server. So also much simpler setups will suffer from the same problem.

I do not agree. There are many cases where updating a record on the database may fail especially complex record like a sale.

For me it is clear that by design it is not possible nor good to store on a business record an error because this error may no more exist. Also this would not provide a generic solution for all background processes like tasks and scheduled jobs.

I think the solution would be to provide access to the logs.
On the worker and cron we could catch UserError and UserWarning and store them in a table dedicated to it (very low constraints which should work for insert only). I think we can just store the message of the error (no need for the traceback) as they are most of the time good.
The administrator could have access to such records (and can delete them). And they will be cleaned by a cron task after few configurable time (like 3 months).
The only problem would be that the message are in a single language but normally they should be in the configuration language.

1 Like

Then we may use a query to skip the ORM which will be less error prone.

Fixing the error will not regerate the related documents, so the user still needs to process the sale to fix the exception.

For me the problem is that the sale admin may not be administrator of the system. So wr have the same problem that with logs: the user that should fix the error may not know that there is something to fix.

This is not related to the ORM but to the nature of the ACID of SQL server.

This is not a counter argument against storing error on business record.

The sale admin may not have the access to fix the problem neither.

So for now there are no better solution proposed.

I see a couple of possibilities:

  • When the task is added to the queue we could allow an optional user to be notified in case of a problem. That user would see the errors that should handle. Something like: with Transaction().set_context(queue_name='x', user=user.id):
  • Add those notifications as ir.notes added to the record. This would have the problem of deciding what to do if more than one record id was provided to the queue call.

In both cases, the administrator would see all the errors in the system.

How do you decide which user must be set? What if no user is provider? What if the user cannot do anything about the error? What do you mean by notify? If it is via the bus, what if the user is not connected?

Why? Who should read it? How will it help to find the problem? What happens once the problem is fixed?

I think it would be per model. The programmer is responsible for setting that user. The main problem I see with the case of sales, for example, is that what we store in “confirmed_by” is an employee, which could potentially be linked to more than one user. But maybe a “owner_user” field could be added.

Administrators will still have access to those records. It’s up to the developer to decide if it makes sense to specify an user.

If the user cannot do anything about the error it would report to the person in charge of it. Just like it currently happens: if the sales user presses the “process” button in sales and sees an error for a product misconfiguration it may fix it himself (if he manages products too) or notify whoever is responsible for it.

Notify for me it would mean:

  • The user has access to the list of all errors in which she is involved (maybe from a menu entry or a new button next to the preferences). For me the behaviour should be similar to ir.note: the user sees a badge, clicks on the button sees the list of notifications and just by clicking accept it is marked as read.
  • If the bus is available it could be sent to the bus, too.

So it will never be filled.
Also why hard-coded such property on the model. What about the multi-company design?

Your proposal is too much centered on a single case. We need a generic solution.

Why would an “owner” (what ever it means) should be the one to be notified.
Also I do not see how you could link an unexpected error to a single record? How would it work when a process is run a on bunch of records?

How could the developer make such decision. By nature the problem is that it is unexpected errors so I do not see how a developer would anticipate them.

Or not but there will be no clear workflow to fix all problem in contrary to my proposal.

But the goal is to improve the current workflow.

Finally I do not understand at all your proposal. I think there are at least 3 different proposals mixed up. So it is very difficult to discuss.

No idea if this is going to work. When a record is processed in the queue it knows the user by the create_uid or write_uid right? If there is an error, check if the user has an email address filled in. If so, send the traceback by email to the user which can take action. Otherwise send the traceback to the default admin.

The workflow isn’t changed so it still works the same which can be a problem. But the user is made aware of the error.

Sounds like a good Idea, but I guess we should have the same problems related to ACID.
If another user modified the record before the task is processed on the queue the notification will be sent to the wrong user.

For me this is not a big deal if somebody is noticed, thats why I was proposing to have a general flag on the record so anyone that has access to it may be able to fix the error.

This is again not a general solution. And more over I do not see how it is possible to relate an unexpected error to a record.

I still think the only valid solution for now is a logging table with all the UserError generated.

Is not the record the record put into the queue to be processed? So if another user is changing the record, before the task is processed, the queue will hold two tasks?

A user made changes to a record and wants to store the information into the database. The user does this by pressing ‘Save’ in the client. The data is send over to the server side of Tryton. That for me is the starting point. So when something goes wrong in the storage process, the initiator of the ‘save’ action should be informed.

This is already the behavior.
This topic is about unexpected error in background processes which by definition are not related to a user request.

Not a single record, but a list of records as we may trigger the process for multiple records.
For example when a statement is encoded and some invoices are paid, all the related sales are processed (in order to trigger the shipment workflow if necessary).

If the save triggers the process in the queue it generates a new task. So if the old has not still been processed we have two pending tasks.

But not all changes trigger a task in the queue.

No, this is not the behaviour. The user does not get any message once the task is added to the queue when the workers are not activated. For example on purchase workflow the record is in confirmed state when there is an error and the user should click on process to see the error.

But yes, the main issue is when workers are activated because the user does not have any diference with the standard behaviour and she will never notice there is an error.

It’s not to a single records but to all of the records related to the failed task. I know this may be more than the wrong ones but as this records have not been correctly processed it’s state may be wrong.

The user must reprocess them (one by one) to know which is the wrong one and to fix trigger the tasks of the others that failed due to being in the same transaction as the wrong record.

Having a single table with an UserError will not give the proper information to fix the Error. Let me put an example:

  • He have 3 purchases (A, B, C) which are processed in the same task (which failed).
  • Purchase B is the culprit one because it have a product1 that is missing account configuration.

The user will see a message: “The product1 is misssing an account category” will go to the product and set the proper accounting values. At this point the user fixed the error but does not know which purchases should be reprocessed (because we do not allow the user to search for the products that are inside a purchase). So she must go to process all the purchase with are in processing or confirmed state which will lead to manually processing records that may not be still processed (because the process delay had not been passed yet).

The right solution will allow the user to know which records should be fixed and let any user on the system see the error if it has not been already fixed (this is already solved by the manual process button)

I guess there is no single solution for all tasks, so we should let the developer define some method to be called with the failing instances and provide a sample implementation for sale and sale and purchase workflows.

I disagree the state is not wrong in such case. But there is just a pending task on it.

That’s why I propose to expose the user error in a table.

This is wrong. The task is already rescheduled so the user has only to fix the problem.

He does not care because there is a task that will do it later.

Why not add an extra top level menu item and a group. So we can decide which users should be able to access the Error Log?