I have been experimenting with the old fashioned server deployment in production where I have 3 instances of Trytond server running behind a Nginx proxy doing load balance. Having this setup running for a while, I noticed some strange behaviors, almost “race-condition” like in SAO. I don’t know if this is a bug or a limitation, hence I am commenting this over here.
The strange behavior happens as follows:
1- I have two companies created in the same DB.
2- While working on company A, I switch to the company B by opening the preference dialog box and changing the “current company” to B.
3- Upon clicking on the button “Save”, the interface should refresh and change the company label from A to B, but A is displayed. However, internally all information displayed from the modules: sales, accounts, stock, etc are from company B (just as expected). It is like it the label and the internal company switching are not in sync.
However this behavior does not occur for either of the following conditions:
a- A login dialog is requested, i.e. a password must be entered to switch to company B
b- Multiple Trytond server is disabled ( i.e. only one instance is running ) or use a load balancer that uses ip_hash to ensure a particular client is always handled by the same server instance.
From my humble opinion, I think the problem may be a limitation or a race-condition between Trytond server instances, where the XHR happen faster than the server instances could stay in sync. In other words, when the first XHR res.user.set_preferences call is handled by server instance X to set the current company from A to B has not finished storing the newly selected company into the DB, the browser has issued another XHR res.user.get_preferences call to server instance Y (due to the load balancer) and it returns a cached or “not-yet-updated” current company A value from DB.
I would like to know if anyone has experienced this problem before? And more important how to solve this problem?
You don’t need a complicated set up to have this behaviour.
Let’s say you make query changing the company but you add a sleep of 10 second before returning
During those 10 seconds you make another query, it will still return the initial company
This is due to the fact that each request is isolated.
I guess this is probably a bug in sao due to the fact that the query displaying the the company name is sent before the query saving the new company has been finished.
I understand this fact and it is the true nature of all asynchronous XHR calls. But the interesting part is that none of these strange behavior happens if all XHR requests are handled by one or the same server instance. Hence I assume there is no bug in SAO. I haven’t explore this issue using the python client.
The example I mentioned above about the company label is just one of many strange similar behaviors. I another case were I have added a function field in the account.invoice that displays a “selection” of analytic accounts. When the user selects an analytic account, it will modify all the invoice.lines’ analytic_account entries to that of the selected account and saves them. But sometimes between the XHR to save and the one to get, the lines get the back the initial analytic account, like it was never saved. This is confusing, and buggy to the end user.
What I had to do to solve this problem was to use ip_hash as load balance method in my Trytond upstream pool to ensure all XHR calls from the same client are handled by the same server instance; but this is sub-optimal and not very scalable.
For me this is due to the cache on User.get_preferences. The cache is correctly cleared on User.set_preferences but probably that the propagation of the cache clearing is slower than the second call to User.get_preferences which is redirected to another server.
I think the best would be that User.set_preferences returns the values of User.get_preferences so the client will get the correct values from the same server.