Rational
Tryton clients make an extensive use of the search_count RPC call which in turn executes a “SELECT count(*) FROM table” on the database. Although newer PostgreSQL versions can, in some cases, return that information just by using an index only scan, that is till an expensive operation because it may need to process all the index. In other cases, reading the whole table is still necessary for just returning the number of tuples in it.
When a table grows (+2M sales, +14M stock moves, +20M account move lines) the count operation can easily take between 0.5s and 1.5s on modern hardware.
The operation is used in the total count (up right corner on the Tryton client) but also in the domain counts, although in domains, it only shows +99 when the number of rows is above large.
Proposal
I propose we accept that by default the total count will not be as precise as it currently is.
The same way that in domains +99 is shown when the number is, well, >99, the same could be applied to the total count. For example, it brings not relevant information that there are 78.923 records. At a first glance, simply knowing that the number is >9999 is more than enough. For example, we could show a +9999 but instead of having 9999 hard coded it could be (10*limit)-1 where limit is the total number of records loaded by default (which is 1000 by default in Tryton client).
We could let the user click on the number, if she really needs the exact number.
That in itself is not enough to improve performance, so the idea is to add a “limit” parameter so the client can tell the server to stop counting if the number of records is greater than a given number. The user
Implementation
We could change the “search_count” RPC call to accept a “limit” parameter which would execute the following query:
SELECT count(*) FROM (SELECT 1 FROM table LIMIT 10000)
instead of:
SELECT count(*) FROM table
In case the user is in the second (or third, or whatever) page we should pass 10*limit + offset as limit parameter of the search_count() call.
The same parameter would be used by domain counts, but in that case, the value given to the limit parameter would simply be “100”.
I did some tests at SQL level and using the proposed limit parameter can take <10ms vs >800ms for large tables. Times depending on the size of the table and the filters applied, but it’s easily 2 orders of magnitude faster.
I think it’s worth noting that the proposed query can be somewhat slower for small tables. That seems quite obvious as we’re adding a subquery. For example, for a very small table, such as ir_module, the take goes from 0.6ms to 0.7ms.
So we’d propose applying the subquery behaviour to only the tables configured in a trytond.conf parameter. Something like:
[database]
count_subquery = stock_move, sale_sale, sale_line
Opinions?