Usage of Index.Equality in core modules

JCavallo · April 25, 2024, 9:06am

Hello,

We encountered a few problems linked to the limitations on Hash indexes in postgres.

As per the docs:

Hash indexes have been designed to cope with uneven distributions of hash values. Direct access to the bucket pages works well if the hash values are evenly distributed. When inserts mean that the bucket page becomes full, additional overflow pages are chained to that specific bucket page, locally expanding the storage for index tuples that match that hash value. When scanning a hash bucket during queries, we need to scan through all of the overflow pages. Thus an unbalanced hash index might actually be worse than a B-tree in terms of number of block accesses required, for some data.

As a result of the overflow cases, we can say that hash indexes are most suitable for unique, nearly unique data or data with a low number of rows per hash bucket. One possible way to avoid problems is to exclude highly non-unique values from the index using a partial index condition, but this may not be suitable in many cases.

What I get from this is that Hash indexes are a good fit for columns with diverse data.

Symptoms here are extremely slow index building, and sub-par performances when the index is used.

I’m wondering if there could be some warnings if one uses a Index.Equality on such columns (fields.Selection / fields.Boolean should maybe even be forbidden), since developpers may assume that a valid optimisation for a ('state', '=', ...) query is to add a Index.Equality.

ced · April 25, 2024, 10:59am

The problem is not Index.Equality but the score of HashTranslator vs BTreeTranslator.
The problem is that the IndexTranslator’s do not have any knowledge about the data.

Well it is almost always never a good idea to put indexes on those kind of fields.
Except for the case we already use in standard module with partial index.

They should not but I do not think Tryton should substitute to developer by providing not always accurate warnings.

ced · August 26, 2024, 8:41am

This have been solved by Hash vs B-tree index (#13426) · Issues · Tryton / Tryton · GitLab

JCavallo · September 3, 2024, 8:18am

This should indeed help

ced · September 9, 2024, 6:00am

This topic was automatically closed after 13 days. New replies are no longer allowed.