Managing very large tables

msabater · July 12, 2021, 3:16pm

Hi,

A customer that uses tryton has a fairly large database that suffers performance issues.
Three tables in particular (stock_move, account_move, account_move_line) have grown to reach tens of GBs.
We have tried several methods of optimization, such as profiling queries, adding indexes, removing locks, modifying processes and adding hardware. While these attempts have improved performance, the continous growth of these 3 tables is still a problem, so we are looking for more a more agressive approach.

The options we are working with include database clustering, table partitioning and just deleting old data.

Has anybody faced these problems? If so, how did you overcome then?

albert · July 12, 2021, 3:43pm

Cluster is a task we periodically run on tables based on ID but that will not solve your issue. If you meant some kind of replication, I think it’s probably too complex, and probably better to start with table partitioning. Fortunately since v12 PostgreSQL supports foreign keys to reference partitioned tables.

You may be interested in GitHub - pgpartman/pg_partman: Partition management extension for PostgreSQL

ced · July 12, 2021, 3:56pm

What are they? Which queries are too slow?

msabater · July 12, 2021, 4:22pm

The entire system slows down or just freezes. The problem is usually solved by recreating the container.
We are now auditing the system but the problem does not seem to be related to a specific query.

ced · July 12, 2021, 4:26pm

I do not see how cold data could slow the system. For me you must first find the exact source of problem and not assume it is the database.

albert · July 12, 2021, 5:00pm

Which container? Tryton’s one?

As Cédric suggests it’d be good to ensure the problem is at PostgreSQL side.

Some recommendations & ideas:

In order to see the activity that is going on in PostgreSQL server I’d recommend pg_activity.

We also always enable log_min_duration_statement in all production servers so that all queries that take longer than a certain threshold will be logged. See: PostgreSQL: Documentation: 16: 20.8. Error Reporting and Logging, then you can grep + count the number of queries above that threshold to keep an eye to understand if the situation is improving.

Maybe you’re reaching a limit on the number of connections on uwsgi or proxy you’re using? That may explain the freeze?

Depending on the Tryton version you’re using it’d recommend moving to v6. There you’ll find trytond-stat for better understanding what’s going on. Also SAO has several improvements that hopefully will minimize some performance issues present in previous versions (at least until 5.4).

In larger databases, counters can also be a problem, see Counters & performance.

numaelis · July 14, 2021, 5:08pm

in all modules or a specific module?

pokoli · July 15, 2021, 8:06am

If the system just freezes may be related to a Python process which is doing some calculations which take so much time. This is a common souce of performance problems and can be easly detected using pg_stat_activity. You will see that the query is most of the time in state idle in transaction and the last query is a select from a table to get the results to do the computation.