Clustering postgresql - define different tryton read and write operations

Blepas · June 19, 2019, 6:55am

Hi, i don’t know if actually is possible, if that I would like to know please.
In AWS for example you can define a Postgresql Cluster with one endpoint acting as a Read/write and other endpoint performing read-only operations through read replicas nodes.

It will be powerfull if you could define different endpoints for read operations and write operations

Regards!

ced · June 19, 2019, 7:25am

Hi,

For now, there is no such option. Maybe tools like pgpool could help but I’m not sure they will be able to load balance with the REPEATABLE_READ isolation level.
Any way, I think it could be a good improvement to be able to define a second list of database connections and have the postgres backend use it for readonly connection.
But there is one point that should be checked, the common workflow is to make a write call (UPDATE) and just after a read call (SELECT) if they are done on two different databases is the second SELECT will see the written data. Or should we have a mechanism to force some SELECT to be done on the master database to be sure to get new data.

Blepas · June 19, 2019, 7:29am

I think the solution should pass to define 2 endpoints on database connections, One for read operations (select) and other for write operations (updates, inserts, deletes…) , and modify tryton to use it. All other jobs are made by AWS that provides 2 endpoints (1 for read and 1 for write).
I think it would be a great improvment!

albert · June 19, 2019, 8:17am

Although that’s probably simple from the Tryton POV because trytond knows whether an RPC request has to be readonly or readwrite, as Cédric mentioned that’s not guaranteed to work by PostgreSQL ATM.

To manage replication there are several options in PostgreSQL but probably none allows to do what you need in a reliable way.

Options such as PgPool which have existed for many years usually don’t work with the Tryton usecase because they usually don’t handle sequences reliably. And those are heavily used in Tryton. I haven’t looked at it recently so it may have changed, but I guess that’s not the case.

Built-in PostgreSQL Streaming replication should ideally be the best choice but currently it does not guarantee that the transaction on the read-only node is visible when the transaction finishes in the read-write one. That’s exactly the point that Cédric mentioned when he said that in many cases Tryton reads record just after writing them. See currently available options.

There’s been discussion and work to support that, but it is not yet available and won’t be in the upcomming v12 expected after the summer.

That said, I agree that when PostgreSQL supports this feature, your proposal would be a great addition to Tryton.

Blepas · June 19, 2019, 8:22am

Ok, thanks. But we could gif Amazon a try… may be the replication is very quick and may be if the recently added data has not been replicated yet, the master node (acting as a read node too) will serve it.

but i understand is not trivial to implement these 2 different connections in tryton

thanks.

ced · June 19, 2019, 10:21am

I do not think we should wait for synchronous commit and even once it is there, it will probably not be the best solution for performance.
Indeed my proposal to rely on readonly parameter is obviously not enough. So we could improve the proposal by adding new parameter. For example the client could specify that it is doing a request linked to another, so in this case even if it is readonly, it should use the main connection. I think it will have anyway a good impact on performance because most of the readonly queries are not linked to another.

msabater · July 12, 2021, 2:53pm

Can you please add more info on the use of these 2 suggested parameters?
I I understand correctly, the developer will use them to state that certaint methods are readonly or are linked to another one that has modified certain data.