Taking advantage of LLM

albert · March 21, 2023, 6:57pm

It seems that Tryton could benefit from the recent advancements in Large Language Models like GPT-3/4 or LlaMa.

My proposal is not to bind Tryton to one specific implementation but provide the necessary abstractions to connect to these or other language models or other ML tools.

One idea on how to take advantage of these tools would be the following:

Add a button in the top menu bar of Tryton that allows users to add content. The user may write some text, drag & drop a file, or push a button to record some audio.

In case of audio, a speech recognition tool would be used to generate the text.

Files would be treated as URLs or converted into text (we should see what is the best option).

That content would be sent to a LLM asking it to return a JSON with the necessary actions to be executed by trytond and/or sao.

The inbox shown in sao, should be able to “explain” the user what actions will be executed and if the user agrees, then they would be executed.

For example, the following prompt in ChatGPT-4:

Given the prompt return a JSON with a list of actions with the following structure:

[{
    'action': 'create|search|update',
    'model': 'party.party|account.invoice|sale.sale|purchase.purchase',
    'values': { # Shown only in create|update actions
       {key: value}, 
    }
]

Available fields for party.party are name, code.
Available fields for account.invoice are number, party, invoice_date, lines.
Available fields for sale.sale are reference, party, sale_date, lines.
Available fields for purchase.purchase are reference, party, purchsae_date, lines.

Here's the prompt:

Hi!
the purchase manager of Zara made an order (their reference is 2314) of 25 light bulbs model 42321 that should be delivered ASAP. Remember that they have never purchased yet.

returns:

Here's the JSON with a list of actions to create a new party and a new purchase:

[{
'action': 'create',
'model': 'party.party',
'values': {
'name': 'Zara',
'code': 'ZARA'
}
},
{
'action': 'create',
'model': 'purchase.purchase',
'values': {
'reference': 'Compra_Zara_2314',
'party': 'Zara',
'purchase_date': '2023-03-21',
'lines': [
{'product': 'light bulbs model 42321', 'quantity': 25}
]
}
}]

So given that the LLM does most of the work by itself I think Tryton should provide an integrated interface, the JSON that must be passed to the LLM so it can learn the correct output format and options, as well as a mechanism to process that JSON so new modules can “plug-in” new possibilities.

My opinion is that some actions could be executed by the server (one may be interested in supplying that kind of information using webservices) but other actions should be executed by sao itself. For example, consider the case that the text provided by the user contains the necessary information to create an invoice except for the name of the party, or they have to choose between a couple of them before the rest of the invoice can be saved.

What’s your opinion on the kind of integration Tryton should provide if any?

nicoe · March 22, 2023, 9:25am

I don’t find your example to be very user friendly in fact . Which users are able to define the JSON structure of the API calls that would need a LLM to help him generate those?

But I get that while writing a proteus script (or something like that) those kind of knowledge is useful and can speed up development but then the LLM should be included to your editor (there are already some plugins for neovim).

Of course, it’s your duty as a programmer to ask yourself questions about the usage of code that comes from this source, it’s not without issues regarding the copyright of the code produced and of your own code that has been feeded to the LLM.

Anyway, for users of Tryton LLM could be useful in some cases I suppose. But it would need to have access to the data generated by the users and processes that are stored (and used) by Tryton. From my point of view this is where we can make something: by providing an API to which the LLM can connect and be fed a lot of information. That way after the “Alogirthm” has worked its magic the user can ask him questions about its own data (but he can not discover the answer himself because of the volume or of some hidden correlation). And he will be ready to welcome our overlord .

But I don’t think many companies are ready yet to send their whole database to another company in order to solve unidentified issues.

ced · March 22, 2023, 9:47am

I can foreseen a lot of security thread in such usage because it’s kind of giving user right access to an external “AI” to your company system. And writing access rules against such “AI” will be as much complicated as creating the “AI” itself.

albert · March 22, 2023, 10:27am

I see I did a very bad explanation

The idea is that the user just introduces this part of the text:

Hi!
the purchase manager of Zara made an order (their reference is 2314) of 25 light bulbs model 42321 that should be delivered ASAP. Remember that they have never purchased yet.

Tryton sends that prompt with inside a larger prompt message that gives the LLM the instructions to follow, just like I did in the first post. The LLM returns the JSON and it is Tryton that interprets the output and carries on with the necessary actions or asks the user for confirmation. For example, with the resulting JSON I put above, Tryton could show a a list of actions with something like this:

A "Party" will be created with the following values:

Name: Zara
Code: ZARA

----

A "Sale" will be created with the following values:

Party: Zara
Reference: Compra_Zara_2314
Purchase Date: 2023-03-21
Lines:
    Product: light bulbs model 42321
    Quantity: 25

Next to each action there would be an OK button that if the user presses, would create the records.

albert · March 22, 2023, 10:31am

There’s no need for using external AI. Recent LLaMa model which is said to be similar to GPT-3 (have not tested myself) has been trained in 5 hours with a stock GPU. So it’s something that can be run in the same Tryton server provided it has the right hardware.

That’s why I suggest that Tryton only provides the tools for such integration, then it’s up for company to decide if they want to use a LM and which one.

ced · March 22, 2023, 10:38am

External or not the thread is still there as you can not review the algorithm used. So you can not know exactly what it can do.

nicoe · March 22, 2023, 10:43am

albert:

The idea is that the user just introduces this part of the text:
Hi!
the purchase manager of Zara made an order (their reference is 2314) of 25 light bulbs model 42321 that should be delivered ASAP. Remember that they have never purchased yet.
Tryton sends that prompt with inside a larger prompt message that gives the LLM the instructions to follow, just like I did in the first post. The LLM returns the JSON and it is Tryton that interprets the output and carries on with the necessary actions or asks the user for confirmation.

OK I get it now

It’s indeed something nice. Maybe it could be the base for a plugin of some sort?

Indeed as Albert said there are “portable” LLM and this threat could be mitigated by putting the LLM in a DMZ (if it can get out of this jail it has earned its freedom ).

But I don’t think the portable LLM have the learning algorithms, just the result of those.

ced · March 22, 2023, 10:51am

Of course there are always the thread of attacking any piece of the software stack.
But the thread I’m talking about is a new one. You can not analyze the algorithm build from machine learning to ensure that it will always behave as you expect. You can not be sure that with a specific input, it will not generate as answer “DROP DATABASE” for example or “create a payment to Asimov and validate it”.

So for me such tool should be used to get input that can be cancelled without consequences or that are validated by human prior of usage.

ced · March 22, 2023, 11:01am

I think it should be only on client side mainly to limit the risk but also because user should not be allowed to trigger actions that he could not do from the client.

So globally I think the main difficulty is to create a descriptive language to manipulate (script) the client. For me this sounds a lot like Scripting Tryton Tutorial Video
The remaining part if just input/output API.

acaubet · March 22, 2023, 11:08am

That’s an approach I share, I was thinking of something like ‘ai_draft’ state for each model. Also I’m wondering about required field, which sometimes not all the required information is supplied by the text of by the AI.

ced · March 22, 2023, 11:15am

I think this can rely on the confirm and warning dialog. So the scripting language should not support to validate them.

That’s one of the main difficulty I think. How to make such scripting language that can pause on error to let the user fix it. The simple solution is to stop the execution on the first error (and never resume).

acaubet · March 22, 2023, 11:47am

For me it’s not a good solution, because I think that the interesting part is doing this integration in background mode, like reading external communications from some inbox or customer chat, using as predraft.
With the approach from ai_draft I was thinking of not making required if this is the state. But not convincing me either for example on sale we need party to compute the right taxes.

Another approach I just have, is creating a table ai_json_loads like:

model (party.party, sale.sale ...)
json (ai_generated - something like ai response to @albert example)
name (ai_generated) -> rec_name
used (bool)

So when you enter for example to a sale, with something like the action/attachment button you can select one of the not used json load to try to fill as many gaps as possible.

Another option for another use case is saving ir.note, for example, when importing the @albert sale like I just proposed there should be a note with:

delivered ASAP

But also for those AI parsed messages that are not for creating anything new. So if the AI parses from the customer chat something like “I will come to pick up the shipment A123 on Monday 12am”, should be generating an ir.note with this info on the shipment.

albert · March 22, 2023, 11:50am

Sure. That’s why I proposed that the user could validate and only allow a set of actions that we define.

We could allow full automation on the server side as long as we limit the set of actions or values allowed.

For example, I could allow creating and confirming sales from e-mails as long as the party is X.

albert · March 22, 2023, 12:23pm

I scripting API for sao makes sense to me.

acaubet · April 21, 2023, 8:26am

What about using it as a source?

acaubet · May 12, 2023, 10:42am

FTR, the scripting API for sao suggested, may be used not only for LLM extraction. OCR can also to parse image2text and send that as suggested input for invoices which the user can validate to include it.

sergyo · July 16, 2025, 1:25pm

What about implementing MCP Server?

albert · July 16, 2025, 3:37pm

I endorse that.

Indeed I asked B2CK to implement it if they agreed about 3 months ago. No answer yet (pretty sure they’re quite busy ) but we’re willing to finance that development.

sergyo · July 16, 2025, 3:56pm

We too.

Indeed other solutions have already implemented it…

albert · July 16, 2025, 4:37pm

Just did a quick look but my guess is that there’s no protection on what the LLM is going to do to the server which is quite dangerous.

We currently provide an integrated virtual assistant to our customers but any write operation requires the previous consent of the user using a button:

That currently uses subtransactions so when the user is shown what’s going to happen (such as creating a party in this case), we already know it’s going to work (99.9% of the time) because it’s been tried and committed in a subtransaction and later rolled back, so we can be sure it works and we can show the user what is really going to happen.

Subtransactions are not currently officially supported in Tryton, but it generally works. We also asked b2ck to work on that.

Improving “undo” in general in Tryton would be a great asset (it was quickly discussed many years ago) and would allow AI agents to be much more autonomous.

Meanwhile, the feature will bring more value for creating complex records (pasting text or images comming from somewhere else) and exploring data. There are other some niche use cases such as “translate the current description of the selected record into English” or things like that, which are quite nice and a small pleasure to use but not a huge productivity boost.