Matching existing move lines on statement lines

pokoli · October 30, 2024, 2:45pm

Rational

When doing entering statement it is common to have lines matching receivable or payable lines in the system. Currently on tryton you need to select the party and account and do the reconciliation latter.

When the amount of lines to match is big, this is a long task for the user.
As tryton has all the information, it will be good to automate the creation of such lines, so the user does not need to type all the information.

Proposal

Add a button on the statement lines to match existing payments.
The matching can be done by amount and the line will be keep only if there is a single line matching the same amount.

The matching criteria should be customized and probably it will be good to match the same line using different criteria. For example, if a line for the same amount is found, we may try to find if the matturity date matches the origin date, so the search is redefined and the number of matching lines is increased.

Implementation

We have implement it as custom code using the following code: as extension of the apply rules button.

    @classmethod
    @ModelView.button
    def apply_rules(cls, statements):
        pool = Pool()
        Line = pool.get('account.statement.line')
        MoveLine = pool.get('account.move.line')
        Invoice = pool.get('account.invoice')

        amount2lines = defaultdict(list)
        for line in MoveLine.search([
                ('reconciliation', '=', None),
                ('account.party_required', '=', True),
                ['OR',
                    ('account.type.payable', '=', True),
                    ('account.type.receivable', '=', True),
                 ],
                ]):
            amount2lines[line.amount].append(line)

        super().apply_rules(statements)
        to_save = []
        for statement in statements:
            for origin in statement.origins:
                if origin.lines:
                    continue
                lines = amount2lines.get(origin.amount)
                if lines and len(lines) == 1:
                    origin_line, = lines
                    line = Line()
                    line.statement = origin.statement
                    line.number = origin.number
                    line.description = origin.description
                    line.date = origin.date
                    line.origin = origin
                    line.amount = origin.amount
                    line.party = origin_line.party
                    line.account = origin_line.account
                    if isinstance(origin_line.move.origin, Invoice):
                        line.related_to = origin_line.move.origin
                    to_save.append(line)
        if to_save:
            Line.save(to_save)

This misses the implementation of matching multiple criterias, but it should not be hard to implement by using the criteria as key of the dictionary.

ced · October 30, 2024, 2:55pm

For me it does not make much sense to use amount as only criteria.

pokoli · October 30, 2024, 2:58pm

Why not? Which will be the the criteria which makes sense for you?

I guess matching by dates will be also interesting, but if there is just a single line which maches the amount it is clear that such line is the one we are searching. Don’t it?

ced · October 30, 2024, 3:17pm

It is just random.

An invoice or contract reference, a customer account number, bank account number etc. So mainly the current criteria of the statement rules.

pokoli · October 30, 2024, 3:20pm

If you say amount is random probably you do not understand the problem we are trying to solve with the proposal. I do not think it make sense to invest more time in this discussion.

Thanks for your fast reply!

ced · October 30, 2024, 4:07pm

Indeed I do not understand but it is probably because you did not explain it.
I just read the proposed code and it does a random matching base only on amount.

albert · October 30, 2024, 5:09pm

For me, the right approach is to classify potential criterias with some probability criteria.

When the probability is very high, we should simply fill in the information for the user, but when that probability is lower, we just show the user a set of suggestions so she just has to click the most appropriate one.

We implemented something in those lines in the account_statement_enable_banking module, and of course it would be great if something along those lines was considered into core.

For that to be usable other things need to happen though (all implemented in the same module – we may split it into an independent one in the future):

Have a menu entry that brings directly to origins
Make each origin have its own workflow

That makes even more sense when using the integration with EnableBanking or Open Banking in general given that a cron job obtains the bank transactions and does that even more than once a day.

By the way, we decided not to use the rule system at all: no user creates rules manually and there are mechanisms to make reasonable suggestions based on past decisions. That is also implemented in the module and users are loving it AFAIK.

ced · October 30, 2024, 6:06pm

I still do not understand. Reference match or does not match. I do not see the point to suggest matching with an invoice number with 1 number different.

albert · October 30, 2024, 6:17pm

What do you mean by this?

ced · October 30, 2024, 6:21pm

You are talking about probability so I guess you are comparing the statement line information with existing information like invoice number etc.
So for me a probability, in this context, is a distance which make invoice number almost the same (differ by 1 number) are very close. But this does not make sense for me, I do not want to match with the invoice of another customer that has ordered just before the paying customer.

In other words, for me in an ERP the information space is not enough discret for approximate distance means something.

nicoe · October 30, 2024, 8:54pm

The percentage could be based on rules and not a computation of some kind of distance.

eg:

Let’s say we can match on

reference number
party
amount

We could have rules so that:

identical reference number → 100% probability
different reference number → 0% probability
no reference number but same amount and same party → 60% probability
same party → 40% probability
different party → 0% probability

I don’t have any idea how we could model that but it could be useful because sometimes people forget to put the relevant information and some fuzzy matching can help.

I guess this is the kind of idea @albert is talking about.

ced · October 30, 2024, 9:11pm

I do not see how it is different from the existing rule system as you will still have to decide for a threshold (which will be a fixed value and so no different from the first workable rule).

But I still do not understand the reason to try to match with amount. It is the most random data available on a statement.

In contrario I can see there could be improvement for filling the statement line from the origin. For now either the field is pre-filled with the origin because we found only one matching value or the field is empty. It would be good if we could provide a kind of completion with possible matching or prefill the input with the raw origin text data. This will prevent user to copy/paste the raw data to the statement line.
This is specially for the party because it is mainly the only field that may be missing because often it is incomplete or with typo.
And the “related to” is indeed optional and should probably not be filled when dealing with a high amount of lines (it is better to reconcile by account than by document).

albert · October 30, 2024, 9:21pm

No, I did not suggest that.

Probably my proposal was a bit off-topic or more generic than what Sergi was talking about. The thing is that there are cases such as the one proposed by Sergi that can seem random but in reality correlate very well with what the user needs.

For example, if you have one and only one invoice of 12.723,33€ and you have a transaction in the bank with that amount. The probability that the transaction corresponds to the payment of the invoice is extremely high.

What I suggest is that in those cases in which we (“we” could depend on the company) think that the probability is very high, we can automatically put that invoice in the statement line. In the case that we consider that the probability is not very high but quite high, then we can have another place in the Origin view in which we tell the user “hey, you may want to consider to use this invoice”.

albert · October 30, 2024, 9:28pm

Those are examples, but when you obtain the information from the bank, you also get a text that can contain any of the above information or something else.

For example, you may get “TRANSFER FROM SUPERCUSTOMER WHATEVER”. Maybe you have that party as “My Company, SA” so a fuzzy match between party name and the text will never match. However, the next time you get a bank transfer from the same customer we can look at the previous ones and guess quite confidently that we must use the same party the user selected the last time.

Creating a probability out of this is tricky because one wants to combine hardcoded heuristics with information previously provided by the user but it is certainly doable, and obviously it is possible to improve the current behaviour.

ced · October 30, 2024, 9:33pm

I’m strongly against to implement such weak logic in Tryton.
The troubles of matching with the wrong party is immense.

The simplest way is to create a statement rule.
But it may be better to have a “special” kind of rule that is able to search in the past relation made by the user between raw data and target. This would avoid to create too much rules.
(see Improve statement rule party matching - #7 by ced).

ced · October 30, 2024, 9:37pm

Indeed it is already implemented: Store rule keyword and use them for future matching (5deaa442f09a) · Commits · Tryton / Tryton · GitLab

ced · October 30, 2024, 9:47pm

For the record I checked on a large production database and 95% of the invoices have duplicate amount.
Of course on small database it is 10% but small database does not need such heuristic as matching or reconciliation is much more simple.

albert · October 30, 2024, 11:06pm

Having duplicate amounts highly depends on the nature of the business. If you invoice recurring services it will be very frequent. In other cases it is much more infrequent.
Having many duplicates does not mean that the duplicate is pending to be reconciled at the moment you reconcile the banks. Specially if you do that work daily.
Sergi’s proposal aimed to match if there is a single invoice that matches. If there are more than one, nothing is suggested.
AFAIU, your strong opposition does not consider the possibility of having some suggestions that the user only has to select from a set of options.
We’ve been using this kinds of proposals for years (probably since we started with Tryton 10 years ago) with satisfactory results.

ced · October 30, 2024, 11:38pm

This is not the case of the business I checked and that is quite logical following the Benford’s law.
And indeed the state does not matter because there is no reason that the distribution of open invoices is different than the paid invoices. Only the size of open invoices matter, if it is a small number than it may not show the same distribution because it will no be representative sample. But if it is small there is no difficulties to make the matching.

It does not really matter because your proposal is to reconcile no matter the party.

This does not make it right.

But I just talked about possible improvement.

Maybe you have special case, maybe you do not care about mistakes… But this stays random.

pokoli · October 31, 2024, 11:43am

We did tests on real databases with the proposed code and the results are quite good. The number of matching lines that won’t be found with standard rules is increased.

Thats exactly the point!

If there is a single line matching the amount, it is not possible to match with the wrong party.

Yes, I remember that this was implemented in custom nan-tic statement modules and this is something that users that used such feature are really missing when using the account_statement_rule module.

Any statement rule makes it right. I found several cases where the user creates a rule that does fuzzy matches and the proposal is not right. So any rule may be wrong.

Another problem is that the user does not have any way to know which lines have been manually reviewed and which ones are untrusted (create by the system).
But this is probably another discussion.