Using C extensions to speed up Tryton

nicoe · January 21, 2021, 3:22pm

These last days I had been working on making some core methods of Tryton a bit more efficient. Here are a small post to show you what I have done and some conclusions from those tests

Benchmarking Trytond

The first thing to do is to benchmark what you’re working on. In my case it concerns access to Tryton fields. I used a small script with trytond-console in order to create 200 000 move lines.

import decimal
import datetime
import time
import os
import warnings

warnings.filterwarnings('ignore')

Move = pool.get('account.move')  # noqa
MoveLine = pool.get('account.move.line')  # noqa
Journal = pool.get('account.journal')  # noqa
Account = pool.get('account.account')  # noqa

journal, = Journal.search([('code', '=', 'CASH')])
acc1, acc2 = Account.search([
        ('party_required', '=', False),
        ], limit=2)

moves = []
for i in range(10_000):
    lines = []
    move = Move(
        description=f'Move {i}', journal=journal, company=1, period=1,
        date=datetime.date(2021, 1, 1))
    nb_line = 20
    for j in range(nb_line):
        line = MoveLine(debit=decimal.Decimal(j + 1), account=acc2)
        lines.append(line)
    lines.append(MoveLine(
            credit=decimal.Decimal(nb_line * (nb_line + 1) / 2), account=acc1))
    move.lines = lines
    moves.append(move)

Move.save(moves)
transaction.commit()

Then here’s the script I use to profile the calls used to fetch and work on the lines’ fields:

import cProfile
import pstats
import timeit

MoveLine = pool.get('account.move.line')  # noqa
move_lines = MoveLine.search([])

with cProfile.Profile() as profile:
    sum(l.debit - l.credit for l in move_lines)  # noqa

stats = pstats.Stats(profile)
stats.sort_stats(pstats.SortKey.CUMULATIVE, pstats.SortKey.FILENAME)
stats.print_stats(80)

other_moves = MoveLine.search([], limit=200)

t = timeit.Timer(
    'sum(l.debit - l.credit for l in other_moves)',
    'sum(l.debit - l.credit for l in other_moves)',
    globals={'other_moves': other_moves})
print(t.timeit(500))

I use cProfile in order to get a breakdown of the time spent in the different methods of Tryton.
But unfortunately, for methods implemented in C, profiling does not work (as the python documentation explains), in this case you should rely on timeit which allow to run small snippets a great number of time. In order to get consistent timings, timeit will disable the garbage collector.

The Timer class takes numerous arguments, the only noticeable thing is that I run a first time the snippet so that Tryton’s cache is filled when timing the snippet.

Beware that when profiling you should use PYTHONOPTIMIZE=1 it removes the assert in the codes (and there is particularly nasty one in ModelStorage.__init__). You should also know that the results you will have are variable, the load on your computer could be a factor (and yes something as mundane as watching a BEAK> video while waiting can result in some differences).

Trytond’s benchmark result

         68569122 function calls (63706962 primitive calls) in 43.894 seconds

   Ordered by: cumulative time, file name
   List reduced from 190 to 80 due to restriction <80>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.052    0.052   43.894   43.894 {built-in method builtins.sum}
   210001    0.152    0.000   43.842    0.000 <string>:9(<genexpr>)
5280200/420000    1.348    0.000   43.690    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/fields/field.py:333(__get__)
   420000    8.913    0.000   43.368    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1441(__getattr__)
  1470000    5.041    0.000   22.344    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1555(instantiate)
   630000    2.441    0.000    4.372    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/transaction.py:83(get_cache)
   630000    1.699    0.000    2.707    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1403(__init__)
   630000    0.790    0.000    2.499    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/fields/many2one.py:98(get_target)
   420200    0.475    0.000    2.358    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/cache.py:359(__missing__)
   630000    0.841    0.000    1.906    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/cache.py:386(__init__)
  1680200    1.232    0.000    1.825    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/cache.py:374(_check_size_limit)
  1260000    1.176    0.000    1.792    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/cache.py:348(__init__)
   210000    0.829    0.000    1.779    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/model.py:439(_update)
      210    0.051    0.000    1.777    0.008 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelsql.py:679(read)
   420200    0.446    0.000    1.655    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/cache.py:355(__setitem__)
   630000    1.167    0.000    1.548    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/cache.py:37(freeze)
   630210    0.986    0.000    1.451    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/transaction.py:176(set_context)
  3360400    0.839    0.000    1.277    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/model.py:371(_setitem)
   210210    0.174    0.000    1.261    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/tools/__init__.py:27(cursor_dict)
  7351240    1.197    0.000    1.197    0.000 {built-in method builtins.setattr}
   630630    0.732    0.000    1.136    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/transaction.py:26(__exit__)
  4418820    1.079    0.000    1.079    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/transaction.py:61(__new__)
  4865302    0.829    0.000    1.016    0.000 {built-in method builtins.isinstance}
   631260    0.627    0.000    0.902    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/pool.py:59(__new__)
   630000    0.754    0.000    0.754    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/model.py:248(__init__)
   210000    0.712    0.000    0.712    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/tools/__init__.py:34(<dictcomp>)
   630000    0.449    0.000    0.665    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:92(cache_size)
4436442/4436232    0.658    0.000    0.660    0.000 {built-in method builtins.getattr}
   420000    0.572    0.000    0.633    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/model.py:286(__getattr__)
   630000    0.543    0.000    0.543    0.000 {built-in method builtins.sorted}
   631260    0.379    0.000    0.514    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/pool.py:73(__init__)
      210    0.038    0.000    0.502    0.002 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1622(<dictcomp>)
  3996720    0.491    0.000    0.491    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/fields/field.py:324(_get_context)
   210210    0.195    0.000    0.464    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1543(unique)
   210000    0.428    0.000    0.428    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1645(<dictcomp>)
   630630    0.266    0.000    0.408    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/transaction.py:23(__enter__)
   630000    0.397    0.000    0.397    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/transaction.py:85(<genexpr>)
   210210    0.376    0.000    0.376    0.000 {method 'fetchmany' of 'psycopg2.extensions.cursor' objects}
  1890000    0.359    0.000    0.359    0.000 {method 'setdefault' of 'dict' objects}
      210    0.332    0.002    0.332    0.002 {method 'index' of 'list' objects}
   416400    0.322    0.000    0.322    0.000 {method 'popitem' of 'collections.OrderedDict' objects}
   631680    0.297    0.000    0.297    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/pool.py:174(get)
  2097860    0.271    0.000    0.271    0.000 {built-in method builtins.len}
  2311460    0.262    0.000    0.262    0.000 {method 'items' of 'dict' objects}
   210000    0.147    0.000    0.255    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1535(filter_)
  2521050    0.254    0.000    0.254    0.000 {method 'pop' of 'dict' objects}
      210    0.001    0.000    0.251    0.001 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/backend/postgresql/database.py:66(execute)
      210    0.247    0.001    0.250    0.001 {function LoggingCursor.execute at 0x7fd5ff754dc0}
   435100    0.240    0.000    0.240    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1434(_cache)
   420200    0.182    0.000    0.228    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/model.py:361(__init__)
   630840    0.184    0.000    0.228    0.000 {method 'update' of 'dict' objects}
   210000    0.072    0.000    0.187    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/abc.py:96(__instancecheck__)
  1474830    0.175    0.000    0.175    0.000 {method 'get' of 'dict' objects}
   630630    0.157    0.000    0.157    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/transaction.py:20(__init__)
   420610    0.090    0.000    0.141    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/model.py:365(_getitem)
   630420    0.129    0.000    0.129    0.000 {method 'copy' of 'dict' objects}
   845040    0.125    0.000    0.125    0.000 {method 'startswith' of 'str' objects}
   210000    0.115    0.000    0.115    0.000 {built-in method _abc._abc_instancecheck}
   630000    0.094    0.000    0.094    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/cache.py:41(<genexpr>)
   847143    0.090    0.000    0.090    0.000 {method 'append' of 'list' objects}
   420000    0.090    0.000    0.090    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/cache.py:395(refresh)
      420    0.031    0.000    0.066    0.000 {method 'sort' of 'list' objects}
      210    0.050    0.000    0.065    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/tools/misc.py:95(reduce_ids)
      630    0.001    0.000    0.057    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:193(__iter__)
      630    0.001    0.000    0.044    0.000 {built-in method builtins.max}
      210    0.021    0.000    0.042    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelsql.py:794(<genexpr>)
      210    0.005    0.000    0.042    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:550(__str__)
     7140    0.018    0.000    0.037    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1493(not_cached)
   210000    0.036    0.000    0.036    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1624(<lambda>)
  841/630    0.002    0.000    0.029    0.000 {method 'join' of 'str' objects}
     3360    0.005    0.000    0.017    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:479(_format_column)
      210    0.005    0.000    0.014    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:591(params)
   210000    0.014    0.000    0.014    0.000 {method 'add' of 'set' objects}
    14490    0.006    0.000    0.010    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/model.py:377(_contains)
     3782    0.003    0.000    0.008    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:1306(__str__)
     3360    0.003    0.000    0.008    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/fields/field.py:358(sql_type)
     7140    0.005    0.000    0.007    0.000 /home/nicoe/projets/tryton/tryton-env/tryton-env/trytond/trytond/model/modelstorage.py:1500(to_load)
      420    0.005    0.000    0.007    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:496(_window_functions)
      210    0.007    0.000    0.007    0.000 {built-in method fromkeys}
     3992    0.002    0.000    0.007    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:249(alias)


0.8432038589962758

As you can see the call to __get__ yet quite simple is still quite long. That’s why we had the idea to rewrite it in C. Another point is the call to __getattr__ in ModelStorage, in there there is an index() call on the list of ids which in the profile case is quite big.

`get` in C’s benchmark result

The implementation of __get__ is quite naive, but it’s only for testing purpose. But anyway given the dynamicity of Python I think that the only part where some improvements could be done would be in the way we handle the name == 'id' case (by fetching directly the data from the object’s struct).

PyObject *
Field_getter(FieldObject *self, PyObject *args) {

    const char *field_name;
    PyObject *inst, *type, *result;
    PyArg_UnpackTuple(args, "__get__", 2, 2, &inst, &type);

    if (inst == NULL || inst == Py_None) {
        /* Not binding to an object, or already bound */
        Py_INCREF(self);
        return (PyObject *)self;
    }

    field_name = PyUnicode_AsUTF8(
            PyObject_GetAttrString((PyObject *) self, "name"));
    if (strcmp(field_name, "id") == 0) {
        result = PyObject_GetAttrString(inst, "_id");
    } else {
        result = PyObject_CallMethod(inst, "__getattr__", "s", field_name);
    }

    if (result == NULL) {
        PyTypeObject *tp = Py_TYPE(inst);
        PyErr_Format(PyExc_AttributeError,
                "'%.50s' object has no attribute '%U'",
                tp->tp_name, PyUnicode_FromString(field_name));
        return NULL;
    }

    Py_INCREF(result);
    return result;
}

The result where quite disappointing:

         63288922 function calls (63286962 primitive calls) in 46.804 seconds

   Ordered by: cumulative time, file name
   List reduced from 189 to 80 due to restriction <80>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.054    0.054   46.804   46.804 {built-in method builtins.sum}
   210001    0.627    0.000   46.750    0.000 <string>:9(<genexpr>)
   420000   10.996    0.000   46.123    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1441(__getattr__)
  1470000    5.273    0.000   23.211    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1555(instantiate)
   630000    2.603    0.000    4.639    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/transaction.py:83(get_cache)
   630000    1.590    0.000    2.664    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1403(__init__)
   630000    0.856    0.000    2.594    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/fields/many2one.py:98(get_target)
   420200    0.490    0.000    2.449    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/cache.py:359(__missing__)
   630000    0.897    0.000    1.991    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/cache.py:386(__init__)
      210    0.055    0.000    1.893    0.009 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelsql.py:679(read)
  1680200    1.242    0.000    1.872    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/cache.py:374(_check_size_limit)
   210000    0.863    0.000    1.868    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/model.py:439(_update)
  1260000    1.213    0.000    1.837    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/cache.py:348(__init__)
   420200    0.465    0.000    1.712    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/cache.py:355(__setitem__)
   630000    1.221    0.000    1.608    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/cache.py:37(freeze)
   630210    1.014    0.000    1.514    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/transaction.py:176(set_context)
   210210    0.176    0.000    1.355    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/tools/__init__.py:27(cursor_dict)
  3360400    0.837    0.000    1.347    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/model.py:371(_setitem)
  7351240    1.287    0.000    1.287    0.000 {built-in method builtins.setattr}
   630630    0.768    0.000    1.175    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/transaction.py:26(__exit__)
  4418820    1.104    0.000    1.104    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/transaction.py:61(__new__)
  4865302    0.897    0.000    1.092    0.000 {built-in method builtins.isinstance}
   631260    0.661    0.000    0.951    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/pool.py:59(__new__)
   630000    0.811    0.000    0.811    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/model.py:248(__init__)
4436442/4436232    0.805    0.000    0.808    0.000 {built-in method builtins.getattr}
   210000    0.779    0.000    0.779    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/tools/__init__.py:34(<dictcomp>)
   420000    0.662    0.000    0.727    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/model.py:286(__getattr__)
   630000    0.455    0.000    0.683    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:92(cache_size)
   630000    0.601    0.000    0.601    0.000 {built-in method builtins.sorted}
      210    0.039    0.000    0.531    0.003 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1622(<dictcomp>)
   631260    0.388    0.000    0.521    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/pool.py:73(__init__)
  3996720    0.510    0.000    0.510    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/fields/field.py:336(_get_context)
   210210    0.206    0.000    0.492    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1543(unique)
   210000    0.481    0.000    0.481    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1645(<dictcomp>)
   630630    0.281    0.000    0.426    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/transaction.py:23(__enter__)
   630000    0.408    0.000    0.408    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/transaction.py:85(<genexpr>)
   210210    0.400    0.000    0.400    0.000 {method 'fetchmany' of 'psycopg2.extensions.cursor' objects}
  1890000    0.395    0.000    0.395    0.000 {method 'setdefault' of 'dict' objects}
      210    0.349    0.002    0.349    0.002 {method 'index' of 'list' objects}
   416400    0.337    0.000    0.337    0.000 {method 'popitem' of 'collections.OrderedDict' objects}
  2097860    0.292    0.000    0.292    0.000 {built-in method builtins.len}
  2311460    0.273    0.000    0.273    0.000 {method 'items' of 'dict' objects}
   631680    0.272    0.000    0.272    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/pool.py:174(get)
   210000    0.207    0.000    0.272    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1535(filter_)
  2521050    0.264    0.000    0.264    0.000 {method 'pop' of 'dict' objects}
      210    0.001    0.000    0.256    0.001 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/backend/postgresql/database.py:66(execute)
      210    0.253    0.001    0.255    0.001 {function LoggingCursor.execute at 0x7f7680138ca0}
   630840    0.200    0.000    0.248    0.000 {method 'update' of 'dict' objects}
   420200    0.198    0.000    0.247    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/model.py:361(__init__)
   435100    0.228    0.000    0.228    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1434(_cache)
   420610    0.097    0.000    0.198    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/model.py:365(_getitem)
   210000    0.072    0.000    0.195    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/abc.py:96(__instancecheck__)
  1474830    0.189    0.000    0.189    0.000 {method 'get' of 'dict' objects}
   630630    0.169    0.000    0.169    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/transaction.py:20(__init__)
   630420    0.137    0.000    0.137    0.000 {method 'copy' of 'dict' objects}
   845040    0.131    0.000    0.131    0.000 {method 'startswith' of 'str' objects}
   210000    0.123    0.000    0.123    0.000 {built-in method _abc._abc_instancecheck}
   420000    0.099    0.000    0.099    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/cache.py:395(refresh)
   630000    0.099    0.000    0.099    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/cache.py:41(<genexpr>)
   847143    0.095    0.000    0.095    0.000 {method 'append' of 'list' objects}
      420    0.032    0.000    0.073    0.000 {method 'sort' of 'list' objects}
      210    0.052    0.000    0.067    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/tools/misc.py:95(reduce_ids)
      630    0.001    0.000    0.058    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:193(__iter__)
      630    0.001    0.000    0.049    0.000 {built-in method builtins.max}
      210    0.022    0.000    0.048    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelsql.py:794(<genexpr>)
      210    0.004    0.000    0.043    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:550(__str__)
   210000    0.041    0.000    0.041    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1624(<lambda>)
     7140    0.026    0.000    0.040    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1493(not_cached)
  841/630    0.002    0.000    0.030    0.000 {method 'join' of 'str' objects}
     3360    0.006    0.000    0.017    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:479(_format_column)
      210    0.006    0.000    0.015    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:591(params)
   210000    0.014    0.000    0.014    0.000 {method 'add' of 'set' objects}
    14490    0.007    0.000    0.011    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/model.py:377(_contains)
     3782    0.003    0.000    0.008    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:1306(__str__)
     3360    0.003    0.000    0.008    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/fields/field.py:362(sql_type)
     7140    0.006    0.000    0.008    0.000 /home/nicoe/projets/tryton/tryton-env/FieldGetC/trytond/trytond/model/modelstorage.py:1500(to_load)
      210    0.007    0.000    0.007    0.000 {built-in method fromkeys}
      420    0.005    0.000    0.007    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:496(_window_functions)
     3992    0.002    0.000    0.007    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:249(alias)
      210    0.001    0.000    0.006    0.000 /home/nicoe/.local/share/virtualenvs/tryton-trunk/lib/python3.8/site-packages/sql/__init__.py:262(select)


1.2678403660247568

Due to limitations in the profiler, the call to __get__ is not present (but if I add a call that will just make a call to super() the results would be quite like the results of the previous point).

The timeit figures is a bit slower than the pure python implementation, I attribute that to some overhead that is probably added by the way I implemented the Field and its accelerator class.

Conclusion

It’s quite hard to find some place where writing some part of the implementation in C would make a big difference.

In fact retrospectively I think that good candidates are part of the framework that could be “self-contained”: by not being too tied to the other part of Trytond, they could be completely implemented in C thus minimizing the need to use the Python datastructures (to some extent of course).

A good candidate would be our cache implementations as they could be quite independent of the rest and yet the do appear at the top of the methods that are slow (not THAT slow but a bit slow anyway).

JCavallo · January 21, 2021, 4:01pm

Hi,

From working on “performances” related problem rather frequently, I think there may be a misinterpretation, probably caused by CPU throttling / other processes.

If you compare other calls, they all took longer anyway, so it might be worth repeating the test multiple times, ideally by forcing the CPU stepping so it does not change.

(for instance, the second line was 4-5 times in the second call)

Maybe you already did that, but I cannot see how rewriting in C would slow the call (unless the implementation is flawed, which I honestly did not check)

ced · January 21, 2021, 4:11pm

Indeed instead of using timeit, pyperf should be used.
https://twitter.com/VictorStinner/status/1349666308682612736

nicoe · January 21, 2021, 4:55pm

In fact pyperf is just a wrapper around timeit to provide some statistics. But it provides the nice pyperf system tune command that can be used to set IRQ affinity and so on (I should reboot in order to isolate a CPU afterwards to test that as well).

To be honest, the results here are consistent with those I had yesterday (I ran those in the console without running anything else not even the cronjobs). But as I did not have them in a file, I just ran the test again and checked that they where consistents.

Now that I have use pyperf system tune here are the results:

C implementation: 2.10175000000163
Pure python implementation: 1.5666235190001316

And indeed the timing of the calls in the profiler are more alike.

I know it seems strange, as I said one of the reason for the difference is to search in the way I implemented the accelerator maybe:

try:
    from ._field import Field as CField
except ImportError:
    class CField(object):
        def __get__(self, inst, cls):
            if inst is None:
                return self
            assert self.name is not None
            if self.name == 'id':
                return inst._id
            return inst.__getattr__(self.name)

And then Field inherits from CField (that way we can fallback on a pure Python implementation if the accelerator is not present). Maybe that this additional level of indirection is what slowly builds up to this difference.

rmu · January 22, 2021, 9:41am

Did you try running the benchmark with the pypy jit?

albert · January 22, 2021, 9:50am

You could try leaving __get__ implementation in “class Field” as:

def __get__(self, inst, cls):
    if inst is None:
        return self
    return ins.__getattr__(self.name)

and add a new class FieldId with its own __get__ implementation:

def __get__(self, inst, cls):
    if inst is None:
        return self
    return inst._id

albert · January 22, 2021, 9:58am

Also, I would compare how much it takes running the script with just a 1000 moves (that is, without going beyond the cache boundary).

This is important in order to understand what is the performance of the methods involved in just the plain access to fields once they’re in memory and what is the performance when data must be loaded.

In the latter case, you may simply increase record cache size to say 10000 and see its impact on the performance.

albert · January 23, 2021, 4:35pm

I also suggest you take a deep look at this:

You’re just looping over 200.000 records but you get 1.47M records instantiated (7 times the number of records you need) which is basically the same I already reported.

You can improve that situation a lot by:

Lazy loading create_uid/write_uid
Lazy loading all M2o fields

I would also do some checks with:

Lazy loading write_date, create_date
Lazy loading all the fields that you’re not using in the loop

albert · January 23, 2021, 4:44pm

Mabye you can make a quick comparison with this C implementation of LRU Dict:

That said, it seems that we spend a lot of time in _check_size_limit. Probably because once the cache is filled-in with each new record the oldest one must be removed. Maybe it would be better if the cache was more “flexible”. So instead of having a hard limit (which is 1000 by default) there was a soft and hard limit. So, once the hard limit is reached “hard - soft” records are removed from the cache at once. Say hard = 1500 records, and soft = 1000. Then, once we reach 1500 records, the 500 least recently used are removed at once.

This idea supposes that it’s faster to remove 500 records at once rather than one by one, but I’m not sure that is true. Just an idea for a possible improvement.

Another possibility would be to make record_cache_size dynamic. So, by default it is 100, but once the first 100 records have been loaded, for that record list, then it becomes 1000, and after we’ve read those 1000 records, then it is increased to 10000. So, the larger the number of records we read, the larger the number of records we read at once (with upper limits of course).

ced · January 23, 2021, 9:45pm

I doubt there will be any improvement against an optimized version of LRUDict.
By optimized I mean that it should be possible now to make an implementation using only dict and not OrderedDict which adds a doubly linked list.

There is no method to remove more than one record at once.

And so there is no limit and all memory is used for cache.

albert · January 24, 2021, 12:24am

There’d be a limit as I already stated. You could set that limit to 10000 or whatever is defined. The speed at which the cache_size increases could also be defined.

The problem with the current implementation is that say that you have a loop like this:

for invoice in invoices:
    if invoice.lines:
        break

if invoices have a large number of lines, they will all be loaded even if it’s never necessary. So if cache_size is set to a large number because you need to process 200 thousand invoices (just in Nicolas example) you’re loading lots of lines even if you only need a large cache size for invoices (not for the lines).

So if preloading starts with 10 records, later jumps to 100, later 1000, and eventually 10000 (and stay at 10000 records from there on if 10000 has been set as the upper limit), we will only be using a lot of cache memory for those models and lists that really use it.

Seeing the results of lru-cache I think that is the way to go but another approach may be to clear the whole cache when read() is going to completely fill it

Something in the lines:

if len(result) >= cache.size_limit:
    cache.clear()

This should prevent the removal of item by item which is probably going to be slow. Would only apply if records are not already in the cache, so some extra logic would be required.

ced · January 24, 2021, 1:17am

That’s always the risk of pre-loading data. But it can always be tune by the developer.

This is not how the code works. We do not fill the cache but we read per IN_MAX.

lru-cache can not be used because it is missing default_factory.

Sorry but it makes no sense to me. It is not the job of the caller to worry about the internal of the container.
Also normally we always read less than the size of the cache. So your case will never happen.

It could be a way to implement LRUDict that when reaching the limit, it clear itself (of course before adding the new value).
But clearing a dict is still O(n) (as it has to loop over all keys) but deleting a key is average O(1) (worst O(n)). So in the best case it has the same complexity and the worst it is twice faster. But this is without counting the benefit that former cache value may be reused for example when there are duplicate ids in a list or if records has the same target etc.

nicoe · January 25, 2021, 9:59am

When I said that it’s where I think we could see improvements it’s indeed the kind of idea I had in mind but it was just an idea.

In case we implement our own cache nothing prevent us from creating one (apart the fact that it’s an added source of bug and so on obviously).

But I agree wholeheartedly with the idea of first testing with the new constraint on the order of the keys that brings python 3.7 (although I like the fact that OrderedDict conveys the semantic way better and that I find this constraint a bit strange but this is more a philosophical objection ).

Hello back @rmu, no I didn’t try using pypi because the goal is the find a better implementation for Tryton not to test if another interpreter would be better . But you’re probably right I guess that the JIT would bring some improvements in this area without any dev …

numaelis · April 15, 2024, 1:49am

Hi!
using some sort of load cache before fetching the table values?
like this:

        suma = Decimal(0)
        for move in move_lines:
            move.select_cache(["debit","credit"],'account_move_line')
            res = move.debit - move.credit
            suma+=res

def select_cache(self, list_names, table):
        if self.id:
            cur = Transaction().connection.cursor()
            field_str = ",".join(list_names)
            cur.execute("SELECT "+ field_str+"  FROM "+table+" where id="+str(self.id)+" ;")
            result = cur.fetchone()            
            if result:
                ix = 0
                for name in list_names:
                    if self._local_cache.get(self.id,-1)==-1:
                        self._local_cache[self.id] = {}
                    self._local_cache[self.id][name] = result[ix]                    
                    ix+=1

numaelis · April 15, 2024, 1:53am

using cprofile:

suma = sum(l.debit - l.credit for l in move_lines)  # noqa

27421 function calls (25756 primitive calls) in 1.575 seconds

        suma = Decimal(0)
        for move in move_lines:
            move.select_cache(["debit","credit"],'account_move_line')
            res = move.debit - move.credit
            suma+=res

27421 function calls (25756 primitive calls) in 0.632 seconds

Maybe this is useful when there are many records and we want to maintain some pythonic style

ced · April 15, 2024, 7:11am

What are the condition of such test?
Because I doubt that making a SQL query at each iteration would be faster than the current preload of cache per bunch.
Is the list of move_lines has the cache correctly align?
How much records are there?

numaelis · April 15, 2024, 12:12pm

The previous test was in an old version of tryton.
Now I have another test for version 6.4
account_move_line table has 1609 rows
wearing:
sum = sum(l.debit - l.credit for l in move_lines) # noqa
I obtained(adjusting cprofile):

892669 function calls (796329 primitive calls) in 0.466 seconds

 Ordered by: cumulative time

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 76683/3238    0.016    0.000    0.385    0.000 field.py:398(__get__)
 6449/3231    0.063    0.000    0.383    0.000 modelstorage.py:1622(__getattr__)

although sometimes the time was considerably reduced to this:

    104045 function calls (87219 primitive calls) in 0.122 seconds

    Ordered by: cumulative time

    ncalls tottime percall cumtime percall filename:lineno(function)
       2/1 0.001 0.000 0.093 0.093 modelsql.py:1744(search)
         5 0.000 0.000 0.059 0.012 database.py:65(execute)

using select before

105657 function calls (95267 primitive calls) in 0.273 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1614    0.002    0.000    0.183    0.000 database.py:65(execute)
     1614    0.178    0.000    0.181    0.000 {function LoggingCursor.execute at 0x7f2870ff6a60}
     1609    0.013    0.000    0.150    0.000 modelstorage.py:1601(select_cache)

ced · April 15, 2024, 1:15pm

I do not understand what I’m reading.

numaelis · April 15, 2024, 2:03pm

ok. what I understand happens in:

sum = sum(l.debit - l.credit for l in move_lines) # noqa

is that when doing l.debit, somewhere in the code it is a select debit from table
and when doing l.credit, it is another select credit from table
there are two select.
but if the select is reduced, it will logically be more efficient when you want to traverse a table. That’s why we use classmethod function fields with pysql. but sometimes it is more comfortable (practical) to use the orm, like l.debit - l.credit. which will not cause problems with few rows.
here I simply wanted to speed up the example:
sum = sum(l.debit - l.credit for l in move_lines) # noqa
but I don’t know if it would be good to add an annotation to modelsql before going through large data.

numaelis · April 15, 2024, 2:09pm

The cprofile code I used was this.
I used it temporarily in the invoices module, and from the client I went to invoices, I clicked on the search button

@classmethod
    def read(cls, ids, fields_names):
        result = super(Invoice, cls).read(ids, fields_names)
        
        pr = cProfile.Profile()
        pr.enable()

        MoveLine = Pool().get('account.move.line')  # noqa
        move_lines = MoveLine.search([])
       
        #suma = sum(l.debit - l.credit for l in move_lines)  # noqa
        
        suma = Decimal(0)
        for move in move_lines:
            move.select_cache(["debit","credit"],'account_move_line')
            res = move.debit - move.credit
            suma+=res
        
        pr.disable()
        out_stream = open('estadis.txt', 'w')
        ps = pstats.Stats(pr, stream=out_stream)
        ps.strip_dirs().sort_stats('cumulative').print_stats()
        return result

Using C extensions to speed up Tryton

Benchmarking Trytond

Trytond’s benchmark result

__get__ in C’s benchmark result

Conclusion

`get` in C’s benchmark result