The following code:
products = Product.search([], order=[('id', 'ASC')], limit=500)
logger.info('Before: %s', t)
products[0].cost_price
logger.info('Elapsed: %s', t)
where t is a Timer class that helps measuring elapsed time results in as much as 3.6452 seconds in a real world database.
cProfile shows the following (just measuring products[0].cost_price):
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
312791/1 0.080 0.000 5.786 5.786 field.py:328(__get__)
28001/1 0.444 0.000 5.786 5.786 modelstorage.py:1401(__getattr__)
76/1 0.009 0.000 5.783 5.783 modelsql.py:682(read)
1 0.000 0.000 5.782 5.782 function.py:84(get)
2 0.006 0.003 5.782 2.891 function.py:106(<genexpr>)
1 0.000 0.000 5.776 5.776 function.py:96(call)
501 0.001 0.000 5.773 0.012 function.py:101(<genexpr>)
500 0.001 0.000 5.772 0.012 multivalue.py:55(_multivalue_getter)
500 0.002 0.000 5.768 0.012 model.py:21(get_multivalue)
500 0.001 0.000 5.765 0.012 multivalue.py:43(get_multivalue)
500 0.001 0.000 5.704 0.011 multivalue.py:37(__values)
500 0.001 0.000 4.912 0.010 model.py:12(multivalue_records)
86035 0.508 0.000 4.182 0.000 modelstorage.py:1493(instantiate)
--- cut ---
Although we only request 500 records, 86035 objects are instantiated. This is because cost_price is a MultiValue field and fetches the information from product.cost_price
model.
However, that’s still a lots of instances and I tried making create_uid/write_uid fields use lazy loading (after all those fields are hardly ever used I guess). Here’s the output after this change:
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
270693/1 0.067 0.000 4.073 4.073 field.py:328(__get__)
28001/1 0.326 0.000 4.073 4.073 modelstorage.py:1401(__getattr__)
76/1 0.008 0.000 4.070 4.070 modelsql.py:682(read)
1 0.000 0.000 4.069 4.069 function.py:84(get)
2 0.002 0.001 4.069 2.034 function.py:106(<genexpr>)
1 0.000 0.000 4.067 4.067 function.py:96(call)
501 0.000 0.000 4.062 0.008 function.py:101(<genexpr>)
500 0.001 0.000 4.061 0.008 multivalue.py:55(_multivalue_getter)
500 0.002 0.000 4.057 0.008 model.py:21(get_multivalue)
500 0.001 0.000 4.053 0.008 multivalue.py:43(get_multivalue)
500 0.001 0.000 4.002 0.008 multivalue.py:37(__values)
500 0.001 0.000 3.446 0.007 model.py:12(multivalue_records)
43929 0.311 0.000 2.730 0.000 modelstorage.py:1493(instantiate)
--- cut ---
Only 50% of instances created, and the execution time drops to 2.5985 seconds (still a lot of time, but 30% less).
Time measured with cprofile disabled.
What do you think? Should we make create_uid/write_uid use lazy loading?
Other tricks we should use to lower the number of instances created? I’m wondering if the cache should just store id’s by default and create instances only if the field is actually accessed in any of the records of the cache.