Sentry SDK integration

Rational

The existing sentry_tryton module, yet powerful and happily used in production environments, works with the deprecated raven package and some hacks.

Namely: only few and known logger.error calls should exist in the application codebase which are carefully skipped except for one in the trytond.protocols.dispatcher._dispatch function except block. There, the SentryTryton log handler will raise a different unchained exception that will ultimately be caught by the WSGI app error handling and serialized to the RPC client.

But

  1. Raising exceptions in a log handler emit method is rather unexpected: a failure in the logging system shouldn’t interrupt a service
  2. Custom code or third party dependencies could legitimately try to emit error log events in their exception control blocks. Or anywhere! And this would be beyond the foreseeable hardcoded skips of the SentryTryton log handler, which raises really unexpected exceptions in 3rd party code. This can only be mitigated by limiting the extent of that handler to only the logger of “qualname=trytond.protocols.dispatcher”. This is always something wise to do, but still hacky and poorly documented.

Proposal

The goal would be to distribute a Trytond server integration with the new sentry_sdk package just as other widespread Python frameworks do.

This will allow to write a wsgi.py script which initializes the sentry_sdk and then exposes the wsgi app

# wsgi.py

import sentry_sdk
import sentry_sdk.integrations.trytond

sentry_sdk.init(
    "https://<key>@sentry.io/<project>",
    integrations=[sentry_sdk.integrations.trytond.TrytondWSGIIntegration()]
)

from trytond.application import app as application

# ...

Which will be loaded by your favorite WSGI server. For development, a python wsgi server like gunicorn or a werkzeug script can be used to load it.

The trytond shipped binary won’t be able to start this integration unless a feature is developed so it can start arbitrary wsgi scripts. (see https://docs.djangoproject.com/en/2.2/ref/settings/#std:setting-WSGI_APPLICATION for example) But this shouldn’t be an issue since the integration is most wanted in production where the server should not be started with that binary.

Using the recently shipped error handlers feature, the sentry event id could be reported to the RPC client by registering a custom error handler with a custom message in a TrytondUserError

# wsgi.py

# ...

@application.error_handler
def _(app, request, e):
    if isinstance(e, TrytondBaseException):
        return
    else:
        event_id = sentry_sdk.last_event_id()
        data = TrytondUserError('Custom Message', f'{event_id}\n{e}')
        return app.make_response(request, data)

that would only report the event id for actually unhandled exceptions, but not for trytond user errors/warnings nor internally emitted error log events.

Custom UserApplication exception handling could also craft a sentry event id report shaped for custom APIs.

Implementation

A first approach can be seen here https://github.com/getsentry/sentry-python/pull/548

Tryton community feedback is very welcome!

Wishlist

After this, a trytond-cron, trytond-worker and tryton client integration (by means of plugins?) could come, also with a trytond-specific sentry context processor

1 Like

It will be better if we did not need to write a custom script but that WSGI wrappers could be loaded from the configuration (using qualname for example).

The problem is that the content of such message could be hardly translated.
Maybe the clients could have a dedicated error message translated for such case.

I guess the logger integration should be enough. But again it will be good to be able to load external tools by configuration.

I guess it is the only way.
Or we could have the client to log to the OS and let companies collect OS journals.

It could be better, but it also would require quite a lot of work to abstract and document the injection of aribtrary WSGI middleware and its configuration. Think about environment variables, secrets managing, system hooks… In the meanwhile, writing a custom wsgi.py for any app deployment is something very usual that provides full control on how modules are loaded. (actually, it is very convenient so as to properly configure the logging of the very trytond wsgi app, for example)

Such a script is so convenient that many frameworks actually ease into it, promoting its writing and its execution by their very development toolchains. And even if Tryton didn’t ease into it, no operator can be stopped of wrapping the trytond wsgi app with arbitrary wsgi.py scripts. I’d actually give for granted the existence of such scripts.

Indeed, it will be cumbersome to translate messages at this level. I can depict at least 2 ways

  • A pure gettext translation system with an even more complex wsgi.py script.
  • The messages are registered in a custom trytond module and a second transaction is started so as to use the Tryton translation system. But this would not be advisable if the very database connection was to be the source of the exception!

This is definitely on the road so we should think about it even if the community integration does not solve all use cases.

Precisely because it is not yet straightforward to load external tools by configuration, a workaround is provided in the pull request linked above https://github.com/getsentry/sentry-python/pull/548/files#diff-f12c2c0e27b01b231c946f1fc734cb75R45
Essentially, it injects a sentry_sdk.init call in a null logger handler that will only start the standard integrations. But again, this is hackish and it is the behavior of the deprecated raven package. And it could be a framework independent workaround, meaning that it could be shipped not in the sentry_sdk.integrations.trytond submodule. But AFAIU sentry_sdk does purposely not ship such a feature nor anything similar.

I can only have a feeling that a plugin is the way to go, but I am clueless about its design. All insights are welcome.

I do not see why. WSGI has a well defined interface. Look for example all the middleware available in werkzeug.contrib.
Indeed we already have such mechanism for the bus and the cache.

Except that in both case, you do not know the language of the request.

Then we could add error handlers to both services like we have for the WSGI application.
All error handlers could be filled by configuration (like done for bus and cache).

Well plugin was not really designed for that. But it is code that is loaded by the client on startup.
Such plugin could replace the current sys.excepthook set in client.py.

Precisely, using all that middleware in custom wsgi.py scripts gives you a lot of power. Declaratively configuring few middlewares in an app that will use them upfront is feasible (trytond does!). Creating a config system that will power the setup of arbitrary middleware feels like designing a programming language to me. That’s why I’d fallback to wsgi scripts.

I don’t want to dicourage you from taking that path. I just wanted to state that is is full of pitfalls while the wsgi.py path will always be there, will always be easier, will always be more powerful, and I find no reason to regret abusing it.

Furthermore, if the trytond binary was to load aribtrary wsgi scripts with a cli/config parameter, it would instantly gain all that power “for free” instead of designing such a config system. Otherwise gunicorn is already a solution for that, but a framework-specific toolchain is always an interesting feature.

Said this, I think this thread has reached an off-topic status. Should we open a new discussion linking what has been discussed here?

Request language could be inferred from the very request, which will be accessible. It will be in the context of a RPC client call or it could be in HTTP headers if a browser was consuming a user application. In the worst case, it will be in a per-user record in the database but potentially stored in a cache. Maybe every operator will have to settle for a different heuristic but I believe it is feasible.

Next week I’ll check out those features because I feel like I am missing your point.

That looks promising. Yet I still can’t imagine how each server would tell sentry-powered clients the DSN where to report events. Should it be a per-client configuration instead of per-server? How is a plugin distributed and configured if not manually? How a server tracks which clients have what plugins? I must admit this feature was never in my short-term plans.

Not at all, it can be done in few lines of code:

for middleware in config.get('wsgi', 'middleware'):
    Middleware = resolve(middleware)
    app.wsgi = Middleware(app.wsgi)

Re-parsing the request at this place seems bad and duplicate code.

They will and should not. The client error are not specific to any Tryton server but only to the specific client.

It is the only possible option.

I do not know. It depends on the infrastructure of each one.

It does not. The server must be agnostic about the client.

Indeed I already implemented it: Issue 8807: Load wsgi middleware from configuration - Tryton issue tracker
I still need to write the documentation.