Skip to content

Conversation

@MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Sep 23, 2025

Fix no active span when trying to log tracing error on startup.

Example error:

synapse.logging.opentracing - 427 - ERROR - wake_destinations_needing_catchup-0 - There was no active span when trying to log. Did you forget to start one or did a context slip?
Stack (most recent call last):
  File "/usr/lib/python3.13/threading.py", line 1014, in _bootstrap
    self._bootstrap_inner()
  File "/usr/lib/python3.13/threading.py", line 1043, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.13/threading.py", line 994, in run
    self._target(*self._args, **self._kwargs)
  File "python3.13/site-packages/twisted/_threads/_threadworker.py", line 75, in work
    task()
  File "python3.13/site-packages/twisted/_threads/_team.py", line 192, in doWork
    task()
  File "python3.13/site-packages/twisted/python/threadpool.py", line 269, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "python3.13/site-packages/twisted/python/threadpool.py", line 285, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "python3.13/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "python3.13/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "python3.13/site-packages/twisted/enterprise/adbapi.py", line 282, in _runWithConnection
    result = func(conn, *args, **kw)
  File "synapse/synapse/storage/database.py", line 1094, in inner_func
    return func(db_conn, *args, **kwargs)
  File "synapse/synapse/storage/database.py", line 822, in new_transaction
    opentracing.log_kv({"message": "commit"})
  File "synapse/synapse/logging/opentracing.py", line 427, in ensure_active_span_inner_2
    logger.error(

Why did this happen before?

This previously occurred because we called init_tracer(...) after the reactor started up in _base.start(). But we actually attempt some database transactions earlier than that which try to do some tracing because of that oidc = hs.get_oidc_handler() line.

Notice oidc = hs.get_oidc_handler() happened before _base.start(hs):

async def start() -> None:
# Load the OIDC provider metadatas, if OIDC is enabled.
if hs.config.oidc.oidc_enabled:
oidc = hs.get_oidc_handler()
# Loading the provider metadata also ensures the provider config is valid.
await oidc.load_metadata()
await _base.start(hs)
hs.get_datastores().main.db_pool.updates.start_doing_background_updates()
register_start(hs, start)

With this PR, I've updated things to init_tracer(...) earlier on alongside where we setup_logging(...).

Dev notes

OpenTracing

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

Example:
```log
synapse.logging.opentracing - 427 - ERROR - wake_destinations_needing_catchup-0 - There was no active span when trying to log. Did you forget to start one or did a context slip?
Stack (most recent call last):
  File "/usr/lib/python3.13/threading.py", line 1014, in _bootstrap
    self._bootstrap_inner()
  File "/usr/lib/python3.13/threading.py", line 1043, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.13/threading.py", line 994, in run
    self._target(*self._args, **self._kwargs)
  File "python3.13/site-packages/twisted/_threads/_threadworker.py", line 75, in work
    task()
  File "python3.13/site-packages/twisted/_threads/_team.py", line 192, in doWork
    task()
  File "python3.13/site-packages/twisted/python/threadpool.py", line 269, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "python3.13/site-packages/twisted/python/threadpool.py", line 285, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "python3.13/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "python3.13/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "python3.13/site-packages/twisted/enterprise/adbapi.py", line 282, in _runWithConnection
    result = func(conn, *args, **kw)
  File "synapse/synapse/storage/database.py", line 1094, in inner_func
    return func(db_conn, *args, **kwargs)
  File "synapse/synapse/storage/database.py", line 822, in new_transaction
    opentracing.log_kv({"message": "commit"})
  File "synapse/synapse/logging/opentracing.py", line 427, in ensure_active_span_inner_2
    logger.error(
```
Comment on lines +363 to +364
# Start the tracer
init_tracer(hs) # noqa
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We setup tracing next to setup_logging which I think makes sense (early on)

@MadLittleMods MadLittleMods marked this pull request as ready for review September 24, 2025 00:35
@MadLittleMods MadLittleMods requested a review from a team as a code owner September 24, 2025 00:35
@MadLittleMods MadLittleMods merged commit 25fa555 into develop Sep 24, 2025
76 of 78 checks passed
@MadLittleMods MadLittleMods deleted the madlittlemods/fix-no-active-span-on-error-startup branch September 24, 2025 15:12
@MadLittleMods
Copy link
Contributor Author

Thanks for the review @reivilibre 🐀

@MadLittleMods MadLittleMods added the A-Tracing OpenTracing (maybe OpenTelemetry in the future) label Oct 2, 2025
Michael-Ixo pushed a commit to ixoworld/synapse that referenced this pull request Oct 23, 2025
Deployments that make use of the
[synapse-s3-storage-provider](https://github.com/matrix-org/synapse-s3-storage-provider)
module must upgrade to
[v1.6.0](https://github.com/matrix-org/synapse-s3-storage-provider/releases/tag/v1.6.0).
Using older versions of the module with this release of Synapse will prevent
users from being able to upload or download media.

No significant changes since 1.140.0rc1.

- Add [a new Media Query by ID Admin API](https://element-hq.github.io/synapse/v1.140/admin_api/media_admin_api.html#query-a-piece-of-media-by-id) that allows server admins to query and investigate the metadata of local or cached remote media via
  the `origin/media_id` identifier found in a [Matrix Content URI](https://spec.matrix.org/v1.14/client-server-api/#matrix-content-mxc-uris). ([\element-hq#18911](element-hq#18911))
- Add [a new Fetch Event Admin API](https://element-hq.github.io/synapse/v1.140/admin_api/fetch_event.html) to fetch an event by ID. ([\element-hq#18963](element-hq#18963))
- Update [MSC4284: Policy Servers](matrix-org/matrix-spec-proposals#4284) implementation to support signatures when available. ([\element-hq#18934](element-hq#18934))
- Add experimental implementation of the `GET /_matrix/client/v1/rtc/transports` endpoint for the latest draft of [MSC4143: MatrixRTC](matrix-org/matrix-spec-proposals#4143). ([\element-hq#18967](element-hq#18967))
- Expose a `defer_to_threadpool` function in the Synapse Module API that allows modules to run a function on a separate thread in a custom threadpool. ([\element-hq#19032](element-hq#19032))

- Fix room upgrade `room_config` argument and documentation for `user_may_create_room` spam-checker callback. ([\element-hq#18721](element-hq#18721))
- Compute a user's last seen timestamp from their devices' last seen timestamps instead of IPs, because the latter are automatically cleared according to `user_ips_max_age`. ([\element-hq#18948](element-hq#18948))
- Fix bug where ephemeral events were not filtered by room ID. Contributed by @frastefanini. ([\element-hq#19002](element-hq#19002))
- Update Synapse main process version string to include git info. ([\element-hq#19011](element-hq#19011))

- Explain how `Deferred` callbacks interact with logcontexts. ([\element-hq#18914](element-hq#18914))
- Fix documentation for `rc_room_creation` and `rc_reports` to clarify that a `per_user` rate limit is not supported. ([\element-hq#18998](element-hq#18998))

- Remove deprecated `LoggingContext.set_current_context`/`LoggingContext.current_context` methods which already have equivalent bare methods in `synapse.logging.context`. ([\element-hq#18989](element-hq#18989))
- Drop support for unstable field names from the long-accepted [MSC2732](matrix-org/matrix-spec-proposals#2732) (Olm fallback keys) proposal. ([\element-hq#18996](element-hq#18996))

- Cleanly shutdown `SynapseHomeServer` object, allowing artifacts of embedded small hosts to be properly garbage collected. ([\element-hq#18828](element-hq#18828))
- Update OEmbed providers to use 'X' instead of 'Twitter' in URL previews, following a rebrand. Contributed by @HammyHavoc. ([\element-hq#18767](element-hq#18767))
- Fix `server_name` in logging context for multiple Synapse instances in one process. ([\element-hq#18868](element-hq#18868))
- Wrap the Rust HTTP client with `make_deferred_yieldable` so it follows Synapse logcontext rules. ([\element-hq#18903](element-hq#18903))
- Fix the GitHub Actions workflow that moves issues labeled "X-Needs-Info" to the "Needs info" column on the team's internal triage board. ([\element-hq#18913](element-hq#18913))
- Disconnect background process work from request trace. ([\element-hq#18932](element-hq#18932))
- Reduce overall number of calls to `_get_e2e_cross_signing_signatures_for_devices` by increasing the batch size of devices the query is called with, reducing DB load. ([\element-hq#18939](element-hq#18939))
- Update error code used when an appservice tries to masquerade as an unknown device using [MSC4326](matrix-org/matrix-spec-proposals#4326). Contributed by @tulir @ Beeper. ([\element-hq#18947](element-hq#18947))
- Fix `no active span when trying to log` tracing error on startup (when OpenTracing is enabled). ([\element-hq#18959](element-hq#18959))
- Fix `run_coroutine_in_background(...)` incorrectly handling logcontext. ([\element-hq#18964](element-hq#18964))
- Add debug logs wherever we change current logcontext. ([\element-hq#18966](element-hq#18966))
- Update dockerfile metadata to fix broken link; point to documentation website. ([\element-hq#18971](element-hq#18971))
- Note that the code is additionally licensed under the [Element Commercial license](https://github.com/element-hq/synapse/blob/develop/LICENSE-COMMERCIAL) in SPDX expression field configs. ([\element-hq#18973](element-hq#18973))
- Fix logcontext handling in `timeout_deferred` tests. ([\element-hq#18974](element-hq#18974))
- Remove internal `ReplicationUploadKeysForUserRestServlet` as a follow-up to the work in element-hq#18581 that moved device changes off the main process. ([\element-hq#18988](element-hq#18988))
- Switch task scheduler from raw logcontext manipulation to using the dedicated logcontext utils. ([\element-hq#18990](element-hq#18990))
- Remove `MockClock()` in tests. ([\element-hq#18992](element-hq#18992))
- Switch back to our own custom `LogContextScopeManager` instead of OpenTracing's `ContextVarsScopeManager` which was causing problems when using the experimental `SYNAPSE_ASYNC_IO_REACTOR` option with tracing enabled. ([\element-hq#19007](element-hq#19007))
- Remove `version_string` argument from `HomeServer` since it's always the same. ([\element-hq#19012](element-hq#19012))
- Remove duplicate call to `hs.start_background_tasks()` introduced from a bad merge. ([\element-hq#19013](element-hq#19013))
- Split homeserver creation (`create_homeserver`) and setup (`setup`). ([\element-hq#19015](element-hq#19015))
- Swap near-end-of-life `macos-13` GitHub Actions runner for the `macos-15-intel` variant. ([\element-hq#19025](element-hq#19025))
- Introduce `RootConfig.validate_config()` which can be subclassed in `HomeServerConfig` to do cross-config class validation. ([\element-hq#19027](element-hq#19027))
- Allow any command of the `release.py` script to accept a `--gh-token` argument. ([\element-hq#19035](element-hq#19035))

* Bump Swatinem/rust-cache from 2.8.0 to 2.8.1. ([\element-hq#18949](element-hq#18949))
* Bump actions/cache from 4.2.4 to 4.3.0. ([\element-hq#18983](element-hq#18983))
* Bump anyhow from 1.0.99 to 1.0.100. ([\element-hq#18950](element-hq#18950))
* Bump authlib from 1.6.3 to 1.6.4. ([\element-hq#18957](element-hq#18957))
* Bump authlib from 1.6.4 to 1.6.5. ([\element-hq#19019](element-hq#19019))
* Bump bcrypt from 4.3.0 to 5.0.0. ([\element-hq#18984](element-hq#18984))
* Bump docker/login-action from 3.5.0 to 3.6.0. ([\element-hq#18978](element-hq#18978))
* Bump lxml from 6.0.0 to 6.0.2. ([\element-hq#18979](element-hq#18979))
* Bump phonenumbers from 9.0.13 to 9.0.14. ([\element-hq#18954](element-hq#18954))
* Bump phonenumbers from 9.0.14 to 9.0.15. ([\element-hq#18991](element-hq#18991))
* Bump prometheus-client from 0.22.1 to 0.23.1. ([\element-hq#19016](element-hq#19016))
* Bump pydantic from 2.11.9 to 2.11.10. ([\element-hq#19017](element-hq#19017))
* Bump pygithub from 2.7.0 to 2.8.1. ([\element-hq#18952](element-hq#18952))
* Bump regex from 1.11.2 to 1.11.3. ([\element-hq#18981](element-hq#18981))
* Bump serde from 1.0.224 to 1.0.226. ([\element-hq#18953](element-hq#18953))
* Bump serde from 1.0.226 to 1.0.228. ([\element-hq#18982](element-hq#18982))
* Bump setuptools-rust from 1.11.1 to 1.12.0. ([\element-hq#18980](element-hq#18980))
* Bump twine from 6.1.0 to 6.2.0. ([\element-hq#18985](element-hq#18985))
* Bump types-pyyaml from 6.0.12.20250809 to 6.0.12.20250915. ([\element-hq#19018](element-hq#19018))
* Bump types-requests from 2.32.4.20250809 to 2.32.4.20250913. ([\element-hq#18951](element-hq#18951))
* Bump typing-extensions from 4.14.1 to 4.15.0. ([\element-hq#18956](element-hq#18956))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Tracing OpenTracing (maybe OpenTelemetry in the future)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants