Skip to content

fix(ffi): manually drop room if it's last in Arc#6545

Open
articice wants to merge 1 commit intomatrix-org:mainfrom
articice:manually-drop-room
Open

fix(ffi): manually drop room if it's last in Arc#6545
articice wants to merge 1 commit intomatrix-org:mainfrom
articice:manually-drop-room

Conversation

@articice
Copy link
Copy Markdown

@articice articice commented May 7, 2026

The FFI Client.inner is already wrapped in AsyncRuntimeDropped because dropping a sqlite-backed MatrixClient outside a tokio runtime causes SyncWrapper<rusqlite::Connection>::drop to call tokio::task::spawn_blocking. Without runtime context that panics, and a panic inside Drop aborts the process (SIGABRT).

Room.inner: SdkRoom was not wrapped. An SdkRoom holds an Arc<ClientInner>. When the FFI Client had already been released and React Native's Hermes GC finalized the last JS reference to a Room on the JS thread (non-tokio), Room::dropSdkRoom::drop → last Arc<ClientInner>::dropsqlite drop → spawn_blocking panic → SIGABRT.

This completes the pattern from #4168, with #2815 / #2821 being reference fixes for deadpool/sqlite drop-context root cause.

TL;DR

Element X (iOS/Android) and other Tauri-style consumers don't hit this because they keep a single long-lived Client for the app/session lifetime, so Room drops never release the last Arc. React Native + Hermes hits it because the JS Client can be released while old Room JS references await GC on a non-tokio thread; when Hermes finalizes them, the drop chain reaches sqlite::Connection::drop → spawn_blocking outside any runtime → SIGABRT.

The wrapper fixes that one corner case without affecting any other consumer.

Timeline FFI type is the next strongest candidate, will follow-up.

History of the Drop-on-non-tokio-thread problem

2023 — original problem and first hack
  • PR fix(ffi): don't leak Client instances #2815 fix(ffi): don't leak Client instances — Benjamin Bouvier, Nov 3 2023 (commit ad5761b)
  • First documented FFI Drop crash. Quote: "the Client holds references to deadpool's pools, which require to be in a tokio Runtime because their Drop impl will use block_on. Now, we can't just drop the stores without breaking all the abstractions."
  • Fix: hack Drop for Client that did RUNTIME.block_on(async move { self.inner = MatrixClient::builder()...build().await.unwrap() }) — i.e., replace the inner client with a dummy in-memory one inside a tokio context, so the original gets dropped under the runtime.
  • PR ffi: Simplify Drop implementation for Client #2821 ffi: Simplify Drop implementation for Client — Jonas Platte, Nov 6 2023 (commit e788a6b)
  • Replaced Bouvier's swap-with-dummy hack with ManuallyDrop plus an explicit Drop impl that does let _guard = RUNTIME.enter(); ManuallyDrop::drop(&mut self.inner);.
  • Comment changes from "deadpool's block_on" to "deadpool drops sqlite connections in the DB pool on tokio's blocking threadpool to avoid blocking async worker threads" — i.e., it's actually spawn_blocking (not block_on) that fails without runtime context.
  • This is the canonical pattern still in use today.
2024 — pattern proliferation, then helper
  • PR chore(room_preview): add RoomListItem::preview_room #4152 chore(room_preview): add RoomListItem::preview_room — Oct 25 2024 (commit 40f4fc1)
  • Same ManuallyDrop recipe copy-pasted into RoomList and RoomPreview because each holds its own Arc.
  • PR chore(ffi): introduce AsyncRuntimeDropped helper #4168 chore(ffi): introduce AsyncRuntimeDropped helper — Benjamin Bouvier, Oct 25 2024 (commit c08194a)
  • Quote: "avoids proliferation of ManuallyDrop in the code base, by having a single type that's used for dropping under an async runtime."
  • Three call sites converted: Client.inner, RoomList.inner, RoomPreview.inner.
  • Room.inner was NOT converted — it remained a bare SdkRoom. (This is the gap our fix closes.)
2025 — wasm extension
2026 — this PR
  • manually-drop-room branch (commit b78d14a, today)
  • Wraps Room.inner in AsyncRuntimeDropped, completing the pattern that chore(ffi): introduce AsyncRuntimeDropped helper #4168 left half-finished.
  • Adds regression tests for both Client::drop and Room::drop on non-tokio threads to prevent any of these wrappers from being silently removed in future refactors.

Why upstream didn't catch it

  • Element X (iOS/Android) drops the FFI Client first; Room objects either die with it under the same tokio runtime or are dropped from a tokio task. The bug only surfaces when the FFI Client outlives no rooms or when a Room outlives the Client and is then dropped from a non-tokio thread.
  • React Native's Hermes runs JS on a dedicated non-tokio thread. If a JS room reference is GC'd after its parent client has already been released — and the FFI Room is the last Arc holder — the destructor runs on Hermes'
    thread. SdkRoom → Arc last-strong-drop → SyncWrapperrusqlite::Connection::drop → tokio::task::spawn_blocking from a non-tokio thread → panic in Drop → SIGABRT.
  • This is the exact scenario Client.inner's wrapper has prevented since 2023; Room.inner only needed the same treatment.

  • I've documented the public API Changes in the appropriate CHANGELOG.md files.
  • This PR was made with the help of AI.

Signed-off-by:

@articice articice requested a review from a team as a code owner May 7, 2026 08:25
@articice articice requested review from poljar and removed request for a team May 7, 2026 08:25
@articice articice force-pushed the manually-drop-room branch 3 times, most recently from 352b7d9 to 66c5f80 Compare May 7, 2026 08:34
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 7, 2026

Merging this PR will degrade performance by 49.4%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

❌ 1 regressed benchmark
✅ 49 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation Restore session [memory store] 113.9 ms 225.1 ms -49.4%

Comparing articice:manually-drop-room (b78d14a) with main (777ce05)

Open in CodSpeed

@articice articice force-pushed the manually-drop-room branch from 66c5f80 to b78d14a Compare May 7, 2026 12:09
@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.90%. Comparing base (99cfce8) to head (b78d14a).
⚠️ Report is 19 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6545      +/-   ##
==========================================
- Coverage   89.93%   89.90%   -0.04%     
==========================================
  Files         381      381              
  Lines      106780   106780              
  Branches   106780   106780              
==========================================
- Hits        96036    96002      -34     
- Misses       7096     7116      +20     
- Partials     3648     3662      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant