Ensure StateStore lock is acquired before updating RoomInfo#6478
Ensure StateStore lock is acquired before updating RoomInfo#6478poljar merged 16 commits intomatrix-org:mainfrom
StateStore lock is acquired before updating RoomInfo#6478Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #6478 +/- ##
==========================================
- Coverage 89.95% 89.90% -0.05%
==========================================
Files 381 381
Lines 106066 106780 +714
Branches 106066 106780 +714
==========================================
+ Hits 95412 96001 +589
- Misses 7007 7115 +108
- Partials 3647 3664 +17 ☔ View full report in Codecov by Sentry. |
66b178b to
2f89722
Compare
0db9ead to
bdc3c56
Compare
poljar
left a comment
There was a problem hiding this comment.
Ok, this PR looked way scarier than it is. Most of the changes are mechanical because of the addition of another argument, namely the lock guard.
It's a bit annoying that we need to add some more implementations of the storage trait but I don't think I have a better idea for this.
I think this looks sensible and removes one of the footguns our codebase had, I can't spot anything I would object to so approving it.
Could you just resolve the merge conflicts?
|
Will do! And glad it was not so scary :) Any thoughts, by the way, about the differences mentioned above between |
…ave_changes Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
…to room info Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
… sync Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
Signed-off-by: Michael Goldenberg <m@mgoldenberg.net>
bdc3c56 to
315fc51
Compare
Hmm, this just deletes the room completely, it might be unlikely to have this clash with a sync but I think it needs to be synchronized as well for it to behave correctly without edge-cases.
I think that should be fine, as we're only calling
Yeah, but we can make a rule to avoid the usage of the
Yes, this looks like a clear mistake. |
|
I'm going to merge this now. If you have the time, the remaining issues you raised would be neat to have fixed as well. Otherwise please open an issue so someone else takes care of them. |
Overview
Currently, updates to a
RoomInfoobject are typically persisted in two places.Roomobject viaRoom::set_room_infoStateStoreviaStateStore::save_changesIt is generally expected that the caller acquire a lock - often referred to as a state store lock - to synchronize these updates, but this is not enforced in any systematic way and is easy to forget.
The primary purpose of this pull request is to provide a better - albeit imperfect - way of ensuring all
RoomInfoupdates are synchronized, while also allowing theRoomInfoobject of aRoomto be read at any time - even during an update (see #6384).Changes
Synchronizing
StateStore::save_changesThe primary change was to add a type,
SaveLockedStateStore, that wraps an existingStateStoreand ensures that calls to the underlyingStateStore::save_changesare synchronized.This is accomplished by implementing
StateStoreforSaveLockedStateStoreand ensuring that the lock is acquired before calling the underlyingStateStore::save_changes.Additionally, the
SaveLockedStateStoreprovides a means of acquiring the underlying lock and making callsStateStore::save_changesby providing the guard for that lock. This is ensured by checking the guard provided references the sameMutexheld bySaveLockedStateStore.The benefit is that the lock may be held for an extended period where many operations may need to take place in order to build up the
StateChanges- e.g., while processing a sync response.Finally,
SaveLockedStateStorereplaces the underlying lock and store inBaseStateStoreand clones are provided to downstream structures.Synchronzing
RoomInfoupdates toRoomThe
SaveLockedStateStoreis cloned and provided to eachRoomcreated via theBaseStateStore. This allows aRoomto synchronize its actions using the underlying lock inSaveLockedStateStore.Furthermore, previous functions in
Roomfor making changes toRoomInfoare replaced with atomic versions.Room::update_room_infoensures updates to the in-memory representations of theRoomInfoare atomic andRoom::update_and_save_room_infoensures that updates to in-memory as well as persistent representations ofRoomInfoare atomic. Both have versions which allow the caller to pass a guard forSaveLockedStateStore::lock.Note that the proposal in #6384 suggested using a token issued by the state store lock to ensure updates to
RoomInfowere synchronized. Instead, this has been accomplished by checking reference equality between the guard and the underlying mutex.Considerations
While the abstractions outlined above offer some useful benefits, there are also some shortcomings to consider.
1.
SaveLockedStateStoredoes not synchronize calls to allStateStorefunctionsWhile
StateStore::save_changesis the primary function we care about, it's possible there are others that would be worth synchronizing. For instance,StateStore::remove_room, which modifiesRoom-related data.2.
SaveLockedStateStoredoes not provide atomic updatesWhile calls to
StateStore::save_changesare synchronized, read-then-edit-then-write operations may overlap. For example, overlapping calls toStateStore::get_room_infosfollowed byStateStore::save_changescould overwrite one another's modifications.3. It is easy to enter a deadlock
Because access is provided to the underlying lock in
SaveLockedStateStore, it is a bit too easy to enter a deadlock. This can happen by acquiring the lock and then making a call toStateStore::save_changesrather thanSaveLockedStateStore::save_changes_with_guard.An alternative here could be to make
StateStore::save_changessimply try to acquire the lock and return an error if it fails.Questions
Currently, while executing a sync via
SlidingSync::sync_once, a downstream function - i.e.,SlidingSync::handle_response- acquires the underlying state store lock right before processing the response and holds it until it has finished processing the response. As of this pull request, the guard for that lock is also passed to many downstream functions so that updates toRoomInfodo not cause a deadlock.matrix-rust-sdk/crates/matrix-sdk/src/sliding_sync/mod.rs
Lines 280 to 329 in 4e8ceaa
On the other hand, while executing a sync via
Client::sync_once, downstream functions - e.g.,Client::process_sync- do not hold the state store lock for the duration of the sync. They simply acquires it at very particular points during the process - e.g., inClient::receive_sync_response_with_requested_required_states.matrix-rust-sdk/crates/matrix-sdk-base/src/client.rs
Lines 756 to 766 in fd6cb66
Should executing a sync through
Client::sync_oncemimic the behavior of aSlidingSync::sync_oncein this regard? It seems to me they should, but I may be overlooking a reason thatClient::sync_onceshould not be holding a lock for the duration of the response processing logic.Closes #6384.
CHANGELOG.mdfiles.Signed-off-by: Michael Goldenberg m@mgoldenberg.net