feat(sdk): add high-level Search helpers + ffi bindings#6394
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6394 +/- ##
========================================
Coverage 89.85% 89.86%
========================================
Files 378 379 +1
Lines 104569 104893 +324
Branches 104569 104893 +324
========================================
+ Hits 93964 94264 +300
- Misses 6998 7012 +14
- Partials 3607 3617 +10 ☔ View full report in Codecov by Sentry. |
6ed213a to
5f0244b
Compare
|
This PR is in a sufficiently good state that it would allow FFI embedders to try out the search feature, build a search UI and so on. Doesn't do automatic indexation of all content, that problem is currently left as an exercise to the reader/embedder. |
poljar
left a comment
There was a problem hiding this comment.
This looks mostly good, there is some missing documentation here and there though.
I'm also wondering if this would be better modeled as a stream, which is my main concern.
| EventLoadError(#[from] matrix_sdk::Error), | ||
| } | ||
|
|
||
| /// An async iterator for a search query in a single room. |
There was a problem hiding this comment.
Hmm, a bit worried to call this an async iterator as some folks might expect this.
There was a problem hiding this comment.
I wasn't too worried about that (async iterators are nightly only and have been for a while, so I wouldn't expect them to become stable any time soon?), but I like the idea of being able to add stream combinators much better 🥳
| #[derive(thiserror::Error, Debug)] | ||
| pub enum SearchError { |
There was a problem hiding this comment.
Missing docs on the type and enum variants.
There was a problem hiding this comment.
Since both error variants are transparent, I don't think it's really worth documenting them, since the interesting documentation should rather be on the wrapped inner error type. And ditto for the overall error, not sure that adding "an error that happened during search" makes it clearer, but I'll add a tiny one though.
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
There was a problem hiding this comment.
The module is public but there's no module-level documentation.
It would be nice to have some sort of introductory example on how this should be used in the module level docs.
There was a problem hiding this comment.
Still no example? Could we please add one as it makes things for newcommers much easier.
|
|
||
| #[matrix_sdk_ffi_macros::export] | ||
| impl Room { | ||
| pub fn search(&self, query: String) -> RoomSearchIterator { |
There was a problem hiding this comment.
Missing docs.
In fact, a lot of the new types/functions here don't have docs.
There was a problem hiding this comment.
Yeah, I don't think adding docs must be a religion, when the doc only repeats the function name or states something clearly obvious 🤷
There was a problem hiding this comment.
I'll add some for your pleasure, where I think it makes the most sense.
| return Ok(None); | ||
| } | ||
|
|
||
| let result = self.room.search(&self.query, num_results, self.offset).await?; |
There was a problem hiding this comment.
This function only uses the local index, which is fine for a first prototype, but it should use the homeserver for rooms that aren't encrypted.
A TODO item to record this might make sense.
There was a problem hiding this comment.
For what it's worth: I do NOT think we should use the homeserver search for rooms that aren't encrypted, otherwise we need to make sure that the behavior of local search is exactly (and not more) than that of the homeserver search. Otherwise, it would be observable that we're using one or the other implementation, according to the search results, and the room encryption state.
This extra requirement may restrict what we're able to do (after all, we could do better search locally than what the search CS APIs ought to offer, because tantivy has fancy keywords, we could index more), and it might be hard to follow the implementation-specific details, especially as they evolve (which could be super slow as adding features to the homeserver is usually a slow process; or involve implementation-specific details, and show preference to a HS implementation rather than another).
As a result, I would strongly advocate for using a local index even in the case of unencrypted rooms, otherwise the semantics of our local search API would likely be harder to define.
There was a problem hiding this comment.
I agree with all the arguments why it would be better to use local search for both types of rooms.
Sadly none of them matter since you can't expect every user, and potentially every device, to locally index something like Matrix HQ. It's just not possible for storage, bandwidth, and CPU usage on the device as well as on the homeserver, reasons.
You could argue that you don't need to index the whole lot of Matrix HQ, but then you have incomplete search results. This would be strictly worse than a complete but less feature-full search implementation.
It's just not feasible to implement full-text search in this manner.
| /// if there are no more results. | ||
| pub async fn next( | ||
| &mut self, | ||
| num_results: usize, |
There was a problem hiding this comment.
I wonder if we should put this argument into the builder.
Then we could convert this into a proper Stream which would allow users to use many of the available adapters to post-process the results.
This would also then require a rename of the type, but I think that's fine as well as we wouldn't confuse people by calling something an iterator which doesn't implement Iterator or AsyncIterator.
There was a problem hiding this comment.
(I still need to address this, first.)
232b5bc to
89c2da8
Compare
3e7e41f to
a37fdc2
Compare
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
There was a problem hiding this comment.
Still no example? Could we please add one as it makes things for newcommers much easier.
0cdf60b to
7fbbdba
Compare
Starting to extract relevant helpers that can be merged ahead of having search spider all rooms in background.
Part of #5350.
CHANGELOG.mdfiles.