Better edit validation and don't apply invalid edits in the latest event#6454
Better edit validation and don't apply invalid edits in the latest event#6454
Conversation
5267ab9 to
cec40bf
Compare
cec40bf to
dca2c4f
Compare
bnjbvr
left a comment
There was a problem hiding this comment.
Hmm, I'm afraid that modifying every observer of the event cache to do the manual check will lead to proliferation of very similar code. And, it makes it a bit too easy to forget about other places where such a check would be required:
- in the thread latest event (
maybe_update_thread_summarypredates theLatestEventsystem, and theLatestEventintegration with threads is NYI) - in
TimelineItemContent::from_event(which reconstructs anEventTimelineItemfor out-of-timeline events, for usage in mutiple FFI APIs)
These are the two cases I'm thinking about, from the top of my head, so there might be more.
Instead, what we'd do ideally would be to not save an invalid edit event in the first place in the event cache. This complicates things a bit, because federation implies that the edit event can be observed before the original event, so we might want to keep track of edit events not matched against their target event, and discard them if they don't validate.
With such an implementation, you wouldn't have to worry about other subsystems creating observable invalid edits, I think, because the events wouldn't be emitted, or they would be removed beforehand even.
How does that sound? If you prefer keeping an a posteriori validation, then I think the other two functions I've mentioned need a fix and tests too 🫠
Well that sounds good, and I would say that we should go even further. The whole aggregations logic which lives inside of the timeline should be moved to the event cache. Almost each of those subsystems will be interested in more than just edits and will want to have consistent handling. That being said, I don't think I'll want to tackle such a big refactoring right now, it's more important to get the fix for this merged. |
|
Alright, I handled the thread summary in 6d1c566 as well. As for In both cases we eventually check if this contains a valid edit, well at least after this PR gets merged. |
This is relevant to a fundamental design question around the event cache: should the event cache contain raw events (like now) or some kind of aggregation of events (like the timeline Unless you meant something by this sentence (?), I think it's still better to keep raw, individual, non-aggregated events in the event cache, and let the consumers reaggregate them as they wish. One API change that would be interesting to explore, would be some kind of output transformer for the event cache events, which split them into "rendered" and "aggregations" categories; each rendered item would get a list of all its aggregations at the same time. Might not be that efficient to represent, but would be definitely more ergonomic for multiple consumers (timeline and latest event, at the minimum), and this would give us a single place where to apply the edit validation logic, among other things. |
Why is that better? Do we expect different consumers would reaggregate things differently?
If I understand this correctly, sure it doesn't need to be exactly in the event cache but might be a layer on top. What shouldn't be the goal is to have multiple places where aggregation logic needs to happen. |
No, this is rather that, if we want to be able to accomodate all the use cases (first example that comes to mind being show individual events, for debugging purposes), we need to provide access to individual raw events anyways, i.e. the least common denominator. If we provided only a high-level, aggregated view of events, then we couldn't cover the whole variety of use cases we have now and can't really anticipate in all consumers of the SDK.
Definitely agree here that having multiple places where applying aggregation would be an anti goal! |
I think we already can't represent the raw view of events like the CS API does, as we're not storing the As we're already applying transformations on the raw events, i.e. we decrypt them, I'm not sure I see a massive difference between transforming events by decrypting them or by applying an edit. But alas, this is becoming a bit off-topic for this PR. |
9faa761 to
9651630
Compare
…deciding on a latest event
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #6454 +/- ##
==========================================
- Coverage 89.90% 89.87% -0.04%
==========================================
Files 378 379 +1
Lines 104559 104689 +130
Branches 104559 104689 +130
==========================================
+ Hits 94002 94087 +85
- Misses 6963 6994 +31
- Partials 3594 3608 +14 ☔ View full report in Codecov by Sentry. |
This PR fixes an issue where we would not adhere to the spec when it comes to edit validation in the latest event logic.
Since the event cache isn't the one applying edits this validation logic needs to happen in two places. The first place is the timeline aggregation and the second one the latest event module.
I created a common method which operates on the raw JSON which is now used in the two relevant places.
A review commit by commit would be the easiest.
CHANGELOG.mdfiles.