Conversation
|
@HarelM I wish I could help but I believe this is too advanced for me |
|
Using a reserved string to handle nested properties via regular JSON would likely be fine if we stuck with MVT. If the existing parser from |
|
I tried removing all the references to vt-pbf in order to see the effect on bundle size, it is negligible, the code there is simply using pbf so there's very little code there, even less than what I added here. |
|
When you ran your benchmarks on updateData how did you wait for the pending transmission of the pbf back to the main thread? Did you use the waitForCompletion option? |
|
I used the existing benchmarks, which use map idle event as far as I remember. |
|
That should do it. It's surprising to me that at 3X you aren't seeing any gain |
|
I think it's not a bottleneck, it might be saving a few milliseconds and not noticeable overall, I guess, not sure... |
|
Maybe give it a try with 1,000 versus 100,000 points? |
Just a heads up — the existing benchmarks use features that have no |
|
True, although the serialization of geometries is also a part of this change, which is also interesting. |
|
Seems to me that it'd be significantly less code & more performant to use the copy of the GeoJSON that's already on the main thread rather than encoding, transferring, and decoding it. The only trick is finding a stable ID to associate the vector tile features back to the GeoJSON features (if one isn't provided). That's a rather easy problem to solve, though. |
|
As far as I understand, which might not be a lot, the main problem here, is query render features - you need to know which geojson geometry is in the relevant tile. |
I'm fairly certain this can work. |
|
I might be lost here but I thought we were sending back a sliced tile with extremely simplified geometry at low zoom levels. Additionally, the advantage is that we are incrementally clipping and don't have to re-process feature simplification and re-clip untouched geometry unless the specific tile is affected by a feature - but then we are seeking bottom up versus top down here. (deleted) |
|
@wayofthefuture I think you're overestimating the scope of this change. The only consumers of this serialized JSON format are |
|
Yes you're right I've been reading the code and am starting to see what you guys are talking about. The vast majority of my time has been spent in the geojson worker. |
|
I wish we could put a separate loadTile method in geojson_worker_source so it wasn't so closely tied-in to vector_tile_worker_source... Or even better, remove |
|
I don't have any plans to work on this issue. Free free to Ping me on the OSM Slack if you have any questions about what I'm suggesting. |
|
Would a possible solution be to save the generated vector tile in the worker and send the query to the worker? Then the processing for the tile is only done once and the worker can respond with the features? |
|
I've decided to use the json serialization "hack" approach. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6992 +/- ##
=======================================
Coverage 92.66% 92.67%
=======================================
Files 289 289
Lines 24061 24080 +19
Branches 5094 5101 +7
=======================================
+ Hits 22297 22317 +20
+ Misses 1764 1763 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
src/util/vectortile_to_geojson.ts
Outdated
| this._z = z; | ||
|
|
||
| this.properties = vectorTileFeature.properties; | ||
| this.properties = Object.fromEntries(Object.entries(vectorTileFeature.properties).map(e => [e[0], e[1]?.toString().startsWith(JSON_PREFIX) ? JSON.parse(e[1]?.toString().replace(JSON_PREFIX, '')) : e[1]])); |
There was a problem hiding this comment.
Not sure how this will affect performance, but here is an alternative if you prefer:
- No allocations when no prefixed values.
- Avoids
Object.entries()array creation. - Avoids
.map()intermediate array creation. - Avoids
Object.fromEntries()work entirely. - Skips calls on non-strings.
toString() - Computes prefix length once.
constructor(vectorTileFeature: VectorTileFeatureLike, z: number, x: number, y: number, id: string | number | undefined) {
this.type = 'Feature';
this._vectorTileFeature = vectorTileFeature;
this._x = x;
this._y = y;
this._z = z;
this.properties = this.getJSONProperties(vectorTileFeature.properties);
this.id = id;
}
private getJSONProperties(properties: Record<string, any>): Record<string, any> {
// Fast path: if no value is a JSON-prefixed string, avoid allocations entirely.
for (const key in properties) {
const value = properties[key];
if (typeof value !== 'string') continue;
if (!value.startsWith(JSON_PREFIX)) continue;
// Slower path: decode all JSON-prefixed strings.
return this.decodeJSONPrefixedProperties(properties);
}
return properties;
}
private decodeJSONPrefixedProperties(properties: Record<string, any>): Record<string, any> {
const decoded: Record<string, any> = {};
for (const key in properties) {
const value = properties[key];
if (typeof value !== 'string' || !value.startsWith(JSON_PREFIX)) {
decoded[key] = value;
continue;
}
decoded[key] = JSON.parse(value.slice(JSON_PREFIX.length));
}
return decoded;
}There was a problem hiding this comment.
It's worth remembering this is only applied to returned features when running query rendered features.
And usually features don't have a lot of properties.
But I was concerned about performance too.
Also you can't break as there might be more than one property with nested objects.
But I'll write something a bit more performant.
There was a problem hiding this comment.
I'm pretty sure it doesn't break... it's an early return in detection versus decoding. If it continues all the way through it just returns the original object - mimicking the previous implementation. I believe the performance key is the avoidance of .toString and .parse
There was a problem hiding this comment.
I'll update this, give me a sec...
There was a problem hiding this comment.
Done in a778796.
In theory, this means that if the same feature is used multiple times the json parse will only happen once per JSON property, it will be done lazily only when converting the feature to geojson.
No extra allocation, minimal time spent for features without this.
There was a problem hiding this comment.
Much better than my version!
|
|
||
| this.properties = Object.fromEntries(Object.entries(vectorTileFeature.properties).map(e => [e[0], e[1]?.toString().startsWith(JSON_PREFIX) ? JSON.parse(e[1]?.toString().replace(JSON_PREFIX, '')) : e[1]])); | ||
| for (const key in vectorTileFeature.properties) { | ||
| if (typeof vectorTileFeature.properties[key] !== 'string' || !vectorTileFeature.properties[key].startsWith(JSON_PREFIX)) { |
There was a problem hiding this comment.
blows my version away! consider const value = vectorTileFeature.properties[key]... I'm not sure though I guess this is a style preference or something i don't know
There was a problem hiding this comment.
I wanted to avoid allocation as much as possible...
This adds a custom binary serializer to facilitate for geojson nested properties.
I've looked into
cbor,masgpkr,bsonand other binary serializer for json objects and they increase the bundle size too much.geobufuses geojson, but the worker data, even though its called geojson worker is holding a tile like data, which in not in geojson format, mainly because it's in tile space coordinates and not wgs84.Another approach I thought about is adding a flag to the vt-pbf serialization in order to prefix with a special string the
JSON.stringifythat happens there so that later on we can simply callJSON.parseand return that, this will reduce the need for a custom serialization code and should not impact performance much.I've benchmarked the serializer against the existing one, and it's about 3x faster for the test I ran, but I don't think it changes the performance of geojson much from the benchmark test that is already there.
@mwilsnd @lucaswoj @wayofthefuture
I would appreciate your thoughts here as I'm a bit stuck in terms of which option to use.
Tests failure: while initially I thought the problem is with the serializer that causes the tests to fail, when I looked into it, I saw that the problem is that the serializer works "too good" in the sense that it serialized and desrialized to the same object, and for
MultiPointthis was not the case forfromVectorTileJsmethod, as it changed the input so that when deserializing it in the main thread it changed multi point points array - from an array of two points to two array of one point, this is the reason for the query test failure.Launch Checklist
CHANGELOG.mdunder the## mainsection.