-
Notifications
You must be signed in to change notification settings - Fork 68
Change materialize to use the parquet encoders and df_loaders #2549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
arienandalibi
wants to merge
77
commits into
db_v4
Choose a base branch
from
db_v4_bulk_ingestion
base: db_v4
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 37 commits
Commits
Show all changes
77 commits
Select commit
Hold shift + click to select a range
b3881c1
Updating Parquet* structs to support manually passed export vids, eid…
arienandalibi fe45031
Allowed IDs to be passed to parquet serialization. Will allow us to p…
arienandalibi c6d43e1
Changed Parquet encoding to take GraphView instead of GraphStorage. L…
arienandalibi d4d63ea
Fixed node and edge parallel iterator creation
arienandalibi 8ddff63
Making the parquet encoders generic over the writer (now sink). We st…
arienandalibi 2bb7b49
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi dbfa3f1
Changed Parquet writer from ArrowWriter to generic sink for nodes, ed…
arienandalibi e261d28
Fixed possible ParquetDelEdge layer_id and layer_name issues by calli…
arienandalibi c469443
Fixed path error
arienandalibi d90debf
Made all the encode_* functions generic over the sink. A sink factory…
arienandalibi a10a251
Adding Receiver side on materialize
arienandalibi 82935de
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi 33790e3
Hid new materialize behind IO feature and added a test to test the ne…
arienandalibi 9ca997e
Adding logic to ingest data using load_*_from_df functions
arienandalibi f11458f
Fixed deadlock. It had to do with LayerMappers being shared between e…
arienandalibi 80dfc59
Removed unused variable bindings
arienandalibi 93382c1
Fixed deadlock caused by DictMapper deep_clone not creating a new loc…
arienandalibi 68d1233
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi 365f777
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi bb6730e
Working on making materialize stream RecordBatches properly instead o…
arienandalibi 321ddbc
Changed std::thread::scope for a rayon::scope
arienandalibi 8ff88fc
Added a test that times the old and new materialize functions
arienandalibi a37485b
Debugging materialize_using_recordbatches to see why it freezes/hangs…
arienandalibi aa9f7d9
Changed to make encoding using its own thread pool and ingestion use …
arienandalibi 3df56dc
Switched materialize test to use graph paths and have disk backed sto…
arienandalibi 0387dc1
Improved ingestion time on the "load_*_from_df" path by avoiding resc…
arienandalibi f170591
Switched assert_graph_equals to be parallel instead of multi-threaded…
arienandalibi 68a9928
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi eef6972
Rustfmt
arienandalibi f856a7b
Use graph_equals instead of our custom GraphSummary. Update tests to …
arienandalibi e31a7a4
Set up environment variables to configure database properly before ma…
arienandalibi 6e6cb4c
Added Jemalloc
arienandalibi b949238
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi 48c5257
Removed some unnecessary #[cfg(feature = "io")] gates. Use constants …
arienandalibi 7310890
Added a test to time loading SF10 dumped parquet files using the df_l…
arienandalibi 9c4646a
Brought zips back in df_loaders/edges.rs for passing data such as vid…
arienandalibi 4045c8c
Removed flushing of graph before ingesting RecordBatch in df_loaders.…
arienandalibi 8584a84
Removed unused imports, changed jemalloc to only be used on MacOS, an…
arienandalibi a736540
Moving df_loaders out of io feature
arienandalibi 43345a0
Move LOAD_POOL out of "io" feature
arienandalibi c810389
Move ENCODE_POOL out of "io" feature
arienandalibi 5c7e3ca
Removing some #[cfg(feature = "io")] gates related to materialize_usi…
arienandalibi 8c319df
Moved folder from serialise::parquet out of serialise folder (so out …
arienandalibi 0fe99bd
Fixed feature gating behind io and progress
arienandalibi 3a5efc7
Moved SNB SF1, SF3, SF10 tests to their own separate file
arienandalibi 9bc6e2f
Added test for a filtered graph
arienandalibi f9c41dd
Renamed parquet folder to parquet_encoder
arienandalibi a80fd13
Fixed encoders to pass relevant information in NodesT, EdgesC, and Ed…
arienandalibi e97d323
Lower channel size
arienandalibi bbce32c
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi cc8f453
Fixes after merge
arienandalibi c8110ea
Fixed test
arienandalibi 40d0687
Fixed io feature gating
arienandalibi aa079cd
Added layer creation before creating the temporal graph to ensure emp…
arienandalibi b6fb8bc
Updated edges iteration in parquet encoders so that EIDs get resolved…
arienandalibi 11ade14
Clean up after filtered sf1 test
arienandalibi 166f3e2
No need to set the env vars for raphtory settings, they are imported …
arienandalibi f86c7c9
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi ad382bf
Added layer names to the parquet files to avoid filename collision wh…
arienandalibi 78616fe
Cleaned up test_materialize.rs imports
arienandalibi 3586a90
Switched old materialize for the new one to run tests
arienandalibi 6f72d82
Fix bug in resolve_node_and_meta_for_node_col where nodes were not be…
arienandalibi 1a53892
Materialize edge deletions before edge c props (edge metadata) to fix…
arienandalibi 23a6cdd
Attempting to fix temporal properties not being serialized properly o…
arienandalibi 643bd6c
Got rid of layer_n in parquet filenames. They were causing problems w…
arienandalibi 967ecbd
Preserve property mappers in materialize
arienandalibi 28ac52a
Fix bugs in materialize. Switch rayon::scope for std::thread::scope t…
arienandalibi 1e7e8f4
Remove sf3 paths in test_materialize_sf10.rs,
arienandalibi 9909cde
Remove channel for producer in materialize
arienandalibi a6e7fb9
Added flag to resolve nodes when materializing in load_node_props_fro…
arienandalibi fd838a4
First try at is_materializing flag in load_node_props_from_df
arienandalibi 9075538
Fixed test_materialize_sf10.rs feature gating on imports
arienandalibi 9ba9c32
Merge branch 'refs/heads/db_v4' into db_v4_bulk_ingestion
arienandalibi 245ad95
Added t_len for NodeStorageInner
arienandalibi 65daec9
Clean up imports a lil
arienandalibi ad56266
Fix normalise_temporal_map not properly defining a stable determinist…
arienandalibi 028e000
Added edge.properties().temporal().iter_ids() and used it in the seri…
arienandalibi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.