Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
b3881c1
Updating Parquet* structs to support manually passed export vids, eid…
arienandalibi Mar 12, 2026
fe45031
Allowed IDs to be passed to parquet serialization. Will allow us to p…
arienandalibi Mar 13, 2026
c6d43e1
Changed Parquet encoding to take GraphView instead of GraphStorage. L…
arienandalibi Mar 17, 2026
d4d63ea
Fixed node and edge parallel iterator creation
arienandalibi Mar 18, 2026
8ddff63
Making the parquet encoders generic over the writer (now sink). We st…
arienandalibi Mar 18, 2026
2bb7b49
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi Mar 19, 2026
dbfa3f1
Changed Parquet writer from ArrowWriter to generic sink for nodes, ed…
arienandalibi Mar 19, 2026
e261d28
Fixed possible ParquetDelEdge layer_id and layer_name issues by calli…
arienandalibi Mar 19, 2026
c469443
Fixed path error
arienandalibi Mar 20, 2026
d90debf
Made all the encode_* functions generic over the sink. A sink factory…
arienandalibi Mar 23, 2026
a10a251
Adding Receiver side on materialize
arienandalibi Mar 25, 2026
82935de
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi Mar 25, 2026
33790e3
Hid new materialize behind IO feature and added a test to test the ne…
arienandalibi Mar 26, 2026
9ca997e
Adding logic to ingest data using load_*_from_df functions
arienandalibi Mar 26, 2026
f11458f
Fixed deadlock. It had to do with LayerMappers being shared between e…
arienandalibi Mar 26, 2026
80dfc59
Removed unused variable bindings
arienandalibi Mar 26, 2026
93382c1
Fixed deadlock caused by DictMapper deep_clone not creating a new loc…
arienandalibi Mar 27, 2026
68d1233
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi Mar 28, 2026
365f777
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi Mar 30, 2026
bb6730e
Working on making materialize stream RecordBatches properly instead o…
arienandalibi Mar 30, 2026
321ddbc
Changed std::thread::scope for a rayon::scope
arienandalibi Mar 31, 2026
8ff88fc
Added a test that times the old and new materialize functions
arienandalibi Mar 31, 2026
a37485b
Debugging materialize_using_recordbatches to see why it freezes/hangs…
arienandalibi Apr 1, 2026
aa9f7d9
Changed to make encoding using its own thread pool and ingestion use …
arienandalibi Apr 2, 2026
3df56dc
Switched materialize test to use graph paths and have disk backed sto…
arienandalibi Apr 2, 2026
0387dc1
Improved ingestion time on the "load_*_from_df" path by avoiding resc…
arienandalibi Apr 7, 2026
f170591
Switched assert_graph_equals to be parallel instead of multi-threaded…
arienandalibi Apr 8, 2026
68a9928
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi Apr 8, 2026
eef6972
Rustfmt
arienandalibi Apr 8, 2026
f856a7b
Use graph_equals instead of our custom GraphSummary. Update tests to …
arienandalibi Apr 8, 2026
e31a7a4
Set up environment variables to configure database properly before ma…
arienandalibi Apr 9, 2026
6e6cb4c
Added Jemalloc
arienandalibi Apr 10, 2026
b949238
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi Apr 10, 2026
48c5257
Removed some unnecessary #[cfg(feature = "io")] gates. Use constants …
arienandalibi Apr 10, 2026
7310890
Added a test to time loading SF10 dumped parquet files using the df_l…
arienandalibi Apr 14, 2026
9c4646a
Brought zips back in df_loaders/edges.rs for passing data such as vid…
arienandalibi Apr 14, 2026
4045c8c
Removed flushing of graph before ingesting RecordBatch in df_loaders.…
arienandalibi Apr 14, 2026
8584a84
Removed unused imports, changed jemalloc to only be used on MacOS, an…
arienandalibi Apr 15, 2026
a736540
Moving df_loaders out of io feature
arienandalibi Apr 15, 2026
43345a0
Move LOAD_POOL out of "io" feature
arienandalibi Apr 15, 2026
c810389
Move ENCODE_POOL out of "io" feature
arienandalibi Apr 15, 2026
5c7e3ca
Removing some #[cfg(feature = "io")] gates related to materialize_usi…
arienandalibi Apr 15, 2026
8c319df
Moved folder from serialise::parquet out of serialise folder (so out …
arienandalibi Apr 16, 2026
0fe99bd
Fixed feature gating behind io and progress
arienandalibi Apr 16, 2026
3a5efc7
Moved SNB SF1, SF3, SF10 tests to their own separate file
arienandalibi Apr 16, 2026
9bc6e2f
Added test for a filtered graph
arienandalibi Apr 16, 2026
f9c41dd
Renamed parquet folder to parquet_encoder
arienandalibi Apr 17, 2026
a80fd13
Fixed encoders to pass relevant information in NodesT, EdgesC, and Ed…
arienandalibi Apr 20, 2026
e97d323
Lower channel size
arienandalibi Apr 20, 2026
bbce32c
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi Apr 20, 2026
cc8f453
Fixes after merge
arienandalibi Apr 20, 2026
c8110ea
Fixed test
arienandalibi Apr 20, 2026
40d0687
Fixed io feature gating
arienandalibi Apr 21, 2026
aa079cd
Added layer creation before creating the temporal graph to ensure emp…
arienandalibi Apr 21, 2026
b6fb8bc
Updated edges iteration in parquet encoders so that EIDs get resolved…
arienandalibi Apr 21, 2026
11ade14
Clean up after filtered sf1 test
arienandalibi Apr 21, 2026
166f3e2
No need to set the env vars for raphtory settings, they are imported …
arienandalibi Apr 22, 2026
f86c7c9
Merge branch 'db_v4' into db_v4_bulk_ingestion
arienandalibi Apr 22, 2026
ad382bf
Added layer names to the parquet files to avoid filename collision wh…
arienandalibi Apr 22, 2026
78616fe
Cleaned up test_materialize.rs imports
arienandalibi Apr 22, 2026
3586a90
Switched old materialize for the new one to run tests
arienandalibi Apr 22, 2026
6f72d82
Fix bug in resolve_node_and_meta_for_node_col where nodes were not be…
arienandalibi Apr 23, 2026
1a53892
Materialize edge deletions before edge c props (edge metadata) to fix…
arienandalibi Apr 23, 2026
23a6cdd
Attempting to fix temporal properties not being serialized properly o…
arienandalibi Apr 23, 2026
643bd6c
Got rid of layer_n in parquet filenames. They were causing problems w…
arienandalibi Apr 23, 2026
967ecbd
Preserve property mappers in materialize
arienandalibi Apr 23, 2026
28ac52a
Fix bugs in materialize. Switch rayon::scope for std::thread::scope t…
arienandalibi Apr 24, 2026
1e7e8f4
Remove sf3 paths in test_materialize_sf10.rs,
arienandalibi Apr 24, 2026
9909cde
Remove channel for producer in materialize
arienandalibi Apr 24, 2026
a6e7fb9
Added flag to resolve nodes when materializing in load_node_props_fro…
arienandalibi Apr 24, 2026
fd838a4
First try at is_materializing flag in load_node_props_from_df
arienandalibi Apr 24, 2026
9075538
Fixed test_materialize_sf10.rs feature gating on imports
arienandalibi Apr 24, 2026
9ba9c32
Merge branch 'refs/heads/db_v4' into db_v4_bulk_ingestion
arienandalibi Apr 27, 2026
245ad95
Added t_len for NodeStorageInner
arienandalibi Apr 27, 2026
65daec9
Clean up imports a lil
arienandalibi Apr 27, 2026
ad56266
Fix normalise_temporal_map not properly defining a stable determinist…
arienandalibi Apr 28, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion db4-storage/src/pages/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ impl<
node_meta.get_or_create_node_type_id(node_type);
}

let t_len = edge_storage.t_len();
let t_len = edge_storage.t_len() + node_storage.t_len();

Ok(Self {
nodes: node_storage,
Expand Down
4 changes: 4 additions & 0 deletions db4-storage/src/pages/node_store.rs
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,10 @@ impl<NS: NodeSegmentOps<Extension = EXT>, EXT: PersistenceStrategy<NS = NS>>
self.segments.count()
}

pub fn t_len(&self) -> usize {
self.segments.iter().map(|(_, page)| page.t_len()).sum()
}

// pub fn segments(&self) -> &boxcar::Vec<Arc<NS>> {
// &self.segments
// }
Expand Down
4 changes: 2 additions & 2 deletions python/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -39,5 +39,5 @@ proto = ["raphtory/proto"]
[build-dependencies]
pyo3-build-config = { workspace = true }

#[target.'cfg(not(target_env = "msvc"))'.dependencies]
#tikv-jemallocator.workspace = true
[target.'cfg(target_os = "macos")'.dependencies]
tikv-jemallocator.workspace = true
6 changes: 6 additions & 0 deletions python/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ use raphtory::python::{
};
use raphtory_graphql::python::pymodule::base_graphql_module;

#[cfg(target_os = "macos")]
use tikv_jemallocator::Jemalloc;
#[cfg(target_os = "macos")]
#[global_allocator]
static GLOBAL: Jemalloc = Jemalloc;

/// Raphtory graph analytics library
#[pymodule]
fn _raphtory(py: Python<'_>, m: &Bound<PyModule>) -> PyResult<()> {
Expand Down
1 change: 1 addition & 0 deletions raphtory/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ storage.workspace = true
iter-enum = { workspace = true, features = ["rayon"] }
hashbrown = { workspace = true }
chrono = { workspace = true }
crossbeam-channel = { workspace = true }
itertools = { workspace = true }
num-traits = { workspace = true }
num-integer = { workspace = true }
Expand Down
6 changes: 2 additions & 4 deletions raphtory/examples/eth_loader.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
#[cfg(feature = "io")]
use raphtory::io::{
arrow::df_loaders::edges::ColumnNames, parquet_loaders::load_edges_from_parquet,
};
use raphtory::{errors::GraphError, prelude::*};
use raphtory::io::parquet_loaders::load_edges_from_parquet;
use raphtory::{arrow_loader::df_loaders::edges::ColumnNames, errors::GraphError, prelude::*};
use std::path::{Path, PathBuf};

/// Load ETH data from Parquet files into a Raphtory Graph.
Expand Down
7 changes: 2 additions & 5 deletions raphtory/examples/snb_loader.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
#[cfg(feature = "io")]
use raphtory::io::{
arrow::df_loaders::edges::ColumnNames,
parquet_loaders::{load_edges_from_parquet, load_nodes_from_parquet},
};
use raphtory::{errors::GraphError, prelude::*};
use raphtory::io::parquet_loaders::{load_edges_from_parquet, load_nodes_from_parquet};
use raphtory::{arrow_loader::df_loaders::edges::ColumnNames, errors::GraphError, prelude::*};
use std::path::{Path, PathBuf};

/// Construct the path to a named Parquet file inside `parquet_dir`.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use crate::{
api::core::utils::time::TryIntoTime,
arrow_loader::node_col::{lift_node_col, NodeCol},
errors::{into_load_err, GraphError, LoadError},
io::arrow::node_col::{lift_node_col, NodeCol},
};
use arrow::{
array::{cast::AsArray, Array, ArrayRef, PrimitiveArray},
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#[cfg(feature = "progress")]
use crate::io::arrow::df_loaders::build_progress_bar;
use crate::arrow_loader::df_loaders::build_progress_bar;
#[cfg(feature = "progress")]
use kdam::BarExt;

use crate::{
db::api::view::StaticGraphViewOps,
errors::{into_graph_err, GraphError, LoadError},
io::arrow::{
arrow_loader::{
dataframe::{DFChunk, DFView},
df_loaders::{
edges::{get_or_resolve_node_vids, store_node_ids, ColumnNames},
Expand All @@ -13,13 +13,14 @@ use crate::{
layer_col::lift_layer_col,
prop_handler::*,
},
db::api::view::StaticGraphViewOps,
errors::{into_graph_err, GraphError, LoadError},
prelude::*,
};
use arrow::{array::AsArray, datatypes::UInt64Type};
use bytemuck::checked::cast_slice_mut;
use db4_graph::WriteLockedGraph;
use itertools::izip;
use kdam::BarExt;
use raphtory_api::{
atomic_extra::atomic_usize_from_mut_slice,
core::entities::{properties::prop::AsPropRef, LayerId, EID},
Expand Down
Loading
Loading