Adding node layers to tests and loaders#2597
Draft
arienandalibi wants to merge 116 commits into
Draft
Conversation
…re-compute new IDs and turn them into RecordBatches
…ock the graph to get parallel iterators over edges. We filter to respect GraphView filtering behaviour.
…ill use ArrowWriter<File> for now, but we will add support for loading into a graph
# Conflicts: # raphtory/src/serialise/parquet/mod.rs
…ng explode_layers() on each EdgeView.
… function can now be passed to these functions to determine how the sinks will be created. This will allow us to pass a sink which is a crossbeam_channel to send RecordBatches elsewhere.
# Conflicts: # raphtory/src/serialise/parquet/mod.rs
…w materialize function
…dge_meta and node_meta.
…k and reusing the old one.
…f encoding everything and then ingesting everything (which would keep everything in memory at once).
… when run on a big graph.
…another thread pool.
…rage so that it doesn't run out of memory
…anning each segment for each row. Now using this path in the new materialize_using_recordbatches function.
… as much as possible
…separate out running materialize and parquet decoding. Test using SF10 for now.
# Conflicts: # raphtory/src/arrow_loader/df_loaders/nodes.rs # raphtory/src/db/api/view/graph.rs # raphtory/src/io/parquet_loaders.rs # raphtory/src/parquet_encoder/edges.rs # raphtory/src/parquet_encoder/mod.rs # raphtory/src/parquet_encoder/model.rs # raphtory/src/parquet_encoder/nodes.rs # raphtory/src/python/graph/io/arrow_loaders.rs # raphtory/src/serialise/parquet.rs # raphtory/tests/df_loaders.rs # raphtory/tests/test_materialize_sf10.rs
…'re now back to ingesting using VIDs instead of resolving GIDs.
Contributor
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Rust Benchmark'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 2.
| Benchmark suite | Current: ae5856a | Previous: 9823ef7 | Ratio |
|---|---|---|---|
lotr_graph/num_edges |
5 ns/iter (± 0) |
0 ns/iter (± 0) |
+∞ |
lotr_graph/num_nodes |
5 ns/iter (± 0) |
1 ns/iter (± 0) |
5 |
lotr_graph/graph_latest |
3 ns/iter (± 0) |
0 ns/iter (± 0) |
+∞ |
lotr_graph_materialise/materialize |
8112181 ns/iter (± 92144) |
1564816 ns/iter (± 35303) |
5.18 |
lotr_graph_window_100/num_nodes |
14 ns/iter (± 0) |
5 ns/iter (± 0) |
2.80 |
lotr_graph_window_100/iterate_exploded_edges |
771669 ns/iter (± 1905) |
325242 ns/iter (± 847) |
2.37 |
lotr_graph_window_100_materialise/materialize |
8439124 ns/iter (± 20322) |
1669150 ns/iter (± 10700) |
5.06 |
lotr_graph_window_10/has_node_existing |
146 ns/iter (± 7) |
62 ns/iter (± 11) |
2.35 |
lotr_graph_window_10/iterate nodes |
31982 ns/iter (± 64) |
11339 ns/iter (± 40) |
2.82 |
lotr_graph_window_10/iterate_exploded_edges |
370371 ns/iter (± 789) |
155788 ns/iter (± 1001) |
2.38 |
lotr_graph_window_10_materialise/materialize |
3616229 ns/iter (± 13006) |
971980 ns/iter (± 4278) |
3.72 |
lotr_graph_subgraph_10pc_materialise/materialize |
1731301 ns/iter (± 16067) |
334634 ns/iter (± 1287) |
5.17 |
lotr_graph_subgraph_10pc_windowed/has_node_existing |
153 ns/iter (± 9) |
62 ns/iter (± 14) |
2.47 |
lotr_graph_subgraph_10pc_windowed/iterate nodes |
5234 ns/iter (± 168) |
1365 ns/iter (± 3) |
3.83 |
lotr_graph_subgraph_10pc_windowed_materialise/materialize |
1065311 ns/iter (± 8431) |
230399 ns/iter (± 2617) |
4.62 |
lotr_graph_window_50_layered/num_edges_temporal |
152259 ns/iter (± 4117) |
70121 ns/iter (± 7586) |
2.17 |
lotr_graph_window_50_layered/has_node_existing |
439 ns/iter (± 17) |
129 ns/iter (± 12) |
3.40 |
lotr_graph_window_50_layered/iterate nodes |
74981 ns/iter (± 219) |
19308 ns/iter (± 47) |
3.88 |
lotr_graph_window_50_layered/iterate edges |
191543 ns/iter (± 3827) |
83616 ns/iter (± 1318) |
2.29 |
lotr_graph_window_50_layered/graph_latest |
79594 ns/iter (± 2869) |
36649 ns/iter (± 916) |
2.17 |
lotr_graph_window_50_layered_materialise/materialize |
32210207 ns/iter (± 272878) |
3488825 ns/iter (± 24948) |
9.23 |
lotr_graph_persistent_window_50_layered/num_edges_temporal |
596036 ns/iter (± 3556) |
192686 ns/iter (± 1569) |
3.09 |
lotr_graph_persistent_window_50_layered/has_node_existing |
471 ns/iter (± 293) |
174 ns/iter (± 83) |
2.71 |
lotr_graph_persistent_window_50_layered/iterate nodes |
98411 ns/iter (± 266) |
35886 ns/iter (± 191) |
2.74 |
lotr_graph_persistent_window_50_layered/iterate edges |
174062 ns/iter (± 3028) |
84161 ns/iter (± 596) |
2.07 |
lotr_graph_persistent_window_50_layered/iterate_exploded_edges |
4310197 ns/iter (± 19958) |
1659940 ns/iter (± 19402) |
2.60 |
lotr_graph_persistent_window_50_layered_materialise/materialize |
58225321 ns/iter (± 109281) |
5298035 ns/iter (± 147912) |
10.99 |
lotr_graph/proto_encode |
9914727 ns/iter (± 75745) |
1157897 ns/iter (± 73709) |
8.56 |
This comment was automatically generated by workflow using github-action-benchmark.
# Conflicts: # raphtory/src/db/api/view/graph.rs
…c. resolve_layer fast path when layer ids are present is gone temporarily while debugging, will bring it back. fix node_updates_window in persistent_semantics.rs to account for the entire timestamp at the windows beginning for persisting properties properly.
…ders, and bringing back the fast path that uses these when resolving layers.
… back to Option, if it's not there then we imply STATIC_GRAPH_LAYER
… added layer and layer id to nodes_t
…ing both functions into one. updating callsites. fixing variable names. cleaning up comments
# Conflicts: # raphtory/src/db/api/view/internal/time_semantics/persistent_semantics.rs # raphtory/src/parquet_encoder/nodes.rs
…the disk graphs are unreadable or data is loaded incorrectly
…the disk graphs are unreadable or data is loaded incorrectly
…the source graph doesn't change each time the test runs. Added .gitkeep empty files so empty directories are picked up by git.
…st. Add an entry in the makefile to run this example and create the necessary sentinel .gitkeep files in empty directories.
… to pometry-storage
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Node layers were previously not tested rigorously. They are now being added to tests, proptests, the loaders, and the parquet encoders. The loaders and parquet encoders are also used by materialize.
Why are the changes needed?
Fix node layer related bugs that we find.
Does this PR introduce any user-facing change? If yes is this documented?
It shouldn't
How was this patch tested?
proptests
Are there any further changes required?
There shouldn't be