Skip to content

rust(feat): Updated Rust CLI Parquet imports to detect config on clientside#541

Draft
Brandon-Shippy wants to merge 2 commits intomainfrom
brandonshippy/rust/parquet_config_import_clientside
Draft

rust(feat): Updated Rust CLI Parquet imports to detect config on clientside#541
Brandon-Shippy wants to merge 2 commits intomainfrom
brandonshippy/rust/parquet_config_import_clientside

Conversation

@Brandon-Shippy
Copy link
Copy Markdown

Added detect.rs to add methods for parsing Parquet clientside instead of polling sift API

@solidiquis solidiquis changed the title Rust(Feat) Updated Rust CLI Parquet imports to detect config on clientside rust(feat): Updated Rust CLI Parquet imports to detect config on clientside Apr 21, 2026
@Brandon-Shippy Brandon-Shippy marked this pull request as draft April 21, 2026 00:43
Copy link
Copy Markdown
Collaborator

@solidiquis solidiquis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left comments

let arrow_schema = parquet_to_arrow_schema(
metadata.file_metadata().schema_descr(),
metadata.file_metadata().key_value_metadata()
).unwrap();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer anyhow context and early return with ? and avoid unwrap

metadata.file_metadata().key_value_metadata()
).unwrap();

let mut time_column: Option<ParquetTimeColumn> = None;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid explicitly typing your variables and allow Rust to do type inference everywhere it can.

).unwrap();

let mut time_column: Option<ParquetTimeColumn> = None;
let mut data_columns: Vec<ParquetDataColumn> = Vec::new();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

let mut data_columns: Vec<ParquetDataColumn> = Vec::new();

for field in arrow_schema.fields() {
// TODO: If no time_column handle that (invalid)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address todo

use windows::{FooterMetadata, get_footer};

pub mod flat_dataset;
pub mod detect;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

module name needs work to better express intent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants