Skip to content

kdtree_cuda#86

Open
koide3 wants to merge 12 commits intomasterfrom
kdtree_cuda
Open

kdtree_cuda#86
koide3 wants to merge 12 commits intomasterfrom
kdtree_cuda

Conversation

@koide3
Copy link
Copy Markdown
Owner

@koide3 koide3 commented Dec 6, 2025

No description provided.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds GPU-accelerated GICP (Generalized Iterative Closest Point) matching using KdTree on CUDA. The implementation provides a GPU-based alternative to the existing CPU-based GICP factor, allowing for faster point cloud registration on systems with CUDA support.

Changes:

  • Introduces KdTreeGPU class for GPU-accelerated nearest neighbor search using KdTree data structure
  • Adds IntegratedGICPFactorGPU factor class for GPU-accelerated GICP matching cost computation
  • Implements CUDA kernels for KdTree search, correspondence computation, and GICP derivatives linearization
  • Adds comprehensive test coverage for both KdTree GPU functionality and GICP GPU factor integration
  • Updates demo and CI configuration to support the new GPU GICP factor

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
include/gtsam_points/ann/kdtree_gpu.hpp Header defining KdTreeGPU class with GPU nearest neighbor search interface
include/gtsam_points/factors/integrated_gicp_factor_gpu.hpp Header defining IntegratedGICPFactorGPU class for GPU-accelerated GICP matching
include/gtsam_points/factors/integrated_gicp_derivatives.cuh CUDA header for GICP derivatives computation utilities
include/gtsam_points/cuda/kernels/kdtree.cuh CUDA kernel for KdTree nearest neighbor search on GPU
include/gtsam_points/cuda/kernels/gicp_derivatives.cuh CUDA kernels for computing GICP derivatives and error
include/gtsam_points/cuda/kernels/correspondence.hpp Header defining Correspondence structure for point matching
src/gtsam_points/ann/kdtree_gpu.cpp CPU-side implementation of KdTreeGPU (tree construction, memory management)
src/gtsam_points/ann/kdtree_gpu.cu GPU-side implementation of KdTree nearest neighbor search kernel
src/gtsam_points/factors/integrated_gicp_factor_gpu.cpp Implementation of IntegratedGICPFactorGPU factor methods
src/gtsam_points/factors/integrated_gicp_derivatives.cu Base implementation of GICP derivatives computation
src/gtsam_points/factors/integrated_gicp_derivatives_inliers.cu Inlier correspondence management for GICP
src/gtsam_points/factors/integrated_gicp_derivatives_linearize.cu GICP factor linearization on GPU
src/gtsam_points/factors/integrated_gicp_derivatives_compute.cu GICP error computation on GPU
src/test/test_kdtree.cpp Adds KdTreeGPU unit tests for nearest neighbor search accuracy
src/test/test_matching_cost_factors.cpp Adds GICP_CUDA to integration tests for factor optimization
src/demo/demo_matching_cost_factors.cpp Adds GICP_GPU option to matching cost factors demo
CMakeLists.txt Adds new CUDA source files to build configuration
.github/workflows/build.yml Removes CUDA 12.2 build, keeping 12.5 and 13.1

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +52 to +55
* @param targfixed_target_pose Fixed target pose
* @param source_key Source key
* @param target Target kdtree
* @param source Source frame
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter documentation is missing for "target_tree" in the unary constructor. The binary constructor on line 37 documents it, but it's missing here. This inconsistency in documentation should be fixed for clarity.

Suggested change
* @param targfixed_target_pose Fixed target pose
* @param source_key Source key
* @param target Target kdtree
* @param source Source frame
* @param fixed_target_pose Fixed target pose
* @param source_key Source key
* @param target Target frame
* @param source Source frame
* @param target_tree Target kdtree

Copilot uses AI. Check for mistakes.
check_error << cudaMemcpyAsync(nodes, h_nodes.data(), sizeof(KdTreeNodeGPU) * num_nodes, cudaMemcpyHostToDevice, stream);
}

KdTreeGPU::~KdTreeGPU() {
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent stream parameter in cudaFreeAsync calls. In the destructor, cudaFreeAsync is called with nullptr for the stream parameter, but these allocations were done with a specific stream in the constructor (line 52-55). The stream should match the allocation stream or be synchronized before freeing. Using nullptr may cause undefined behavior if the allocation stream is still using the memory.

Suggested change
KdTreeGPU::~KdTreeGPU() {
KdTreeGPU::~KdTreeGPU() {
// Ensure all GPU work that might use 'indices' or 'nodes' has completed
check_error << cudaDeviceSynchronize();

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +60
if (stack_size > MAX_STACK_SIZE - 2) {
printf("kdtree stack overflow!!");
} else if (cut_sq_dist < result.second) {
search_stack[stack_size].first = other_child;
search_stack[stack_size++].second = cut_sq_dist;
}

search_stack[stack_size].first = best_child;
search_stack[stack_size++].second = 0.0f;
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stack overflow handling is incomplete. When stack_size exceeds MAX_STACK_SIZE - 2, a warning is printed but execution continues, potentially writing beyond array bounds on lines 59-60. After printing the warning, the function should either return early with an error result or skip adding to the stack to prevent buffer overflow.

Copilot uses AI. Check for mistakes.
Comment thread include/gtsam_points/cuda/kernels/kdtree.cuh

/**
* @brief Create a unary GICP_GPU factor between a fixed target pose and an active source pose.
* @param targfixed_target_pose Fixed target pose
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in parameter name documentation. "targfixed_target_pose" should be "fixed_target_pose" to match the actual parameter name on line 60.

Suggested change
* @param targfixed_target_pose Fixed target pose
* @param fixed_target_pose Fixed target pose

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants