Harden PicoD server: graceful shutdown, symlink guard, doc fix#337
Harden PicoD server: graceful shutdown, symlink guard, doc fix#337Abhinav-kodes wants to merge 3 commits into
Conversation
|
@Abhinav-kodes: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request implements graceful shutdown for the picod server by introducing a signal-aware context and updating the server's Run method. Additionally, it updates the default execution timeout documentation and switches to a safer directory creation method. The review feedback correctly identifies a race condition in the shutdown logic where the main process could exit prematurely, as well as an issue where a clean shutdown is incorrectly reported as a fatal error. A code suggestion was provided to address these concerns using a synchronization channel.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #337 +/- ##
==========================================
+ Coverage 47.57% 48.93% +1.36%
==========================================
Files 30 30
Lines 2819 2869 +50
==========================================
+ Hits 1341 1404 +63
+ Misses 1338 1312 -26
- Partials 140 153 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements graceful shutdown for the picod server by integrating signal handling and context-aware execution. It also increases the default command execution timeout to 60 seconds and refactors directory creation to use a safer internal method. Feedback focuses on improving the server's state management by using a local variable for the HTTP server instead of a struct field to prevent potential race conditions, and increasing the shutdown timeout to 90 seconds to better accommodate long-running requests.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements graceful shutdown for the picod server by integrating signal handling in the main entry point and utilizing the http.Server.Shutdown method. Additionally, it updates the default command timeout documentation and refactors directory creation to use a safer internal method. A review comment suggests avoiding the use of a magic number for the shutdown timeout by defining it as a constant or making it configurable.
Signed-off-by: Abhinav Singh <abhinavsingh717073@gmail.com>
Signed-off-by: Abhinav Singh <abhinavsingh717073@gmail.com>
Signed-off-by: Abhinav Singh <abhinavsingh717073@gmail.com>
bd41de4 to
02cb899
Compare
What type of PR is this?
/kind security
/kind enhancement
What this PR does / why we need it:
This PR hardens the PicoD server with three targeted improvements:
Graceful shutdown —
Run()now accepts acontext.Contextand shuts down the HTTP server cleanly onSIGINT/SIGTERM, allowing in-flight requests to complete before exit. This is foundational for the stateful execution direction ([Proposal] Position AgentCube as a Stateful, Isolated, Concurrent Rollout Execution Layer for Agentic RL and Verifiable Agentic Tasks #267).Symlink traversal guard in upload handlers —
handleMultipartUploadandhandleJSONBase64Uploadusedos.MkdirAllto create parent directories, which follows symlinks in existing path components. An attacker with workspace write access could place a symlink pointing outside the jail, causing directory creation beyond the workspace boundary. Both call sites now use the existingmkdirSafeguard, consistent withExecuteHandler.Doc fix — The
ExecuteRequest.Timeoutcomment stated the default was"30s"but the actual default is60s.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
setWorkspacecall site intentionally keepsos.MkdirAll— it creates the workspace root itself at startup from admin config, andmkdirSafeusesworkspaceDiras the jail root, so using it there would be circular.Run()mirrors the pattern used by the workloadmanager'sShutdown()method.Does this PR introduce a user-facing change?: