feat(accounts): wire AWS Organizations discovery endpoint (closes #208)#212
Conversation
…OrgAccounts (closes #208) POST /api/accounts/discover-org used to return a stub "not yet implemented" message. Wire it end-to-end against the already-shipped accounts.DiscoverOrgAccounts function: 1. Parse {account_id} body, validate it's a UUID. 2. Load the named account via config.GetCloudAccount; 404 if missing. 3. Validate provider == "aws" and aws_is_org_root == true; 400 otherwise. 4. Resolve the org-root's stored credentials via credentials.ResolveAWSCredentialProvider and build an aws.Config. 5. Call accounts.DiscoverOrgAccounts (injectable seam: Handler.discoverOrgFn for tests; nil-default falls back to the real Organizations API call). 6. Dedupe each member account by (provider="aws", external_id) against the existing aws cloud_accounts roster. 7. Persist new members with enabled=false, aws_auth_mode=bastion, aws_bastion_id pointing at the org root — so the operator must review/approve and fill in the target role ARN before any discovered account participates in scheduled collection. 8. Return {discovered, created, skipped} counts. Spec acceptance F-1, F-2, F-3 from specs/multi-account-execution/acceptance.md. Test additions in handler_accounts_test.go cover the primary acceptance dimensions: - 400 on invalid JSON body - 400 on invalid account_id (not a UUID) - 400 on non-AWS account - 400 on aws_is_org_root=false - 404 on account not found - happy path: 3 discovered, 1 already known (skipped), 2 created; asserts each persisted row has enabled=false + aws_auth_mode=bastion + aws_bastion_id=org-root.ID The router-level smoke test in router_handlers_test.go is updated to assert the routing dispatcher reaches the handler (now expects a ClientError 400 on the empty body it sends, instead of the prior NotNil result from the stub). MockConfigStore gains ListCloudAccountsFn + CreateCloudAccountFn hooks so the org-discovery happy-path test can inject an existing roster and capture the persisted rows.
|
@coderabbitai review |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds a production implementation for POST /api/accounts/discover-org: validates org-root account, resolves its AWS credentials, invokes AWS Organizations discovery (injectable for tests), deduplicates by external_id against stored cloud accounts, persists new cloud accounts with locked defaults, and returns discovered/created/skipped counts. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler
participant ConfigStore
participant CredResolver
participant DiscoveryFn as Discovery
participant CloudAccountStore
Client->>Handler: POST /api/accounts/discover-org { account_id }
Handler->>ConfigStore: Load account by ID
ConfigStore-->>Handler: Account (may include stored credentials)
Handler->>Handler: Validate aws_is_org_root == true
Handler->>CredResolver: Resolve AWS credentials for account
CredResolver-->>Handler: aws.Config
Handler->>DiscoveryFn: DiscoverOrg(ctx, aws.Config)
DiscoveryFn-->>Handler: []MemberAccounts
Handler->>ConfigStore: ListCloudAccounts(filter by provider/external_id)
ConfigStore-->>Handler: []ExistingAccounts
Handler->>Handler: Deduplicate discovered vs existing
loop For each new member
Handler->>CloudAccountStore: CreateCloudAccount(enabled=false, aws_bastion_id=orgRoot, defaults)
CloudAccountStore-->>Handler: Persisted account
end
Handler-->>Client: 200 OK { discovered, created, skipped }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 3/5 reviews remaining, refill in 23 minutes and 8 seconds. Comment |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/api/handler_accounts.go`:
- Around line 1174-1183: The error handling in buildOrgRootAWSConfig incorrectly
maps all failures from credentials.ResolveAWSCredentialProvider to a 400 client
error; instead, examine the returned error from ResolveAWSCredentialProvider (in
buildOrgRootAWSConfig) and only convert to NewClientError(400, ...) for definite
client-side validation issues (e.g., sentinel ErrBadCredentials or a typed
validation error), while mapping transient/server-side failures (credential
store unavailable, STS errors, network/timeouts) to a server error (e.g., return
NewServerError(500, fmt.Sprintf(...)) or wrap and return the original error) so
retries are possible; update the conditional around
credentials.ResolveAWSCredentialProvider to branch on error type or sentinel
values and return the appropriate error type rather than unconditionally using
NewClientError(400,...).
- Around line 1226-1232: The code is defaulting new members to
member.AWSAuthMode = "bastion" while leaving member.AWSRoleARN empty, which
causes awsAmbientCredResult to treat the account as having valid ambient creds;
change the default so newly persisted rows do not claim bastion mode until a
role is configured (e.g., set member.AWSAuthMode = "" or "bastion_pending"
instead of "bastion"), or alternatively update awsAmbientCredResult to
explicitly require a non-empty AWSRoleARN when AWSAuthMode == "bastion"; modify
either the initializer around
member.Enabled/member.AWSAuthMode/member.AWSBastionID or the
awsAmbientCredResult validation logic to enforce that bastion mode implies a
configured AWSRoleARN.
- Around line 1119-1125: The handler discoverOrgAccounts currently calls
requirePermission(ctx, req, "create", "accounts") which allows non-admin users;
change the permission check to require admin-only by calling
requirePermission(ctx, req, "admin", "accounts") (or otherwise enforce admin
scope) before parsing the org root, and/or additionally call
requireAccountAccess to scope the org-root account if applicable; update the
check in discoverOrgAccounts so only admin-scoped principals can invoke this
endpoint.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2f85acbe-8891-43e2-a4c1-3c9e210edae1
📒 Files selected for processing (5)
internal/api/handler.gointernal/api/handler_accounts.gointernal/api/handler_accounts_test.gointernal/api/mocks_test.gointernal/api/router_handlers_test.go
…, drop misleading bastion default, drop blanket-400 (#212) Three CR-pass-1 findings on PR #212 addressed in one commit: 1. Permission scope: switch from requirePermission("create", "accounts") to requireAdmin. Org discovery can create N cloud_accounts rows in one call and bring unfamiliar accounts into the roster — admin scope is the right gate even though the rows boot disabled. requireAdmin honours both the admin API-key and bearer-token paths used elsewhere. 2. Drop the default AWSAuthMode="bastion" on persisted rows. awsAmbientCredResult treats role_arn or bastion modes with an empty AWSRoleARN as "ambient host credentials" — semantically correct for the CUDly self-account, but WRONG for a freshly-discovered member account, which would have falsely passed the test-credentials endpoint. Persist with AWSAuthMode="" instead; the operator's review step must set BOTH the mode AND a non-empty AWSRoleARN before flipping enabled=true. AWSBastionID is still pre-filled with the org root's ID so the operator's edit is just "add role ARN, set mode=bastion". The empty mode also fails ResolveAWSCredentialProvider's switch with a clear "unsupported aws_auth_mode" error if the row is enabled prematurely. 3. buildOrgRootAWSConfig no longer wraps every ResolveAWSCredentialProvider error as ClientError(400). The resolver mixes definite client-side validation failures (missing aws_role_arn) with transient server-side ones (credential store unavailable, network errors during access-key load); without a sentinel/typed error in the credentials package today, blanket-400 was misleading. Wrap the error as a regular Go error instead — the handler-default surface is 5xx, retries are possible, and the error message makes it into the structured log either way. A proper 400/500 split lives in a future credentials-package error-type cleanup; flagged in the doc comment. Test update: TestDiscoverOrgAccounts_HappyPathDedupesAndPersists now asserts AWSAuthMode is empty on persisted rows (was checking "bastion"). The doc above the assertions explains the WHY so a future maintainer doesn't "fix" it back. go test ./... clean. No behaviour change other than the three items above.
|
CR pass 1 addressed — pushed 1. Permission scope:
|
|
Triggering a fresh review of ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/api/handler_accounts.go (1)
1113-1114: ⚡ Quick winUpdate the stale
aws_auth_modecomments.These comments still describe the old
aws_auth_mode=bastionbehavior, but the implementation now intentionally persists an empty auth mode. Keeping them out of sync makes the safety rationale easy to undo by accident.Also applies to: 1220-1221
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/api/handler_accounts.go` around lines 1113 - 1114, Update the stale comments that reference aws_auth_mode=bastion to reflect the current behavior of persisting an empty auth mode: find the comment blocks in handler_accounts.go (the comments around the code handling creation/persistence of external IDs and AWS auth setup, e.g., the comment near the logic that "persists the new ones with enabled=false" and the later comment at the other occurrence) and change the wording to state that aws_auth_mode is intentionally persisted as an empty value (not "bastion"), explaining the safety rationale accordingly and keeping the explanation aligned with the code that writes an empty aws_auth_mode.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/api/handler_accounts.go`:
- Around line 1235-1258: The discovered-member save path fails to initialize the
account metadata: before calling h.config.CreateCloudAccount inside the members
loop, set member.ID and the timestamp fields (member.CreatedAt and
member.UpdatedAt) the same way other CreateCloudAccount callers in this file do
(i.e., generate a new ID and assign current time for CreatedAt/UpdatedAt) so the
store receives valid metadata; do this just prior to the CreateCloudAccount call
in the loop that processes discovered members.
---
Nitpick comments:
In `@internal/api/handler_accounts.go`:
- Around line 1113-1114: Update the stale comments that reference
aws_auth_mode=bastion to reflect the current behavior of persisting an empty
auth mode: find the comment blocks in handler_accounts.go (the comments around
the code handling creation/persistence of external IDs and AWS auth setup, e.g.,
the comment near the logic that "persists the new ones with enabled=false" and
the later comment at the other occurrence) and change the wording to state that
aws_auth_mode is intentionally persisted as an empty value (not "bastion"),
explaining the safety rationale accordingly and keeping the explanation aligned
with the code that writes an empty aws_auth_mode.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 771cf1ae-05b7-45ea-bc13-02cf15900a05
📒 Files selected for processing (2)
internal/api/handler_accounts.gointernal/api/handler_accounts_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/api/handler_accounts_test.go
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/api/handler_accounts_test.go (1)
525-684: ⚡ Quick winAdd regression tests for the admin-only and 5xx behaviors.
This block locks down the 400/404 paths, but it still doesn't assert the two review fixes that changed the handler contract: non-admin callers must be rejected, and credential-resolution failures must not surface as client errors. One focused case for each here would make those regressions much harder to reintroduce.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/api/handler_accounts_test.go` around lines 525 - 684, Add two regression tests for discoverOrgAccounts: one that verifies non-admin callers are rejected (call handler.discoverOrgAccounts without setupAdminAuth or with a MockAuthService that is not admin and assert it returns a client error with appropriate forbidden code), and one that verifies credential-resolution failures do not surface as client errors (inject a credStore or ResolveAWSCredentialProvider stub that returns an error when Handler.discoverOrgAccounts is invoked and assert the returned error is not classified by IsClientError and carries a 5xx server error code). Ensure tests reference the same Handler.discoverOrgAccounts path, reuse MockAuthService/setupAdminAuth patterns for auth state, and inject the failing credential resolver via the Handler.credStore or discoverOrgFn hooks used in the existing tests so behavior is exercised in-process.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/api/handler_accounts.go`:
- Around line 1225-1263: The loop that persists discovered members treats any
CreateCloudAccount error as fatal but doesn't handle duplicate-key races or
duplicate entries in members; update the logic in the loop that iterates over
members (referencing members, knownExternal, CreateCloudAccount, and
DiscoverOrgResult) to: on duplicate-key (unique constraint) errors increment
result.Skipped and continue, on successful create add member.ExternalID to
knownExternal so subsequent iterations see it as existing (to keep idempotency),
and only return an error for non-duplicate failures; ensure
result.Discovered/Skipped counters reflect these outcomes.
---
Nitpick comments:
In `@internal/api/handler_accounts_test.go`:
- Around line 525-684: Add two regression tests for discoverOrgAccounts: one
that verifies non-admin callers are rejected (call handler.discoverOrgAccounts
without setupAdminAuth or with a MockAuthService that is not admin and assert it
returns a client error with appropriate forbidden code), and one that verifies
credential-resolution failures do not surface as client errors (inject a
credStore or ResolveAWSCredentialProvider stub that returns an error when
Handler.discoverOrgAccounts is invoked and assert the returned error is not
classified by IsClientError and carries a 5xx server error code). Ensure tests
reference the same Handler.discoverOrgAccounts path, reuse
MockAuthService/setupAdminAuth patterns for auth state, and inject the failing
credential resolver via the Handler.credStore or discoverOrgFn hooks used in the
existing tests so behavior is exercised in-process.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f701f71e-d9b6-4f18-8776-22eb4648b725
📒 Files selected for processing (2)
internal/api/handler_accounts.gointernal/api/handler_accounts_test.go
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/api/handler_accounts.go`:
- Around line 1260-1262: The code sets member.AWSAuthMode = "" before calling
CreateCloudAccount, but the DB column cloud_accounts.aws_auth_mode only allows
('access_keys','role_arn','bastion') so an empty string will cause insert
failures; change the logic in the handler so that AWSAuthMode is left unset/nil
(or explicitly set to a null-equivalent) instead of an empty string before
calling h.config.CreateCloudAccount (or alternatively adjust CreateCloudAccount
to convert empty-string sentinel to NULL) and ensure CreateCloudAccount uses a
NULL value when AWSAuthMode is not provided.
- Around line 1138-1143: The code calls runOrgDiscovery and then unconditionally
dereferences disco.Accounts when returning persistDiscoveredMembers; add a nil
guard to handle cases where runOrgDiscovery returns (nil, nil) to avoid a panic:
after disco, err := h.runOrgDiscovery(ctx, cfg) check if disco == nil and return
an appropriate nil/empty response or error (consistent with surrounding error
handling) before calling persistDiscoveredMembers, referencing the disco
variable and the persistDiscoveredMembers(ctx, root, disco.Accounts) call so the
fix is applied at that call site.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b03ad946-a2fc-43c8-8013-3e924afef40c
📒 Files selected for processing (2)
internal/api/handler_accounts.gointernal/api/handler_accounts_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/api/handler_accounts_test.go
| member.AWSAuthMode = "" | ||
| member.AWSBastionID = root.ID | ||
| if err := h.config.CreateCloudAccount(ctx, &member); err != nil { |
There was a problem hiding this comment.
aws_auth_mode="" may fail DB writes under current CHECK constraint.
Line 1260 persists an empty string, but cloud_accounts.aws_auth_mode is constrained to ('access_keys','role_arn','bastion') in internal/database/postgres/migrations/000011_cloud_accounts.up.sql. If CreateCloudAccount writes "" directly (not NULL), discovered-member inserts will fail at runtime.
Consider using NULL for “unset” auth mode (or migrating the constraint to explicitly allow your sentinel).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/api/handler_accounts.go` around lines 1260 - 1262, The code sets
member.AWSAuthMode = "" before calling CreateCloudAccount, but the DB column
cloud_accounts.aws_auth_mode only allows ('access_keys','role_arn','bastion') so
an empty string will cause insert failures; change the logic in the handler so
that AWSAuthMode is left unset/nil (or explicitly set to a null-equivalent)
instead of an empty string before calling h.config.CreateCloudAccount (or
alternatively adjust CreateCloudAccount to convert empty-string sentinel to
NULL) and ensure CreateCloudAccount uses a NULL value when AWSAuthMode is not
provided.
|
Addressed the latest review note with a nil-guard on the discovery path and a regression test for a nil discovery result. I did not change the empty storage path: already writes through , so an empty string is persisted as SQL NULL rather than an invalid empty enum value. That keeps the current safety invariant intact without widening the DB contract. |
|
Addressed the latest review note with a nil-guard on the discovery path and a regression test for a nil discovery result. |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Closes #208 — wires
POST /api/accounts/discover-orgto the already-shippedaccounts.DiscoverOrgAccountsfunction. Replaces the stub atinternal/api/handler_accounts.go:1089that returned"org discovery not yet implemented".What the endpoint does
Given
{"account_id": "<org-root-uuid>"}:account_idis a UUID, named account exists,provider == "aws",aws_is_org_root == true. Each failure returns the right ClientError code (400 for parse/UUID/non-aws/non-root, 404 for missing).credentials.ResolveAWSCredentialProvider; build anaws.Configfor the call.accounts.DiscoverOrgAccounts(ctx, cfg)(the existing function with full Organizations API pagination + tests atinternal/accounts/org_discovery_test.go). Injectable seam:Handler.discoverOrgFnfor tests; nil-default falls back to the real call in production.ListCloudAccounts(provider="aws"); skip member accounts whoseexternal_idalready exists.enabled=false+aws_auth_mode="bastion"+aws_bastion_id=<org-root.ID>and callCreateCloudAccount. The operator must review/approve and fill inaws_role_arnbefore the row participates in scheduled collection.{discovered, created, skipped}counts.Spec sections:
specs/multi-account-execution/acceptance.mdF-1, F-2, F-3.Why these defaults
enabled=false: gates the operator into reviewing each discovered account before the scheduler picks it up. Spec requirement.aws_auth_mode="bastion"+aws_bastion_id=<org-root>: signals the intended cred-resolution path (the org-root assumes a role in the member). The operator fills inaws_role_arnbefore flippingenabled=true. The scheduler silently skips disabled rows so a half-configured account never causes a runtime error.Createdexcludes skipped rows: dedupe is by(provider="aws", external_id), matching the spec's natural-key semantics. Re-running the endpoint is idempotent.Refactor for gocyclo
The handler logic split into four helpers to fit the project's gocyclo budget (≤10):
parseDiscoverOrgRoot— body decode + validatebuildOrgRootAWSConfig— credential resolution +aws.ConfigbuildrunOrgDiscovery—discoverOrgFndispatcherpersistDiscoveredMembers— dedupe + persist loopThe main
discoverOrgAccountsis now a 5-step coordinator.Tests
Six new test cases in
internal/api/handler_accounts_test.gocover the spec's acceptance dimensions:TestDiscoverOrgAccounts_RejectsInvalidBodyTestDiscoverOrgAccounts_RejectsInvalidAccountIDaccount_idthat isn't a UUIDTestDiscoverOrgAccounts_RejectsNonAWSAccountTestDiscoverOrgAccounts_RejectsNonOrgRootaws_is_org_root == falseTestDiscoverOrgAccounts_NotFoundTestDiscoverOrgAccounts_HappyPathDedupesAndPersistsenabled=false,aws_auth_mode="bastion",aws_bastion_id=<root.ID>MockConfigStoregainsListCloudAccountsFn+CreateCloudAccountFninjection points;fakeCredStoreis a minimalCredentialStoresatisfying access_keys mode without dragging in Postgres / Secrets Manager.TestRouter_discoverOrgAccountsHandler(routing-only smoke) updated to assert the dispatcher reaches the handler — now expects a 400 ClientError on the empty body it sends, instead of the priorNotNilresult from the stub.Verification
go build ./...cleango test ./...(full repo) — all greenTriage
type/feat,severity/high,urgency/this-sprint,impact/many,effort/m,priority/p1,triaged. Closes #208.Summary by CodeRabbit
New Features
Tests