perf(azure): batched SKU catalogue lookup eliminates N+1 in recommendation converters#81
Conversation
…ation converters The four Azure recommendation converters (compute, database, cache, cosmosdb) previously left SKU-derived `Details` fields empty because populating them inline would have triggered an N+1 SDK call per recommendation. This change wires three of them to a lazily-cached, single-shot SKU catalogue lookup, gated by `sync.Once` and held for the client's lifetime. Per-service: - **Cache** (`services/cache`): caches `Properties.ShardCount` per SKU from `armredis.Client.NewListBySubscriptionPager`. Converter populates `CacheDetails.Shards` for Premium-tier clustered caches. - **Cosmos** (`services/cosmosdb`): caches the dominant Cosmos APIType (mongodb / cassandra / gremlin / table / sql) across the subscription's accounts via `armcosmos.DatabaseAccountsClient.NewListPager`, mapping `account.Kind` + `Capabilities`. Converter populates `NoSQLDetails.APIType`. Multi-API-type subscriptions leave APIType empty rather than guessing. - **Database** (`services/database`): caches engine version per SKU from `armsql.CapabilitiesClient.ListByLocation`, traversing ServerVersion → Editions → ServiceLevelObjectives. Converter populates `DatabaseDetails.EngineVersion`. Each converter calls a per-service `cachedSKULookup` (or `cachedAPIType` on cosmos). The catalogue is fetched ONCE per client lifetime; many converter calls in the same `GetRecommendations` run hit the in-memory map. A failed catalogue fetch leaves the cache nil and converters fall back to the previous empty-Details behaviour with a one-time WARN log — the conversion itself never fails, preserving the graceful-degradation contract. **Compute scoped back**: `common.ComputeDetails` (in `pkg/common/types.go`) exposes only InstanceType/Platform/Tenancy/Scope. The two enrichments the SKU catalogue would supply for compute are vCPU and MemoryGB — neither has a struct field to write into. Adding fields to the shared type would touch all 3 cloud providers' converters, the frontend, and the matchers, well outside this perf change's scope. Documented as a narrower follow-up in `known_issues/10_azure_provider.md`. Tests per service in the existing `*_test.go`: - `_PopulatesShardsFromSKUCache` / `_PopulatesAPIType` / `_PopulatesEngineVersion` — cache hit populates the previously-deferred field. - `_PagerErrorFallsBack` / `_CapabilitiesErrorFallsBack` — fetch error leaves the field empty and conversion still succeeds. - `_FetchedOnce` — many converter calls share a single catalogue fetch (the N+1 invariant). - The pre-existing `_PopulatesAllFields` tests now inject empty mock pagers so they don't hit the real Azure API on first lookup. Refs #49
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 31 minutes and 52 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
Two follow-up GitHub issues filed for the deferrals this PR explicitly scopes back:
Both reference back here and propose mirroring the cachedAPIType / cachedSKULookup patterns this PR establishes. |
Adds optional VCPU (int) and MemoryGB (float64) fields to common.ComputeDetails so per-provider catalogue lookups can enrich compute recommendations with sizing data. Both fields use json omitempty — converters that don't yet wire a catalogue leave them at the zero value and the API payload stays clean. GetDetailDescription appends " (N vCPU / M GB)" when both values are > 0, otherwise returns the existing "platform/tenancy" form. Format uses %g so 16 GB renders as "16 GB" (not "16.000000 GB") while 0.5 GB SKUs render as "0.5 GB". Scope decision — schema-only foundation: - Azure: PR #81 (perf/azure-batched-sku-lookup) introduces the cachedSKULookup helper for cache/cosmos/database. Compute was scoped back from #81 because these fields didn't exist. With the schema in place, a follow-up PR can wire compute on top of #81 once it merges (filed as a separate issue post-merge to avoid an unmerged-PR dependency in this changeset). - AWS: ce.EC2InstanceDetails (Cost-Explorer) has no vCPU/Memory fields; populating would require a new ec2:DescribeInstanceTypes caller + cache + mocks. Substantial net-new code, tracked as a separate follow-up. - GCP: no compute recommendation converter exists today. N/A. - Frontend: api.Recommendation has no `details` field and the detail drawer renders a hard-coded list. Surfacing the new fields needs separate API + UI work; tracked as follow-up once at least one provider populates the fields. Tests: extend pkg/common/types_test.go::TestComputeDetails_GetDetail Description with 4 new table rows — VCPU-only, MemoryGB-only, integer-GB, and fractional-GB (Azure 0.5 GB SKU shape). Refs #82
Summary
Adds a lazy, single-shot SKU catalogue lookup to three of the four Azure recommendation converters (cache, cosmos, database) so the previously-deferred Details fields populate without N+1 SDK calls.
CacheDetails.Shardsfromarmredis.Properties.ShardCount.NoSQLDetails.APIType(sql / mongodb / cassandra / gremlin / table) from the dominant Cosmos accountKind+Capabilitiesin the subscription.DatabaseDetails.EngineVersionfromarmsql.CapabilitiesClient.ListByLocationtraversal.Each converter calls a per-service
cachedSKULookup(orcachedAPITypeon cosmos). The catalogue is fetched ONCE per client lifetime viasync.Once; many converter calls in the sameGetRecommendationsrun hit the in-memory map. A failed catalogue fetch leaves the cache nil and converters fall back to the previous empty-Details behaviour with a one-time WARN log — the conversion itself never fails.Compute scoped back:
common.ComputeDetails(inpkg/common/types.go) currently exposes only InstanceType / Platform / Tenancy / Scope. The two enrichments compute would supply (vCPU, MemoryGB) have no struct field to write into. Adding fields to a sharedpkg/commontype would touch all 3 cloud providers, the frontend, and the matchers — out of scope for this perf change. Tracked as a narrower follow-up inknown_issues/10_azure_provider.md.Test plan
New per-service tests in the existing
*_test.go:_PopulatesShardsFromSKUCache/_PopulatesAPIType(5 sub-cases) /_PopulatesEngineVersion— cache hit populates the previously-deferred field._PagerErrorFallsBack/_CapabilitiesErrorFallsBack— fetch error leaves the field empty and conversion still succeeds (graceful-degradation contract)._CachedSKULookup_FetchedOnce/_CachedAPIType_FetchedOnce— many converter calls share a single catalogue fetch (the N+1 invariant pinned in tests)._AmbiguousAPIType— multi-API-type Cosmos subscription leaves APIType empty rather than guessing._PopulatesAllFieldstests now inject empty mock pagers so they don't hit the real Azure API on first lookup.cd providers/azure && go test ./...— all green.go test ./...(root) — all green.go vet ./...— clean.Closes #49 for cache / cosmos / database. Compute follow-up filed as a narrower issue in the known-issues doc.