Skip to content

feat(aws/recs): populate ComputeDetails.VCPU + MemoryGB via DescribeInstanceTypes catalogue #218

@cristim

Description

@cristim

Background

PR #97 (merged) added VCPU int + MemoryGB float64 to common.ComputeDetails. The AWS RI parser at providers/aws/recommendations/parser_ri.go does not populate them today and PR #97's scope decision deferred it:

AWS: ce.EC2InstanceDetails (Cost-Explorer) has no vCPU/Memory fields. Populating would require a new ec2:DescribeInstanceTypes caller + region-scoped catalogue cache + mocks/tests — substantial net-new code, separate follow-up.

This is the follow-up.

What needs to happen

Add an AWS-side instance-type catalogue and use it to populate ComputeDetails.VCPU + MemoryGB on EC2 reservation recommendations.

Sketch

  1. New caller: a providers/aws/services/ec2/sku.go (or extend the existing EC2 service client) that wraps ec2:DescribeInstanceTypes. Region-scoped — instance type availability can vary by region, but the vCPU/Memory shape is uniform; we can hit any region the caller already has a session for.
  2. Catalogue cache: a per-(region, instance-type) cache so a single recommendations refresh batches lookups instead of hitting DescribeInstanceTypes once per rec. Mirror the shape of the Azure cachedSKULookup helper (PR perf(azure): batched SKU catalogue lookup eliminates N+1 in recommendation converters #81) where reasonable.
  3. Wire into the parser: parser_ri.go::parseServiceSpecificDetails (or its EC2 branch) consults the cache when building ComputeDetails and populates VCPU from InstanceTypeInfo.VCpuInfo.DefaultVCpus and MemoryGB from InstanceTypeInfo.MemoryInfo.SizeInMiB / 1024.
  4. IAM: the deploy SAs already have ec2:Describe* in most environments; verify the per-cloud ci-cd-permissions/ modules don't need a narrowing change. If the runtime SA's IAM is tighter, add ec2:DescribeInstanceTypes explicitly to its policy.
  5. Tests: a fake EC2 client returning a known DescribeInstanceTypes payload + assertions that the rec ends up with the right VCPU/MemoryGB. Existing parser tests stay valid (none reference the new fields today).

Caching considerations

  • TTL: instance-type metadata changes only when AWS launches new families. A multi-day TTL is fine; the recommendations refresh runs daily-at-2-AM (per recommendation_schedule = "0 2 * * *"), so a 24h TTL keeps the cache warm across one cycle without staleness risk.
  • Negative cache: a region that doesn't offer a given instance type returns an empty result. Cache the negative so we don't re-query each refresh.
  • Memory bound: ~700 instance types × ~30 regions = ~21k entries worst case at a few hundred bytes each — under 10 MB. Fine to keep in-process.

Acceptance

  • AWS EC2 recommendations populate ComputeDetails.VCPU + MemoryGB.
  • A single recommendations refresh issues at most one DescribeInstanceTypes per (region, instance-type) — verified by a counter on the fake client in tests.
  • IAM is documented (either "already in scope" or "this PR adds it").
  • Existing AWS parser tests still pass.

References

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions