Skip to content

⚡ Replace Scan in list_resources_with_defaults() with resource registry pattern #233

@sodre

Description

@sodre

Problem or Use Case

list_resources_with_defaults() in repository.py (line 893) uses a full DynamoDB table Scan with FilterExpression on begins_with(PK, "RESOURCE#"). This reads every item in the table and filters client-side. As the table grows with entities, buckets, audit events, and usage snapshots, this operation becomes increasingly expensive in both cost and latency.

Current implementation:

async def list_resources_with_defaults(self) -> list[str]:
    # ...
    response = await client.scan(
        TableName=self.table_name,
        FilterExpression="begins_with(PK, :prefix)",
        ExpressionAttributeValues={":prefix": {"S": schema.RESOURCE_PREFIX}},
        ProjectionExpression="PK",
    )

This is called by RateLimiter.list_resources_with_defaults() and the CLI resource list command. For a table with 100K items but only 10 resources, the Scan reads all 100K items to find 10.

Proposed Solution

Maintain a resource registry record (PK=SYSTEM#, SK=#RESOURCES) that tracks the set of known resource names. This replaces the Scan with a single GetItem (1 RCU).

Schema

# Resource registry record (FLAT)
{
    "PK": "SYSTEM#",
    "SK": "#RESOURCES",
    "resources": ["gpt-4", "claude-3", "dall-e-3"]  # String Set (SS type)
}

Write Path

Update the registry when resources are added or removed:

  • set_resource_defaults(resource, limits) -- ADD the resource name to the registry set
  • delete_resource_defaults(resource) -- DELETE the resource name from the registry set

Use DynamoDB UpdateItem with ADD (for sets) and DELETE (for sets) operations, which are atomic and idempotent.

Read Path

list_resources_with_defaults() becomes a single GetItem on PK=SYSTEM#, SK=#RESOURCES, extracting the string set.

Migration / Backfill

Add a one-time backfill mechanism (e.g., a migration or CLI command) that scans existing RESOURCE# PKs and populates the registry record. After backfill, the Scan code path can be removed.

Alternatives Considered

  1. GSI on PK prefix -- Adding a GSI for RESOURCE# prefix queries. More expensive (GSI storage + write throughput) and over-engineered for listing unique resource names.
  2. Query with begins_with on PK -- Not possible; DynamoDB requires exact PK match for Query operations on the base table.
  3. Keep Scan with smaller ProjectionExpression -- Still reads every item; only reduces data transfer, not RCU consumption.

Acceptance Criteria

  • schema.py defines key builders for the resource registry record (PK=SYSTEM#, SK=#RESOURCES)
  • set_resource_defaults() atomically adds the resource name to the registry set via UpdateItem with ADD
  • delete_resource_defaults() atomically removes the resource name from the registry set via UpdateItem with DELETE
  • list_resources_with_defaults() reads the registry record via GetItem instead of Scan
  • Unit tests verify registry is updated on set/delete of resource defaults
  • Unit test verifies list_resources_with_defaults() returns resources from the registry record
  • Unit test verifies empty registry returns an empty list
  • CLAUDE.md "DynamoDB Access Patterns" table updated with the resource registry access pattern
  • Existing Scan code path removed (no fallback to Scan)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/limiterCore rate limiting logicperformancePerformance optimization

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions