Skip to content

fix(deployment): cache cosmos REST fallback routes#3158

Open
baktun14 wants to merge 2 commits into
mainfrom
fix/deployment-cache-cosmos-fallback-routes
Open

fix(deployment): cache cosmos REST fallback routes#3158
baktun14 wants to merge 2 commits into
mainfrom
fix/deployment-cache-cosmos-fallback-routes

Conversation

@baktun14
Copy link
Copy Markdown
Contributor

@baktun14 baktun14 commented May 8, 2026

Why

/akash/deployment/{version}/deployments/info and /list mirror the Cosmos REST API but are served from console's chain DB by FallbackDeploymentReaderService. They exist so external Cosmos tooling can keep working against deployments whose state has been pruned from chain RPC nodes.

Logs show ~50K requests in 6 hours from a single Azure IP polling one specific draining deployment (dseq=26735741) with userAgent: node. Every call goes through Sequelize against the chain DB — the same DB that's saturated with 235+ active connections (see #3157).

This PR caches at two layers, since neither alone is sufficient:

  • Cache-Control headers help browsers and any cooperating CDN.
  • In-process memoization is what actually defeats server-side bots that ignore HTTP cache headers — identical queries within a block window are served from an in-memory LRU and never reach Sequelize.

What

  1. Route-level cache: { maxAge: 10, staleWhileRevalidate: 30 } on both fallback routes — emits Cache-Control: public, max-age=10, stale-while-revalidate=30.
  2. Service-level memoization in FallbackDeploymentReaderService:
    • findByOwnerAndDseq: @Memoize({ ttlInSeconds: averageBlockTime }) (matches the existing pattern in AkashBlockService).
    • findAll: wrapped with cacheResponse(averageBlockTime, ...) keyed on the params (the @Memoize decorator only handles primitive args).

The cacheResponse helper has stale-while-revalidate semantics in-process and dedupes concurrent identical requests via pendingRequests, so a request burst still results in a single DB hit.

Summary by CodeRabbit

  • Performance Improvements
    • Enhanced HTTP caching for deployment data endpoints
    • Optimized repeated deployment queries for faster response times

The /akash/deployment/{version}/deployments/{info,list} routes mirror
chain REST and are hit at high frequency by external Cosmos tooling
polling for draining deployments. Each call queries the chain DB
through Sequelize and contributes to connection-pool saturation.

Data only changes when the indexer ingests a new block (~6s), so a
short Cache-Control (max-age=10, stale-while-revalidate=30) lets
browsers and any CDN safely deduplicate without staleness concerns.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Adds TTL-based caching to the fallback deployment reader (cached findAll via cacheResponse, memoized findByOwnerAndDseq) and sets HTTP cache parameters (maxAge=10, staleWhileRevalidate=30) on two database-fallback deployment routes.

Changes

Deployment Cache Configuration

Layer / File(s) Summary
Service imports / TTL
apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts
Adds cacheResponse, @Memoize, and imports averageBlockTime used as TTL.
findAll cache wrapper
apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts
findAll now returns cacheResponse(...) delegating work to a private findAllUncached.
findAll uncached implementation
apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts
Original findAll body moved into findAllUncached (pagination and transform logic unchanged).
findByOwnerAndDseq memoization
apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts
findByOwnerAndDseq annotated with @Memoize({ ttlInSeconds: averageBlockTime }).
Route cache configuration
apps/api/src/deployment/routes/deployments/deployments.router.ts
fallbackListRoute and fallbackInfoRoute updated to cache.maxAge: 10 and cache.staleWhileRevalidate: 30.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/deployment-cache-cosmos-fallback-routes

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.10%. Comparing base (ed84f34) to head (4b9317f).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3158      +/-   ##
==========================================
- Coverage   62.82%   62.10%   -0.73%     
==========================================
  Files        1066     1010      -56     
  Lines       25286    24139    -1147     
  Branches     6229     6040     -189     
==========================================
- Hits        15887    14991     -896     
+ Misses       8211     7980     -231     
+ Partials     1188     1168      -20     
Flag Coverage Δ *Carryforward flag
api 84.19% <100.00%> (+0.02%) ⬆️
deploy-web 46.34% <ø> (ø) Carriedforward from e1c2ad7
log-collector ?
notifications 90.70% <ø> (ø) Carriedforward from e1c2ad7
provider-console 81.48% <ø> (ø) Carriedforward from e1c2ad7
provider-inventory ?
provider-proxy 85.21% <ø> (ø) Carriedforward from e1c2ad7
tx-signer ?

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
...yment-reader/fallback-deployment-reader.service.ts 90.36% <100.00%> (+0.23%) ⬆️

... and 57 files with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Cache-Control alone does not deter the server-side bots that drive
this traffic — they ignore the header. Memoize the service methods
so identical queries within a block window are served from an
in-process LRU and never reach Sequelize.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts (1)

27-29: ⚡ Quick win

Normalize the findAll cache key to avoid cache fragmentation.

Using raw JSON.stringify(params) can split cache entries for equivalent requests (e.g., defaulted vs omitted fields, key-order differences), which reduces hit rate on hot endpoints.

Proposed change
 async findAll(params: DatabaseDeploymentListParams): Promise<RestAkashDeploymentListResponse> {
-  return cacheResponse(averageBlockTime, `FallbackDeploymentReaderService#findAll#${JSON.stringify(params)}`, () => this.findAllUncached(params));
+  const normalizedParams = Object.entries({
+    skip: params.skip ?? 0,
+    limit: params.limit ?? 100,
+    key: params.key ?? "",
+    countTotal: params.countTotal ?? true,
+    ...params
+  }).sort(([a], [b]) => a.localeCompare(b));
+
+  return cacheResponse(
+    averageBlockTime,
+    `FallbackDeploymentReaderService#findAll#${JSON.stringify(normalizedParams)}`,
+    () => this.findAllUncached(params)
+  );
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts`
around lines 27 - 29, The cache key for FallbackDeploymentReaderService.findAll
uses JSON.stringify(params) which fragments cache; normalize params before
keying by creating a canonical representation (e.g., fill in omitted defaults
from DatabaseDeploymentListParams, sort object keys or pick/serialize only the
deterministic subset of fields used for the query) and use that normalized
string in the cacheResponse key; update the call in findAll to compute
normalizedParams (or a stableKeyFromParams helper) and use
`FallbackDeploymentReaderService#findAll#${normalizedKey}` when calling
cacheResponse so equivalent requests map to the same cache entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts`:
- Around line 27-29: The cache key for FallbackDeploymentReaderService.findAll
uses JSON.stringify(params) which fragments cache; normalize params before
keying by creating a canonical representation (e.g., fill in omitted defaults
from DatabaseDeploymentListParams, sort object keys or pick/serialize only the
deterministic subset of fields used for the query) and use that normalized
string in the cacheResponse key; update the call in findAll to compute
normalizedParams (or a stableKeyFromParams helper) and use
`FallbackDeploymentReaderService#findAll#${normalizedKey}` when calling
cacheResponse so equivalent requests map to the same cache entry.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 63b0828e-9232-4dc6-b12a-6ee51dad45e1

📥 Commits

Reviewing files that changed from the base of the PR and between e1c2ad7 and 4b9317f.

📒 Files selected for processing (1)
  • apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts

};
}

@Memoize({ ttlInSeconds: averageBlockTime })
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: not sure about the details of @Memoize but it seems that cache key is highly variable, wouldn't it open another point of failure like OOM?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and probably in case of blockchain being available, we could just redirect request to blockchain api

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also we need to find out the reason why pool is exhausted:

  • is this because of slow db query?
  • is this because amount of requests too high?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the fallback when the chain in unavailable, I think from what I've seen is bursts of queries. But yea this is not a permanent fix for the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants