fix(deployment): cache cosmos REST fallback routes#3158
Conversation
The /akash/deployment/{version}/deployments/{info,list} routes mirror
chain REST and are hit at high frequency by external Cosmos tooling
polling for draining deployments. Each call queries the chain DB
through Sequelize and contributes to connection-pool saturation.
Data only changes when the indexer ingests a new block (~6s), so a
short Cache-Control (max-age=10, stale-while-revalidate=30) lets
browsers and any CDN safely deduplicate without staleness concerns.
📝 WalkthroughWalkthroughAdds TTL-based caching to the fallback deployment reader (cached findAll via cacheResponse, memoized findByOwnerAndDseq) and sets HTTP cache parameters (maxAge=10, staleWhileRevalidate=30) on two database-fallback deployment routes. ChangesDeployment Cache Configuration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3158 +/- ##
==========================================
- Coverage 62.82% 62.10% -0.73%
==========================================
Files 1066 1010 -56
Lines 25286 24139 -1147
Branches 6229 6040 -189
==========================================
- Hits 15887 14991 -896
+ Misses 8211 7980 -231
+ Partials 1188 1168 -20
*This pull request uses carry forward flags. Click here to find out more.
🚀 New features to boost your workflow:
|
Cache-Control alone does not deter the server-side bots that drive this traffic — they ignore the header. Memoize the service methods so identical queries within a block window are served from an in-process LRU and never reach Sequelize.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts (1)
27-29: ⚡ Quick winNormalize the
findAllcache key to avoid cache fragmentation.Using raw
JSON.stringify(params)can split cache entries for equivalent requests (e.g., defaulted vs omitted fields, key-order differences), which reduces hit rate on hot endpoints.Proposed change
async findAll(params: DatabaseDeploymentListParams): Promise<RestAkashDeploymentListResponse> { - return cacheResponse(averageBlockTime, `FallbackDeploymentReaderService#findAll#${JSON.stringify(params)}`, () => this.findAllUncached(params)); + const normalizedParams = Object.entries({ + skip: params.skip ?? 0, + limit: params.limit ?? 100, + key: params.key ?? "", + countTotal: params.countTotal ?? true, + ...params + }).sort(([a], [b]) => a.localeCompare(b)); + + return cacheResponse( + averageBlockTime, + `FallbackDeploymentReaderService#findAll#${JSON.stringify(normalizedParams)}`, + () => this.findAllUncached(params) + ); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts` around lines 27 - 29, The cache key for FallbackDeploymentReaderService.findAll uses JSON.stringify(params) which fragments cache; normalize params before keying by creating a canonical representation (e.g., fill in omitted defaults from DatabaseDeploymentListParams, sort object keys or pick/serialize only the deterministic subset of fields used for the query) and use that normalized string in the cacheResponse key; update the call in findAll to compute normalizedParams (or a stableKeyFromParams helper) and use `FallbackDeploymentReaderService#findAll#${normalizedKey}` when calling cacheResponse so equivalent requests map to the same cache entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
`@apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts`:
- Around line 27-29: The cache key for FallbackDeploymentReaderService.findAll
uses JSON.stringify(params) which fragments cache; normalize params before
keying by creating a canonical representation (e.g., fill in omitted defaults
from DatabaseDeploymentListParams, sort object keys or pick/serialize only the
deterministic subset of fields used for the query) and use that normalized
string in the cacheResponse key; update the call in findAll to compute
normalizedParams (or a stableKeyFromParams helper) and use
`FallbackDeploymentReaderService#findAll#${normalizedKey}` when calling
cacheResponse so equivalent requests map to the same cache entry.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 63b0828e-9232-4dc6-b12a-6ee51dad45e1
📒 Files selected for processing (1)
apps/api/src/deployment/services/fallback-deployment-reader/fallback-deployment-reader.service.ts
| }; | ||
| } | ||
|
|
||
| @Memoize({ ttlInSeconds: averageBlockTime }) |
There was a problem hiding this comment.
question: not sure about the details of @Memoize but it seems that cache key is highly variable, wouldn't it open another point of failure like OOM?
There was a problem hiding this comment.
and probably in case of blockchain being available, we could just redirect request to blockchain api
There was a problem hiding this comment.
also we need to find out the reason why pool is exhausted:
- is this because of slow db query?
- is this because amount of requests too high?
There was a problem hiding this comment.
This is for the fallback when the chain in unavailable, I think from what I've seen is bursts of queries. But yea this is not a permanent fix for the problem.
Why
/akash/deployment/{version}/deployments/infoand/listmirror the Cosmos REST API but are served from console's chain DB byFallbackDeploymentReaderService. They exist so external Cosmos tooling can keep working against deployments whose state has been pruned from chain RPC nodes.Logs show ~50K requests in 6 hours from a single Azure IP polling one specific draining deployment (
dseq=26735741) withuserAgent: node. Every call goes through Sequelize against the chain DB — the same DB that's saturated with 235+ active connections (see #3157).This PR caches at two layers, since neither alone is sufficient:
Cache-Controlheaders help browsers and any cooperating CDN.What
cache: { maxAge: 10, staleWhileRevalidate: 30 }on both fallback routes — emitsCache-Control: public, max-age=10, stale-while-revalidate=30.FallbackDeploymentReaderService:findByOwnerAndDseq:@Memoize({ ttlInSeconds: averageBlockTime })(matches the existing pattern inAkashBlockService).findAll: wrapped withcacheResponse(averageBlockTime, ...)keyed on the params (the@Memoizedecorator only handles primitive args).The
cacheResponsehelper has stale-while-revalidate semantics in-process and dedupes concurrent identical requests viapendingRequests, so a request burst still results in a single DB hit.Summary by CodeRabbit