Skip to content

feat: add api keys support for ingestion/query authentication#1620

Merged
nikhilsinhaparseable merged 7 commits intoparseablehq:mainfrom
nikhilsinhaparseable:apikey
Apr 22, 2026
Merged

feat: add api keys support for ingestion/query authentication#1620
nikhilsinhaparseable merged 7 commits intoparseablehq:mainfrom
nikhilsinhaparseable:apikey

Conversation

@nikhilsinhaparseable
Copy link
Copy Markdown
Contributor

@nikhilsinhaparseable nikhilsinhaparseable commented Apr 13, 2026

use api keys as an alternative auth mechanism for ingestion
use header X-API-KEY in place of basic auth
middleware validates the header on ingest action
key store in object store at .settings/apikeys/.json

Summary by CodeRabbit

  • New Features
    • Introduced API key management: create, list (masked), delete, and validate keys.
    • Added API-key authentication for ingestion and query via X-API-KEY (can bypass session flow when applicable).
    • Query keys create short-lived query sessions with scoped permissions.
    • API keys are tenant-scoped with a default-tenant fallback and persisted to storage.
    • Ingestion and query requests with API keys enforce tenant suspension checks.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 13, 2026

Walkthrough

Adds a tenant-partitioned in-memory API key store with metastore persistence and object-store layout, integrates X-API-KEY handling into HTTP middleware (including Kinesis header extraction and short-lived query sessions), extends Metastore trait/impl for apikey CRUD, exposes apikeys module, and adds uuid dependency.

Changes

Cohort / File(s) Summary
Dependencies
Cargo.toml
Added uuid = { version = "1", features = ["v4"] }.
API Key Core Module
src/apikeys.rs
New module: ApiKey models, ApiKeyStore global (API_KEYS) using tenant-partitioned RwLock<HashMap<...>>, async load/create/delete/list/get/validate/sync methods, ApiKeyError, MetastoreObject impl, and masked list entries.
Auth Middleware
src/handlers/http/middleware.rs
Refactored Kinesis header parsing into extract_kinesis_headers; added X-API-KEY short-circuit path that snapshots tenant header, checks suspension (check_suspension_for_tenant), validates keys via API_KEYS.validate_key, and for KeyType::Query creates a short-lived session id tracked in mut_sessions() with Drop-based cleanup.
Module Export
src/lib.rs
Exported pub mod apikeys;.
Metastore Trait
src/metastore/metastore_traits.rs
Added async trait methods: get_api_keys() -> Result<HashMap<String, Vec<ApiKey>>, MetastoreError>, put_api_key(obj, tenant_id), and delete_api_key(obj, tenant_id).
ObjectStore Metastore
src/metastore/metastores/object_store_metastore.rs
Implemented get_api_keys (per-tenant listing + JSON parse of apikeys/*.json), put_api_key (write object), and delete_api_key (delete object).
Storage Constants & Helpers
src/storage/mod.rs, src/storage/object_storage.rs
Added APIKEYS_ROOT_DIRECTORY: &str = "apikeys" and pub fn apikey_json_path(key_id, tenant_id) -> RelativePathBuf helper for object-storage paths.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Middleware as AuthMiddleware
    participant ApiKeyStore as API_KEYS
    participant Metastore
    participant Storage

    Client->>Middleware: HTTP request (may include X-API-KEY / Kinesis header)
    Middleware->>Middleware: extract_kinesis_headers (if ingest)
    Middleware->>Middleware: read tenant header & x-api-key
    alt x-api-key present and action maps to KeyType
        Middleware->>ApiKeyStore: validate_key(api_key, tenant, required_type)
        ApiKeyStore->>ApiKeyStore: lookup in tenant-partitioned RwLock
        alt Key valid
            ApiKeyStore-->>Middleware: valid
            Middleware->>Middleware: create short-lived session if KeyType::Query
            Middleware-->>Client: forward request to service
        else Key invalid
            ApiKeyStore-->>Middleware: invalid
            Middleware-->>Client: 401/403 response
        end
    else no x-api-key
        Middleware->>Middleware: normal session extraction & RBAC
    end

    Note over ApiKeyStore,Metastore: bootstrap/sync flow
    ApiKeyStore->>Metastore: get_api_keys()
    Metastore->>Storage: list/read SETTINGS_ROOT/apikeys/*.json per-tenant
    Storage-->>Metastore: ApiKey JSON blobs
    Metastore-->>ApiKeyStore: HashMap<tenant, Vec<ApiKey>>
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • parmesant
  • nitisht

Poem

🐰 I hopped through keys both new and bright,
Tenants tucked in neat and tight,
X-API-KEY opens the gate,
RwLock guards each tiny state,
A rabbit cheers — the services take flight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description is incomplete. It lacks the structured sections from the template including an issue reference (Fixes #XXXX), a proper Description section with goals and rationale, and the checklist items. Add a proper description following the template: include issue reference, detailed description of the goal/solution/changes, and complete the testing and documentation checklist items.
✅ Passed checks (4 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title accurately describes the main change: adding API key support for both ingestion and query authentication, which aligns with the substantial changes across apikeys module, middleware auth flow, and metastore integration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/apikeys.rs`:
- Around line 42-53: The ApiKey struct currently persists plaintext secrets
(field api_key) and validate_key() compares plaintext; change storage to persist
only a keyed hash/digest (e.g., api_key_hash) and remove plaintext from
serialization (make plaintext returned only at creation), update ApiKey struct
(replace or annotate api_key -> api_key_hash and adjust serde to not expose
secret), update the code paths that create and store keys to compute and save
the derived hash while returning the plaintext once, and modify validate_key()
to compute the same digest for the presented key and compare against
api_key_hash (constant-time compare); apply the same changes to the other
locations that persist or compare keys (the creation/storage logic and any
validation/auth functions referenced in the diff).
- Around line 121-129: When loading keys in get_api_keys(), don't overwrite
default-tenant keys with Some(DEFAULT_TENANT); preserve tenant == None so
apikey_json_path() continues to generate the legacy path
(".settings/apikeys/..."). Locate the loop that iterates (tenant_id, keys) and,
instead of unconditionally setting k.tenant = Some(tenant_id.clone()), set
k.tenant = None when tenant_id == DEFAULT_TENANT (otherwise set
Some(tenant_id.clone())). This keeps delete/update paths consistent with
apikey_json_path() for single-tenant/default entries.
- Around line 33-35: API_KEYS is never hydrated at startup; call
API_KEYS.load().await as part of the concurrent initialization sequence that
currently runs load_on_init() for other global stores so persisted keys are
available after restart. Edit the async startup loader where the other globals
are awaited (the block that triggers load_on_init() for CORRELATIONS, FILTERS,
DASHBOARDS, ALERTS, TARGETS) and add API_KEYS.load().await to that set of
concurrent tasks (referencing the static API_KEYS and its method load() on
ApiKeyStore) so the cache is populated from persistent storage during server
startup.

In `@src/handlers/http/middleware.rs`:
- Around line 194-217: The API-key branch currently validates the key with
API_KEYS.validate_key(...) but then returns the service future without running
the normal authorization flow, bypassing self.auth_method (which ultimately
calls Users.authorize(...)); after a successful validate_key(), invoke
self.auth_method (or the same authorization helper used for other branches) for
the request/resource (Action::Ingest) and only proceed to await fut if that
authorization succeeds, otherwise return the appropriate error (e.g.,
unauthorized/forbidden); keep the existing header_error and suspension checks
before validation and ensure any auth call uses the same request/tenant context
as the other paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e82a3074-a0eb-4c16-8ce5-111e492a771d

📥 Commits

Reviewing files that changed from the base of the PR and between eacb1b9 and 77e2894.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • Cargo.toml
  • src/apikeys.rs
  • src/handlers/http/middleware.rs
  • src/lib.rs
  • src/metastore/metastore_traits.rs
  • src/metastore/metastores/object_store_metastore.rs
  • src/storage/mod.rs
  • src/storage/object_storage.rs

Comment thread src/apikeys.rs
Comment thread src/apikeys.rs
Comment thread src/apikeys.rs
Comment thread src/handlers/http/middleware.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/apikeys.rs (1)

234-249: Consider constant-time comparison for API key validation.

The current comparison uses == (lines 239, 245), which is susceptible to timing attacks. While network jitter typically masks timing differences and the high entropy of UUIDv4 keys makes exploitation difficult in practice, using constant-time comparison is the security best practice for credential validation.

♻️ Proposed fix using constant-time comparison

Add subtle to dependencies:

subtle = "2"

Then update the comparison:

+use subtle::ConstantTimeEq;
+
 pub async fn validate_key(&self, api_key_value: &str, tenant_id: &Option<String>) -> bool {
     let map = self.keys.read().await;
     if let Some(tenant_id) = tenant_id {
         // Multi-tenant: check keys for the specific tenant
         if let Some(tenant_keys) = map.get(tenant_id) {
-            return tenant_keys.values().any(|k| k.api_key == api_key_value);
+            return tenant_keys.values().any(|k| {
+                k.api_key.as_bytes().ct_eq(api_key_value.as_bytes()).into()
+            });
         }
         false
     } else {
         // Single-tenant: check keys under DEFAULT_TENANT
         if let Some(tenant_keys) = map.get(DEFAULT_TENANT) {
-            return tenant_keys.values().any(|k| k.api_key == api_key_value);
+            return tenant_keys.values().any(|k| {
+                k.api_key.as_bytes().ct_eq(api_key_value.as_bytes()).into()
+            });
         }
         false
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/apikeys.rs` around lines 234 - 249, In validate_key, replace the plain
equality checks of k.api_key == api_key_value with a constant-time comparison
using the subtle crate: add subtle = "2" to Cargo.toml, import
subtle::ConstantTimeEq, and use
k.api_key.as_bytes().ct_eq(api_key_value.as_bytes()).unwrap_u8() == 1 (or
equivalent API) when iterating tenant_keys.values().any(...); update both places
in the validate_key method to use this constant-time byte comparison instead of
== to mitigate timing attacks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/apikeys.rs`:
- Around line 234-249: In validate_key, replace the plain equality checks of
k.api_key == api_key_value with a constant-time comparison using the subtle
crate: add subtle = "2" to Cargo.toml, import subtle::ConstantTimeEq, and use
k.api_key.as_bytes().ct_eq(api_key_value.as_bytes()).unwrap_u8() == 1 (or
equivalent API) when iterating tenant_keys.values().any(...); update both places
in the validate_key method to use this constant-time byte comparison instead of
== to mitigate timing attacks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ed5dd247-c054-4a2d-ac96-87f7fb53d2dc

📥 Commits

Reviewing files that changed from the base of the PR and between 5359317 and fc7e4a8.

📒 Files selected for processing (2)
  • src/apikeys.rs
  • src/handlers/http/middleware.rs

coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 13, 2026
use api keys as an alternative auth mechanism for ingestion
use header `X-API-KEY` in place of basic auth
middleware validates the header on ingest action
key store in object store at .settings/apikeys/<key-id>.json
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/apikeys.rs`:
- Around line 168-180: The code currently holds the global write lock
(self.keys.write().await) across the awaited metastore call
(PARSEABLE.metastore.put_api_key), causing long blocking; change this to do the
duplicate-name check under the global lock, make a short-lived reservation
(e.g., insert a placeholder entry or per-tenant reservation key in tenant_keys)
or acquire a per-tenant lock, then drop the global write guard before awaiting
put_api_key(&api_key, &api_key.tenant). After put_api_key completes, reacquire
the global write lock and replace the placeholder with the real api_key (or
remove the reservation on error), returning
ApiKeyError::DuplicateKeyName(api_key.key_name) as before when duplicate
detected. Ensure any cleanup on metastore failure removes the reservation to
avoid dead reservations.

In `@src/handlers/http/middleware.rs`:
- Around line 157-178: The tenant header is being overwritten by
get_user_and_tenant, so in the API-key short-circuit branch capture the incoming
tenant header before calling get_user_and_tenant and use that captured value for
tenant_scoped checks (e.g., for suspension via check_suspension and for
validating API key scope) instead of reading req.headers() after
get_user_and_tenant; specifically, read TENANT_ID into a local tenant_id_before
(Option<String>) before invoking get_user_and_tenant, then use tenant_id_before
in the api-key branch where api_key_type_for_action, extract_api_key, and
check_suspension are used.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 16ad85d2-9099-4462-8af9-9c071809b95e

📥 Commits

Reviewing files that changed from the base of the PR and between fc7e4a8 and 63b62c8.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • Cargo.toml
  • src/apikeys.rs
  • src/handlers/http/middleware.rs
  • src/lib.rs
  • src/metastore/metastore_traits.rs
  • src/metastore/metastores/object_store_metastore.rs
  • src/storage/mod.rs
  • src/storage/object_storage.rs
✅ Files skipped from review due to trivial changes (3)
  • src/storage/mod.rs
  • Cargo.toml
  • src/lib.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/metastore/metastore_traits.rs

Comment thread src/apikeys.rs Outdated
Comment thread src/handlers/http/middleware.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/apikeys.rs (1)

274-286: Index API keys by secret on the auth path.

validate_key() does a full scan of every key in the tenant on each request. Since this now sits on ingest/query authentication, the cost grows linearly with key count and can become a tenant-local bottleneck. A second in-memory index keyed by the API-key value (or a derived hash) would keep validation O(1) while preserving the current key_id map for CRUD/listing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/apikeys.rs` around lines 274 - 286, validate_key currently scans all
tenant keys (self.keys) which is O(n); add a secondary in-memory index (e.g.,
self.keys_by_secret or self.key_secret_index) keyed by the API secret string or
its derived hash to allow O(1) lookup per tenant. Change validate_key to first
compute tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT) and then check
the new index for an entry matching api_key_value and required_type instead of
iterating tenant_keys.values().any(...). Ensure all CRUD paths that modify
self.keys (create/update/delete functions that reference key_id/key maps) also
update the new keys_by_secret index under the same lock semantics (write lock on
mutation, read lock on validation), and consider storing a hash instead of raw
secret if you want to avoid duplicating secret strings in memory.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/handlers/http/middleware.rs`:
- Around line 185-223: The code currently creates a temporary query session
(mut_sessions().track_new(...), SessionKey::SessionId, and
req.extensions_mut().insert(session_key)) before performing header_error,
suspension checks and API_KEYS.validate_key(...).await, causing sessions to be
created for failed requests; move the session creation and insertion (the
track_new call and req.extensions_mut().insert) to after the checks (after
header_error/suspension handling and after API_KEYS.validate_key(...) returns
true) so the session is only created when the API key is validated, keeping
SessionCleanupGuard(query_session_id) logic intact to still clean up the session
if created.

---

Nitpick comments:
In `@src/apikeys.rs`:
- Around line 274-286: validate_key currently scans all tenant keys (self.keys)
which is O(n); add a secondary in-memory index (e.g., self.keys_by_secret or
self.key_secret_index) keyed by the API secret string or its derived hash to
allow O(1) lookup per tenant. Change validate_key to first compute tenant =
tenant_id.as_deref().unwrap_or(DEFAULT_TENANT) and then check the new index for
an entry matching api_key_value and required_type instead of iterating
tenant_keys.values().any(...). Ensure all CRUD paths that modify self.keys
(create/update/delete functions that reference key_id/key maps) also update the
new keys_by_secret index under the same lock semantics (write lock on mutation,
read lock on validation), and consider storing a hash instead of raw secret if
you want to avoid duplicating secret strings in memory.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c3abb88d-0abf-43b5-8bb8-347ab05cf8c5

📥 Commits

Reviewing files that changed from the base of the PR and between 63b62c8 and 72286a2.

📒 Files selected for processing (2)
  • src/apikeys.rs
  • src/handlers/http/middleware.rs

Comment thread src/handlers/http/middleware.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/apikeys.rs (1)

274-288: Consider constant-time comparison for API key validation.

The string comparison k.api_key == api_key_value is not constant-time, which could theoretically leak timing information about the API key. While practical exploitation is difficult (network jitter dominates, and the any() iterator already introduces variable timing), using constant-time comparison is a defense-in-depth best practice for secret validation.

🔒 Proposed fix using subtle crate

Add subtle to dependencies, then:

+use subtle::ConstantTimeEq;
+
 pub async fn validate_key(
     &self,
     api_key_value: &str,
     tenant_id: &Option<String>,
     required_type: KeyType,
 ) -> bool {
     let map = self.keys.read().await;
     let tenant = tenant_id.as_deref().unwrap_or(DEFAULT_TENANT);
     if let Some(tenant_keys) = map.get(tenant) {
-        return tenant_keys
-            .values()
-            .any(|k| k.api_key == api_key_value && k.key_type == required_type);
+        for k in tenant_keys.values() {
+            if k.key_type == required_type
+                && k.api_key.as_bytes().ct_eq(api_key_value.as_bytes()).into()
+            {
+                return true;
+            }
+        }
     }
     false
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/apikeys.rs` around lines 274 - 288, The validate_key function uses a
standard equality check (k.api_key == api_key_value) which leaks timing info;
replace it with a constant-time comparison: add the subtle crate to Cargo.toml
and use k.api_key.as_bytes().ct_eq(api_key_value.as_bytes()).unwrap_u8() == 1
(or equivalent ConstantTimeEq API) inside the any() closure in validate_key so
you still check k.key_type == required_type but perform the API key comparison
in constant time; reference the validate_key method and the api_key field on the
key struct when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/apikeys.rs`:
- Around line 274-288: The validate_key function uses a standard equality check
(k.api_key == api_key_value) which leaks timing info; replace it with a
constant-time comparison: add the subtle crate to Cargo.toml and use
k.api_key.as_bytes().ct_eq(api_key_value.as_bytes()).unwrap_u8() == 1 (or
equivalent ConstantTimeEq API) inside the any() closure in validate_key so you
still check k.key_type == required_type but perform the API key comparison in
constant time; reference the validate_key method and the api_key field on the
key struct when making this change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 661190ac-4fb6-4c3b-909e-128ead64876c

📥 Commits

Reviewing files that changed from the base of the PR and between 63b62c8 and aa93cca.

📒 Files selected for processing (2)
  • src/apikeys.rs
  • src/handlers/http/middleware.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/handlers/http/middleware.rs

@nikhilsinhaparseable nikhilsinhaparseable changed the title feat: add api keys support for ingestion authentication feat: add api keys support for ingestion/query authentication Apr 22, 2026
@nikhilsinhaparseable nikhilsinhaparseable merged commit 76ff210 into parseablehq:main Apr 22, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants