Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion docker-compose-library.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ services:
dockerfile: deploy/lightspeed-stack/Containerfile
platform: linux/amd64
container_name: lightspeed-stack
# Wrapper seeds the RAG kvstore into a writable path before starting the app.
entrypoint: ["/bin/bash", "/app-root/lightspeed-stack-entrypoint.sh"]
ports:
- "8080:8080"
depends_on:
Expand All @@ -18,7 +20,12 @@ services:
- ./lightspeed-stack.yaml:/app-root/lightspeed-stack.yaml:Z
- ./run.yaml:/app-root/run.yaml:Z
- ${GCP_KEYS_PATH:-./tmp/.gcp-keys-dummy}:/opt/app-root/.gcp-keys:ro
- ./tests/e2e/rag:/opt/app-root/src/.llama/storage/rag:Z
# Deliver the seeded RAG kvstore read-only; the entrypoint copies it into
# a writable location so the embedded llama-stack can write to it as a
# non-root user (the registry shares this kvstore and must be writable).
- ./tests/e2e/rag:/opt/app-root/rag-seed:ro,Z
# Host copy so `docker compose up` picks up script changes without rebuilding
- ./scripts/lightspeed-stack-entrypoint.sh:/app-root/lightspeed-stack-entrypoint.sh:ro,z
Comment on lines +26 to +28

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Inconsistent SELinux label casing between mounts.

The RAG seed mount uses :ro,Z (private label) while the entrypoint script mount uses :ro,z (shared label). For consistency and clarity, consider using the same casing unless the different labeling is intentional.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docker-compose-library.yaml` around lines 26 - 28, The SELinux mount label
casing is inconsistent between the two volume entries
("./tests/e2e/rag:/opt/app-root/rag-seed:ro,Z" vs
"./scripts/lightspeed-stack-entrypoint.sh:/app-root/lightspeed-stack-entrypoint.sh:ro,z");
pick one casing and make both mounts consistent (either change the RAG seed
mount to use :ro,z or the entrypoint script to :ro,Z) unless the differing
semantics (private vs shared) are intentional—update the line containing
"./tests/e2e/rag:/opt/app-root/rag-seed:ro,Z" or the line containing
"./scripts/lightspeed-stack-entrypoint.sh:/app-root/lightspeed-stack-entrypoint.sh:ro,z"
accordingly.

- ./tests/e2e/secrets/mcp-token:/tmp/mcp-token:ro,z
- ./tests/e2e/secrets/invalid-mcp-token:/tmp/invalid-mcp-token:ro,z
environment:
Expand Down
5 changes: 4 additions & 1 deletion docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,10 @@ services:
- ${GCP_KEYS_PATH:-./tmp/.gcp-keys-dummy}:/opt/app-root/.gcp-keys:ro
- ./lightspeed-stack.yaml:/opt/app-root/lightspeed-stack.yaml:ro,z
- llama-storage:/opt/app-root/src/.llama/storage
- ./tests/e2e/rag:/opt/app-root/src/.llama/storage/rag:z
# Deliver the seeded RAG kvstore read-only; the entrypoint copies it into
# the writable storage volume so llama-stack can write to it as a non-root
# user (the registry shares this kvstore and must be writable at startup).
- ./tests/e2e/rag:/opt/app-root/rag-seed:ro,z
- mock-tls-certs:/certs:ro
environment:
- BRAVE_SEARCH_API_KEY=${BRAVE_SEARCH_API_KEY:-}
Expand Down
25 changes: 25 additions & 0 deletions scripts/lightspeed-stack-entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/bin/bash
# Entrypoint for the library-mode lightspeed-stack container.
# Seeds the RAG kvstore into a writable location, then starts lightspeed-stack.

set -e

# Seed the RAG kvstore into the writable storage volume.
#
# The seed db is mounted read-only from the host (owned by the host user), but
# the embedded llama-stack runs as a non-root user and must write to this
# kvstore at startup (the resource registry shares it). Copying it into the
# storage tree makes the runtime db owned by the container user, so it is
# writable regardless of the host UID. See run.yaml -> storage.backends.kv_default.
RAG_SEED_DIR="${RAG_SEED_DIR:-/opt/app-root/rag-seed}"
STORAGE_RAG_DIR="${STORAGE_RAG_DIR:-/opt/app-root/src/.llama/storage/rag}"
if [ -d "$RAG_SEED_DIR" ]; then
echo "Seeding RAG kvstore from $RAG_SEED_DIR into $STORAGE_RAG_DIR..."
mkdir -p "$STORAGE_RAG_DIR"
cp -f "$RAG_SEED_DIR"/*.db "$STORAGE_RAG_DIR"/
fi
Comment on lines +16 to +20

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Improve error handling for missing seed files.

If $RAG_SEED_DIR exists but contains no *.db files, the glob pattern will fail to expand and cp will exit with a cryptic error (cp: cannot stat '.../*.db': No such file or directory). Consider adding explicit validation to provide a clearer error message.

🛡️ Proposed fix to add validation
 if [ -d "$RAG_SEED_DIR" ]; then
     echo "Seeding RAG kvstore from $RAG_SEED_DIR into $STORAGE_RAG_DIR..."
     mkdir -p "$STORAGE_RAG_DIR"
+    if ! compgen -G "$RAG_SEED_DIR/*.db" > /dev/null; then
+        echo "ERROR: No .db files found in $RAG_SEED_DIR"
+        exit 1
+    fi
     cp -f "$RAG_SEED_DIR"/*.db "$STORAGE_RAG_DIR"/
 fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if [ -d "$RAG_SEED_DIR" ]; then
echo "Seeding RAG kvstore from $RAG_SEED_DIR into $STORAGE_RAG_DIR..."
mkdir -p "$STORAGE_RAG_DIR"
cp -f "$RAG_SEED_DIR"/*.db "$STORAGE_RAG_DIR"/
fi
if [ -d "$RAG_SEED_DIR" ]; then
echo "Seeding RAG kvstore from $RAG_SEED_DIR into $STORAGE_RAG_DIR..."
mkdir -p "$STORAGE_RAG_DIR"
if ! compgen -G "$RAG_SEED_DIR/*.db" > /dev/null; then
echo "ERROR: No .db files found in $RAG_SEED_DIR"
exit 1
fi
cp -f "$RAG_SEED_DIR"/*.db "$STORAGE_RAG_DIR"/
fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/lightspeed-stack-entrypoint.sh` around lines 16 - 20, When
$RAG_SEED_DIR exists, check whether any .db files actually exist before running
cp to avoid the cryptic "cannot stat" error; test the glob (e.g. set --
"$RAG_SEED_DIR"/*.db and verify [ -e "$1" ] or loop to detect files) and if none
are found, emit a clear error/warning mentioning RAG_SEED_DIR and skip or exit
instead of running cp into STORAGE_RAG_DIR; update the block referencing
RAG_SEED_DIR and STORAGE_RAG_DIR and the cp invocation accordingly.


# Use the venv interpreter explicitly: overriding the image entrypoint changes
# PATH ordering, so a bare `python3.12` may resolve to the system interpreter
# (without the app's dependencies) instead of the venv at /app-root/.venv.
exec /app-root/.venv/bin/python3.12 src/lightspeed_stack.py "$@"
15 changes: 15 additions & 0 deletions scripts/llama-stack-entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,21 @@ INPUT_CONFIG="${LLAMA_STACK_CONFIG:-/opt/app-root/run.yaml}"
ENRICHED_CONFIG="/tmp/enriched-run.yaml"
LIGHTSPEED_CONFIG="${LIGHTSPEED_CONFIG:-/opt/app-root/lightspeed-stack.yaml}"

# Seed the RAG kvstore into the writable storage volume.
#
# The seed db is mounted read-only from the host (owned by the host user), but
# llama-stack runs as a non-root user and must write to this kvstore at startup
# (the resource registry shares it). Copying it into the storage volume makes
# the runtime db owned by the container user, so it is writable regardless of
# the host UID. See run.yaml -> storage.backends.kv_default.
RAG_SEED_DIR="${RAG_SEED_DIR:-/opt/app-root/rag-seed}"
STORAGE_RAG_DIR="${STORAGE_RAG_DIR:-/opt/app-root/src/.llama/storage/rag}"
if [ -d "$RAG_SEED_DIR" ]; then
echo "Seeding RAG kvstore from $RAG_SEED_DIR into $STORAGE_RAG_DIR..."
mkdir -p "$STORAGE_RAG_DIR"
cp -f "$RAG_SEED_DIR"/*.db "$STORAGE_RAG_DIR"/
fi
Comment on lines +18 to +24

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Consider extracting duplicated seeding logic.

The RAG seeding block (lines 18-24) is identical to the logic in scripts/lightspeed-stack-entrypoint.sh (lines 14-20). Consider extracting this into a shared function or sourced script to reduce duplication and ensure consistency.

♻️ Example: Extract to a shared function

Create a shared script (e.g., scripts/seed-rag-kvstore.sh):

#!/bin/bash
# Shared function to seed RAG kvstore

seed_rag_kvstore() {
    local RAG_SEED_DIR="${RAG_SEED_DIR:-/opt/app-root/rag-seed}"
    local STORAGE_RAG_DIR="${STORAGE_RAG_DIR:-/opt/app-root/src/.llama/storage/rag}"
    
    if [ -d "$RAG_SEED_DIR" ]; then
        echo "Seeding RAG kvstore from $RAG_SEED_DIR into $STORAGE_RAG_DIR..."
        mkdir -p "$STORAGE_RAG_DIR"
        if ! compgen -G "$RAG_SEED_DIR/*.db" > /dev/null; then
            echo "ERROR: No .db files found in $RAG_SEED_DIR"
            exit 1
        fi
        cp -f "$RAG_SEED_DIR"/*.db "$STORAGE_RAG_DIR"/
    fi
}

Then source and call it in both entrypoint scripts:

source /app-root/seed-rag-kvstore.sh
seed_rag_kvstore
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/llama-stack-entrypoint.sh` around lines 18 - 24, The RAG seeding
block in scripts/llama-stack-entrypoint.sh is duplicated in
scripts/lightspeed-stack-entrypoint.sh; extract that logic into a shared
function (e.g., seed_rag_kvstore) in a new script (e.g.,
scripts/seed-rag-kvstore.sh) and then source and call seed_rag_kvstore from both
entrypoint scripts to remove duplication; ensure the function uses the same
environment variables (RAG_SEED_DIR, STORAGE_RAG_DIR), checks that the seed
directory exists and contains .db files before copying, emits the same echo
messages, and returns non-zero on fatal errors so both llama-stack-entrypoint.sh
and lightspeed-stack-entrypoint.sh simply call seed_rag_kvstore to perform the
seeding.

Comment on lines +20 to +24

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Improve error handling for missing seed files.

If $RAG_SEED_DIR exists but contains no *.db files, the glob pattern will fail to expand and cp will exit with a cryptic error. Consider adding explicit validation for a clearer error message (same issue as in lightspeed-stack-entrypoint.sh).

🛡️ Proposed fix to add validation
 if [ -d "$RAG_SEED_DIR" ]; then
     echo "Seeding RAG kvstore from $RAG_SEED_DIR into $STORAGE_RAG_DIR..."
     mkdir -p "$STORAGE_RAG_DIR"
+    if ! compgen -G "$RAG_SEED_DIR/*.db" > /dev/null; then
+        echo "ERROR: No .db files found in $RAG_SEED_DIR"
+        exit 1
+    fi
     cp -f "$RAG_SEED_DIR"/*.db "$STORAGE_RAG_DIR"/
 fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if [ -d "$RAG_SEED_DIR" ]; then
echo "Seeding RAG kvstore from $RAG_SEED_DIR into $STORAGE_RAG_DIR..."
mkdir -p "$STORAGE_RAG_DIR"
cp -f "$RAG_SEED_DIR"/*.db "$STORAGE_RAG_DIR"/
fi
if [ -d "$RAG_SEED_DIR" ]; then
echo "Seeding RAG kvstore from $RAG_SEED_DIR into $STORAGE_RAG_DIR..."
mkdir -p "$STORAGE_RAG_DIR"
if ! compgen -G "$RAG_SEED_DIR/*.db" > /dev/null; then
echo "ERROR: No .db files found in $RAG_SEED_DIR"
exit 1
fi
cp -f "$RAG_SEED_DIR"/*.db "$STORAGE_RAG_DIR"/
fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/llama-stack-entrypoint.sh` around lines 20 - 24, The seed-copy block
that checks RAG_SEED_DIR and runs cp may fail with a cryptic error when there
are no *.db files; update the shell snippet that uses RAG_SEED_DIR and
STORAGE_RAG_DIR to validate that at least one .db file exists before running cp,
e.g., test the glob (for f in "$RAG_SEED_DIR"/*.db; do [ -e "$f" ] && break;
done) and if no files found print a clear error/warning and skip or exit,
otherwise proceed to mkdir -p "$STORAGE_RAG_DIR" and cp the files; ensure the
message mentions the missing .db files and uses the same variable names
(RAG_SEED_DIR, STORAGE_RAG_DIR) so it’s easy to locate.


# Enrich config if lightspeed config exists
if [ -f "$LIGHTSPEED_CONFIG" ]; then
echo "Enriching llama-stack config..."
Expand Down
22 changes: 17 additions & 5 deletions src/app/endpoints/streaming_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -644,11 +644,23 @@ async def generate_response(
should_generate = context.query_request.generate_topic_summary
if should_generate:
logger.debug("Generating topic summary for new conversation")
topic_summary = await get_topic_summary(
context.query_request.query,
context.client,
responses_params.model,
)
# The stream has already started here, so a raised exception cannot
# be turned into a clean HTTP error response ("response already
# started"). Topic summaries are non-essential metadata, so failures
# (e.g. context-length overflow on a very large query) are caught and
# logged rather than aborting the already-streamed answer.
try:
topic_summary = await get_topic_summary(
context.query_request.query,
context.client,
responses_params.model,
)
except Exception: # pylint: disable=broad-except
logger.exception(
"Failed to generate topic summary for new conversation, "
"request %s",
context.request_id,
)

# Consume tokens
logger.info("Consuming tokens")
Expand Down
67 changes: 67 additions & 0 deletions tests/unit/app/endpoints/test_streaming_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -1216,6 +1216,73 @@ async def mock_generator() -> AsyncIterator[str]:

assert len(result) > 0

@pytest.mark.asyncio
async def test_generate_response_topic_summary_failure_does_not_abort_stream(
self, mocker: MockerFixture
) -> None:
"""Test stream completes when post-stream topic summary generation fails.

The topic summary is generated after the stream has already started, so a
raised exception must be caught and logged rather than propagated (which
would otherwise trigger "response already started").
"""

async def mock_generator() -> AsyncIterator[str]:
yield "data: token\n\n"

mock_context = mocker.Mock(spec=ResponseGeneratorContext)
mock_context.conversation_id = "conv_123"
mock_context.user_id = "user_123"
mock_context.vector_store_ids = []
mock_context.rag_id_mapping = {}
mock_context.inline_rag_context = RAGContext()
mock_context.query_request = QueryRequest(
query="test", generate_topic_summary=True
) # pyright: ignore[reportCallIssue]
mock_context.started_at = "2024-01-01T00:00:00Z"
mock_context.skip_userid_check = False
mock_context.request_id = "123e4567-e89b-12d3-a456-426614174000"
mock_context.client = mocker.AsyncMock(spec=AsyncLlamaStackClient)

mock_responses_params = mocker.Mock(spec=ResponsesApiParams)
mock_responses_params.model = "provider1/model1"

mock_turn_summary = TurnSummary()
mock_turn_summary.token_usage = TokenCounter(input_tokens=10, output_tokens=5)

mock_config = mocker.Mock()
mock_config.quota_limiters = []
mocker.patch("app.endpoints.streaming_query.configuration", mock_config)
mocker.patch("app.endpoints.streaming_query.consume_query_tokens")
mocker.patch(
"app.endpoints.streaming_query.get_available_quotas", return_value={}
)
mocker.patch(
"app.endpoints.streaming_query.get_topic_summary",
new=mocker.AsyncMock(
side_effect=HTTPException(status_code=500, detail="context length")
),
)
store_results_mock = mocker.patch(
"app.endpoints.streaming_query.store_query_results"
)

result = []
async for item in generate_response(
mock_generator(),
mock_context,
mock_responses_params,
mock_turn_summary,
):
result.append(item)

# The stream still completes (post-stream side effects run) and the
# topic summary failure does not propagate.
assert len(result) > 0
store_results_mock.assert_called_once()
call_kwargs = store_results_mock.call_args.kwargs
assert call_kwargs["topic_summary"] is None

@pytest.mark.asyncio
async def test_generate_response_connection_error(
self, mocker: MockerFixture
Expand Down
Loading