Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .github/workflows/e2e_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,20 @@ jobs:
echo "=== lightspeed-stack.yaml ==="
grep -A 3 "llama_stack:" lightspeed-stack.yaml

- name: Cache HuggingFace embedding model
uses: actions/cache@v4
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin action to commit SHA for supply-chain security.

The actions/cache@v4 reference is not pinned to a specific commit SHA. GitHub Actions security best practices require pinning to immutable commit hashes to prevent supply-chain attacks.

🔒 Recommended fix
-      - name: Cache HuggingFace embedding model
-        uses: actions/cache@v4
+      - name: Cache HuggingFace embedding model  
+        uses: actions/cache@3624ceb22c1c5a301c8db4169662070a689d9ea8  # v4.1.1

Use actions/cache@3624ceb22c1c5a301c8db4169662070a689d9ea8 (current v4.1.1) or the latest commit from the v4 branch. Add a comment with the version for maintainability.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
uses: actions/cache@v4
uses: actions/cache@3624ceb22c1c5a301c8db4169662070a689d9ea8 # v4.1.1
🧰 Tools
🪛 zizmor (1.25.2)

[error] 118-118: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e_tests.yaml at line 118, Replace the floating action
reference "uses: actions/cache@v4" with a pinned commit SHA (e.g., "uses:
actions/cache@3624ceb22c1c5a301c8db4169662070a689d9ea8") to hard-pin the action
for supply-chain security and add a trailing comment noting the semantic version
(v4.1.1) you pinned from for future maintainability; update any tests or docs
that expect the unpinned form if necessary.

with:
path: /tmp/hf-cache
key: hf-sentence-transformers-all-mpnet-base-v2
Comment on lines +117 to +121
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add versioning to cache key to enable invalidation.

The cache key hf-sentence-transformers-all-mpnet-base-v2 is static and doesn't include version information. If the model is updated upstream or the sentence-transformers library changes model handling, stale cached artifacts will persist until manually invalidated.

📦 Suggested improvement
       - name: Cache HuggingFace embedding model
         uses: actions/cache@v4
         with:
           path: /tmp/hf-cache
-          key: hf-sentence-transformers-all-mpnet-base-v2
+          key: hf-sentence-transformers-all-mpnet-base-v2-${{ hashFiles('**/pyproject.toml') }}
+          restore-keys: |
+            hf-sentence-transformers-all-mpnet-base-v2-

Include a hash of dependency lockfiles or a date component (e.g., ${{ github.run_number }} or ${{ github.sha }}) in the key. Use restore-keys as a fallback to speed up cache misses.

🧰 Tools
🪛 zizmor (1.25.2)

[error] 118-118: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e_tests.yaml around lines 117 - 121, Update the static
cache key used with actions/cache@v4 (the key value
"hf-sentence-transformers-all-mpnet-base-v2") to include a versioning component
so caches can be invalidated automatically; modify the `key` to append a dynamic
value such as `${{ github.run_number }}`, `${{ github.sha }}`, or a hash of
lockfiles, and add a `restore-keys` fallback pattern to speed up restores on
partial hits (keep the `path` and action unchanged, only adjust the `key` and
add `restore-keys`).


- name: Pre-download HuggingFace embedding model
env:
HF_HOME: /tmp/hf-cache
run: |
pip install -q sentence-transformers
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Pin sentence-transformers version for reproducibility.

Installing sentence-transformers without a version constraint can lead to non-deterministic CI behavior if the library releases breaking changes or updates model-loading logic.

📌 Suggested fix
-          pip install -q sentence-transformers
+          pip install -q sentence-transformers==3.3.1

Pin to the version currently in use (check pyproject.toml or requirements.txt if available). If the project doesn't use sentence-transformers directly, pin to a known-compatible version.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pip install -q sentence-transformers
pip install -q sentence-transformers==3.3.1
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e_tests.yaml at line 127, Replace the unpinned pip
install command "pip install -q sentence-transformers" with a pinned version to
ensure CI reproducibility; pick the exact version used by the project (from
pyproject.toml or requirements.txt) or a known-compatible release and update the
workflow step to install that specific version instead of the floating package.

python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-mpnet-base-v2')"
echo "HF_CACHE_PATH=/tmp/hf-cache" >> $GITHUB_ENV

- name: Docker Login for quay access
if: matrix.mode == 'server'
env:
Expand Down
3 changes: 3 additions & 0 deletions docker-compose-library.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ services:
- ./run.yaml:/app-root/run.yaml:Z
- ${GCP_KEYS_PATH:-./tmp/.gcp-keys-dummy}:/opt/app-root/.gcp-keys:ro
- ./tests/e2e/rag:/opt/app-root/src/.llama/storage/rag:Z
- ${HF_CACHE_PATH:-./tmp/.hf-cache}:/opt/app-root/src/.cache/huggingface
- ./tests/e2e/secrets/mcp-token:/tmp/mcp-token:ro,z
- ./tests/e2e/secrets/invalid-mcp-token:/tmp/invalid-mcp-token:ro,z
environment:
Expand Down Expand Up @@ -57,6 +58,8 @@ services:
- LLAMA_STACK_LOGGING=${LLAMA_STACK_LOGGING:-}
# FAISS test and inline RAG config
- FAISS_VECTOR_STORE_ID=${FAISS_VECTOR_STORE_ID:-}
# Prevent HuggingFace Hub update checks (HTTP 429 rate-limiting in CI from parallel jobs).
- HF_HUB_OFFLINE=1
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/liveness"]
interval: 10s # how often to run the check
Expand Down
3 changes: 3 additions & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ services:
- llama-storage:/opt/app-root/src/.llama/storage
- ./tests/e2e/rag:/opt/app-root/src/.llama/storage/rag:z
- mock-tls-certs:/certs:ro
- ${HF_CACHE_PATH:-./tmp/.hf-cache}:/opt/app-root/src/.cache/huggingface
environment:
- BRAVE_SEARCH_API_KEY=${BRAVE_SEARCH_API_KEY:-}
- TAVILY_SEARCH_API_KEY=${TAVILY_SEARCH_API_KEY:-}
Expand Down Expand Up @@ -57,6 +58,8 @@ services:
- LLAMA_STACK_LOGGING=${LLAMA_STACK_LOGGING:-}
# FAISS test
- FAISS_VECTOR_STORE_ID=${FAISS_VECTOR_STORE_ID:-}
# Prevent HuggingFace Hub update checks (HTTP 429 rate-limiting in CI from parallel jobs).
- HF_HUB_OFFLINE=1
Comment on lines +61 to +62
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check for documentation on local vs CI compose usage
rg -i -C3 'docker.compose.*local|local.*docker|development.*setup' --type=md README.md CONTRIBUTING.md docs/

# Check if there's a separate compose override file for local development
fd 'docker-compose.*override|docker-compose.*local' --type=f

Repository: lightspeed-core/lightspeed-stack

Length of output: 5038


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate docker-compose.yaml
echo "== docker-compose.yaml candidates =="
fd -i 'docker-compose\.ya?ml' . || true

# Show the referenced lines (if file at repo root; otherwise find the closest match and print the requested vicinity)
echo
echo "== docker-compose.yaml excerpt around lines 55-75 =="
for f in $(fd -i 'docker-compose\.ya?ml' . -x echo 2>/dev/null | tr '\n' ' '); do
  echo "--- $f ---"
  nl -ba "$f" | sed -n '55,80p' || true
done

echo
echo "== occurrences of HF_HUB_OFFLINE / HF_CACHE_PATH =="
rg -n --hidden --no-ignore -S 'HF_HUB_OFFLINE|HF_CACHE_PATH' . || true

echo
echo "== occurrences of huggingface offline/cache/pre-download =="
rg -n --hidden --no-ignore -S 'hugging.?face|hf-cache|hf cache|pre-?download' . || true

Repository: lightspeed-core/lightspeed-stack

Length of output: 11390


🏁 Script executed:

#!/bin/bash
set -euo pipefail

mapfile -t files < <(fd -i 'docker-compose\.ya?ml' .)

echo "== docker-compose.yaml candidates =="
printf '%s\n' "${files[@]:-<none>}"

echo
echo "== docker-compose.yaml excerpt around lines 55-80 (each match) =="
if [ "${`#files`[@]}" -eq 0 ]; then
  exit 0
fi

for f in "${files[@]}"; do
  echo "--- $f ---"
  nl -ba "$f" | sed -n '55,80p'
  echo
done

echo "== occurrences of HF_HUB_OFFLINE / HF_CACHE_PATH =="
rg -n --hidden --no-ignore -S 'HF_HUB_OFFLINE|HF_CACHE_PATH' . || true

echo
echo "== occurrences of huggingface offline/cache/pre-download =="
rg -n --hidden --no-ignore -S 'hugging.?face|hf-cache|HF_HOME|HF_DATASETS_CACHE|pre-?download' . || true

Repository: lightspeed-core/lightspeed-stack

Length of output: 133


🏁 Script executed:

#!/bin/bash
set -euo pipefail

mapfile -t files < <(fd -i 'docker-compose\.ya?ml' .)

echo "== docker-compose.yaml candidates =="
printf '%s\n' "${files[@]:-<none>}"

echo
echo "== docker-compose.yaml excerpt around lines 55-80 (each match) =="
if [ "${`#files`[@]}" -eq 0 ]; then
  exit 0
fi

for f in "${files[@]}"; do
  echo "--- $f ---"
  nl -ba "$f" | sed -n '55,80p'
  echo
done

echo "== occurrences of HF_HUB_OFFLINE / HF_CACHE_PATH =="
rg -n --hidden --no-ignore -S 'HF_HUB_OFFLINE|HF_CACHE_PATH' . || true

echo
echo "== occurrences of huggingface offline/cache/pre-download =="
rg -n --hidden --no-ignore -S 'hugging.?face|hf-cache|HF_HOME|HF_DATASETS_CACHE|pre-?download' . || true

Repository: lightspeed-core/lightspeed-stack

Length of output: 133


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== docker-compose.yaml lines 50-80 =="
sed -n '50,80p' docker-compose.yaml

echo
echo "== docker-compose-library.yaml lines 50-80 =="
sed -n '50,80p' docker-compose-library.yaml

echo
echo "== Search: HF_HUB_OFFLINE / HF_CACHE_PATH guidance in docs README workflows =="
rg -n --hidden --no-ignore -S 'HF_HUB_OFFLINE|HF_CACHE_PATH|hf-cache|HF_HOME' README.md CONTRIBUTING.md docs .github/workflows || true

echo
echo "== Show e2e_tests workflow around HF cache pre-download lines =="
# Tight context around the lines indicated earlier
rg -n 'Cache HuggingFace embedding model|Pre-download HuggingFace embedding model|HF_CACHE_PATH=/tmp/hf-cache' .github/workflows/e2e_tests.yaml -n
awk 'NR>=90 && NR<=160 {print NR":"$0}' .github/workflows/e2e_tests.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 6528


🏁 Script executed:

#!/bin/bash
set -euo pipefail
echo "== docker-compose override/local candidates =="
fd -i "docker-compose.*override.*\.ya?ml|docker-compose.*local.*\.ya?ml|docker-compose\.override\.ya?ml" .
echo
echo "== docker-compose_library.yaml override/local candidates =="
fd -i "docker-compose-library.*override.*\.ya?ml|docker-compose-library.*local.*\.ya?ml" .

Repository: lightspeed-core/lightspeed-stack

Length of output: 184


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Find Containerfile/Dockerfile references to .cache/huggingface =="
rg -n --hidden --no-ignore -S '\.cache/huggingface|HF_HOME|HF_HUB_OFFLINE|sentence-transformers' deploy .github/workflows || true

echo
echo "== Show relevant parts of deploy/lightspeed-stack/Containerfile =="
if [ -f deploy/lightspeed-stack/Containerfile ]; then
  nl -ba deploy/lightspeed-stack/Containerfile | sed -n '90,170p'
fi

echo
echo "== Show relevant parts of deploy/llama-stack/test.containerfile (if present) =="
if [ -f deploy/llama-stack/test.containerfile ]; then
  nl -ba deploy/llama-stack/test.containerfile | sed -n '1,120p'
fi

Repository: lightspeed-core/lightspeed-stack

Length of output: 894


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== deploy/lightspeed-stack/Containerfile (around .cache/huggingface) =="
sed -n '105,150p' deploy/lightspeed-stack/Containerfile

echo
echo "== deploy/llama-stack/test.containerfile (around line ~36) =="
sed -n '1,80p' deploy/llama-stack/test.containerfile

Repository: lightspeed-core/lightspeed-stack

Length of output: 3857


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n --hidden --no-ignore -S 'HF_HUB_OFFLINE' README.md CONTRIBUTING.md docs .github/workflows deploy || true
echo
rg -n --hidden --no-ignore -S 'hf-cache|HF_CACHE_PATH|pre-download HuggingFace|sentence-transformers/all-mpnet-base-v2' README.md docs .github/workflows || true

Repository: lightspeed-core/lightspeed-stack

Length of output: 2310


Hardcoded HF_HUB_OFFLINE=1 will block local embedding downloads unless the HF cache is pre-populated.

docker-compose.yaml (and docker-compose-library.yaml) unconditionally sets HF_HUB_OFFLINE=1 while mounting ${HF_CACHE_PATH:-./tmp/.hf-cache} into the container’s HuggingFace cache. CI avoids failures by pre-downloading the embedding model into /tmp/hf-cache and exporting HF_CACHE_PATH=/tmp/hf-cache before docker compose up.

Local docker compose up has no cache pre-populated (Containerfile only creates the cache directory), and the docs describe that the embedding model can be downloaded automatically on first start-up—offline mode contradicts this.

      # Prevent HuggingFace Hub update checks (HTTP 429 rate-limiting in CI from parallel jobs).
      - HF_HUB_OFFLINE=1
💡 Recommended fixes

Option 1: Conditional offline mode via environment variable (preferred)

       # Prevent HuggingFace Hub update checks (HTTP 429 rate-limiting in CI from parallel jobs).
-      - HF_HUB_OFFLINE=1
+      - HF_HUB_OFFLINE=${HF_HUB_OFFLINE:-0}

Set HF_HUB_OFFLINE=1 explicitly in the CI workflow environment (where the cache is already pre-filled). Apply the same change to docker-compose-library.yaml.

Option 2: Local pre-download step + docs

Add a local setup script to pre-download the embedding model into ./tmp/.hf-cache and document running it before docker compose up.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docker-compose.yaml` around lines 61 - 62, Remove the hardcoded
HF_HUB_OFFLINE=1 entry that forces offline mode and instead make offline mode
conditional: stop setting HF_HUB_OFFLINE in the compose env block and
document/expect the CI pipeline to export HF_HUB_OFFLINE=1 (or set it in the CI
job environment) when HF_CACHE_PATH is pre-populated; alternatively add a local
setup script to pre-download the embedding into the mounted HF_CACHE_PATH (or
./tmp/.hf-cache) before running docker compose up so the container can start
without offline mode; apply the same change to the docker-compose-library.yaml
and ensure references to HF_CACHE_PATH remain unchanged.

# OKP/Solr RAG
- RH_SERVER_OKP=${RH_SERVER_OKP:-}
- SOLR_URL=${SOLR_URL:-}
Expand Down
Loading