Skip to content

fix: run embedding novelty check before evaluation to avoid wasted compute#459

Open
octo-patch wants to merge 1 commit intoalgorithmicsuperintelligence:mainfrom
octo-patch:fix/issue-439-pre-eval-novelty-check
Open

fix: run embedding novelty check before evaluation to avoid wasted compute#459
octo-patch wants to merge 1 commit intoalgorithmicsuperintelligence:mainfrom
octo-patch:fix/issue-439-pre-eval-novelty-check

Conversation

@octo-patch
Copy link
Copy Markdown

Fixes #439

Problem

When novelty checking is enabled via embedding_model in the database config, the embedding-based similarity check was being called inside database.add(), which runs after the program has already been evaluated. This means every program is fully evaluated (potentially running expensive external code) before discovering it's too similar to an existing program and getting rejected.

Current flow:

  1. Generate child program code (LLM call)
  2. Evaluate program (expensive — runs the actual program)
  3. database.add()_is_novel() → embedding similarity check
  4. If too similar → reject and discard evaluation result

Solution

Move the embedding-based cosine similarity check into the worker process, executed immediately after code generation and before evaluation.

New flow:

  1. Generate child program code (LLM call)
  2. Embedding similarity check (cheap — one embedding API call + vector math)
  3. If too similar → return early, skip evaluation
  4. Evaluate program (only for novel candidates)
  5. database.add()_is_novel() (still runs as a safety fallback)

Implementation details:

  • Add _worker_embedding_client global to worker process, lazily initialized in _lazy_init_worker_components() using config.database.embedding_model
  • Add _pre_eval_novelty_check() function that gets the embedding for the generated code and compares it with embeddings of existing programs in the target island (from the db snapshot)
  • Call it in _run_iteration_worker() between code generation and evaluation
  • Errors in the pre-check are caught and logged as warnings — the evaluation proceeds if the check fails unexpectedly
  • The LLM-based novelty judge in database.add() is preserved as a fallback for borderline cases

Testing

  • 6 new unit tests in tests/test_pre_eval_novelty_check.py covering:
    • Disabled client (no-op)
    • Zero threshold (no-op)
    • Missing embeddings in island programs (novel by default)
    • Similar code rejection (similarity ≥ threshold)
    • Orthogonal code acceptance (similarity < threshold)
    • Graceful handling of embedding API errors
  • All 356 existing tests continue to pass

…mpute (fixes algorithmicsuperintelligence#439)

When novelty checking is enabled, programs that are too similar to existing
island programs were being fully evaluated before the similarity check rejected
them. Move the embedding-based cosine similarity check into the worker process,
executed immediately after code generation and before the expensive evaluation
step. A global _worker_embedding_client is lazily initialized per worker using
the same config as the database. The LLM-based novelty judge in database.add()
is preserved as a fallback for borderline cases.

Add tests for _pre_eval_novelty_check covering: disabled client, zero threshold,
missing embeddings, similar code rejection, orthogonal code acceptance, and API
error graceful handling.

Co-Authored-By: Octopus <liyuan851277048@icloud.com>
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


octo-patch seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

1 similar comment
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


octo-patch seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Novelty check runs after evaluation, wasting compute on rejected programs

2 participants