Skip to content

tangle-network/agent-knowledge

Repository files navigation

agent-knowledge

Source-grounded, eval-gated knowledge growth primitives for agents.

This package turns raw sources and generated markdown knowledge into a versionable graph that agents can search, lint, evaluate, and improve over time. It is intentionally domain-agnostic: legal, tax, coding, research, finance, business, and scientific workflows define their own policies and rubrics on top.

Install

pnpm add @tangle-network/agent-knowledge @tangle-network/agent-eval

CLI

agent-knowledge init --root .
agent-knowledge source-add ./docs/spec.md --root .
agent-knowledge sources --root .
agent-knowledge apply-write-blocks ./proposal.txt --root .
agent-knowledge index --root .
agent-knowledge search "portfolio risk" --root .
agent-knowledge inspect --root .
agent-knowledge explain knowledge/concepts/risk.md --root .
agent-knowledge graph --root . --format json
agent-knowledge lint --root .
agent-knowledge validate --strict --root .
agent-knowledge export --root . --format json
agent-knowledge viz --root .

The default layout is:

raw/
  sources/
knowledge/
  index.md   # scaffold: human-navigation only, excluded from the page index
  log.md     # scaffold: human-navigation only, excluded from the page index
.agent-knowledge/
  sources.json
  index.json

initKnowledgeBase writes knowledge/index.md and knowledge/log.md for authors to curate by hand. They are deliberately excluded from buildKnowledgeIndex / searchKnowledge so they do not inflate page counts or pollute search hits. Any nested <dir>/index.md or <dir>/log.md is treated the same way. The shared predicate is isScaffoldPath, exported from @tangle-network/agent-knowledge.

Design

  • Raw sources are immutable evidence.
  • Generated knowledge is editable but validated.
  • Claims should cite source records when promoted.
  • Lint fails on pages that cite unknown source IDs.
  • Text sources get deterministic anchors (all, l1, l51, ...) for precise citations like [^src_id#all].
  • Agent write proposals can be safely applied with apply-write-blocks.
  • KbStore keeps storage consumer-owned; use MemoryKbStore, FileSystemKbStore, or implement D1 in the app.
  • Discovery uses worker/dispatcher contracts, with a local dispatcher for dev and tests.
  • runKnowledgeResearchLoop() provides thin loop mechanics for researcher agents: ingest sources, apply safe write blocks, rebuild the index, lint/validate, score readiness, and return a transcript. The agent still decides what to research, what to write, and when the wiki is good enough.
  • createKnowledgeControlLoopAdapter() maps those mechanics into agent-eval's runAgentControlLoop() so products can plug in their own proposer, reviewer, and driver policies.
  • Zod schemas define the stable wire shape.
  • Graph/search/lint are deterministic and fast.
  • searchKnowledge returns hits with three score fields. score and rrfScore are the raw reciprocal-rank-fusion value (typically 0.01–0.05); use them when intent matters or when fusing across queries. normalizedScore is the same value scaled into [0, 1] relative to the top hit in this result set (top hit = 1, others = score / topScore) — use it when comparing against natural confidence thresholds. The normalization is within-set ranking, not a cross-query absolute confidence.
  • Optimization uses @tangle-network/agent-eval internally instead of reimplementing eval gates.
  • buildEvalKnowledgeBundle() maps wiki/search evidence into agent-eval KnowledgeRequirement, KnowledgeBundle, and KnowledgeReadinessReport contracts so control loops can block, ask, or acquire data before running an agent.

The /viz subpath exports graph insight helpers without UI dependencies.

Agent-Eval Integration

Use runKnowledgeBaseOptimization() when the question is whether a candidate knowledge base actually improves agent task success. The candidate is passed through runMultiShotOptimization, so n=1 single-turn tasks and variable-length multi-turn traces use the same path.

Use knowledgeReleaseReportFromOptimization() to turn optimizer output into release confidence evidence using agent-eval release gates and RunRecord validation.

Use buildEvalKnowledgeBundle() before execution when the question is whether the agent has enough task-world context to run:

import { buildEvalKnowledgeBundle } from '@tangle-network/agent-knowledge'

const readiness = buildEvalKnowledgeBundle({
  taskId: 'sdk-migration',
  index,
  specs: [{
    id: 'repo-build-command',
    description: 'Repository build and typecheck command',
    query: 'build typecheck command',
    requiredFor: ['coding'],
    category: 'codebase_specific',
    acquisitionMode: 'inspect_repo',
    importance: 'blocking',
    freshness: 'weekly',
    sensitivity: 'public',
    confidenceNeeded: 0.9,
    minSources: 1,
  }],
})

console.log(readiness.report.recommendedAction)

Pass readiness.report to blockingKnowledgeEval() from @tangle-network/agent-eval; use readiness.questions and readiness.acquisitionPlans to drive UI or connector workflows.

Research Loop

Use runKnowledgeResearchLoop() when an agent is acting as a researcher or librarian. Keep the loop small: the package handles deterministic mechanics; your agent handles judgment.

import {
  defineReadinessSpec,
  runKnowledgeResearchLoop,
} from '@tangle-network/agent-knowledge'

await runKnowledgeResearchLoop({
  root: './kb',
  goal: 'Build a grounded onboarding wiki for billing support',
  readinessSpecs: [defineReadinessSpec({
    id: 'refund-policy',
    description: 'Refund policy grounding',
    query: 'refund policy customer request',
    requiredFor: ['support-agent'],
  })],
  async step({ iteration, index, readiness }) {
    // Call your researcher/LLM/browser/connector workflow here.
    if (iteration > 1 && readiness?.report.blockingMissingRequirements.length === 0) {
      return { done: true, notes: 'ready for eval' }
    }
    return {
      sourceTexts: [{
        uri: 'research://refund-policy',
        title: 'Refund Policy Source',
        text: 'Source text gathered by the researcher.',
      }],
      proposalText: [
        '---FILE: knowledge/support/refund-policy.md---',
        '---',
        'id: refund-policy',
        'title: Refund Policy',
        '---',
        '# Refund Policy',
        'Grounded summary written by the researcher.',
        '---END FILE---',
      ].join('\n'),
    }
  },
})

This is intentionally not a crawler, prompt framework, or agent. It is the repeatable shell around one.

For full agent-eval control-loop integration, use createKnowledgeControlLoopAdapter() and provide decide yourself:

import { runAgentControlLoop } from '@tangle-network/agent-eval'
import { createKnowledgeControlLoopAdapter } from '@tangle-network/agent-knowledge'

const adapter = createKnowledgeControlLoopAdapter({
  root: './kb',
  goal: 'Maintain the billing support wiki',
  readinessSpecs,
})

await runAgentControlLoop({
  ...adapter,
  async decide({ state, evals }) {
    if (state.previousSteps.length > 0 && evals.every((e) => e.passed)) {
      return { type: 'stop', pass: true, reason: 'knowledge ready' }
    }
    const proposal = await proposerAgent(state)
    const review = await reviewerAgent({ ...state, proposal })
    return {
      type: 'continue',
      reason: review.summary,
      action: driverPolicy({ proposal, review }),
    }
  },
})

Researcher profile

@tangle-network/agent-knowledge/profiles ships a sandbox-SDK AgentProfile preset for source-grounded research agents. Pairs with runLoop from @tangle-network/agent-runtime/loops — the profile owns the prompt + output adapter + validator; the kernel owns iteration, concurrency, cost, and trace emission.

import { runLoop } from '@tangle-network/agent-runtime/loops'
import { multiHarnessResearcherFanout } from '@tangle-network/agent-knowledge/profiles'

const research = multiHarnessResearcherFanout({
  harnesses: ['opencode/zai-coding-plan/glm-5.1', 'claude-code', 'codex'],
})

const result = await runLoop({
  driver: research.driver,
  agentRuns: research.agentRuns,
  output: research.output,
  validator: research.validator,
  task: {
    question: 'What content does cpg-founder ICP engage with on Twitter?',
    knowledgeNamespace: 'cust_42',
    sources: ['twitter', 'web'],
    maxItems: 20,
    minConfidence: 0.6,
  },
  ctx: { sandboxClient },
})

if (result.winner?.verdict?.valid) {
  // result.winner.output.proposedWrites: KnowledgeUpdate[]
  // The profile does NOT materialize. Decide whether to apply.
  for (const write of result.winner.output.proposedWrites) {
    // route through applyKnowledgeWriteBlocks / a KbStore put when ready
  }
}

Three invariants are enforced by the validator:

  • Namespace isolation — every KnowledgeItem + KnowledgeUpdate must carry task.knowledgeNamespace. Cross-tenant writes hard-fail.
  • Provenance — every item carries at least one evidence entry.
  • Citation density — quotes-with-source / items >= 0.7 by default.

Validator scoring (default; overridable):

score = 0.4 · citation_density
      + 0.2 · source_diversity
      + 0.2 · recency_match
      + 0.2 · gap_coverage

The output preserves agent intelligence — items, citations, proposedWrites are typed; gaps, notes, and any extras the agent emitted land in raw rather than getting dropped.

Pluggable Knowledge Sources

Static knowledge rots. Authorities like Cornell LII, the IRS, and state Secretaries of State change without warning — a ruling vacates an FTC non-compete rule, a CFR section renumbers, a state replaces Beverly-Killea with RULLCA. The @tangle-network/agent-knowledge/sources subpath ships three primitives that bridge "live authority" → "eval re-runs":

  • KnowledgeSource — pluggable contract (fetch(opts) → KnowledgeFragment[]). Every fragment carries provenance (URL, source-attested timestamp, jurisdiction, verifiable flag) and dimensionHints (which eval dimensions a change in this fragment should re-score).
  • KnowledgeFreshnessStore — per-(workspaceId, sourceId) last-refresh tracker. Filesystem adapter ships in-package; D1 / Postgres adapter scaffold is shipped as createD1FreshnessStoreStub(adapter).
  • detectChanges(prev, next) — diffs two fragment snapshots, emits KnowledgeChange[] tagged with the affected eval dimensions so a cron scheduler knows exactly which campaigns to re-run.

Three concrete sources ship in-package:

import {
  createCornellLiiSource,
  createIrsPublicationsSource,
  createStateSosSource,
  createFileSystemFreshnessStore,
  detectChanges,
  type KnowledgeChange,
  type KnowledgeFragment,
} from '@tangle-network/agent-knowledge'

const sources = [
  // Federal statutes + Wex encyclopedia from law.cornell.edu.
  createCornellLiiSource({
    selectors: [
      { kind: 'uscode', path: '18/1836' },               // DTSA
      { kind: 'wex', path: 'restraint_of_trade', dimensionHints: ['jurisdictional_accuracy'] },
    ],
  }),
  // IRS publications index + named publications + revenue procedures.
  createIrsPublicationsSource({
    publications: ['p15', 'p17', 'p463'],
    revenueProcedures: [],
  }),
  // Generic state SOS adapter — one config per state you need tracked.
  createStateSosSource({
    state: 'CA',
    baseUrl: 'https://www.sos.ca.gov',
    entities: [{
      id: 'business-entities-forms',
      path: '/business-programs/business-entities/forms',
      title: 'CA Business Entities Forms',
      selector: { kind: 'whole' },
    }],
  }),
]

const freshness = createFileSystemFreshnessStore({ root: './kb' })

// Worked example: Cornell LII updates the Wex `restraint_of_trade` entry
// to reflect Ryan-LLC v. FTC. The cron tick below detects the change,
// extracts the `jurisdictional_accuracy` dimension hint, and hands it to
// the eval scheduler which re-runs only the campaigns tagged with that
// dimension.
async function tick({ workspaceId, prevSnapshots }: {
  workspaceId: string
  prevSnapshots: Record<string, KnowledgeFragment[]>
}): Promise<KnowledgeChange[]> {
  const allChanges: KnowledgeChange[] = []
  for (const source of sources) {
    const stale = await freshness.stale({
      workspaceId,
      sourceId: source.id,
      ttlMs: 24 * 60 * 60 * 1000,
    })
    if (!stale) continue

    const next = await source.fetch({ cacheDir: './.agent-knowledge/http-cache' })
    const prev = prevSnapshots[source.id] ?? []
    const { changes } = detectChanges(prev, next)
    allChanges.push(...changes)

    await freshness.mark({ workspaceId, sourceId: source.id, when: new Date() })
    prevSnapshots[source.id] = next
  }
  return allChanges
}

Polite-by-default: every HTTP fetch carries the package User-Agent, is throttled to 1 req/sec/origin, caches successful responses to disk, and marks verifiable: false on block pages / 4xx rather than promoting un-grounded content. See src/sources/http.ts for the invariants.

About

Source-grounded, eval-gated knowledge growth primitives for agents.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors