Skip to content

feat(fts)!: make v2 the default index format#7512

Merged
BubbleCal merged 3 commits into
mainfrom
yang/fts-v2-default-index-param
Jun 30, 2026
Merged

feat(fts)!: make v2 the default index format#7512
BubbleCal merged 3 commits into
mainfrom
yang/fts-v2-default-index-param

Conversation

@BubbleCal

@BubbleCal BubbleCal commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Feature

What is the new feature?

This makes FTS v2 the default format for newly-created inverted / full-text indexes and replaces the previous environment-variable switch with an explicit format_version index creation parameter.

Why do we need this feature?

FTS v2 is the latest format, but format selection should be part of the index creation API instead of process-wide environment state. Existing v1 indexes must remain queryable and must continue to be maintained as v1 after append, incremental indexing, and optimize.

How does it work?

  • Defaults new FTS index creation to v2.
  • Adds explicit format_version handling in Rust params and exposes it through Python and Java index creation APIs.
  • Preserves the stored FTS format when deriving params from existing indexes.
  • Ensures mem-wal / maintained-index paths use the resolved index format instead of the default.
  • Restores FTS format from IndexMetadata.index_version when rebuilding mem-wal FTS config, mapping legacy 0 and v1 1 to v1 and 2 to v2.
  • Updates compatibility tests so old wheels can still use LANCE_FTS_FORMAT_VERSION while new code uses explicit format_version.

Breaking Change

BREAKING CHANGE: Newly-created FTS / inverted indexes now default to v2 instead of v1. Workflows that require the v1 layout, including compatibility with older Lance readers, must pass format_version=1; LANCE_FTS_FORMAT_VERSION no longer controls new Lance index creation.

Compatibility

Existing v1 FTS indexes remain readable. The regression coverage includes explicit v1 append + optimize_indices(OptimizeOptions::append()) and verifies the resulting FTS index metadata remains v1.

Verification

  • GitHub Actions: 30/30 checks passed on commit e6a2ca759.
  • git diff --check passed locally.
  • Required local Rust / Python / Java checks could not run in this environment because cargo, uv, and a Java Runtime are not installed locally; CI covered the PR checks.

@github-actions github-actions Bot added A-python Python bindings A-index Vector index, linalg, tokenizer A-java Java bindings + JNI enhancement New feature or request and removed A-python Python bindings A-index Vector index, linalg, tokenizer A-java Java bindings + JNI labels Jun 29, 2026
@github-actions github-actions Bot added A-python Python bindings A-index Vector index, linalg, tokenizer A-java Java bindings + JNI labels Jun 29, 2026
@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

@BubbleCal BubbleCal marked this pull request as ready for review June 29, 2026 14:12
@BubbleCal

Copy link
Copy Markdown
Contributor Author

@claude review

@BubbleCal BubbleCal changed the title feat(fts): make v2 the default index format feat(fts)!: make v2 the default index format Jun 29, 2026

@Xuanwo Xuanwo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mem-wal maintained-index path should have end-to-end v1 coverage before this lands. The key compatibility contract is that an existing v1 FTS index remains v1 after maintenance; if this path accidentally flushes the maintained index as v2, FTS queries may still pass while older readers lose compatibility.

@BubbleCal BubbleCal requested a review from Xuanwo June 30, 2026 04:54

@Xuanwo Xuanwo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this. We will need a breaking change upgrade note. Can be a follow up PR.

@BubbleCal BubbleCal merged commit c57864c into main Jun 30, 2026
30 checks passed
@BubbleCal BubbleCal deleted the yang/fts-v2-default-index-param branch June 30, 2026 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-index Vector index, linalg, tokenizer A-java Java bindings + JNI A-python Python bindings breaking-change enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants