Skip to content

Conversation

@wilbertharriman
Copy link

vlasky added a commit to vlasky/sqlite-vec that referenced this pull request Nov 28, 2025
Implements a custom 'optimize' command (similar to SQLite FTS5) that allows
reclaiming disk space after DELETE operations:

  INSERT INTO vec_table(vec_table) VALUES ('optimize');
  VACUUM;

How it works:
- Identifies fragmented chunks from deletions
- Migrates all vectors to new, contiguous chunks
- Preserves partition keys and metadata during migration
- Deletes old fragmented chunks
- Allows VACUUM to reclaim freed disk space

Implementation details:
- Adds hidden 'table_name' column to trigger special insert commands
- vec0Update_SpecialInsert_Optimize(): Main optimization logic
  - Iterates all rows and copies to new chunks
  - Copies metadata values to new chunk positions
  - Cleans up old chunks and vector data
- vec0Update_SpecialInsert_OptimizeCopyMetadata(): Handles metadata migration

Schema improvements:
- Change PRIMARY KEY → INTEGER PRIMARY KEY in shadow tables
- Makes rowid an alias instead of separate index
- Reduces storage overhead and improves performance

Use cases:
- After bulk deletions to reclaim disk space
- Periodic maintenance to defragment vector storage
- Before backups to minimize database file size

Caveats:
- Can be slow on large tables (rebuilds all chunks)
- Should be run during maintenance windows
- Not transaction-safe for concurrent reads
- Requires VACUUM afterward to actually free space

Merged from upstream PR asg017#210 by wilbertharriman.
Fixes issue asg017#185.

Co-Authored-By: Wilbert Harriman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant