Slow performance and many small segments

I'm running Tantivy alongside a SQLite database to augment it with (better) full-text searching. When the app alters a record in the database, it creates a new index writer and issues a corresponding `delete_term` and `add_document` in the Tantivy index, followed by committing and dropping the writer.

Over some time of usage, although the SQLite database is only 93 megabytes, the Tantivy index has ended up with 2k individual segments with an average of ~26kb/segment, which in all takes several seconds to load from cold cache on my ZFS drives.

After deleting the index and recreating it from the data in the database, I can get it down to 15 segments, each around 1MB, but that's not a workaround I want to apply often. I'm not applying any custom options to the index writer, and it should be using the default merger.

#194 mentions this problem as well, but has been open for a long time.

Is there a better way to index small amounts of documents like this without having to constantly remake the index?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slow performance and many small segments #2932

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Slow performance and many small segments #2932

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions