Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,8 @@
"learn/indexing/indexing_best_practices",
"learn/indexing/ram_multithreading_performance",
"learn/indexing/tokenization",
"learn/indexing/multilingual-datasets"
"learn/indexing/multilingual-datasets",
"learn/indexing/optimize_indexing_performance"
]
},
{
Expand Down
119 changes: 119 additions & 0 deletions learn/indexing/optimize_indexing_performance.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
title: Optimize indexing performance with batch statistics
description: Learn how to analyze the `progressTrace` to identify and resolve indexing bottlenecks in Meilisearch.
---

# Optimize indexing performance by analyzing batch statistics

Indexing performance can vary significantly depending on your dataset, index settings, and hardware. The [batch object](/reference/api/batches) provides information about the progress of asynchronous indexing operations.

The `progressTrace` field within the batch object offers a detailed breakdown of where time is spent during the indexing process. Use this data to identify bottlenecks and improve indexing speed.

## Understanding the `progressTrace`

`progressTrace` is a hierarchical trace showing each phase of indexing and how long it took.
Each entry follows the structure:

```json
"processing tasks > indexing > extracting word proximity": "33.71s"
```

This means:

- The step occurred during **indexing**.
- The subtask was **extracting word proximity**.
- It took **33.71 seconds**.

Focus on the **longest-running steps** and investigate which index settings or data characteristics influence them.

## Key phases and how to optimize them

### `computing document changes`and `extracting documents`

| Description | Optimization |
|--------------|--------------|
| Meilisearch compares incoming documents to existing ones. | No direct optimization possible. Process duration scales with the number and size of incoming documents.|

### `extracting facets` and `merging facet caches`

| Description | Optimization |
|--------------|--------------|
| Extracts and merges filterable attributes. | Keep the number of [**filterable attributes**](/reference/api/settings#filterable-attributes) to a minimum. |

### `extracting words` and `merging word caches`

| Description | Optimization |
|--------------|--------------|
| Tokenizes text and builds the inverted index. | Ensure the [searchable attributes](/reference/api/settings#searchable-attributes) list only includes the fields you want to be checked for query word matches. |

### `extracting word proximity` and `merging word proximity`

| Description | Optimization |
|--------------|--------------|
| Builds data structures for phrase and attribute ranking. | Lower the precision of this operation by setting [proximity precision](/reference/api/settings#proximity-precision) to `byAttribute` |

### `waiting for database writes`

| Description | Optimization |
|--------------|--------------|
| Time spent writing data to disk. | No direct optimization possible. Either the disk is too slow or you are writing too much data in a single operation. Avoid HDDs (Hard Disk Drives) |

### `waiting for extractors`

| Description | Optimization |
|--------------|--------------|
| Time spent waiting for CPU-bound extraction. | No direct optimization possible. Indicates a CPU bottleneck. Use more cores or scale horizontally with [sharding](/learn/advanced/sharding). |

### `post processing facets > strings bulk` / `numbers bulk`

| Description | Optimization |
|--------------|--------------|
| Processes equality or comparison filters. | - Disable unused [**filter features**](/reference/api/settings#features), such as comparison operators on string values. <br /> - Reduce the number of [**sortable attributes**](reference/api/settings#sortable-attributes). |

### `post processing facets > facet search`

| Description | Optimization |
|--------------|--------------|
| Builds structures for the [facet search API](/reference/api/facet_search). | If you don’t use the facet search API, [disable it](/reference/api/settings#update-facet-search-settings).|

### Embeddings

| Trace key | Description | Optimization |
|------------|--------------|--------------|
| `writing embeddings to database` | Time spent saving vector embeddings. | Use embedding vectors with fewer dimensions. <br/>- [Disabling embedding regeneration on document update](/reference/api/documents#vectors). <br/>- Consider enabling [binary quantization](/reference/api/settings#binaryquantized). |

### `post processing words > word prefix *`

| Description | Optimization |
|--------------|--------------|
| | Builds prefix data for autocomplete. Allows matching documents that begin with a specific query term, instead of only exact matches.| Disable [**prefix search**](/reference/api/settings#prefix-search) (`prefixSearch: disabled`). _This can severely impact search result relevancy._ |

### `post processing words > word fst`

| Description | Optimization |
|--------------|--------------|
| Builds the word FST (finite state transducer). | No direct action possible, as FST size reflect the number of different words in the database. Using documents with fewer searchable words may improve operation speed. |

## Example analysis

If you see:

```json
"processing tasks > indexing > post processing facets > facet search": "1763.06s"
```

[Facet searching](/learn/filtering_and_sorting/search_with_facet_filters#searching-facet-values) is raking significant indexing time. If your application doesn’t use facets, disable the feature:

```bash
curl \
-X PUT 'MEILISEARCH_URL/indexes/INDEX_UID/settings/facet-search' \
-H 'Content-Type: application/json' \
--data-binary 'false'
```

## Learn more

- [Indexing best practices](/learn/indexing/indexing_best_practices)
- [Impact of RAM and multi-threading on indexing performance
](/learn/indexing/ram_multithreading_performance)
- [Configuring index settings](/learn/configuration/configuring_index_settings)
7 changes: 7 additions & 0 deletions snippets/samples/code_samples_compact_index_1.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
<CodeGroup>

```bash cURL
curl \
-X POST 'MEILISEARCH_URL/indexes/INDEX_UID/compact'
```
</CodeGroup>
4 changes: 4 additions & 0 deletions snippets/samples/code_samples_webhooks_delete_1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,8 @@ client.deleteWebhook(WEBHOOK_UUID)
```go Go
client.DeleteWebhook("WEBHOOK_UUID");
```

```rust Rust
client.delete_webhook("WEBHOOK_UUID").await.unwrap();
```
</CodeGroup>
4 changes: 4 additions & 0 deletions snippets/samples/code_samples_webhooks_get_1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,8 @@ client.getWebhooks()
```go Go
client.ListWebhooks();
```

```rust Rust
let webhooks = client.get_webhooks().await.unwrap();
```
</CodeGroup>
4 changes: 4 additions & 0 deletions snippets/samples/code_samples_webhooks_get_single_1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,8 @@ client.getWebhook(WEBHOOK_UUID)
```go Go
client.GetWebhook("WEBHOOK_UUID");
```

```rust Rust
let webhook = client.get_webhook("WEBHOOK_UUID").await.unwrap();
```
</CodeGroup>
9 changes: 9 additions & 0 deletions snippets/samples/code_samples_webhooks_patch_1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,13 @@ client.UpdateWebhook("WEBHOOK_UUID", &meilisearch.UpdateWebhookRequest{
},
});
```

```rust Rust
let mut update = meilisearch_sdk::webhooks::WebhookUpdate::new();
update.remove_header("referer");
let webhook = client
.update_webhook("WEBHOOK_UUID", &update)
.await
.unwrap();
```
</CodeGroup>
8 changes: 8 additions & 0 deletions snippets/samples/code_samples_webhooks_post_1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,12 @@ client.AddWebhook(&meilisearch.AddWebhookRequest{
},
});
```

```rust Rust
let mut payload = meilisearch_sdk::webhooks::WebhookCreate::new("WEBHOOK_TARGET_URL");
payload
.insert_header("authorization", "SECURITY_KEY")
.insert_header("referer", "https://example.com");
let webhook = client.create_webhook(&payload).await.unwrap();
```
</CodeGroup>