Skip to content

Commit 1a00ac6

Browse files
authored
fix for staying under openai limits (#2041)
1 parent 1e6c2ce commit 1a00ac6

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

vector_search/utils.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -262,7 +262,19 @@ def _process_content_embeddings(serialized_content):
262262
)
263263
for md in split_metadatas
264264
]
265-
split_embeddings = list(encoder.embed_documents(split_texts))
265+
split_embeddings = []
266+
"""
267+
Break up requests according to chunk size to stay under openai limits
268+
600,000 tokens per request
269+
max array size: 2048
270+
see: https://platform.openai.com/docs/guides/rate-limits
271+
"""
272+
request_chunk_size = int(
273+
600000 / settings.CONTENT_FILE_EMBEDDING_CHUNK_SIZE_OVERRIDE
274+
)
275+
for i in range(0, len(split_texts), request_chunk_size):
276+
split_chunk = split_texts[i : i + request_chunk_size]
277+
split_embeddings.extend(list(encoder.embed_documents(split_chunk)))
266278
if len(split_embeddings) > 0:
267279
resource_points.append(
268280
models.PointVectors(

0 commit comments

Comments
 (0)