ONNXRuntime takes up too much memory (more like accumulating cause I believe it's not freeing up unused memory), when trying to embed large collections of data.
Am I missing something or is this a problem with the runtime itself.
Iam trying to embed about 10000 documents (Average Size - 3000 characters) using the JinaAI Colbert Model (Late Interaction Model)
GPU - Tesla T4 16 GB VRam