`embed` function with LLMs in Ollama #447

acraevschi · 2025-02-18T10:48:11Z

Hello everyone!

I recently discovered that Ollama has embed function. In this repo, there is an example of using it with 'llama3.2' (3B variant, by default). The output of this function is a vector of llama's hidden_size parameter (=3072).

Does this mean the model is doing more than just passing the input through the network (excluding the output layer) and instead applying some form of pooling?

Specifically, how is the final hidden representation being transformed from shape [input_tokenized_len, hidden_size] to [1, hidden_size]?

I understand how specialized sentence transformers achieve this, but in a typical decoder-only LLM like LLaMA, the hidden states usually retain the [input_tokenized_len, hidden_size] shape before the final output. Could someone clarify what transformation is applied here?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`embed` function with LLMs in Ollama #447

`embed` function with LLMs in Ollama #447

acraevschi commented Feb 18, 2025

embed function with LLMs in Ollama #447

embed function with LLMs in Ollama #447

Comments

acraevschi commented Feb 18, 2025

`embed` function with LLMs in Ollama #447

`embed` function with LLMs in Ollama #447