Question of text generation of a gpt model #912
FlyingFish760
started this conversation in
General
Replies: 1 comment 4 replies
-
|
No worries, that's what the discussions here are for :) This here is the typical, regular usage. But this is a good observation, the previous queries are not used, and thus it's quite wasteful to recompute them. In practice, that's where the KV cache comes into play: https://github.com/rasbt/LLMs-from-scratch/tree/main/ch04/03_kv-cache Does that address your question? |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello! I have a maybe stupid question about the text generation process of a GPT model. In the generate_text_simple() function of the main code of chapter4 "ch04.ipynb", I saw only the last output logits is chosen:
So if only the last logits is chosen, why are all the input tokens used as query in generating the text? In other words, would the next token chosen be the same if we only use the last input token as a query and all input tokens as keys and values? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions