Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example which uses a tokeniser? #162

Open
hakanai opened this issue Dec 12, 2023 · 3 comments
Open

Example which uses a tokeniser? #162

hakanai opened this issue Dec 12, 2023 · 3 comments

Comments

@hakanai
Copy link

hakanai commented Dec 12, 2023

I have been studying the Python demo code for llama.onnx, found here:
https://github.com/tpoisonooo/llama.onnx/blob/main/demo_llama.py#L184

I have looked through all the examples we currently have for kinference, but nothing is doing tokenisation yet. You would sort of expect an example like POSTagger to be doing tokenisation, but it seems to skip the hard part and load the end result directly in as the input.(Unless I'm misreading the code?)

How do I go from a string prompt, into an ONNXData object that would be accepted by this model?

@AnastasiaTuchina
Copy link
Contributor

You are correct, KInference expects you to do all the input data preprocessing yourself (e.g. tokenization), as it is an inference-only library. So in order to get ONNXData you have to implement your own tokenizer that converts input string to NDArray and then use .asTensor(name) on it.
Unfortunately, we don't have any plans to add built-in tokenization yet.

@hakanai
Copy link
Author

hakanai commented Dec 13, 2023

Tokenisation is something I can do, my biggest stumbling block is not knowing the structure of the data I have to provide. If I provide nothing then it throws an error saying that "input" is missing, so that's currently the best hint I have to work with. It would be super nice if it threw some detailed error about the shape of the data it expected to be fed in.

@AnastasiaTuchina
Copy link
Contributor

AnastasiaTuchina commented Dec 13, 2023

We don't have shape analyzers in KInference, but I can suggest using Netron app. It shows correct input names and shapes when possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants