Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Transition ocrs' ONNX Runtime to a Production-Ready Alternative ie. ORT #166

Open
diversable opened this issue Feb 23, 2025 · 2 comments

Comments

@diversable
Copy link

First off, thanks for all the work you've put into this tool!

May I humbly suggest, though, that to help get the binary tool into a production-ready state as quickly as possible that you replace your RTen ONNX runtime backend for a more production-ready alternative that's already being used widely in production scenarios in industry.

What I would recommend is checking out ORT ('ONNX RunTime').

See: https://github.com/pykeio/ort

Just to give you a quick sense of what I mean by 'ORT is production-ready', it is being used at:

  • Twitter / X
  • in Google's Magika library, which is used in Gmail
  • ORT is used in the Wasmtime WASI runtime, which powers their WASI-NN API (which is used by multiple companies, such as Fermyon's 'Spin' Wasi FAAS platform, the CNCF's WasmCloud project, etc)
  • as part of SurrealDB
    ...

While creating an ONNX runtime is a fantastic learning experience, for right now, OCRS might benefit from the production-ready ONNX RT support that you'd get from a solution like ort

Just a suggestion...

@robertknight
Copy link
Owner

robertknight commented Feb 23, 2025

"Production ready" is a somewhat fuzzy phrase. I think any discussion needs to focus on the concrete things that matter for particular users.

ort / ONNX Runtime does have some obvious advantages:

  • It is popular, and thus more of a known quantity
  • The core C++ implementation is maintained by Microsoft
  • It has backends for various hardware accelerators

RTen's advantage is that it is fully implemented in Rust, with no native binaries involved. This avoids some of the challenges that come with that, and it can build for any environment for which a Rust compiler is available, including WebAssembly. The binary size is also smaller (~3.4MB for the compiled rten CLI vs ~70.5MB for libonnxruntime.a on my system).

From a personal perspective, I should also be transparent that working on an ML runtime is a large part of what motivates me to work on this in my spare time.

@robertknight
Copy link
Owner

Regarding WebAssembly specifically, even though ocrs has a WASM build, performance is significantly worse than native due to a) the difficulty of setting up multi-threading in that environment (this applies to the browser, not WASI environments where a std::thread implementation is available) and b) a weaker set of available SIMD operations.

The best solution for that environment would probably be targeting WebNN or wasi-nn directly. This will provide the performance of the underlying libraries while avoiding shipping a large runtime as part of the ocrs WASM binary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants