inferi is a set of Rust libraries for local LLM and vision model inference on the GPU. Shaders are written in Rust via rust-gpu, targeting Vulkan/WebGPU through wgpu. It runs natively on desktop, mobile, and in the browser.
Warning: inferi is under active development and may be missing features. Contributions welcome!
Install cargo-gpu (required to compile shaders):
cargo install --git https://github.com/Rust-GPU/cargo-gpu cargo-gpu
cargo gpu installRun the chat app natively:
cd crates/inferi-chat
dx run --release --features desktopOr the CLI version:
cd crates/inferi-chat
cargo run --release --features desktop -- --headless --inspect '/path/to/model.gguf'