Skip to main content
Looking for an instant start? Check out the drop-in snippets:

Chat with the local model

Chat with the cloud model

Chat with the local model that maintains conversation history

Process multiple requests with a local model using shared context

Summarize text with the local model

Detect text features with the local model

Structured output
Otherwise, check the detailed integration guide.
Mirai lets you add high-performance AI right into your app with zero latency, full data privacy, and no inference costs. You don’t need an ML team or weeks of setup anymore. One developer can get it all running in minutes. To achieve this, we offer the following key products:
uzu

uzu

A Rust inference engine built to run AI with hardware specifics in mind.
uzu-swift

uzu-swift

Prebuilt bindings to run uzu from Swift.
uzu-ts

uzu-ts

Prebuilt bindings to run uzu from TypeScript.
lalamo

lalamo

A set of tools to optimize and convert models for on-device use.
cli

CLI

A command-line tool to chat with models and serve them as a local API.
platform

Platform

A console where you can create your organization and grab an API key for SDK use.

Where to start?

If you just need a drop-in solution to run one of the supported models, you’ll need a binding framework that matches your application stack. Check our guide for step-by-step integration and an overview of the key concepts.
Check out the overview of the lalamo models toolkit.
Check out the overview of the uzu inference engine.

FAQ

The uzu and lalamo libraries are fully open source under the MIT license.
Binding libraries are provided as prebuilt frameworks with Platform integration and require an API key to run. They include extra features like automatic model downloads and speculative decoding.
The uzu and lalamo libraries are completly free to use.
Currently, only Apple Silicon (iOS/macOS) devices are supported.
The full list of supported models is available here.