Overview

The uzu inference engine makes it possible to run LLMs, enabling tasks like text generation, information retrieval, classification and summarization. In this guide, we’ll walk through the full process of integrating an LLM into your app.

Components

Let’s start with a closer look at the entities we’ll be working with later on:

Engine

The main entry point to the SDK. Use it to refresh the registry, download models, and create inference sessions.

Registry

A list of available models for your device. Use it to pick the model you want to run.

Preset

A prebuilt configuration for an inference session. Choosing the right preset for your task can give you significant performance improvements, as discussed here.

Session

The main entity for interacting with the model. It keeps the selected model’s weights in memory and lets you send requests to the LLM.

Interaction diagram

Integration

The full integration process consists of the following steps:

Get an API key

Connect and configure the SDK

Choose and download a model

Choose an inference session configuration

Create an inference session and run a model

Quick start

Integrate AI to your app

Overview

Components

Interaction diagram

Integration

Let’s take a detailed look at each step

Overview

Quick start

Integrate AI to your app

​Components

Interaction diagram

​Integration

Let’s take a detailed look at each step

Components

Integration