Components
Let’s start with a closer look at the entities we’ll be working with later on:Engine
Engine
The main entry point to the SDK. Use it to refresh the registry, download models, and create inference sessions.
Registry
Registry
A list of available models for your device. Use it to pick the model you want to run.
Preset
Preset
A prebuilt configuration for an inference session. Choosing the right preset for your task can give you significant performance improvements, as discussed here.
Session
Session
The main entity for interacting with the model. It keeps the selected model’s weights in memory and lets you send requests to the LLM.
Interaction diagram
Integration
The full integration process consists of the following steps:1
Get an API key
2
Connect and configure the SDK
3
Choose and download a model
4
Choose an inference session configuration
5
Create an inference session and run a model