CLI

You can run uzu in a CLI mode:

cargo run --release -p cli -- help
Usage: uzu_cli [COMMAND]

Commands:
  run    Run a model with the specified path
  serve  Start a server with the specified model path
  help   Print this message or the help of the given subcommand(s)

Run

Run command allows you to send messages to the model in interactive mode:

cargo run --release -p cli -- run $MODEL_PATH

Serve

Serve command starts a local server with an OpenAI-like API:

cargo run --release -p cli -- serve $MODEL_PATH

Request example:

curl --location 'http://127.0.0.1:8000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant"
        },
        {
            "role": "user",
            "content": "Tell about London"
        }
    ],
    "max_completion_tokens": 256
}'

Response example:

{
    "id": "421a6b4b-be08-478c-b443-5cc7cef46f5c",
    "model": "Llama-3.2-1B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "London! The vibrant, historic, and cosmopolitan capital of England. London is a city that seamlessly blends tradition with modernity, offering a unique blend of culture, entertainment, and excitement. Here's a comprehensive overview:\n\n**History and Architecture**\n\nLondon has a rich history dating back to the Roman Empire. The city has been ruled by various empires, including the Romans, Anglo-Saxons, and Normans. The city's iconic landmarks, such as Buckingham Palace, the Tower of London, and Big Ben, are testaments to its storied past. The city's architecture is a mix of medieval, Gothic, and Victorian styles, with many buildings dating back to the 13th century.\n\n**Must-Visit Attractions**\n\n1. **Buckingham Palace**: The official residence of the British monarch, Buckingham Palace is a must-visit attraction. Watch the Changing of the Guard ceremony, which takes place daily at 11:30 am from April to July and on alternate days the rest of the year.\n2. **The British Museum**: One of the world's greatest museums, the British Museum houses a vast collection of artifacts from ancient civilizations, including the Rosetta Stone, the Elgin Marbles, and the mummies in the Ancient Egypt gallery.\n3"
            },
            "finish_reason": "length"
        }
    ],
    "stats": {
        "prefill_stats": {
            "duration": 0.282355625,
            "suffix_length": 8,
            "tokens_count": 1,
            "tokens_per_second": 3.541633002707136,
            "model_run": {
                "count": 6,
                "average_duration": 0.047014152499999996
            },
            "run": null
        },
        "generate_stats": {
            "duration": 7.3130893389999985,
            "suffix_length": 1,
            "tokens_count": 255,
            "tokens_per_second": 34.868984662898846,
            "model_run": {
                "count": 255,
                "average_duration": 0.028573350874509806
            },
            "run": {
                "count": 255,
                "average_duration": 0.02867878172156862
            }
        },
        "total_stats": {
            "duration": 7.6461285839999995,
            "tokens_count_input": 42,
            "tokens_count_output": 256
        }
    }
}