iOS app crashes
iOS app crashes
Each model may consume a significant amount of RAM, so it’s important to keep only one session loaded at a time. For iOS apps, we recommend adding the Increased Memory Capability entitlement to ensure your app can allocate the required memory.
It takes a long time to get the first token
It takes a long time to get the first token
The
Session’s init method loads the model weights into memory. A common practice is to keep one loaded session instance for the entire application lifecycle. After loading, you can run the same session multiple times. Don’t load a new session for each request.