Deployment Options - Liquid Docs

On-Device
GPU Inference
Tools

LFM models are designed for efficient deployment across a wide range of platforms. Run models on-device for privacy and low latency, or scale up with GPU inference for production workloads.

On-Device

iOS SDK

Deploy models natively on iPhone and iPad

Android SDK

Deploy models natively on Android devices

llama.cpp

CPU-first inference with cross-platform support

MLX

Optimized inference on Apple Silicon

ONNX

Cross-platform inference with ONNX Runtime

Ollama

Easy local deployment and model management

GPU Inference

Transformers

Flexible inference with Hugging Face Transformers

vLLM

High-throughput production serving

SGLang

Structured generation and fast serving

Modal

Serverless GPU deployment

Baseten

Production model inference platform

Fal

Fast inference API platform

Tools

Model Bundling Services

Package and distribute optimized model bundles for edge deployment

Connect AI Tools

⌘I

Getting Started

On-Device

GPU Inference

Tools

​On-Device

iOS SDK

Android SDK

llama.cpp

MLX

ONNX

Ollama

​GPU Inference

Transformers

vLLM

SGLang

Modal

Baseten

Fal

​Tools

Model Bundling Services

On-Device

GPU Inference

Tools