Skip to main content

Image Captioning

Detailed descriptions, alt-text generation, and visual summarization.

OCR & Extraction

Text recognition, form parsing, and document digitization.

Visual Reasoning

Scene understanding, spatial relations, and visual Q&A.

Continuous Capture

Always-on activity recognition and scene monitoring on-device.

LFM2.5 Models   Latest release

LFM2.5-VL builds on LFM2-VL with extended reinforcment learning training for higher performance while maintaining the same architecture and deployment footprint.

LFM2.5-VL-1.6B

1.6B · RecommendedBest vision model for most use cases. Fast and accurate.

LFM2 Models

LFM2-VL-3B

3BHighest-capacity multimodal model with enhanced visual reasoning.

LFM2-VL-1.6B

1.6B · DeprecatedUse the new LFM2.5-VL-1.6B checkpoint instead.

LFM2-VL-450M

450M · FastestCompact multimodal model for edge deployment and fast inference.

Examples

Explore practical implementations using vision models:

Image Understanding with Vision Language Models

Analyze images, answer visual questions, and generate descriptions using LFM2-VL-1.6B on Android with Jetpack Compose and Coil.Platform: Android · Uses: LFM2-VL-1.6B

Invoice Extractor Tool

Extract structured payment data from invoice PDFs using LFM2.5-VL-1.6B with file monitoring and 100% local processing.Platform: Desktop · Uses: LFM2.5-VL-1.6B

Real-Time Video Captioning

Generate video captions directly in-browser using LFM2.5-VL-1.6B with WebGPU acceleration and ONNX Runtime Web.Platform: Web · Uses: LFM2.5-VL-1.6B

Fine-tune for Car Maker Identification

Learn to fine-tune LFM2-VL models (450M, 1.6B, 3B) with LoRA, structured generation, and evaluation pipelines for image classification.Platform: Desktop · Uses: LFM2-VL-450M, 1.6B, 3B