Launch
LongCat Flash AI
Visit
Example Image

LongCat Flash AI

Meituan's open-source 560B parameter LLM with blazing speed

Visit

LongCat AI is Meituan's revolutionary open-source large

 language model featuring 560B parameters with innovative

 Mixture-of-Experts (MoE) architecture. It delivers exceptional

 performance with over 100 tokens/second inference speed and costs as low

 as $0.7 per million output tokens. The model excels in agentic

 capabilities, complex reasoning, and real-world applications while

 maintaining enterprise-grade reliability and scalability.


Example Image
Example Image
Example Image
Example Image
Example Image

Features

 - 560B Parameter MoE Architecture - Dynamically activates 18.6B-31.3B

 parameters per token for optimal efficiency

 - Blazing Fast Inference - Over 100 tokens/second on H800 GPUs with

 minimal latency

 - Cost-Effective Performance - Just $0.7 per million output tokens,

 making it highly competitive

 - Advanced Agentic Capabilities - Excels in tool use, multi-step

 reasoning, and complex environment interaction

 - Zero-Computation Experts - Smart resource allocation that saves

 computation for simple tokens

 - Shortcut-Connected MoE - Innovative architecture that reduces

 communication bottlenecks

 - Enterprise-Grade Reliability - 98.48% uptime with automatic fault

 detection and recovery

 - Open-Source Accessibility - Available on Hugging Face, GitHub, and web

 demo for community use

Comments