
State-of-the-Art
AI Models
Access a comprehensive library of cutting-edge AI models for LLMs, image, video, and audio generation, all through our high-performance inference API.

Qwen
embedding
Qwen3-Embedding-8B
Release on: Jun 6, 2025
Qwen3-Embedding-8B is the latest proprietary model in the Qwen3 Embedding series, specifically designed for text embedding and ranking tasks. Built upon the dense foundational models of the Qwen3 series, this 8B parameter model supports context lengths up to 32K and can generate embeddings with dimensions up to 4096. The model inherits exceptional multilingual capabilities supporting over 100 languages, along with long-text understanding and reasoning skills. It ranks No.1 on the MTEB multilingual leaderboard (as of June 5, 2025, score 70.58) and demonstrates state-of-the-art performance across various tasks including text retrieval, code retrieval, text classification, clustering, and bitext mining. The model offers flexible vector dimensions (32 to 4096) and instruction-aware capabilities for enhanced performance in specific tasks and scenarios...
Total Context:
33K
Max output:
Input:
$
0.04
/ M Tokens
Output:
$
/ M Tokens

Qwen
embedding
Qwen3-Embedding-4B
Release on: Jun 6, 2025
Qwen3-Embedding-4B is the latest proprietary model in the Qwen3 Embedding series, specifically designed for text embedding and ranking tasks. Built upon the dense foundational models of the Qwen3 series, this 4B parameter model supports context lengths up to 32K and can generate embeddings with dimensions up to 2560. The model inherits exceptional multilingual capabilities supporting over 100 languages, along with long-text understanding and reasoning skills. It achieves excellent performance on the MTEB multilingual leaderboard (score 69.45) and demonstrates outstanding results across various tasks including text retrieval, code retrieval, text classification, clustering, and bitext mining. The model offers flexible vector dimensions (32 to 2560) and instruction-aware capabilities for enhanced performance in specific tasks and scenarios, providing an optimal balance between efficiency and effectiveness...
Total Context:
33K
Max output:
Input:
$
0.02
/ M Tokens
Output:
$
/ M Tokens

Qwen
embedding
Qwen3-Embedding-0.6B
Release on: Jun 6, 2025
Qwen3-Embedding-0.6B is the latest proprietary model in the Qwen3 Embedding series, specifically designed for text embedding and ranking tasks. Built upon the dense foundational models of the Qwen3 series, this 0.6B parameter model supports context lengths up to 32K and can generate embeddings with dimensions up to 1024. The model inherits exceptional multilingual capabilities supporting over 100 languages, along with long-text understanding and reasoning skills. It achieves strong performance on the MTEB multilingual leaderboard (score 64.33) and demonstrates excellent results across various tasks including text retrieval, code retrieval, text classification, clustering, and bitext mining. The model offers flexible vector dimensions (32 to 1024) and instruction-aware capabilities for enhanced performance in specific tasks and scenarios, making it an ideal choice for applications prioritizing both efficiency and effectiveness...
Total Context:
33K
Max output:
Input:
$
0.01
/ M Tokens
Output:
$
/ M Tokens