🎉 LongCat-2.0はSiliconFlowで利用可能です。今すぐお試しください。

モデル

製品

価格

ドキュメント

ブログ

約

連絡

最先端

AI Model ライブラリ

1つのAPIで200以上の最先端AIモデルでInferenceを実行し、数秒でデプロイ

最先端

AI Model ライブラリ

1つのAPIで200以上の最先端AIモデルでInferenceを実行し、数秒でデプロイ

最先端

AI Model ライブラリ

1つのAPIで200以上の最先端AIモデルでInferenceを実行し、数秒でデプロイ

All

Featured

LLM

Vision

Image

Video

Audio

Serverless

Tencent

Tencent

Text Generation

Hunyuan-A13B-Instruct

リリース日：2025/06/30

Hunyuan-A13B-Instructは、その80 Bのパラメーターのうち13 Bのみをアクティブにしますが、主流のベンチマークでより大きなLLMに匹敵します。ハイブリッド推論を提供し、低遅延の「高速」モードまたは高Precisionの「低速」モードを各呼び出しごとに切り替えることができます。ネイティブの256 K-tokenコンテキストにより、劣化せずに本のような長さのドキュメントを処理できます。エージェントスキルはBFCL-v3、τ-Bench、C3-Benchのリーダーシップに合わせて調整されており、優れた自律型アシスタントのバックボーンとなっています。グループ化されたQuery Attentionと多形式の量子化により、メモリ効率の良い、GPUに優しいInferenceを実現し、実際の展開での使用に備えています。企業向けアプリケーションのためのマルチリンガルサポートと強固な安全性調整を備えています。...

Total Context:

131K

Max output:

131K

Input：

0.14

/ M Tokens

Input：

text

/ M Tokens

Output:

0.57

/ M Tokens

Tencent

Text Generation

Hy3

リリース日：2026/06/26

Built for real-world business scenarios, Hy3 features a 295B/21B active MoE architecture, native 256K context support, and three reasoning modes. It enhances coding, long-form comprehension, multi-turn dialogue, and agentic task execution, balancing reliability, efficiency, and cost across both high-frequency interactions and complex workflows....

Total Context:

262K

Max output:

262K

Input：

0.0

/ M Tokens

Input：

text

/ M Tokens

Output:

0.0

/ M Tokens

Tencent

Text Generation

Hy3-preview

リリース日：2026/04/07

Hy3 preview is a 295B-parameter Mixture-of-Experts (MoE) language model from Tencent Hunyuan, built for production-grade agent workloads. With only 21B parameters activated per token and native 256K context support, it handles complex tasks like cross-file code refactoring, long-document analysis, and multi-step tool use, rather than just generating fluent dialogue. Hy3 scores near state-of-the-art on SWE-bench Verified and advanced STEM benchmarks, while offering three inference modes (no_think, think_low, think_high) to dynamically trade off latency and reasoning depth. Its sparse activation architecture delivers competitive intelligence at a significantly lower token cost....

Total Context:

262K

Max output:

262K

Input：

0.066

/ M Tokens

Input：

text

/ M Tokens

Output:

0.26