ホームニュース

会社のニュース MinIO adds petabyte-scale MemKV cache for Nvidia GPU inference

すべての製品

棚貯蔵サーバー
(165)

華為技術の融合サーバー
(31)

Dell Poweredgeサーバー
(59)

H3Cサーバー
(31)

データ通信スイッチ
(96)

WLAN装置
(21)

スマートな無線ルーター
(10)

ハード・ドライブHDD
(78)

内部ハード・ドライブSSD
(16)

Geforceのグラフィックス・カード
(27)

INTEL CPUプロセッサ
(20)

サーバー記憶RAM
(6)

改装された貯蔵サーバー
(6)

SFPのトランシーバーモジュール
(4)

繊維のチャネルスイッチ
(42)

認証

顧客の検討

北京Qianxing Jietongの技術Co.、株式会社の販売スタッフは非常に専門および忍耐強い。それらは引用語句をすぐに提供してもいい。プロダクトの質そして包装はまた非常によい。私達の協同は非常に滑らかである。

—— 《のFestfing DVの》 LLC

私がIntel CPUおよび東芝SSDを緊急に捜していたときに、北京Qianxing Jietongの技術Co.、株式会社からのサンディは私に多くの助けを与え、私に私がすぐに必要としたプロダクトを得た。私は実際に彼女を認める。

—— キティ円

北京Qianxing Jietongの技術Co.、株式会社のサンディは私がサーバーを買う時間の構成間違いを私に思い出させることができる非常に注意深いセールスマンである。エンジニアはまた非常に専門で、すぐにテストプロセスを完了できる。

—— Strelkin Mikhail Vladimirovich

北京千星捷通との仕事は大変満足しています。製品の品質は素晴らしく、納期も常に守られています。営業チームはプロフェッショナルで、忍耐強く、私たちの質問にすべて丁寧に対応してくれます。彼らのサポートに心から感謝しており、長期的なパートナーシップを期待しています。強くお勧めします！

—— アフマド・ナビド

品質: 提供者との素晴らしい経験. MikroTik RB3011は既に使用されていましたが,非常に良い状態で,すべてが完璧に動作しています. コミュニケーションは迅速でスムーズでした.そして私の懸念はすぐに解決されました信頼性の高いサプライヤーです強くお勧めします

—— ゲラン・コレシオ

オンラインです

会社ニュース

MinIO adds petabyte-scale MemKV cache for Nvidia GPU inference

MinIO has developed a petabyte-scale MemKV caching system tailored for Nvidia GPUs, deployed on top of its AIStor object storage platform.

GPU clusters running inference require high-bandwidth memory (HBM) to store context, vectorized tokens and intermediate key-value (KV) pairs. Once GPU HBM is saturated, data cascades down to CPU DRAM and NVMe SSDs, managed by Nvidia BlueField-4 (BF4) DPUs. When these tiers reach capacity, MinIO AIStor acts as the final storage backup. Nvidia’s STX architecture governs this multi-layer cache hierarchy, and MemKV complies with the standard to deliver persistent, shared context across GPU clusters at superior scale.

AB Periasamy, MinIO co-founder and co-CEO, commented: “The industry has been papering over context loss for years because, at small scale, you may absorb the recompute tax. At today’s high GPU density for hyperscalers and neoclouds, this is no longer viable.

Recomputing generated context wastes power; for clusters with thousands of GPUs, it creates fundamental structural inefficiency. Large-scale inference requires purpose-built infrastructure, and MemKV is designed specifically for this data path.”

For the first time, MinIO enables shared context pools for entire GPU clusters at microsecond-level latency matching inference workflows, avoiding millisecond delays from conventional external storage. Without sufficient cache tiers, GPUs waste resources on repeated context recalculation.

In a 128-GPU deployment with 128K-token context length, MemKV improved time-to-first-token under production loads and boosted GPU utilization from 50% to over 90%, generating an estimated $2 million annual compute cost saving.

Purpose-built for Nvidia STX architecture, MemKV supports Nvidia Dynamo and NIXL caching tools. It delivers petabytes of shared context memory at SSD-level costs, decoupling cache scaling from GPU compute resources. Its core features are listed below:

Native BF4 STX support: Runs as an ARM64 binary within STX infrastructure, embedded in storage rather than separate x86 storage servers.
End-to-end RDMA transport: Transfers KV cache between GPU memory and NVMe via RDMA, bypassing conventional file and object storage protocols.
GPU-optimized block size: Uses 2–16 MB blocks for GPU throughput demands, instead of legacy 4 KB storage blocks.
Wire-speed performance: Optimized for Nvidia Spectrum-X Ethernet and PCIe Gen6 to maximize physical fabric throughput.

MemKV directly transfers data from NVMe SSDs to AI pipelines over RDMA, eliminating HTTP overhead, file system translation and intermediate storage servers.

MinIO categorizes rival context memory solutions into two types: non-sharable local NVMe (G3) and general-purpose shared storage (G4). It positions MemKV as a purpose-built G3.5 tier, distinguishing itself from generic storage products.

The firm emphasizes that legacy vendors’ G3.5 offerings still retain redundant protocol nodes, metadata services and file translation layers. These layers ensure durability and consistency for training data and model weights, yet they are unnecessary for ephemeral, recomputable KV cache optimized for 2–16 MB data blocks.

Hardware RAID vendor GRAID and storage firm WEKA also provide STX-compatible KV cache solutions. A broad range of storage vendors support Nvidia STX, including Cloudian, Dell, DDN, Everpure, Hammerspace, Hitachi Vantara, HPE, Lightbits/ScaleFlux, NetApp, Nutanix, Peak:AIO, Pliops and VAST Data.

Beijing Qianxing Jietong Technology Co., Ltd.
Sandy Yang/Global Strategy Director
WhatsApp / WeChat: +86 13426366826
Email: yangyd@qianxingdata.com
Website: www.qianxingdata.com/www.storagesserver.com
Business Focus:
ICT Product Distribution/System Integration & Services/Infrastructure Solutions
With 20+ years of IT distribution experience, we partner with leading global brands to deliver reliable products and professional services.
“Using Technology to Build an Intelligent World”Your Trusted ICT Product Service Provider!

パブの時間 : 2026-05-14 13:46:14 >> ニュースのリスト

連絡先の詳細

Beijing Qianxing Jietong Technology Co., Ltd.

コンタクトパーソン: Ms. Sandy Yang

電話番号: 13426366826

会社のニュース MinIO adds petabyte-scale MemKV cache for Nvidia GPU inference

棚貯蔵サーバー

華為技術の融合サーバー

Dell Poweredgeサーバー

H3Cサーバー

データ通信スイッチ

WLAN装置

スマートな無線ルーター

ハード・ドライブHDD

内部ハード・ドライブSSD

Geforceのグラフィックス・カード

INTEL CPUプロセッサ

サーバー記憶RAM

改装された貯蔵サーバー

SFPのトランシーバーモジュール

繊維のチャネルスイッチ

棚貯蔵サーバー

12湾1Uラックマウント式サーバーLenovo ThinkSystem SR630の棚サーバー

ThinkSystem SR250 V2 4SFFの棚貯蔵サーバーIntel Xeon E-2378Gプロセッサ

Intel C621Aの棚貯蔵サーバーInspur NF5180M6 1Uのラックマウントサーバー

華為技術の融合サーバー

FusionServer 5288 V6 4Uの棚サーバー32 DDR4 DIMMs 44の3.5インチのハードディスク

超高密度華為技術の融合サーバー1Uネットワークの貯蔵サーバー1288H V5

新しいGEN OceanStor 5310華為技術の棚サーバー雑種の抜け目がない貯蔵