AI

Data Platform for AI/ML WekaIO Announces Marked Results

The Weka File System helps chief data officers and data scientists derive the full benefit from their IT infrastructure by fully utilizing storage performance and NVIDIA GPUs

WekaIO™ (Weka), the fastest-growing data platform for artificial intelligence/machine learning (AI/ML), life sciences research, enterprise technical computing, and high-performance data analytics (HPDA), announced today results of testing conducted with Microsoft, which showed that the Weka File System (WekaFS™) produced among the greatest aggregate NVIDIA® Magnum IO GPUDirect® Storage throughput numbers of all storage systems tested to date. Weka solves the storage challenges common with I/O-intensive workloads, such as artificial intelligence (AI). It delivers high bandwidth, low latency, and single-namespace visibility for the entire data pipeline, enabling chief data officers (CDOs), data scientists, and data engineers to accelerate MLOps and time-to-value.

The tests were conducted at Microsoft Research using a single NVIDIA® DGX-2™ server* connected to a WekaFS cluster over an NVIDIA Mellanox InfiniBand switch. The Microsoft Research engineers, in collaboration with WekaIO and NVIDIA specialists, were able to achieve one of the highest levels of throughput in systems tested to the 16 NVIDIA V100 Tensor Core GPUs using GDS. This high-level performance was achieved and verified by running the NVIDIA gdsio utility for more than 10 minutes and showing sustained performance over that duration.

WekaFS is the world’s fastest and most scalable POSIX-compliant parallel file system, designed to transcend the limitations of legacy file systems that leverage local storage, NFS, or block storage, making it ideal for data-intensive AI and high-performance computing (HPC) workloads. WekaFS is a clean sheet design integrating NVMe-based flash storage for the performance tier with GPU servers, object storage and ultra-low latency interconnect fabrics into an NVMe-over-Fabrics architecture, creating an extremely high-performance scale-out storage system. WekaFS performance scales linearly as more servers are added to the storage cluster, allowing the infrastructure to grow with the increasing demands of the business.

“The results from the Microsoft Research lab are outstanding, and we are pleased the team can utilize the benefits of their compute acceleration technology using Weka to achieve their business goals,” said Ken Grohe, president and chief revenue officer at WekaIO. “There are three critical components needed to achieve successful results like these from the Microsoft Research lab: compute acceleration technology such as GPUs, high speed networking, and a modern parallel file system like WekaFS. By combining WekaFS and NVIDIA GPUDirect Storage (GDS), customers can accelerate AI/ML initiatives to dramatically accelerate their time-to-value and time-to-market. Our mission is to continue to fill the critical storage gap in IT infrastructure that facilitates agile, accelerated data centers.”

“Tests were run on a system that has WekaFS deployed in conjunction with multiple NVIDIA DGX-2 servers in a staging environment and allowed us to achieve the highest throughput of any storage solution that has been tested to date,” said Jim Jernigan, principal R&D systems engineer at Microsoft Corp. “We were impressed with the performance metrics we were able to achieve from the GDS and WekaFS solution.”

For more information on WekaFS, go to: https://www.weka.io/parallel-file-system/ and to request a free trial of WekaFS, go to Get Started with Weka.

Additional resources:

  • Blogs
    • Microsoft Research Customer Use Case: WekaIO and NVIDIA GPUDirect Storage Results with NVIDIA DGX-2 Servers
    • How GPUDirect Storage Accelerates Big Data Analytics
    • Weka AI and NVIDIA Accelerate AI Data Pipelines
    • How GPUDirect Storage Accelerates Big Data Analytics
  • NVIDIA GPUDirect Storage Webinar: click here for a replay
  • WekaFS: 10 Reasons to Deploy the WekaFS Parallel File System
  • Video: Visualizing 150TB of Data using GPUDirect Storage and the Weka File System

NVIDIA DGX-2 server used a non-standard configuration with single-port NICs replaced by dual-port NICs.

Previous ArticleNext Article

Leave a Reply

Your email address will not be published. Required fields are marked *