BE 네트웍스 블로그

BE Networks and NVIDIA Partner to Automate GPU Farms with Spectrum-X Ethernet

BE Networks and Nvidia Partner to Automate GPU Farms with Spectrum-X Ethernet

Today we are proud to share with the public our strategic partnership with NVIDIA to bring a new level of automation and operational efficiency to GPU farm networking. By integrating BE’s Verity software platform with NVIDIA’s Spectrum-X Ethernet switches and Cumulus Linux, the two companies are enabling operators to stand up and manage high performance Ethernet fabrics with precision and reliability from day zero provisioning through daily operations.

Meeting the Demands of Modern GPU Farms

In the last decade, the growth of GPU accelerated computing has reshaped data centers. Training large scale AI models and running inference pipelines require clusters with thousands of GPUs operating at peak efficiency. The compute side of the equation has advanced rapidly, but the network fabric tying these GPUs together has often been the bottleneck.

Traditional manual provisioning methods or fragmented automation approaches cannot keep pace with the performance requirements or the scale of GPU farms built on Spectrum-X Ethernet. Operators need deterministic outcomes, predictable performance, and a way to reduce operational risk. The partnership between BE Networks and NVIDIA directly addresses these needs.

Spectrum-X Ethernet was built to deliver lossless Ethernet optimized for AI workloads. By integrating with BE Verity, we are making it easier than ever for Enterprise customers to deploy and operate these fabrics without having the need for large, on-site and highly skilled NetOps team.”

Amit Katz, VP Networking, NVIDIA
Amit Katz VP Networking, NVIDIA
Nvidia

The Role of Spectrum-X Ethernet and Cumulus Linux

NVIDIA Spectrum-X Ethernet switches have set the standard for ultra-low latency and lossless networking in GPU farms. They provide predictable throughput, congestion control, and optimized data paths for AI workloads. When combined with Cumulus Linux, operators gain the flexibility of a modern network operating system while retaining the deterministic forwarding behavior required for AI clusters.

However, operating these fabrics at scale is not trivial. A typical GPU farm can include hundreds of leaf and spine switches, with requirements for precise queue management, ECN configuration, and traffic engineering. Any misstep in provisioning or ongoing configuration can create congestion hotspots that cripple GPU efficiency. This is where BE Networks’ Verity software steps in.

BE Verity: Automation Built for Ethernet Fabrics

Verity was built by engineers who have spent decades designing, deploying, and troubleshooting large scale Ethernet networks. The platform automates the entire lifecycle of the fabric. From zero touch provisioning of new Spectrum-X Ethernet switches to day two operations such as monitoring, upgrades, and policy changes, Verity eliminates manual steps and reduces human error.

Key features of Verity include:

ZTP-based onboarding
  • ZTP-based onboarding: New switches can be racked, powered on, and automatically provisioned with the correct Cumulus Linux image, configurations, and policies.
Deterministic fabric provisioning
  • Deterministic fabric provisioning: Verity enforces topology aware configuration, ensuring that every link, port, and queue is set up correctly.
Closed loop validation
  • Closed loop validation: The software constantly validates that the running state of the fabric matches the intended state, closing the gap between design and operations.
Integrated telemetry
  • Integrated telemetry: With our integrated Observability platform – Satori, real time fabric health monitoring gives engineers visibility into latency, buffer utilization, and congestion events.

Experience Verity Now with NVIDIA Air

Using NVIDIA Air, customers can easily learn how to use Verity as well as perform complex network simulations including digital twins, automation workflows, config rendering, and hardware evaluation.

Why Automation Matters for AI Networking

In GPU farms, networking is not just plumbing. It is a performance multiplier. A congestion event on a single switch can slow down a training job that spans thousands of GPUs, wasting compute cycles that cost millions of dollars per week. The economics of AI make automation essential.

Manual CLI-based provisioning does not scale. Scripting can help but often introduces complexity and fragility. What is required is a purpose-built automation platform that understands the nuances of Ethernet fabrics and is tightly integrated with the underlying hardware and operating system. Verity was designed for exactly this role.

네트워크 인프라를 코드로 정의하면 CI/CD 파이프라인에 통합할 수 있습니다. 이를 통해 네트워크 공간에 최신 DevOps 관행을 도입할 수 있습니다. 테스트를 실행하고, 규정 준수 검사를 수행하고, 제어 및 감사 가능한 방식으로 변경 사항을 자동으로 배포할 수 있습니다. 품질이나 제어의 저하 없이 운영을 훨씬 쉽게 확장할 수 있습니다. 

A Partnership Built on Real Engineering

The collaboration between NVIDIA and BE Networks is more than a marketing announcement. Engineers from both companies have worked side by side to ensure that Verity can take full advantage of the capabilities of Spectrum-X Ethernet and Cumulus Linux.

“GPU farms are only as good as the network that connects them. With Verity automating Spectrum-X Ethernet fabrics, operators can finally trust that their Ethernet network will keep pace with the enormous demands of AI. This partnership delivers predictable networking at AI scale.”
Amir Elbaz - Founder & CEO, BE Networks

아미르 엘바즈
Founder & CEO, BE Networks

BE 네트워크

Through joint testing in large scale GPU farm environments, Verity has demonstrated that it can automate fabrics spanning thousands of ports while maintaining the deterministic performance Spectrum-X Ethernet is known for. Operators can now deploy AI clusters faster, with less risk, and with greater confidence in long term stability.

Operational Impact for Customers

As GPU farms continue to scale, the need for automation will only grow. BE Networks and NVIDIA are committed to expanding the capabilities of Verity and Spectrum-X Ethernet to handle larger fabrics, more complex topologies, and tighter integration with AI frameworks.

Future roadmap items include enhanced telemetry integration with NVIDIA’s AI software stack, as well as expanded support for hybrid topologies that combine Ethernet and InfiniBand. The partnership is designed to evolve with customer needs and the rapid pace of AI innovation.

결론

The partnership between BE Networks and NVIDIA represents a significant milestone for AI networking. By combining Spectrum-X Ethernet switches, Cumulus Linux, and BE’s Verity automation software, operators of GPU farms can now provision and manage their fabrics with unprecedented speed, reliability, and confidence.

For network engineers who have spent years managing complex data center fabrics, this collaboration feels like a turning point. Instead of patchwork scripts and manual troubleshooting, we now have an integrated, battle-tested solution that reflects the way networks should be built and operated.

The future of AI will be defined not only by GPUs but by the networks that connect them. With NVIDIA and BE Networks working together, that future just became more predictable, more efficient, and more attainable.

Josh Saul 사진

조쉬 사울

제품 마케팅 부사장

Josh Saul has pioneered open source network solutions for more than 25 years. As an architect, he built core networks for GE, Pfizer and NBC Universal. As an engineer at Cisco, Josh advised customers in the Fortune 100 financial sector and evangelized new technologies to customers. More recently, Josh led marketing and product teams at VMware (acquired by Broadcom), Cumulus Networks (acquired by NVIDIA), and Apstra (acquired by Juniper).

ko_KR
문의하기
저희는 네트워크에 대해 이야기하는 것을 정말 좋아합니다!