In this edition of Voices of the Industry, Bob Shine, VP of Marketing and Product Management at Telescent, explores how the rising costs of optics in networks can be managed by dynamic resource allocation at the fiber layer.
As optical networks have evolved to higher data rates to handle the astronomical increases in traffic, the percentage of network costs associated with optical components has grown significantly. Finding ways to economically address this increasing percentage of optics costs has become a focus for many network operator and suppliers. Dynamic resource allocation at the fiber layer represents a way to improve optical network economics and can be implemented today using large scale robotic fiber cross connects.
The increasing share of optical component costs in networks is driven by the need for networks to support higher port densities and faster speeds. This drives the need to use higher speed optics, as seen by the transition from 40G to 100G and now to 400G. Moore’s law has driven down the cost/bit for switches and routers with advanced silicon ASIC development. However, while the cost/bit for pluggable optics has also decreased, it has not decreased nearly as fast as the corresponding silicon component costs.
Based on the above economic trends, Cisco stated that for 10G networks optics represented about 10% of the total hardware cost of a data center network. In contrast, optics will represent over 50% of the total hardware costs for a 400G network. [Optics in the Data Center: Powering Ever-Increasing Capacity Demands – Cisco Blogs]
As you can imagine, this cost challenge occurs in a wide variety of areas – basically any situation where there are compute and storage components combined with high-speed transmission requirements. These include long-haul and now 5G networks, data centers networks and even machine learning resources within data centers. Backbone optical networks were often the first areas of the network to migrate to higher data rates due to the tremendous amount of traffic transmitted over limited and expensive long-haul fibers. The densification of 5G networks and front-haul requirements to the radio towers increases the optical network cost for 5G. Within data centers, scaling with growth while minimized capital expense costs requires efficient ways to add capacity which can be managed by dynamic fiber management. Even applications such as machine learning, which is very compute intensive, will see an increasing share of optical component costs due to the need to transfer ever growing parameter sets between learning iterations. The following sections will demonstrate how all these cases can benefit from dynamic resource allocation at the optical layer.
The Benefit of Dynamic Fiber Cross Connects for Long-Haul Networks
For core optical networks, a group at AT&T showed significant cost savings and improved robustness through the use of software-defined networking (SDN) control and dynamic fiber cross connects (DFCC) to provide joint optimization of the IP and optical layer. In essence, the SDN and DFCC allow the disaggregation of a failed link, allowing working components of the failed link to be reused in a re-engineered route. For a central office with two large core routers, rather than taking both a router port and transponder offline if one component fails or needs a software reboot, a dynamic fiber cross connect allows the working component to be recombined with a port in the second router. Even without a failure, the DFCC can be used to recombine router / transponder pairs to create more efficient routing. The addition of just the automated fiber cross connect in this use case created an estimated saving of 5% across the entire network stack.
The Benefit of Dynamic Fiber Cross Connects while Scaling Data Center Capacity
Moving inside a data center, while a data center construction must be completed before commissioning, data centers are typically filled slowly with servers and switches. This allows capacity to match the growing demand while avoiding idle capacity. The initial moderate-scale network will need to be expanded while the network is carrying live traffic. While other network topologies have been investigated to allow fine grained expansion, it is possible to expand using a tradition Clos network using a novel topology driven methodology as demonstrated by work done at Google. Live expansion requires maintaining sufficient network throughput during the entire course of the live expansion to avoid congestion. This is done through multiple automated stages where a limited subset of network elements is disconnected and added. Since capacity is removed during these rewirings, each stage should be completed as fast as possible. To simplify the reconfiguration, all the server blocks and all the spine blocks are connected through a group of patch panels – creating a DCN topology that can be created and modified simply by moving fiber jumpers on the patch panel. Managing this rewiring during upgrades can be greatly simplified and managed with a robotic fiber cross connect system.
The Benefit of Dynamic Fiber Cross Connects for Machine Learning Resources
Finally, inside data centers machine learning is becoming an increasingly larger share of the compute capability. While the use case is dominated by the development of advanced GPU and TPU processing capability, the very large and growing training sets used create a heavy demand for communication bandwidth between GPUs. The fastest ML training platforms from NVIDIA offer 1.2 Tbps communication bandwidth between its GPUs. But while conventional data center workloads have unpredictable behavior with short flows dominating the workload, machine learning workloads are predictable, sparse and consist of mostly large transfers. By taking advantage of these features, a dynamic fiber cross connect that adjust the communication bandwidth between GPUs based on the training model can significantly improve the performance.
An example of a dynamic fiber cross to address the use cases above would be a Network Topology Manager (NTM). When selecting an NTM, look for one that uses a robot to remotely configure and reconfigure cross connects in minutes. Ideally, you’ll also want to select a pay-as-you-grow NTM model that can be scale the number of ports while still preserving the ability for any-to-any connectivity. You’ll also want a system with low-loss and latched connections, and one that’s fully field maintainable without interrupting traffic. Finally, your NTM should be NEBS Level 3 certified to ensure the dependability that customers demand.
It is certain that data traffic will continue to grow, driving the need for increased bandwidth at all levels of the data network. While controlling costs will require advances in a number of areas, adding dynamic resource managements at the fiber layer is one way to more efficiently address the rising share of optics costs in the network.
This article was written by Bob Shine, VP of Marketing and Product Management, Telescent. Telescent’s Network Topology Manager (NTM) uses a robot to remotely configure and reconfigure cross connects in minutes. It provides dynamic control at the fiber layer, is NEBS certified and available to meet your needs today. To learn more, contact Telescent today.