ALTO J. Ros-Giralt Internet-Draft S. Yellamraju Intended status: Informational Qualcomm Expires: 27 March 2023 Q. Wu Huawei L.M. Contreras Telefonica R. Yang Yale University K. Gao Sichuan University 23 September 2022 Supporting Bottleneck Structure Graphs in ALTO: Use Cases and Requirements draft-giraltyellamraju-alto-bsg-requirements-03 Abstract This document proposes an extension to the base Application-Layer Traffic Optimization (ALTO) protocol to support bottleneck structures as an efficient representation of the state of a network. Bottleneck structures are efficient computational graphs that allow network operators and application service providers to optimize application performance in a variety of communication problems including routing, flow control, flow scheduling, bandwidth prediction, and network slicing, among others. This document introduces a new abstraction called Bottleneck Structure Graph (BSG) and the necessary requirements to integrate it into the ALTO standard. About This Document This note is to be removed before publishing as an RFC. The latest revision of this draft can be found at https://giralt.github.io/draft-ietf-alto-gradient-graph/draft- giraltyellamraju-alto-bsg-requirements.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft- giraltyellamraju-alto-bsg-requirements/. Discussion of this document takes place on the WG Working Group mailing list (mailto:alto@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/alto/. Source for this draft and an issue tracker can be found at https://github.com/giralt/draft-ietf-alto-gradient-graph. Ros-Giralt, et al. Expires 27 March 2023 [Page 1] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 27 March 2023. Copyright Notice Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 4 3. Brief Introduction to Bottleneck Structures . . . . . . . . . 4 3.1. Example of Bottleneck Structure . . . . . . . . . . . . . 4 3.2. Propagation Properties . . . . . . . . . . . . . . . . . 7 3.3. Forces of Interaction among Flows and Links . . . . . . . 7 3.4. Ripple Effects in a Communication Network . . . . . . . . 8 3.5. Not all Bottleneck Links Are Born Equal . . . . . . . . . 8 3.6. Quantifying the Ripple Effects . . . . . . . . . . . . . 9 3.7. Types of Bottleneck Structures . . . . . . . . . . . . . 11 3.8. Computing Optimized Network Reconfigurations . . . . . . 12 3.9. Types of Network Reconfigurations . . . . . . . . . . . . 12 4. ALTO Bottleneck Structure Service Use Cases . . . . . . . . . 15 4.1. Application Rate Limiting for CDN and Edge Cloud Applications . . . . . . . . . . . . . . . . . . . . . . 15 Ros-Giralt, et al. Expires 27 March 2023 [Page 2] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 4.2. Time-bound Constrained Flow Acceleration for Large Data Set Transfers . . . . . . . . . . . . . . . . . . . . . . . . 15 4.3. Application Performance Optimization Through AI Modeling . . . . . . . . . . . . . . . . . . . . . . . . 18 4.4. Optimized Joint Routing and Congestion Control . . . . . 19 4.5. Service Placement for Edge Computing . . . . . . . . . . 20 4.6. Training Neural Networks and AI Inference for Edge Clouds, Data Centers and Planet-Scale Networks . . . . . . . . . 21 4.7. 5G Network Slicing . . . . . . . . . . . . . . . . . . . 23 5. Example: Application Layer Traffic Optimization using Bottleneck Structures . . . . . . . . . . . . . . . . . . 23 6. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 28 6.1. Requirement 1: Bottleneck Structure Graph (BSG) Abstraction . . . . . . . . . . . . . . . . . . . . . . . 28 6.2. Requirement 2: Information Received from the Network . . 29 6.3. Requirement 3: Information Passed to the Application . . 30 6.4. Requirement 4: Features Needed to Support the Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 30 7. Security Considerations . . . . . . . . . . . . . . . . . . . 31 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 9.1. Normative References . . . . . . . . . . . . . . . . . . 33 9.2. Informative References . . . . . . . . . . . . . . . . . 33 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 35 1. Introduction Bottleneck structures have been recently introduced in [G2-SIGCOMM] and [G2-SIGMETRICS] as efficient computational graphs that embed information about the topology, routing and flow information of a network. These computational graphs allow network operators and application service providers to compute network derivatives that can be used to make traffic optimization decisions. For instance, using the bottleneck structure of a network, a real-time communication (RTC) application can efficiently infer the multi-hop end-to-end available bandwidth, and use that information to tune the encoder's transmission rate and optimize the user's Quality of Experience (QoE). Bottleneck structures can be used by the application to address a wide variety of communication optimization problems, including routing, flow control, flow scheduling, bandwidth prediction, and network slicing, among others. Ros-Giralt, et al. Expires 27 March 2023 [Page 3] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 This document introduces a new abstraction called Bottleneck Structure Graph (BSG) and the necessary requirements to integrate it into the existing ALTO services (Network Map, Cost Map, Entity Property Map and Endpoint Cost Map) exposing the properties of the bottleneck structure to help optimize application performance. Use cases are also introduced to motivate the relevancy of bottleneck structures in the context of the ALTO standard and support the description of the integration requirements. 2. Conventions and Definitions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 3. Brief Introduction to Bottleneck Structures [G2-SIGMETRICS] and [G2-SIGCOMM] introduce a new mathematical framework to optimize network performance called the Quantitative Theory of Bottleneck Structures (QTBS). The core building block of QTBS is a computational graph called _bottleneck structure_, which allows to qualify and quantify the forces of interactions that flows and bottleneck links exert on each other. QTBS builds the bottleneck structure by assuming that flows in a network aim at maximizing throughput while ensuring fairness. This general principle holds for all the relevant congestion control algorithms used in IP networks (e.g., TCP Cubic, BBR, QUIC, etc.). 3.1. Example of Bottleneck Structure Consider as an example the following network configuration: Ros-Giralt, et al. Expires 27 March 2023 [Page 4] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 f3 f6 f1 | | | | | | +--+-+-+---+ | | | | | | | | +---+--- l1 | | | | c1=25 | | | | +--+-+-----+ | | | | +----- f2 | | | +--+-+-+---+ | | | | | ----+--+ | | | l2 | | | | c2=50 f4 ----+--+ | | | | | | | +--+-+-+---+ | | | | | +----- | | +--+-+-----+ | | | | ----+--+ +-----+---- l3 | | c3=100 ----+----+ | | | | +----+-----+ | | +----+-----+ | | | l4 f5 ---+----+ | c4=75 | | | | +----------+ Figure 1: Network configuration example. Each link li is represented by a squared box and has a capacity ci. For instance, link l1 is represented by the top most squared box and has a capacity of c1=25 units of bandwidth. In addition, each flow is represented by a line that passes through the set of links it traverses. For instance, flow f6 traverses links l1, l2 and l3. Ros-Giralt, et al. Expires 27 March 2023 [Page 5] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 The bottleneck structure of this network corresponds to the following digraph (see [G2-TREP] for details on how a bottleneck structure is computed): +------+ +------+ +------+ | | | | | | | f1 <--> l1 <---------------> f6 | | | | | +------------+ | +------+ +--^---+ | +---+--+ | | | | | | +--v---+ | | | | | | | f3 | | | | | | | +--+---+ | | | | | | | | +--v---+ | +------+ +--v---+ +------+ | <--+ | | | | | | | l2 <---->| f4 +---> l3 | | l4 | | | | | | | | | +--^---+ +------+ +--^---+ +---^--+ | | | +--v---+ | +------+ | | | | | | | | f2 | +-> f5 <-+ | | | | +------+ +------+ Figure 2: Bottleneck structure of the network in Figure 1. The bottleneck structure is interpreted as follows: * Links and flows are represented by vertices in the graph. * There is a directed edge from a link l to a flow f if and only if flow f is bottlenecked at link l. * There is a directed edge from a flow f to a link l if and only if flow f traverses link l. For instance, in Figure 2 we have that flow f3 is bottlenecked at link l1 (since there is a directed edge from l1 to f3) and it traverses links l1 and l2 (since there is a directed edge from f3 to l1 and from f3 to l2). Note that when a flow is bottlenecked at a link, then the edge connecting them in the bottleneck structure must necessarily be bidirectional. This is because a flow that is Ros-Giralt, et al. Expires 27 March 2023 [Page 6] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 bottlenecked at a link, must necessarily traverse that link too. Indeed, in Figure 2 we can see that all the directed edges from a link to a flow, are in fact bidirectional edges. This is important to ensure that bottleneck structures correctly model how perturbations in a network propagate, as we explain in the next section. 3.2. Propagation Properties Under the assumption of max-min fairness [GALLAGER], QTBS demonstrates the following two properties [G2-SIGMETRICS]: * *Property 1. Flow perturbation*. An infinitesimal change in the transmission rate of a flow f will have an effect on the transmission rate of a flow f' if and only if the bottleneck structure has a directed path from flow f to flow f'. * *Property 2. Link perturbation*. An infinitesimal change in the capacity of a link l will have an effect on the transmission rate of a flow f' if and only if the bottleneck structure has a directed path from link l to flow f. The above two properties qualitatively relate to the classic question in chaos theory: Can the flap of a butterfly's wings in Brazil set off a tornado in Texas? [LORENZ] Obviously a butterfly alone cannot create a tornado, but every element is interconnected in a distributed system, and even the flap of a butterfly's wings in Brazil will have an effect in Texas. Bottleneck structures are graphs that characterize and quantify such type of effects in a communication network. In particular, a bottleneck structure reveals how a perturbation propagates through the network, describing which flows will be affected and by what magnitude. 3.3. Forces of Interaction among Flows and Links Bottleneck structures are powerful computational graphs because they are able to capture the forces of interaction that flows and bottleneck links exert on each other. These forces of interaction are in general non-intuitive, even for a small simple network configuration like the one in Figure 1. For instance, from Property 2, the bottleneck structure reveals that a small variation in the capacity of link l2 (e.g., in a wireless network, a variation in the capacity of a link could be due to a change in the signal to noise ratio of the communication channel) will propagate through the network and have an impact on the transmission rate of flows f2, f4 and f5 (since from Property 2, the bottleneck structure has a directed path from link l2 to each of these flows). However, such a perturbation will have no effect on the transmission rate of flows Ros-Giralt, et al. Expires 27 March 2023 [Page 7] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 f1, f3 and f6 (since there is no path from l2 to any of these other flows). Similarly, from property 1, a small perturbation on the rate of flow f4 (e.g., this could be due to the effect of a traffic shaper altering the transmission rate of flow f4), will have an impact on the rate of flows f2 and f5, but it will have no effect on the rate of flows f1, f3 and f6. 3.4. Ripple Effects in a Communication Network As another example, given the network in Figure 1, it is also not intuitive to foresee that flows f1 and f5 are related to each other by the forces of interaction inherent to the communication network, even though they do not traverse any common link. Specifically, flow f1 traverses link l1, while flow f5 traverses links l3 and l4. In between both flows, there is an additional hop (link l2) further separating them. Despite not being directly connected, the bottleneck structure reveals that a small perturbation on the performance of flow f1 (i.e., a change in its transmission rate), will create a ripple effect that will reach flow f5, affecting its transmission rate. In particular, the perturbation on flow f1 will propagate through the bottleneck structure and reach flow f5 via the following two paths: f1 -> l1 -> f3 -> l2 -> f4 -> l3 -> f5 f1 -> l1 -> f6 -> l3 -> f5 It is also not intuitive to see that the reverse is not true. That is, a small perturbation on flow f5, will have no effect on flow f1, since the bottleneck structure has no direct path from vertex f5 to vertex f1. In [G2-SIGMETRICS], extensive empirical validation of these results is presented for a variety congestion-controlled IP networks. 3.5. Not all Bottleneck Links Are Born Equal Bottleneck structures also reveal that not all bottleneck links have the same relevancy. In Figure 2, links at the top of the graph have a higher impact on the overall performance of the network than links at the bottom. For instance, consider link l1. A variation on its capacity will create a ripple effect that will impact the performance of all the flows in the network, since all flow vertices are reachable from vertex l1 according to the bottleneck structure. In contrast, link l3 has a much smaller impact on the overall performance of the network, since a variation of its capacity will affect flow f5, but have no impact on any of the other flows. This information is valuable to network operators and application service providers as it can be used to make informed network optimization Ros-Giralt, et al. Expires 27 March 2023 [Page 8] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 decisions. For instance, in edge computing, an operator could choose to place a containerized service (e.g., for extended reality, XR) on compute nodes that would yield communication paths traversing bottleneck links with lower impact on the overall performance of the network (See the use case in Section 4.5 for more details). Similarly, in network slicing (or capacity planning in general), operators could choose to allocate more bandwidth on those links that are more influential (i.e., those links that are at lower levels in the bottleneck structure) according to the expected traffic pattern in the network slice. Overall, bottleneck structures provide a mechanism to rank bottleneck links according to their impact on the overall performance of the network. This information can be used in a variety of optimization problems, such as traffic engineering, routing, capacity planning, or resilience analysis, among others. 3.6. Quantifying the Ripple Effects Bottleneck structures not only allow network operators to reason about the qualitative nature of the forces that flows and links exert on each other, but they also provide a mathematical framework to quantify such forces. In particular, the Quantitative Theory of Bottleneck Structures (QTBS) introduced in [G2-TREP] provides a mathematical framework that uses bottleneck structures as efficient computational graphs to quantify the impact that perturbations in a network have on all of its flows. One of the core building blocks of the QTBS framework is the concept of _link and flow equations_, which mathematically characterize how a perturbation in a network propagates through each of the link and flow vertices in the bottleneck structure. (See [G2-TREP] for an exact mathematical formulation.) Because quantifying the effect of a perturbation on a system is nothing more than computing a derivative of the system's performance with respect to the parameter that's been perturbed, bottleneck structures can be used as efficient and scalable computational graphs to calculate flow and link derivatives in a communication network. Consider for instance the computation of the following derivative: dF()/dri where F() represents the total throughput of the network (the sum of all flows' throughput) and ri is the transmission rate of flow fi. For example, this expression can be used to compute the effect of traffic shaping a flow (slightly reducing its rate) on the total throughput of the network. Computing this derivative using a Ros-Giralt, et al. Expires 27 March 2023 [Page 9] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 traditional calculus approach is both very complex and costly, since it requires modeling the congestion control algorithm in the function F(), for which there is no closed form solution. Using bottleneck structures, however, the computation of this derivative is both simple and inexpensive. It is simple because it can be done by applying an infinitesimal change to the rate of flow fi and then using the link and flow equations to measure how this perturbation propagates through the bottleneck structure [G2-TREP], [G2-SIGCOMM]. It is also very efficient because the computation is performed by applying delta calculations on the bottleneck structure, without involving links and flows that are not affected by the perturbation. For instance, in Figure 1, the computation of dF()/df4 only requires recomputing the transmission rates of flows f2 and f5, without the need to recompute the rates of f1, f3 and f6, since these other flows are not affected by the perturbation. In practice, QTBS provides a methodology to compute network derivatives two or three orders of magnitude faster than general purpose methods such as liner programming [G2-SC]. We finish this brief introduction to QTBS by stating the monotonic bandwidth allocation property that all bottleneck structures satisfy: * *Property 3. Monotonic bandwidth allocation (MBA).* Let si be the transmission rate of the flows bottlenecked at link li. Then, for any path in the bottleneck structure of the form l1 -> f1 -> l2 -> f2 -> (...) -> ln -> fn we have that s1 < s2 < (...) < sn The MBA property is relevant in that it states that bottlenecks located at higher levels in the bottleneck structure will have more bandwidth available than those located at lower levels. For instance, this property indicates that an application requiring high bandwidth should route its traffic through paths that involve links at higher levels in the bottleneck structure. We will be using the MBA property to reason about application performance in some of the examples described in this document. Ros-Giralt, et al. Expires 27 March 2023 [Page 10] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 3.7. Types of Bottleneck Structures While QTBS introduces a core definition of bottleneck structure (see Section 3.1), there exist multiple types of bottleneck structures that can be computed depending on the level of granularity and information desired by the operator. Next, we introduce three types of bottleneck structures that will be used in this document and that are suitable to optimize application performance in the context of the ALTO standard: * *Flow gradient graph (FGG)*. This type of bottleneck structure corresponds to the base definition introduced in Section 3.1. The FGG has the finest level of granularity, including a vertex in each graph for each link and flow in the network. Therefore, an FGG can be relatively large (e.g, with millions of vertices). * *Path gradient graph (PGG)*. One technique to reduce the size of the bottleneck structure without affecting its accuracy is to collapse all the vertices of the flows that follow the same path into a single vertex called a _path vertex_. The resulting bottleneck structure is called the path gradient graph (PGG). A PGG usually has 2 or 3 orders of magnitude less vertices than the FGG. * *QoS-Path gradient graph (Q-PGG)*. Some networks assign different types of traffic to different QoS classes. A Q-PGG can model QoS by collapsing all the vertices of the flows that follow the same path and have the same QoS class into a single vertex called a _Q- path vertex_. A Q-PGG is slightly larger than a PGG (with about |Q| times more vertices, where |Q| is the number of QoS classes supported by the network) but still significantly smaller than the FGG. For most of the applications, it is recommended to use a PGG, or a Q-PGG if the network supports QoS classes, since these bottleneck structures are significantly smaller and faster to process than an FGG, and it is often the case that the operator does not need to know flow-level information in order to make proper application performance optimization decisions. Note also that the PGG and the Q-PGG provide the additional security advantage of hiding flow-level information from the graph. This can be important to operators that are sensitive about security and privacy. Ros-Giralt, et al. Expires 27 March 2023 [Page 11] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 3.8. Computing Optimized Network Reconfigurations A central element to the theory of bottleneck structures is the ability to efficiently compute derivatives on a network. Derivatives are a core building block of the optimization framework, as they reveal the directions (gradients) in the feasible set that can help bring the network to a higher level of performance. In this document, we will refer to these directions in the feasible set as _network reconfigurations_, since that's what they effectively are in the physical world. For instance, an example of network reconfiguration can be the action of rate limiting a flow carrying XR traffic to match the available bandwidth along its path with the goal to improve its QoE. Another example of network reconfiguration is the action of rerouting traffic through a new path in order to accelerate the transfer of a large backup data set between two cloud data centers. A third example can be the deployment of a new network slice in a 5G network in order to ensure the QoS of a V2X service. In each of these actions, the network configuration is moved along a direction (a gradient, if the change maximally improves the performance objective) within the feasible set of possible configurations. While derivatives describe how the performance of a network changes when a very small (infinitesimal) change is applied to its configuration, network reconfigurations can accept changes to the network that are arbitrarily large. For instance, traffic shaping a set of flows to reduce their rates by 10 Mbps is a network reconfiguration that is not infinitesimal. We note that bottleneck structures can also be used to compute optimized network reconfigurations consisting of non-infinitesimal changes in the network. This can be done by first computing derivatives using the bottleneck structure to find a direction (gradient) in the feasible set, and then reconfiguring the network by following that direction. This process can be repeated iteratively until a final optimized reconfiguration is achieved. (See for example [G2-SIGCOMM] and [G2-TREP] for examples of algorithms using this technique.) In the next section, we summarize some of the network reconfigurations that can be optimized by using bottleneck structures. 3.9. Types of Network Reconfigurations The following is a list of some of the network reconfigurations that can be efficiently computed and optimized using bottleneck structures: Ros-Giralt, et al. Expires 27 March 2023 [Page 12] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 * *Flow routing*. Both the operation of routing a new flow or rerouting an existing flow on a network can be modeled as a perturbation, whose impact can be efficiently measured using bottleneck structures. In particular, QTBS can be used to resolve the joint congestion control and routing optimization problem for individual flows (see Section 3.1 in [G2-TREP]). * *Traffic shaping*. Traffic shaping a flow corresponds to the action of taking a derivative with respect to the rate of the flow. Bottleneck structures can be used by network operators and application service providers to compute such perturbations. For instance, to accelerate a large scale data transfer, an application can use bottleneck structures to identify optimal traffic shaping configurations (see Section 3.3 in [G2-TREP]). * *Bandwidth enforcement*. In high-performance networks that target close to 100% link utilization such as Google's B4 network [B4-SIGCOMM], a centralized SDN controller is used collect the state of the network and compute an optimized multipath bandwidth allocation vector. The solution is then deployed at the edge of the network using a technique known as bandwidth enforcement [BE-SIGCOMM]. By using bottleneck structures to efficiently compute changes in the bandwidth allocated to each flow path, operators can efficiently derive improved bandwidth allocation vectors. * *Flow scheduling*. When a flow initiates transmitting data on a network, it uses bandwidth along its path, creating a ripple effect that impacts the performance of other flows in the network. Similarly, the termination of a flow frees bandwidth along its path, creating another perturbation that propagates through the network. Bottleneck structures can efficiently model and compute the effect of flow arrival and departure in a communication network by using simple delta calculations according to the link and flow equations (see Section 3.6 and [G2-SIGCOMM]). This information can be used by applications that need to perform bulk data transfer to decide when to schedule a flow. More in particular, it can be used to enhance the ALTO Cost Calendar service [RFC8896]. Ros-Giralt, et al. Expires 27 March 2023 [Page 13] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 * *Service placement*. Deploying application services in a network requires deciding the location of the compute and storage resources needed to run the service. For instance, in edge computing, an extended reality (XR) server could be deployed at the distributed unit (DU), the central unit (CU), the mobile core (MC) or the central cloud [PETERSON]. Bottleneck structures can be used to measure the effect of placing a service on each of the candidate locations, helping the application service provider to make optimized decisions. * *Multi-job scheduling*. Running a job on a network implies a number of flows will be initiated and terminated throughout the execution of the job. the ripple effects generated from the execution of a job can also be measured using bottleneck structures. This can be used to decide when to optimally launch one or more jobs. For instance, in a data center, bottleneck structure analysis can help the application decide how to optimally schedule multiple AI training or inference jobs that are sharing the same interconnect [G2-SIGCOMM]. * *Link capacity upgrades*. In capacity planning, operators often have a fixed budget and need to decide how to optimally add capacity to a network in order to maximize its performance. The effect of a link upgrade operation can be computed as a derivative with respect to a change (an increase) in the capacity of a link. Through the processing of historical flow information from the network (e.g., NetFlow logs), bottleneck structures can efficiently compute the effect of each link upgrade and identify those that yield maximal performance. * *Path shortcuts*. Operators in wide area networks need to decide whether a communication path should be set up as purely optical (bypassing layer 3 routing) or undergo an optical-to-electrical- to-optical (OEO) conversion at certain routers in order to perform layer 3 routing [SH-SIGCOMM]. The trade-off is one of cost- efficiency versus better routing control of the network. Bottleneck structures can be used to search for paths that are optimally suitable for being offloaded to a purely optical path. These are also known in the literature as path shortcuts [SH-SIGCOMM]. Ros-Giralt, et al. Expires 27 March 2023 [Page 14] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 4. ALTO Bottleneck Structure Service Use Cases Applications of bottleneck structure analysis expand through a broad class of optimization problems that include traffic engineering, routing, flow scheduling, resiliency analysis, network slicing, service level agreement (SLA) management, network design and capacity planning, to name only a few. In this section, we briefly describe some of the use cases that relate to the objectives of the IETF ALTO Standard. 4.1. Application Rate Limiting for CDN and Edge Cloud Applications In applications such as CDN, XR or gaming, it is important to throttle the transmission rate of flows to match the true available capacity along their communication path. Transmitting at a lower rate than the available bandwidth leads to lower quality of experience (QoE). Transmitting at a higher rate increases packet losses, which wastes network resources and also leads to a lower QoE. Estimating the available bandwidth for a flow is complex because it depends on multiple factors including the network topology, the routing configuration and the set of dynamic flows using the network resources. Bottleneck structures capture in a single digraph these three factors, creating a model that allows to estimate the performance of each flow. See for instance Sections 3.1 and 3.2 in [G2-TREP] for examples on how bottleneck structures can be used to estimate the available bandwidth of an application. An ALTO server could help the application service provider obtain the available bandwidth on a given path by exposing the bottleneck structure of the network. With this information alone, the provider could directly obtain the available bandwidth. Alternatively, the application service could query the ALTO server by passing the path for which the available bandwidth needs to be computed, and the ALTO server could return this value without the need to share the complete bottleneck structure. 4.2. Time-bound Constrained Flow Acceleration for Large Data Set Transfers Bulk data transfer is an important application to both commercial and government supported networks. For instance, Google's B4 network supports large-scale data push synchronizing state across multiple global data centers [B4-SIGCOMM]. Another common use case is found in science networks, where massive data sets such as those originated from the Large Hadron Collider at CERN, Switzerland, need to be shared with scientific labs around the world. In this section, we show how bottleneck structures can be used to reconfigure a network Ros-Giralt, et al. Expires 27 March 2023 [Page 15] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 towards accelerating a given data transfer with the goal to meet a certain time constraint. To illustrate this use case, we will assume the simple bottleneck structure shown in Figure 3. +------+ | | | l1 <-----------+ | | | +--^---+ | | -1 | +1 | | +--v---+ | | | | | f1 | | | | | +------+ | | +------+ +--v---+ | | +1 | | | l2 <--------+ f2 | | | | | +--^---+ +--+---+ | -1 | +1 | | +--v---+ +--v---+ | | | | | f3 | | l3 | | | | | +--+---+ +--^---+ | -1 | -1 | | +--v---+ +--v---+ | | -1 | | | l4 <--------+ f4 | | | | | +--^---+ +------+ | +2 | +--v---+ | | | f5 | | | +------+ Figure 3: Reducing the rate of flow f1 maximally accelerates flow f5. Ros-Giralt, et al. Expires 27 March 2023 [Page 16] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 Suppose our goal is to accelerate flow f5. To achieve this objective, we will also assume that we are allowed to traffic shape (reduce) the rate of any of the other flows. Effectively, for each flow fi different than f5, we need to compute the following derivative -dr5/d_ri and then pick the maximum value. Note that in the above expression, we take the left-derivative (d_), since a traffic shaper reduces the rate of a flow. We also negate the derivative (-d), since we are interested in a positive impact induced on flow f5 when reducing the rate of another flow fi. Such a calculation can be efficiently performed using the bottleneck structure. As an example, Figure 3 illustrates how the value of (- dr5/d_r1) is computed. First, we reduce the rate of flow f1 by 1 unit. This perturbation propagates through the bottleneck structure reaching flow f5 via two paths: l1 -> f2 -> l2 -> f3 -> l4 -> f5 l1 -> f2 -> l3 -> f4 -> l4 -> f5 Using the link and flow equations (Section 3.6), each path simply flips the sign of the perturbation every time a link vertex is traversed. (The reason why the sign is flipped at each link vertex is explained by the link and flow equations that dictate how perturbations propagate through the bottleneck structure. Further mathematical descriptions to explain this effect are outside the scope of this document. For detailed mathematical derivations and additional examples, please see [G2-TREP]). When reaching vertex f5, we find that each path contributes 1 unit of bandwidth. Thus we have: -dr5/d_r1 = 1 + 1 = 2 In fact, it can be seen that this derivative is maximal. That is, traffic shaping any other flow would yield a smaller increase in the rate of f5. Thus, an operator can conclude that traffic shaping flow f1 yields an optimal strategy to maximally accelerate the rate of flow f5. Note also that in this case, there is a positive multiplier effect, since reducing flow f1's rate by 1 unit, leads to an increase on flow f5's rate by more than 1 unit. This is known as a power gradient [G2-SIGCOMM]. Ros-Giralt, et al. Expires 27 March 2023 [Page 17] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 While left outside the scope of this document, bottleneck structures can also be used to efficiently compute the value of the optimal traffic shaper (i.e., in our example, to find by how much we should traffic shape flow f1) and to quantify the impact on the flow being accelerated. This information can also be used by the application to estimate the flow's completion time. An ALTO server could help the application service provider identify an optimized traffic shaping strategy by exposing the bottleneck structure of the network. With this information alone, the provider could efficiently compute an optimized set of traffic shapers. Alternatively, the application service could query the ALTO server by passing the set of flows that are allowed to be traffic shaped and the flow that needs to be accelerated, and the ALTO server could return the set of recommended traffic shapers. 4.3. Application Performance Optimization Through AI Modeling A relevant and emerging area in the field of application performance is AI-based network modeling. Several global initiatives are been undertaken to apply AI to the field of understanding and predicting network performance. For instance, OpenNetLab [BE-ONL] provides a distributed networking platform with many collaborative nodes (universities and companies) and common benchmarking datasets for researchers to collect real networking data and train their AI models for various networking environments, including the Internet, cloud, and wireless networks. There also exist global benchmarks and challenges to foster innovation in this field, such as the ACM MMSys Challenge [MMSYS], which focuses on novel AI-based bandwidth estimation algorithms to enable superior overall QoE on a global production testbed built for real-time communications (RTC) of video and audio. Modeling communication networks using purely AI frameworks such as deep learning is challenging as it requires very large production data sets that often times are not available. They also require many years of intense, global collaborative R&D. To address these challenges, it has been seen that hybrid models built by combining AI with a "physics" model of the network can outperform purely AI models, as they may require less training data to achieve the same or better performance. For instance, the top two winners of the ACM MMSys 2021 Bandwidth Prediction Challenge [MMSYS] were based on hybrid models. Because bottleneck structures provide a "physics" model of the network that can both qualify and quantify the forces of interactions among flows and links, they can be used in combination with AI to enable better performance than purely AI-based models. For instance, Ros-Giralt, et al. Expires 27 March 2023 [Page 18] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 this area is being discussed in the IETF ALTO WG (e.g., [BE-ONL]) as a potential use case in the ALTO Standard to help optimize the performance of RTC applications. In particular, a key building block to optimize the QoE performance of RTC applications is the bandwidth estimation module. This module runs on the endpoint of a real-time video application and aims at dynamically adapting the video bitrate to stay within the available network capacity. A limitation in the current algorithms, however, is their lack of network state visibility. This requires the algorithms to rely entirely on local indicators such as packet loss or latency, which leads to poor training and inference performance. Information provided by the bottleneck structure (which includes topological, routing and flow information of the network in a single digraph) exposed via the ALTO service could help unlock a richer set of machine learning algorithms to optimize application performance. An ALTO server could help the application service provider implement AI-assisted prediction algorithms by exposing the bottleneck structure of the network. Alternatively, ALTO could implement an AI- assisted prediction module with the help of bottleneck structures. The application would then query the ALTO server to obtain the predicted value. 4.4. Optimized Joint Routing and Congestion Control In traditional IP networks, the problems of flow routing and congestion control are separately resolved by following a two-step process: first, a routing protocol is used to determine the path between any two nodes in a network; then, flows are routed according to such paths and their transmission rates are regulated using a congestion control algorithm. This layered and disjoint approach is known to be scalable but suboptimal because the routing algorithm identifies paths without taking into account the flow transmission rates assigned by the congestion control algorithm. Suppose that an application is trying to launch a new flow between two endpoints with the goal to maximize the available bandwidth. One can be tempted to think that, to identify the path with maximal available bandwidth, it suffices to look at the current state of the network and find the least congested path offering the highest capacity. This approach, however, is naive since it does not take into account the fact that the placement of the new flow onto the network will itself create a perturbation in the network, potentially making the chosen path suboptimal or, even more troublesome, negatively affecting the performance of other priority flows. Ros-Giralt, et al. Expires 27 March 2023 [Page 19] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 The goal of the joint routing and congestion control problem between two given endpoints E1 and E2 consists in finding the path from E1 to E2 that will yield the highest throughput _after_ the flow is placed on the network (i.e., taking into account the effect of placing the flow). The solution to this problem is introduced in [G2-TREP] by employing a strategy that combines the strengths of both the Dijkstra algorithm and the insights revealed by the bottleneck structure. The algorithm can both compute the optimal path and measure the overall network- wide impact of deploying the new flow on the path. It also enables a framework to identify new good-performing paths that have a limited negative impact on the rest of the flows in the network. This allows network and application providers to identify paths that can both provide good performance to the newly added application flow while preserving the performance of the existing high-priority flows. An ALTO server could help the application service provider optimize the path selection decision by exposing the bottleneck structure of the network. With this information alone, the provider could efficiently compute the optimal path (e.g., using the algorithm introduced in [G2-TREP]). Alternatively, the application service could query the ALTO server by passing the information of the two endpoints that need to be connected, and the ALTO server could return a list of the top-N paths with the highest throughput and their expected performance. 4.5. Service Placement for Edge Computing Determining the proper location to deploy an application service in the edge cloud is critical to ensure a good quality of experience (QoE) for its users. Yet the service placement problem is known to be NP-Hard [JSP-INFOCOM], requiring heuristics to compute good (albeit suboptimal) solutions. In [G2-SIGCOMM], it is shown that Bottleneck structures can also be used as highly scalable network simulators to evaluate the performance of a network reconfiguration such as the placement of a new service on a edge cloud. In particular, bottleneck structures can very efficiently (1) compute the performance of each flow in the network and (2) quantify the effects of the arrival (departure) of new (existing) flows to (from) the network. This allows to simulate the full transmission of an application traffic pattern very efficiently, three or more orders of magnitude faster than traditional packet simulators. Network and application providers can use this capability in two ways: Ros-Giralt, et al. Expires 27 March 2023 [Page 20] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 * Given a set of possible placement strategies, bottleneck structures can be used to simulate them in real time, helping the operator select the one that provides the best performance while guaranteeing the service level agreements (SLAs) of the other existing applications. * Despite the server placement problem being intractable, bottleneck structures provide a framework to identify good candidate solutions. In particular, by capturing the topology, routing, and flow information in a single computational graph, they can be used to efficiently explore directions in the feasible set that yield incrementally better performance. By moving in these incremental directions, the placement algorithm can progress within the enormous feasible set towards the optimal solution. An ALTO server could help the application service provider optimize the placement decision by exposing the bottleneck structure of the network. With this information alone, the provider could compute the effect of placing the service in one location versus another. Alternatively, the application service could query the ALTO server by passing the information of the possible locations where it can be placed, and the ALTO server could return an ordered list of the locations and their expected performance. 4.6. Training Neural Networks and AI Inference for Edge Clouds, Data Centers and Planet-Scale Networks Neural network training and inference using distributed computing systems are the subject of intense research and one of the leading target applications in today's communication networks. [TOPOOPT-MIT] [FLEXFLOW-STFORD] [SINGULARITY-MSFT]. To illustrate this use case, we will focus our discussion on three types of networks: edge clouds, data centers and planet-scale networks. 5G and Edge Clouds enable for the first time the ability to provide intelligence at the edge of the network. This capability is disruptive in that humans and machines will have access to unprecedented compute power to perform AI inference in real time. For instance, using augmented reality (AR), humans will be able to make better informed decisions as they navigate through an environment by leveraging AI-inference on video and audio signals captured in real time from their user equipments (UEs). Similarly, machines such as vehicles or factory robots will be able to use AI inference to optimize their actions. Two resources are needed to perform inference: (1) Input data from the environment (e.g., image and audio signals captured from a video camera) and (2) compute (typically in the form of GPUs and CPUs). Ros-Giralt, et al. Expires 27 March 2023 [Page 21] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 The input data needs to be transmitted from the location where it is captured (e.g., a micro-camera running on a human's glasses) to the location where it is to be processed for inference. The transmission of the input data requires communication resources, whereas the inference process requires computing resources. Since computing resources in the edge cloud (Figure 4) are distributed across the user equipment (UE), the radio unit (RU), the distributed unit (DU) and the central unit (CU) [PETERSON], the problem of efficiently performing AI-inference is one of optimizing the trade-off communication-compute as follows: compute (communication) power is more scarce (abundant) if the inference is performed closer to the UE, and more abundant (scarce) if performed closer to the CU. For instance, if an AR application running on a UE needs to perform an inference task at a time when the communication path from the RU to the DU is highly congested, then it will have an incentive to perform such a task directly in the UE or in the RU. If instead the network offers an uncongested path to the DU and the CU, it will have an incentive to run the inference task on these other nodes since they offer more compute power. +------+ +------+ +------+ +------+ | | | | | | | | | UE +-------+ RU +-------+ DU +-------+ CU + | | | | | | | | +------+ Air +------+ +------+ +------+ Interface Fronthaul Backhaul Figure 4: An AI-inference application in the edge cloud needs to place the inference task on a compute node location (UE, RU, DU or CU) that will perform well from both a compute and a communication standpoint. Using ALTO path vector [I-D.ietf-alto-path-vector] and performance metrics [I-D.ietf-alto-performance-metrics] features, the application could retrieve the amount of compute resources located in the RU, DU and CU. By extending ALTO to support bottleneck structures, the application would also be able to estimate in real-time the available bandwidth for the paths UE-RU, UE-RU-DU, and UE-RU-DU-CU. Further, using bottleneck structure methods described in [G2-SIGCOMM], the application would be able to estimate the time to complete the inference task for each of the four possible scenarios (running in the UE, the RU, the DU or, or the CU) and choose the configuration with the fastest execution. Similar joint compute-communication optimization problems appear when performing neural network training in large-scale data centers. Large-scale data centers with millions of compute nodes are used to train gigantic neural networks (with potentially trillions of Ros-Giralt, et al. Expires 27 March 2023 [Page 22] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 parameters). Such a massive task needs to be broken down into smaller subtasks that are then executed on the nodes. Once again, compute and communication need to be jointly optimized (see [TOPOOPT-MIT] and [FLEXFLOW-STFORD]) in order to ensure regions in the network don't become bottlenecks. By exposing bottleneck structure information using ALTO, the AI-training application can make better subtask placement decisions that avoid potential network bottlenecks. Finally, AI-training using planet-scale networks generalizes the same joint compute and communication problem to an Internet level [SINGULARITY-MSFT], with the need to implement a global scheduler that is responsible for placing workloads onto clusters of globally- distributed compute nodes. Here too enabling better network state visibility using ALTO and bottleneck structure graphs could help the scheduler make better task placement decisions. 4.7. 5G Network Slicing Bottleneck structures can also be used by network operators and application service providers to compute optimized network slices. In a simplified form, given a network topology consisting of a set of routers interconnected via links each with a given capacity, the problem of computing a network slice consists in identifying a subset of the routers, links, and a fraction of the capacity in each link, that can cover a certain demand of traffic generated by a given application service. The slice provides a virtual cut of the network, enabling isolation and ensuring a fix amount of resources to the application service. Examples of application services include a vehicle network service, the Internet of Things, an industrial logistical service, or the metaverse, to name a few. [G2-SIGCOMM] shows how bottleneck structures can be used to compute optimized bandwidth allocations in data centers to optimally meet a certain traffic demand. Similar frameworks can be used to compute network slices in a virtualized networking environment. 5. Example: Application Layer Traffic Optimization using Bottleneck Structures In this section we provide an example illustrating how bottleneck structures can be used to optimize application performance. This example will then be referenced in Section 6 to discuss and introduce the necessary requirements to integrate the BSG service into the ALTO standard. It is worth noticing that, as shown in Section 4, bottleneck structures have numerous applications. This section provides a complete example for just one of the use cases. In particular, the focus of the next example is on the joint routing and Ros-Giralt, et al. Expires 27 March 2023 [Page 23] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 congestion control use case Section 4.4. Figure 5 provides a view of Google's B4 network as presented in [B4-SIGCOMM], providing connectivity to 12 data centers distributed across the world (two in Asia, six in America and four in Europe). +-----+ +-----+ +-----+ +-----+ +------+ +------+ | | | | | | | | | | | | | DC1 +---+ DC3 +--+ DC4 +--+ DC7 +---+ DC11 +----+ DC12 | | | | | | | | | | | | | +---+-+ +--+--+ +--+-++ +----++ +-+--+-+ +----+-+ | | | | | | | | | | +------+ | +----+ | | | +--------+ | | | | | | | | | | | | | | | | | | | | +------|-+ +----|-+ | | +-------|--+ | | | | | | | | | +---+-+ +--+--+ ++---++ ++---++ +-+---++ +-+---+ | | | | | | | | | | | | | DC2 +---+ DC5 +--+ DC6 +--+ DC8 +---+ DC10 +----+ DC9 | | | | | | | | | | | | | +-----+ +-----+ +-----+ +-----+ +------+ +-----+ |-----| |-----------------------| |-----------------| Asia America Europe Figure 5: Google's B4 network introduced in [B4-SIGCOMM]. The 12 data centeres are connected via a total of 19 links, labeled l1, l2, ... l19. Table 1 presents the pair of data centers that each link is connected to. Ros-Giralt, et al. Expires 27 March 2023 [Page 24] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 +======+=======================+======+=======================+ | Link | Adjacent data centers | Link | Adjacent data centers | +======+=======================+======+=======================+ | l1 | DC1, DC2 | l11 | DC10, DC12 | +------+-----------------------+------+-----------------------+ | l2 | DC1, DC3 | l12 | DC4, DC5 | +------+-----------------------+------+-----------------------+ | l3 | DC3, DC4 | l13 | DC5, DC6 | +------+-----------------------+------+-----------------------+ | l4 | DC2, DC5 | l14 | DC11, DC12 | +------+-----------------------+------+-----------------------+ | l5 | DC3, DC6 | l15 | DC4, DC7 | +------+-----------------------+------+-----------------------+ | l6 | DC6, DC7 | l16 | DC4, DC8 | +------+-----------------------+------+-----------------------+ | l7 | DC7, DC8 | l17 | DC7, DC8 | +------+-----------------------+------+-----------------------+ | l8 | DC8, DC10 | l18 | DC9, DC11 | +------+-----------------------+------+-----------------------+ | l9 | DC9, DC10 | l19 | DC10, DC11 | +------+-----------------------+------+-----------------------+ | l10 | DC7, DC11 | | | +------+-----------------------+------+-----------------------+ Table 1: Link connectivity (adjacency matrix) in the B4 network. For the sake of illustration, we will assume a simple configuration consisting of a pair of flows (one for each direction) connecting every data center in the US with every data center in Europe, with all flows routed along a shortest path from source to destination. Since there are six data centers in the US and four in Europe, this configuration has a total of 48 flows. All links are assumed to have a capacity of 10 Gbps except for the transatlantic links (DC7-DC11 and DC8-DC10), which are configured at 25 Gbps. The next Figure presents the bottleneck structure of Google's B4 network with the assumed flow configuration. Please see Section 3.1 for a description of how to interpret the bottleneck structure. (See also [G2-SC], [G2-TREP] for details on the algorithm used to compute the bottleneck structure.) Ros-Giralt, et al. Expires 27 March 2023 [Page 25] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 +--------+ +--------+ | | | | | l15 | | l7 | | | | | +-^----^-+ +--^--^--+ | | | | | | | +--------------------------+ | | | | | +---------|------------------+ | | | | | +-v------+ +--v----+ +---v---+ +---v---+ | | | | | | | | | f1 | | f2 | | f3 | | f10 | (...) | | | | | | | | +-+-+-+--+ +-+---+-+ +--+--+-+ +-------+ | | | | | | | | | +---------|---|-------------|--|--------------+ | | | | | | | | | +---------+ +-----------+ | | | | | | | | | | | +-|---------+ +----------|-+ +---------+ | | | | | | | | | | | | | | | +-v---v--+ +-v----v-+ +-v----+ +-v----v-+ | | | | | | | | | l8 | | l10 | | l5 | | l3 | | | | | | | | | +-^---^--+ +-^---^--+ +------+ +--------+ | | | | | | | +-----------------------+ | | | | | +---------|---------------+ | | | | | +-v----+ +---v---+ +---v---+ +---v---+ | | | | | | | | | f6 | | f9 | | f23 | | f18 | (...) | | | | | | | | +------+ +-------+ +-------+ +-------+ Figure 6: Bottleneck structure of Google's B4 network example. For the sake of compactness, Figure 6 only includes the bottleneck links and a subset of the flow vertices that are part of the complete bottleneck structure. In particular, out of the 19 links that are part of B4, six links (l15, l7, l8, l10, l5, l3) are bottlenecks. The bottleneck structure graph shows the existence of two bottleneck levels in our configuration example: Ros-Giralt, et al. Expires 27 March 2023 [Page 26] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 * The first level at the top of the bottleneck structure includes links l15 and l7. Flows f1, f2, f3, f10, etc. are bottlenecked at this level. * The second level of the bottleneck structure includes links l8, l10, l5 and l3. Flows f6, f9, f23, f18, etc. are bottlenecked at this level. From the MBA Property (Property 3), we know that flows bottlenecked by a link at level 2 will enjoy higher available bandwidth than flows bottlenecked at level 2. For instance, consider the following directed path in the bottleneck structure: l15 -> f1 -> l8 -> f6 Using the MBA property, we have that since l15 precedes l8, it must be that s15 < s8, where s15 is the rate of flow f1 bottlenecked at l15 and s8 is the rate of flow f6 bottlenecked at l8. Suppose now that an application needs to place a new flow on Google's B4 network to transfer a large data set from data center 11 (DC11) to data center 4 (DC4). The application needs to select and configure a path from DC11 to DC4 (for instance, this could be done by using SR [RFC8402]). The shortest path DC11 -> l10 -> DC7 -> l15 -> DC4 is often used as the default option. Doing so, however, implies that the flow will be bottlenecked at link l15 at the upper level of the bottleneck structure, leading to a lower transmission rate. If instead we choose the non-shortest path DC11 -> l19 -> DC10 -> l8 -> DC8 -> l16 -> DC4, now the flow will be bottlenecked at link l8 (at the lower level of the bottleneck structure), leading to a higher transmission rate. Using QTBS, we can also numerically compute the transmission rate of the flow on each of the two path options. (See Section 3.1 in [G2-TREP] for a detailed description of how to compute the transmission rate assigned to the flow on each of these paths.) In particular, we obtain that when the application chooses the shortest path (bottlenecked at level 1 of the bottleneck structure), it gets a transmission rate of 1.429 Gbps. If instead the application chooses the slightly longer path (bottlenecked at level 2 of the bottleneck structure), then it gets a transmission rate of 2.5 Gbps, an increase of 74.95% with respect to the shortest path solution. [G2-TREP] introduces also a very efficient routing algorithm that uses the bottleneck structure to find the maximal throughput path for a flow in O(V+E*log(V)) steps, where V is the number of routers and E is the number of links in the network. Ros-Giralt, et al. Expires 27 March 2023 [Page 27] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 Overall, this example illustrates that, equipped with knowledge about the bottleneck structure of a network, an application can make better informed decisions on how to route a flow. In the next sections, we will use this example to support a discussion on the requirements for integrating the Bottleneck Structure Graph (BSG) service into the ALTO standard. 6. Requirements This section provides a discussion on the necessary requirements to integrate the BSG service into the ALTO standard. 6.1. Requirement 1: Bottleneck Structure Graph (BSG) Abstraction The first base requirement consists in extending the ALTO server with the capability to compute bottleneck structures. For instance, with this capability, given the network configuration in Figure 5, the ALTO server would be able to compute the bottleneck structure shown in Figure 6: * Requirement 1A (R1A). The ALTO server MUST compute the bottleneck structure graph to allow applications optimize their performance using the BSG service. We note that the alternative, which would consists in ALTO simply providing the necessary information for applications to compute their own bottleneck structures, would not scale due to the following issues: * Suppose that 1 ALTO server is providing support to N ALTO clients. Then, requiring each application to compute the bottleneck structure would imply performing N identical and redundant computations. By computing the bottleneck structure in the ALTO server, a one-time computation can be leveraged by all N clients. We also note that [G2-SC] demonstrates that bottleneck structures can be efficiently computed in real time by the server even for large scale networks. * A production-ready high-speed implementation of QTBS is relatively sophisticated. Requiring the applications to implement their own QTBS optimization library would impose unnecessary and (perhaps more importantly) out-of-scope requirements to the application, which should focus on providing its service rather than implementing a network optimization math library. The next requirement focuses on the type of bottleneck structure an ALTO server must compute: Ros-Giralt, et al. Expires 27 March 2023 [Page 28] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 * Requirement 1B (R1B). The ALTO server MUST at least support the computation of one bottleneck structure type from Section 3.7. Depending on the network information available (e.g., presence of QoS class information), the ALTO server MAY support all the three bottleneck structure types, in which case the ALTO client MAY be able to choose the bottleneck structure type for retrieval. Also, it is RECOMMENDED that the ALTO server supports the computation of the path gradient graph (PGG) as the default bottleneck structure implementation for retrieval by the ALTO clients. 6.2. Requirement 2: Information Received from the Network To compute a bottleneck structure, two pieces of information are required: * Topology Object (T). The T Object is a data structure that includes: (1) A Topology Graph (V, E), where V is the set of routers and E is the set of links connecting the routers in the network. (2) A Capacity Dictionary (C), a key-value table mapping each link with its capacity (in bps). * Flow Dictionary (F). The F Dictionary is a key-value table mapping every flow with the set of links it traverses. As shown in [G2-TREP], the above information is enough to compute the bottleneck structure. In fact, with only the F and C dictionaries, one can compute the bottleneck structure. The topology graph (V, E) is needed to perform optimal routing computations (for instance, to find new paths in the network that yield higher throughput, as illustrated in Section 5). The above discussion leads to the following requirement: * Requirement 2A (R2A). The ALTO server MUST collect information about (1) the set of routers and links in a network, (2) the capacity of each link and (3) the set of links traversed by each flow. Ros-Giralt, et al. Expires 27 March 2023 [Page 29] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 Information about the set of routers, links and link capacity is typically available from protocols and technologies such as SNMP, BGP-LS, SDN, or domain specific topology logs. This information is enough to construct the T Object. Information about the set of links traversed by each flow can be obtained from protocols such as NetFlow, sFlow, IPFIX, etc. See [FLOWDIR] and [G2-SC] for examples of how requirement R2A is implemented in real-world production networks. 6.3. Requirement 3: Information Passed to the Application The following requirement is necessary so that applications can optimize their performance using bottleneck structure information: * Requirement 3A (R3A). The ALTO client MUST be able to query the ALTO server to obtain the current bottleneck structure of the network, represented as a digraph. In addition, the current ALTO services can be extended with additional information obtained from the bottleneck structure: * Requirement 3B (R3B). One or more ALTO services (the Network Map, the Cost Map, the Entity Property Map or the Endpoint Cost Map) MUST support reporting to ALTO clients additional network state information derived from the bottleneck structure to the ALTO client. For example, the ALTO Cost Map Service can be extended with a new cost metric that corresponds to the estimated available bandwidth between two endpoints according to the bottleneck structure model. 6.4. Requirement 4: Features Needed to Support the Use Cases Bottleneck structures offer a rich framework to optimize application performance for a variety of use cases. In addition to the base requirement to construct the bottleneck structure graph (see R1A), in this section we enumerate additional capabilities that must be supported by the ALTO BSG service to ensure it is effective in helping applications optimize their performance for each of the supported use cases. * Requirement 4A (R4A). The ALTO BSG service MUST be able to compute the effect of network reconfigurations using bottleneck structure analysis and according to the types described in Section 3.9. Ros-Giralt, et al. Expires 27 March 2023 [Page 30] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 For example, an extended reality (XR) application might need to choose where to place a containerized instance of the XR service among a set of possible server racks located in various edge cloud locations. The application would query the ALTO BSG service to obtain the projected performance results of placing the new service instance on each possible location, allowing it to select the one that would yield the highest performance. The following requirement is necessary to ensure that the information provided by the BSG service is not stale: Requirement 4B (R4B). The BSG service MUST be able to update the bottleneck structure graph in near-real time, at least once a minute or less. In [G2-SC] it is shown that bottleneck structures can be computed in a fraction of a session for a production US wide network with about 100 routers and 500 links. Thus, the above requirement should be achievable with a good implementation of the bottleneck structure algorithm [G2-TREP]. 7. Security Considerations Future versions of this document may extend the base ALTO protocol, so the Security Considerations [RFC7285] of the base ALTO protocol fully apply when this proposed extension is provided by an ALTO server. The Bottleneck Structure Graph extension requires additional scrutiny on three security considerations discussed in the base protocol: Confidentiality of ALTO information (Section 15.3 of [RFC7285]), potential undesirable guidance from authenticated ALTO information (Section 15.2 of [RFC7285]), and availability of ALTO service (Section 15.5 of [RFC7285]). For confidentiality of ALTO information, a network operator should be aware that this extension may introduce a new risk: As the Bottleneck Structure information may reveal more fine-grained internal network structures than the base protocol, an attacker may identify the bottleneck link and start a distributed denial-of-service (DDoS) attack involving minimal flows to conduct in-network congestion. Given the potential risk of leaking sensitive information, the BSG extension is mainly applicable in scenarios where: * The properties of the Bottleneck Structure Graph do not impose security risks to the ALTO service provider, e.g., by not carrying sensitive information. Ros-Giralt, et al. Expires 27 March 2023 [Page 31] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 * The ALTO server and client have established a reliable trust relationship, for example, operated in the same administrative domain, or managed by business partners with legal contracts and proper authentication and privacy protocols. * The ALTO server implements protection mechanisms to reduce information exposure or obfuscate the real information. Implementations involving reduction or obfuscation of the Bottleneck Structure information SHOULD consider reduction/ obfuscation mechanisms that can preserve the integrity of ALTO information, for example, by using minimal feasible region compression algorithms [NOVA] or obfuscation protocols RESA [MERCATOR]. We note that these obfuscation methods are experimental and their practical applicability to the generic capability provided by this extension is not fully assessed. We note that for operators that are sensitive about disclosing flow- level information (even if it is anonymized), then they SHOULD consider using the Path Gradient Graph (PGG) or the QoS-Path Gradient Graph (Q-PGG) since these objects provide the additional security advantage of hiding flow-level information from the graph. For undesirable guidance, the ALTO server must be aware that, if information reduction/obfuscation methods are implemented, they may lead to potential misleading information from Authenticated ALTO Information. In such cases, the Protection Strategies described in Section 15.2.2 of [RFC7285] MUST be considered. For availability of ALTO service, an ALTO server should be cognizant that using Bottleneck Structures might have a new risk: frequently querying the BSG service might consume intolerable amounts of computation and storage on the server side. For example, if an ALTO server implementation dynamically computes the Bottleneck Structure for each request, the BSG service may become an entry point for denial-of-service attacks on the availability of an ALTO server. To mitigate this risk, an ALTO server may consider using optimizations such as precomputation-and-projection mechanisms [MERCATOR] to reduce the overhead for processing each query. An ALTO server may also protect itself from malicious clients by monitoring the behaviors of clients and stopping serving clients with suspicious behaviors (e.g., sending requests at a high frequency). 8. IANA Considerations Future versions of this document may register new entries to the ALTO Cost Metric Registry, the ALTO Cost Mode Registry, the ALTO Domain Entity Type Registry and the ALTO Entity Property Type Registry. Ros-Giralt, et al. Expires 27 March 2023 [Page 32] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 9. References 9.1. Normative References [I-D.ietf-alto-path-vector] Gao, K., Lee, Y., Randriamasy, S., Yang, Y. R., and J. Zhang, "An ALTO Extension: Path Vector", Work in Progress, Internet-Draft, draft-ietf-alto-path-vector-25, 20 March 2022, . [I-D.ietf-alto-performance-metrics] Wu, Q., Yang, Y. R., Lee, Y., Dhody, D., Randriamasy, S., and L. M. Contreras, "ALTO Performance Cost Metrics", Work in Progress, Internet-Draft, draft-ietf-alto-performance- metrics-28, 21 March 2022, . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC7285] Alimi, R., Ed., Penno, R., Ed., Yang, Y., Ed., Kiesel, S., Previdi, S., Roome, W., Shalunov, S., and R. Woundy, "Application-Layer Traffic Optimization (ALTO) Protocol", RFC 7285, DOI 10.17487/RFC7285, September 2014, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, July 2018, . [RFC8896] Randriamasy, S., Yang, R., Wu, Q., Deng, L., and N. Schwan, "Application-Layer Traffic Optimization (ALTO) Cost Calendar", RFC 8896, DOI 10.17487/RFC8896, November 2020, . 9.2. Informative References Ros-Giralt, et al. Expires 27 March 2023 [Page 33] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 [B4-SIGCOMM] Jain et al, S., "B4: Experience with a Globally-Deployed Software Defined WAN", ACM SIGCOMM , 2013. [BE-ONL] "Bandwidth Estimation on OpenNetLab", IETF Plenary 112, IETF ALTO WG , 2021, . [BE-SIGCOMM] Kumar et al, A., "BwE: Flexible, Hierarchical Bandwidth Allocation for WAN Distributed Computing", ACM SIGCOMM , 2015. [FLEXFLOW-STFORD] Jia et al, Z., "Beyond Data And Model Parallelism For Deep Neural Networks", n.d., . [FLOWDIR] Pujol, E., Poese, I., Zerwas, J., Smaragdakis, G., and A. Feldmann, "Steering Hyper-Giants' Traffic at Scale", ACM CoNEXT , 2019. [G2-SC] Amsel, N., Ros-Giralt, J., Yellamraju, S., von Hoffe, B., and R. Lethin, "Computing Bottleneck Structures at Scale for High-Precision Network Performance Analysis", IEEE International Workshop on Innovating the Network for Data Intensive Science (INDIS), Supercomputing , 2020. [G2-SIGCOMM] Ros-Giralt, J., Amsel, N., Yellamraju, S., Ezick, J., Lethin, R., Jiang, Y., Feng, A., Tassiulas, L., Wu, Z., and K. Bergman, "Designing data center networks using bottleneck structures", ACM SIGCOMM , 2021. [G2-SIGMETRICS] Ros-Giralt, J., Bohara, A., Yellamraju, S., Langston, H., Lethin, R., Jiang, Y., Tassiulas, L., Li, J., Tan, Y., and M. Veeraraghavan, "On the Bottleneck Structure of Congestion-Controlled Networks", ACM SIGMETRICS , 2020. [G2-TREP] Ros-Giralt, J., Amsel, N., Yellamraju, S., Ezick, J., Lethin, R., Jiang, Y., Feng, A., Tassiulas, L., Wu, Z., and K. Bergman, "A Quantitative Theory of Bottleneck Structures for Data Networks", Reservoir Labs (Qualcomm) Technical Report , 2021. [GALLAGER] Gallager, R. and D. Bertsekas, "Data Networks", 1992. Ros-Giralt, et al. Expires 27 March 2023 [Page 34] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 [JSP-INFOCOM] Poularakis et al, D., "Joint Service Placement and Request Routing in Multi-cell Mobile Edge Computing Networks", n.d.. [LORENZ] Lorenz, E., "Does the flap of a butterfly's wings in Brazil set off a tornado in Texas?", American Association for the Advancement of Science, 139th Meeting , 1972. [MERCATOR] Xiang, Q., Zhang, J., Wang, X., Guok, C., Le, F., MacAuley, J., Newman, H., and Y. Yang, "Toward Fine- Grained, Privacy-Preserving, Efficient Multi-Domain Network Resource Discovery", IEEE/ACM IEEE/ACM IEEE Journal on Selected Areas of Communication 37(8): 1924-1940, 2019, . [MMSYS] "Bandwidth Estimation for Real-Time Communications", 2021, . [NOVA] Gao, K., Xiang, Q., Wang, X., Yang, Y., and J. Bi, "An objective-driven on-demand network abstraction for adaptive applications", IEEE/ACM IEEE/ACM Transactions on Networking (TON) Vol 27, no. 2 (2019): 805-818., 2019, . [PETERSON] Peterson, L. and O. Sunay, "5G Mobile Networks: A Systems Approach", Open Networking Foundation , 2020. [SH-SIGCOMM] Singh et al, R., "Cost-effective capacity provisioning in wide area networks with Shoofly", ACM SIGCOMM , 2021. [SINGULARITY-MSFT] Shukla et al, D., "Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads", n.d., . [TOPOOPT-MIT] Wang et al, W., "TOPOOPT: Optimizing the Network Topology for Distributed DNN Training", n.d., . Authors' Addresses Jordi Ros-Giralt Qualcomm Email: jros@qti.qualcomm.com Ros-Giralt, et al. Expires 27 March 2023 [Page 35] Internet-Draft Supporting Bottleneck Structure Graphs i September 2022 Sruthi Yellamraju Qualcomm Email: yellamra@qti.qualcomm.com Qin Wu Huawei Email: bill.wu@huawei.com Luis Miguel Contreras Telefonica Email: luismiguel.contrerasmurillo@telefonica.com Richard Yang Yale University Email: yry@cs.yale.edu Kai Gao Sichuan University Email: kaigao@scu.edu.cn Ros-Giralt, et al. Expires 27 March 2023 [Page 36]