Internet-Draft | RISAV | October 2022 |
Xu, et al. | Expires 14 April 2023 | [Page] |
This document presents RISAV, a protocol for establishing and using IPsec security between Autonomous Systems (ASes) using the RPKI identity system. In this protocol, the originating AS adds authenticating information to each outgoing packet at its Border Routers (ASBRs), and the receiving AS verifies and strips this information at its ASBRs. Packets that fail validation are dropped by the ASBR. RISAV achieves Source Address Validation among all participating ASes.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 14 April 2023.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Source address spoofing has been identified years ago at [RFC2827], and [RFC5210] has proposed an Source Address Validation Architecture (SAVA) to alleviate such concerns. SAVA classifies this solution into three layers: Access Network, Intra-AS, and Inter-AS. The Inter-AS concerns the SAV at the AS boundaries. It is more challenging for developing the inter-AS source address validation approach because different ASes run different policies in different ISPs independently. It requires the different ASes to collaborate to verify the source address. The inter-AS SAV is more effective than Access or Intra-AS due to its better cost-effectiveness. However, over years of effort, inter-AS source address validation deployment is still not optimistic. An important reason is the difficulty of balancing the clear security benefits of partial implementations with the scalability of large-scale deployments. uRPF [RFC5635] [RFC8704], for example, is a routing-based schemes filter spoofing source address's traffic, which may result in a lack of security benefits due to the dynamic nature of routing or incomplete information caused by partial deployments.¶
This document provides an RPKI- [RFC6480] and IPsec-based [RFC4301] inter-AS approach to source address validation (RISAV). RISAV is a cryptography-based SAV mechanism to reduce the spoofing source address. RPKI provides the reflection relationship between AS numbers (ASN) and IP prefixes. IKEv2 is used to negotiate between two ASes with the Security Association (SA) which contains the algorithm, secret key generating material, and IPsec packet type, and so forth. IPsec is designed for secure the Internet at the IP layer. It introduces two protocols, one is AH (authentication header) [RFC4302] which provides authenticity of the whole packet, including the source address. The other is ESP (IP Encapsulating Security Payload) [RFC4303] which encrypts the whole packet's payload.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Commonly used terms in this document are described below.¶
AS Contact Server, which is the logical representative of one AS and is responsible for delivering session keys and other information to ASBR.¶
The IP address of the ACS.¶
AS border router, which is at the boundary of an AS.¶
Source Address Validation, which verifies the source address of an IP packet and guarantee the source address is valid.¶
The goal of this section is to provides the high level description of what RISAV is and how RISAV works.¶
RISAV is a cryptographically-based inter-AS source address validation approach that guarantees security benefits at partial deployment. It aims to provide the IP datagram with a valid source address, with the capability of anti-spoofing, anti-replay, light-weight and efficient, and incremental deployment incentives. As a result, RISAV adds a tag to a packet at the source AS Border Router (ASBR) proving that the packet is with a valid source address, and it would verify and remove this tag at the destination ASBR. The tag will be encapsulated in the Integrity Check Value (ICV) field of IPsec AH/ESP.¶
RISAV uses IKEv2 to negotiate an IPsec security association (SA) between any two ASes. RPKI provides the binding relationship between AS numbers, IP ranges, contact IPs, and public keys. After negotiation, all packets between these ASes are secured by use of a modified AH header or a standard ESP payload.¶
Before deploying RISAV, each AS sets a contact IP representative. When negotiating or consulting with one AS, the peer MUST first communicate with this contact IP. The AS MUST publish exactly one contact IP for each supported address family (i.e. IPv4 and/or IPv6) in the RPKI database.¶
A typical workflow of RISAV is shown in Figure 1.¶
The functions of the control plane of RISAV include:¶
These functions are achieved in two steps. First, each participating AS publishes a Signed Object [RFC6488] in its RPKI Repository containing a RISAVAnnouncement
:¶
RISAVAnnouncement ::= SEQUENCE { version [0] INTEGER DEFAULT 0, asID ASID, contactIP ipAddress, testing Boolean }¶
When a participating AS discovers another participating AS (via its regular sync of the RPKI database), it initiates an IKEv2 handshake between its own contact IP and the other AS's contact IP. This handshake MUST include an IKE_AUTH exchange that authenticates both ASes with their RPKI ROA certificates.¶
Once this handshake is complete, each AS MUST activate RISAV on all outgoing packets, and SHOULD drop all non-RISAV traffic from the other AS after a reasonable grace period (e.g. 60 seconds).¶
The "testing" field indicates whether this contact IP is potentially unreliable. When this field is set to true
, other ASes MUST fall back to ordinary operation if IKE negotiation fails. Otherwise, the contact IP is presumed to be fully reliable, and other ASes SHOULD drop all non-RISAV traffic from this AS if IKE negotiation fails (see Section 6.1.2).¶
For more information about RPKI, one can refer to [RFC6480].¶
To disable RISAV, a participating AS MUST perform the following steps in order:¶
RISAVAnnouncement
from the RPKI Repository.¶
Conversely, if any AS no longer publishes a RISAVAnnouncement
, other ASes MUST immediately stop sending RISAV to that AS, but MUST NOT delete any negotiated Tunnel Mode SAs for at least 24 hours, in order to continue to process encrypted incoming traffic.¶
TODO: Discuss changes to the contact IP, check if there are any race conditions between activation and deactivation, IKEv2 handshakes in progress, SA expiration, etc.¶
SA has its own expiration time and IKE has its keepalive mechanism. In abnormal case, i.e. the connection is failed after the IKE handshake is established, SA will be always in effect during its lifetime until it expires or the IKE keepalive is failed. In normal case, i.e. the connection is actively down, SA will be expired and RISAV will be disabled immediately.¶
OPEN QUESTION: Does IKEv2 have an authenticated permanent rejection option that would help here?¶
All the ASBRs of the AS are REQUIRED to enable RISAV. It uses SPI for destination ASBR to locate the SA uniquely when processing the AH header in RISAV.¶
As defined in [RFC4301], the Security Association Database (SAD) stores all the SAs. One data item in SAD includes an Authentication algorithm and corresponding key when AH is supported. The authentication algorithm could be HMAC-MD5, HMAC-SHA-1, or others.¶
When a packet arrives at the source ASBR, it will be checked with the destination address by this ASBR first. If the destination address is in the protection range of RISAV, the packet will be checked by the source address next. If the source address belongs to the AS in which the ASBR locates, the packet needs to be modified for RISAV.¶
The modification that is applied depends on whether IPsec "transport mode" or "tunnel mode" is active. This is determined by the presence or absence of the USE_TRANSPORT_MODE notification in the IKEv2 handshake. RISAV implementations MUST support transport mode, and MAY support tunnel mode.¶
OPEN QUESTION: How do peers express a preference or requirement for transport or tunnel mode?¶
When a packet arrives at the destination ASBR, it will check the destination address and the source address. If the destination belongs to the AS that the destination ASBR locates in and the source address is in an AS with which this AS has a RISAV SA, the packet is subject to RISAV processing.¶
To avoid DoS attacks, participating ASes MUST drop any outgoing packet to the contact IP of another AS. Only the AS operator's systems (i.e. the ACS and ASBRs) are permitted to send packets to the contact IPs of other ASes. ASBRs MAY drop inbound packets to the contact IP from non-participating ASes.¶
To avoid conflict with other uses of IPsec, RISAV defines its own variant of the IPsec Authentication Header (AH). The RISAV-AH header format is shown in Figure 2.¶
This format is identical to IPsec standard AH except that the Sequence Number is omitted, because RISAV is presumed to be a "multi-sender SA" for which anti-replay defense is not supported [RFC4302], Section 2.5. This change saves 8 octets when the ICV is 16, 24, or 32 octets. For a 16-octet ICV (most common), RISAV-AH adds 24 octets to each packet.¶
The RISAV-AH header is only for AS-to-AS communication. ASes MUST strip off all RISAV-AH headers for packets whose destination is inside the AS, even if the AS is not currently inspecting the ICV values.¶
In transport mode, each AS's SA Database (SAD) is indexed by SPI and counterpart AS, regardless of the source and destination IPs.¶
In tunnel mode, a RISAV sender ASBR wraps each outgoing packet in an ESP payload. Each ASBR uses its own source address, and sets the destination address to the contact IP of the destination AS.¶
The contact IP decrypts all IPsec traffic to recover the original packets, which are forwarded to the correct destination. After decryption, the receiving AS MUST check that the source IP and destination IP are in the same AS as the outer source and destination, respectively.¶
In Tunnel mode, each ASBR maintains its own copy of the SA Database (SAD). The SAD is indexed by SPI and counterpart AS, except for the replay defense window, which is additionally scoped to the source IP. If a valid ESP packet is received from an unknown IP address, the receiving AS SHOULD allocate a new replay defense window, subject to resource constraints. This allows replay defense to work as usual. (If the contact IP is implemented as an ECMP cluster, effective replay defense may require consistent hashing.)¶
Tunnel mode imposes a space overhead of 73 octets in IPv6.¶
PROBLEM: ESP doesn't protect the source IP, so a packet could be replayed by changing the source IP. Can we negotiate an extension to ESP that covers the IP header? Or could we always send from the contact IP and encode the ASBR ID in the low bits of the SPI?¶
OPEN QUESTION: Do we need multiple contact IPs per AS, to support fragmented ASes?¶
This section presents potential additions to the design.¶
TODO: Remove this section once we have consensus on whether these extensions are worthwhile.¶
Original IPsec AH needs to authenticate the whole constant part of a packet so that it needs to spend amounts of time finding and processing unchangeable fields in the packet. However, RISAV only needs to find a few changeless fields to authenticate the packet decreasing the cost dramatically.¶
As authenticating the whole packet causes a heavy burden in the computation, we could define an IKE parameter to negotiate a header-only variant of transport mode that only authenticates the IP source address, IP destination address, etc.¶
This would likely result in a 10-30x decrease in cryptographic cost compared to standard IPsec. However, it would also offer no SAV defense against any attacker who can view legitimate traffic. An attacker who can read a single authenticated packet could simply replace the payload, allowing it to issue an unlimited number of spoofed packets.¶
It has two ways for an ACS to generate tags. One is using a state machine. The state machine runs and triggers the state transition when time is up. The tag is generated in the process of state transition as the side product. The two ACS in peer AS respectively before data transmission will maintain one state machine pair for each bound. The state machine runs simultaneously after the initial state, state transition algorithm, and state transition interval are negotiated, thus they generate the same tag at the same time. Time triggers state transition which means the ACS MUST synchronize the time to the same time base using like NTP defined in [RFC5905].¶
For the tag generation method, it MUST be to specify the initial state and initial state length of the state machine, the identifier of a state machine, state transition interval, length of generated Tag, and Tag. For the SA, they will transfer all these payloads in a secure channel between ACS and ASBRs, for instance, in ESP [RFC4303]. It is RECOMMENDED to transfer the tags rather than the SA for security and efficiency considerations. The initial state and its length can be specified at the Key Exchange Payload with nothing to be changed. The state machine identifier is the SPI value as the SPI value is uniquely in RISAV. The state transition interval and length of generated Tag should be negotiated by the pair ACS, which will need to allocate one SA attribute. The generated Tag will be sent from ACS to ASBR in a secure channel which MAY be, for example, ESP [RFC4303].¶
The use of IKEv2 between ASes might be fragile, and creates a number of potential race conditions (e.g. if the RPKI database contents change during the handshake). It is also potentially costly to implement, requiring O(N^2) network activity for N participating ASes. If these challenges prove significant, one alternative would be to perform the handshake statically via the RPKI database. For example, static-static ECDH [RFC6278] would allow ASes to agree on shared secrets simply by syncing the RPKI database.¶
Static negotiation makes endpoints nearly stateless, which simplifies the provisioning of ASBRs. However, it requires inventing a novel IPsec negotiation system, so it seems best to try a design using IKEv2 first.¶
In general, RISAV seeks to provide a strong defense against arbitrary active attackers who are external to the source and destination AS. However, different RISAV modes and configurations offer different security properties.¶
In Transport Mode, off-path attackers cannot spoof the source IPs of a participating AS, but any attacker with access to valid traffic can replay it (from anywhere), potentially enabling DoS attacks by replaying expensive traffic (e.g. TCP SYNs, QUIC Initials). ASes that wish to have replay defense, and are willing to pay the extra data-plane costs, should prefer tunnel mode.¶
An on-path attacker between two participating ASes could attempt to defeat RISAV by blocking IKEv2 handshakes to the Contact IP of a target AS. If the AS initiating the handshake falls back to non-RISAV behavior after a handshake failure, this enables the attacker to remove all RISAV protection.¶
This vulnerable behavior is required when the "testing" flag is set, but is otherwise discouraged.¶
RISAV provides significant security benefits even if it is only deployed by a fraction of all ASes. This is particularly clear in the context of reflection attacks. If two networks implement RISAV, no one in any other network can trigger a reflection attack between these two networks. Thus, if X% of ASes (selected at random) implement RISAV, participating ASes should see an X% reduction in reflection attack traffic volume.¶
This is the problem that requires one AS should be logically presented as one entity. That means all ASBRs of one AS should be acted like one ASBR. Otherwise, different source ASBR would add different IPsec ICV value to the packet. After forwarding, the packet may not arrive at the ASBR as the source ASBR thought. The ICV check may be failed. So the ACS is the entity that represents the AS to negotiate and communicate with peers. The ACS would deliver the messages including SAs and generate tags to the ASBR so that all ASBRs in the same AS would work like one ASBR for they have the same processing material and process in the same way. Thus, the multipath problem is solved.¶
When RISAV is used in transport mode, there is a risk of confusion between the RISAV AH header and end-to-end AH headers used by applications. This risk is particularly clear during transition periods, when the recipient is not sure whether the sender is using RISAV or not.¶
To avoid any such confusion, RISAV's transport mode uses a specialized RISAV-AH header. (In tunnel mode, no such confusion is possible.)¶
RISAV can OPTIONAL cooperate with intra-domain SAV and access-layer SAV, such as [RFC8704] or SAVI [RFC7039]. Only when intra-domain or access-layer SAV, if deployed, check passed can the packet process and forward correctly.¶
The ACS, represented by a contact IP, must be a high-availability, high-performance service to avoid outages. When it chooses to use a logical ACS, one AS will elect one distinguished ASBR as the ACS. The distinguished ASBR acting as an ACS will represent the whole AS to communicate with peer AS's ACS. This election takes place prior to the IKE negotiation. An ASBR MUST be a BGP speaker before it is elected as the distinguished ASBR.¶
RISAV requires participating ASes to perform symmetric cryptography on every RISAV-protected packet that they originate or terminate. This will require significant additional compute capacity that may not be present on existing networks. However, until most ASes actually implement RISAV, the implementation cost for the few that do is greatly reduced. For example, if 5% of networks implement RISAV, then participating networks will only need to apply RISAV to 5% of their traffic.¶
Thanks to broad interest in optimization of IPsec, very high performance implementations are already available. For example, as of 2021 an IPsec throughput of 1 Terabit per second was achievable using optimized software on a single server [INTEL].¶
TODO: Figure out what to say about MTU, PMTUD, etc. Perhaps an MTU probe is required after setup? Or on an ongoing basis?¶
As all the outter IP header should be the unicast IP address, NAT-traversal mode is not necesarry in inter-AS SAV.¶
IF APPROVED IANA is requested to add the following entry to the Assigned Internet Protocol Numbers registry:¶