Configuring Control Plane Traffic Protection (CPTP)
About the Importance of Hardening of Network Infrastructure
The network infrastructure is often a prime target for malicious attacks because of the possibility of inflicting the most amount of damage to as many devices as possible (in the worst case scenario, to bring down the entire network and, with it, all the attached devices).
Other reasons to target the network may be to attempt to redirect (for example, through unintended flooding) and snoop traffic to learn about clear-text information and find out possible other weaknesses that can lead to further malicious actions.
Last but not least, network instability sometimes caused by mis-configurations or failures may result in an excessive amount of traffic that can put a strain on the network infrastructure and further exacerbate its stability problems.
Therefore, first and foremost, it is of foundational importance to apply robust control to the traffic that reaches the network devices and to implement appropriate protections against any potentially disruptive traffic.
In particular, this section focuses specifically on the hardware-based protection of the network control plane. Other complementary mechanisms, such as hardware-based security and QoS policies described in detail in other topics can be applied to the data plane too for comprehensive network hardening.
About Control Plane Traffic Protection
In any network device there exists a management entity (typically a CPU) that is in charge of communication exchanges with other networking devices as well as of interactions with a portion of the traffic coming from the rest of the network (the so-called data plane). In general, all the traffic that is natively directed or purposefully redirected to such management processor is commonly referred to as control plane traffic.
When the amount of any class or classes of traffic belonging to the control plane becomes abnormal--e.g., due to a Denial of Service (DoS) attack attempt--then the network device needs to take some containment action.
Hardware-based queuing and rate limiting are two common techniques employed to implement CPU protections, with different levels of granularity and control depending on the switch model’s hardware capabilities.
About Port-based Control Plane Protection
Certain switch models use an internal Ethernet port to transport a portion of the control plane traffic to the management CPU. This special type of interface is referred to as a rear-facing interface. The list of models with a rear facing interface includes the following platforms:
- Edgecore: AS7716-32X, AS7316-54XS, AS7712-32X, AS7312-54XS, AS7716-32X
- Pluribus switches: F9532-C, F9572-V, F9532C-XL-R
- Ericsson: NSU, NRU01, NRU02, NRU03
For this case 8 queues are available for control plane traffic segregation and rate-limiting, which Netvisor ONE leverages to protect the CPU from anomalous traffic.
By default, mission-critical control plane traffic is split across 7 weighted queues based on common network management requirements.
In addition, queue 0 is the default ‘catch-all’ queue that corresponds to all the control plane traffic not specifically segregated into one of the other seven queues.
Any of the eight CPU queues is configured with a default maximum transmission rate suitable to protect the control plane from overloading. If needed, the default rate values may be modified by the network administrator to match specific design requirements.
About Advanced Control Plane Protection
Pluribus Networks has implemented support for Advanced Control Plane Traffic Protection (CPTP) with Auto-Quarantine. This feature is supported on the CPU inband interface of the Freedom and Edgecore data center platforms, as well as of Dell's Open Networking Switches.
This very granular capability allows the control plane’s processing path to be protected against both misbehaving and malicious devices (compromised end-points, rogue network nodes, etc.) that may start pumping an abnormal amount of control plane traffic. In Pluribus parlance, this is also referred to as CPU hog protection.
Advanced CPTP operates over 43 independent queues (from 0 to 42) in order to be able to provide separation and granular control over different types of control plane traffic classes. It can be enabled with the system-settings-modify cpu-class-enable command.
Note: The current default setting is no-cpu-class-enable. To change it to cpu-class-enable requires a subsequent system reboot for the setting to take effect.
Note: CPTP on the inband interface performs CPU traffic classification and queuing in hardware, therefore there is no performance penalty in enabling this feature. CPTP queuing supports round-robin scheduling and rate limiting of individual CPU traffic classes, in addition to guaranteeing minimum buffer space allocation to each class.
CPU resources are protected by segregating into separate queues the following types of traffic by default: various standard network control packets, cluster communication messages, fabric updates as well as regular flooded traffic, packets required for MAC learning and copy-to-cpu packets, analytics, etc.
In addition, custom traffic classes can be added by configuring user-defined CPU policies (for example, for troubleshooting purposes).
Note: Traffic flows that end up sharing a user-defined CPU queue will compete with each other for bandwidth. It is therefore recommended to configure queue-sharing only for traffic that does not constantly compete for CPU time under identical circumstances. Whenever possible, competing classes should be assigned to different queues.
Advanced CPTP can be very granular with its innovative auto-quarantine (a.k.a. CPU hog protection) mechanism. As a user, you can enable the CPU hog protection capability (through the cpu-class-modify command) for the following protocols (in Pluribus’ parlance also called CPU classes): OSPF, BGP, BFD, LACP, STP, ARP, VRRP, and LLDP.
When auto-quarantine is enabled, the Netvisor ONE software monitors control plane packets arriving at the CPU on a per-source-device basis. Traffic from a source device that is deemed to be consuming too much bandwidth (as per user-configurable rate-limit value) is redirected to a dedicated per-protocol quarantine queue by installing a hardware policy entry.
At the same time a syslog alert is displayed and the offending source device’s subsequent activity is monitored. Quarantine state is left automatically only when the traffic activity returns below acceptable limits for a pre-configured timeout time; then a corresponding syslog is displayed. You can view messages using the log-event-show event-type system command.
Only certain system-defined protocol queues support hog protection. For these CPU classes the user can choose to enable the CPU hog protection capability, or to select the enable-and-drop option. In the latter case, all traffic from the quarantined source with the assigned protocol is dropped during ingress.
Since hardware resources are limited, it is also possible to specify a threshold for the maximum number of acceptable CPU hog violators per port. When reached, such threshold causes that class’s auto-quarantine hardware policy to become per switch-port (i.e., less granular) instead of being per-port per-offender.