Dataplane

Introduction

The dataplane is an implementation of a FIB, whereby it receives forwarding-related state from the rest of the system, including that of forwarding-related features. The state it receives can then be used to program a software forwarding pipeline, a hardware forwarding pipeline or both depending on the forwarding functionality and platform. The dataplane uses the DPDK to drive Network Interface Card (NIC)-like hardware and the FAL for driving switching silicon. The two can be combined together to form a switch-silicon-based platform with a high-speed punt path for features that aren't supported in silicon.

Overall Architecture

The dataplane interacts with the rest of DANOS via vplaned (sometimes referred to as "the controller") and the route broker. It has 3 major functional pieces: the "main thread" which executes control processing from the external interfaces, the pipeline which performs high-speed forwarding in software, and the FAL which manages hardware switching silicon.

 

Control Processing

The control processing portion of the dataplane (also known as the "main thread")  is responsible for handling the interaction with vplaned and the route broker. It handles message processing for features that are being proxied via vplaned and also handles route updates being sent from the route broker. Features can register handlers with the "main thread" to receive messages destined for them. The FAL handlers which dispatch calls to FAL plugins are also called from the "main thread" as part of feature dispatch.

Software forwarding infrastructure

The dataplane is built on top of DPDK, enabling fast, scalable packet processing in software. The DPDK is used as the general hardware abstraction layer in the dataplane. The dataplane also relies extensively on Userspace RCU to provide scalable lock free data-structures when data must be shared between cores.

In addition to the "main thread", there are forwarding threads that are assigned to the lcores on which the dataplane is configured to run.  There is exactly one forwarding thread running per forwarding lcore. If multiple queues are assigned to an lcore then they are processed in a round-robin manner according to this order:

  1. All RX queues, processing 32 packets at a time (run-to-completion) through the software forwarding pipeline, with the RX queues ordered by time of assignment in the simple case.

  2. Crypto receive/transmit

  3. All TX queues, processing a batch at a time with the TX queues ordered by time of assignment in the simple case.

This follows the DPDK architecture and with the underlying intention being to avoid the large overhead associated with context switching between threads.

For performance, the dataplane does no locking in either RX or TX per-queue data-structures, but allows multiple RX or TX queues to be serviced for the same port simultaneously. The dataplane thus assigns RX queues so that they are used by a single lcore (a logical core may consist of a physical core with two logical CPUs, i.e. HyperThreads) and similarly TX queues so that they are used by a single lcore.

Packet forwarding may occur in one of two ways, directpath forwarding and non-directpath forwarding. In the case where directpath forwarding can be used for an interface then the TX processing assignment can be avoided and each core can use a TX queue for the interface derived from the lcore ID the packet was received on. In the case where it has been determined that non-directpath forwarding must be used for an interface then the RX queues are assigned to one or more lcores and the dequeuing from the TX pkt ring and enqueueing to the TX NIC queue is also assigned to an lcore. To determine if a port uses directpath or non-directpath forwarding the following logic is applied.

 

Pipeline

The processing pipeline is a design used to encapsulate processing stages into discrete processing blocks, with standard defined input/outputs. The reason for this is to allow both runtime and compile time definition of the processing path for packet flow. Currently behavior exists as a run to completion model (for performance), but with reconfigurable stages. Pipeline stages may be written using the Pipeline API.

There are three types of packet processing nodes within the dataplane:

  1. Nodes: Mandatory processing blocks

  2. Features: Optional processing blocks defined at compile time

  3. Dynamic Features: Optional processing blocks defined at run time. The reason all features are not dynamic features is that there is a slight performance advantage for features (over dynamic features).

The dataplane supports a compile time defined fixed (fused) pipeline with dynamic feature support.

Each node in the pipeline may reference a set of next nodes, of which one is chosen as a packet traverses it based on packet or other state context. The set of next nodes is fixed at the time the node is registered. In order to provide flexibility and extensibility, nodes can be declared as feature points which allows features to be registered against them and perform processing. Features are implemented using nodes and so have all the power and flexibility of the main graph invocation (e.g. they themselves could contain feature points). The decision of whether or not to invoke a feature is decided by the feature point node's semantics. For example, a feature point associated with an interface may allow features to be enabled/disabled on a per-interface basis, or alternatively a feature point could have global semantics and only allow features to be enabled/disabled globally.

FAL

The FAL is the integration point for hardware switch devices. This provides a generic set of APIs for our dataplane to program the hardware switch. Vendor specific code is written into FAL plugins which are loaded dynamically at runtime. 

The two key design principles of the FAL are:

  1. Keep the platform dependencies in the FAL plugin as much as possible.

  2. Keep state out of the FAL plugin as much as possible, to keep the FAL plugin as simple as possible.

The FAL aims to abstract the platform as much as possible to allow the application to be independent of platform specifics. The implementation of a FAL plugin would ideally place as much of the platform specifics into data files, rather than in the code, in order to allow new platforms to be integrated quickly, but a new piece of platform functionality that interacts with the switch chip may require new code and platform data parsing to use it.

Shadow interfaces

The dataplane allows punting packets for processing by the Linux kernel via the shadow device infrastructure. Kernel originated packets are injected into the appropriate part of the pipeline via these interfaces. There are three reasons to punt a packet to the kernel for processing.

  1. The packet is destined for the router and needs to be handled by a control plane process

  2. The packet is destined to an interface that is managed by the kernel.

    1. One may blacklist interfaces from the dataplane when doing development.

  3. The packet needs to be processed by a feature not implemented by the dataplane

    1. This is rare in the current version of DANOS but the infrastructure for it still exists and was important in past iterations of the software

These punt devices are based on the TAP/TUN driver in the Linux kernel. All shadow device packets are handled by a single thread running on lcore 0.

Default Packet Flow

Out of the box DANOS has a preconfigured packet flow for the dataplane. The following diagrams layout the high level blocks for this packet processing.

Related Source Code

https://github.com/danos/vyatta-dataplane