Skip to main content

In-network computing

Terms related to simplyblock

In-network computing runs selected logic inside the network data path instead of pushing every decision to the host CPU. Teams use programmable switches and network-attached accelerators such as SmartNICs, DPUs, and IPUs to apply focused processing at very high packet rates.

Storage and platform leaders care about one outcome: less infrastructure overhead on the host during heavy traffic. That matters most when clusters rely on NVMe/TCP, serve Kubernetes Storage at scale, and need Software-defined Block Storage to keep latency steady for multiple tenants.

Why in-network computing becomes a priority in enterprise clusters

Platform teams hit a wall when platform work starts to dominate the CPU budget. East-west traffic grows, encryption becomes the default, microsegmentation expands, and observability increases the volume of metrics and traces. Those tasks compete with databases, streaming jobs, and storage services on the same nodes.

In-network computing helps by shifting selected enforcement and measurement into the data path. Teams usually see the biggest gains in p95 and p99 latency because fewer shared bottlenecks remain on the host during bursts.

🚀 Apply In-network Computing to NVMe/TCP Storage in Kubernetes
Use Simplyblock™ to run Software-defined Block Storage with multi-tenancy and QoS on standard Ethernet.
👉 See NVMe/TCP Kubernetes Storage with Simplyblock →

How in-network computing is implemented in real deployments

Most in-network computing deployments combine a programmable data plane with a control plane that configures it. Many teams use P4 to define packet behavior, then compile it for a target such as a programmable switch ASIC or an accelerated NIC pipeline.

The best results come from keeping the logic tight and predictable. Data-plane code fits tasks like classification, counters, metering, filtering, and fast-path checks. Complex state, large memory needs, or frequent rule changes usually belong on hosts or in dedicated services.

NVMe/TCP data paths and Ethernet fabrics

NVMe/TCP runs NVMe over standard TCP/IP on Ethernet. Teams use it for routable fabrics and SAN alternative designs without locking into an RDMA-only network. This lines up with in-network computing because many costs sit in the network and security layers around the storage flow, not in the NVMe media itself.

At high IOPS, hosts burn CPU on TCP processing, encryption, policy checks, and telemetry. When the network enforces parts of policy and collects parts of telemetry closer to the wire, hosts keep more headroom during bursts, and shared clusters hold steadier tail latency.

In-network computing infographics
In-network computing

Kubernetes Storage architectures that benefit from data-path offload

In-network computing does not replace storage services. In Kubernetes Storage, you still need Software-defined Block Storage to deliver volumes, snapshots, resiliency, multi-tenancy, and QoS. Kubernetes also still relies on CSI workflows for provisioning and lifecycle control.

Hyper-converged layouts run apps and storage on the same nodes, so CPU pressure shows up fast during spikes. Disaggregated layouts split storage nodes from app nodes, which makes the network path and isolation boundaries more important. Offload helps in both models when it cuts the host-side platform tax that would otherwise compete with scheduling and storage I/O. 

Where computation runs in storage-heavy Kubernetes clusters

The table below compares common placements for enforcement and measurement functions that impact NVMe/TCP flows, and how each tends to affect Kubernetes Storage consistency when Software-defined Block Storage is serving multiple tenants.

Placement choiceTypical functions that fitTypical impact on storage-heavy clusters
Programmable switch pipelineClassification, counters, basic filtering, meteringReduces noisy traffic patterns early, limits hotspot amplification
Accelerated NIC, DPU, or IPUPer-host shaping, inline policy hooks, telemetry, crypto assistPreserves host CPU for apps and storage services, steadier p99 under load
Host CPU and kernelFull flexibility, broad feature coverageHigher contention during bursts, more tail-latency risk

Performance and CPU efficiency in Software-defined Block Storage

Data-path offload works best when the storage layer also minimizes overhead. Storage teams aim for fewer copies, fewer context switches, and stable CPU use at high queue depth because those factors directly shape tail latency.

Shared clusters raise the bar further. In-network computing can reduce host contention by moving selected policy and telemetry into the data path, but multi-tenancy and QoS still need firm controls when Software-defined Block Storage serves multiple tenants over NVMe/TCP. 

Simplyblock™ and in-network computing for NVMe/TCP Kubernetes Storage

Simplyblock™ fits the in-network computing model because the offload layer and the storage layer solve different problems. In-network controls can handle traffic policy, telemetry, and enforcement closer to the wire, which reduces host pressure during storage-heavy periods. Simplyblock then delivers Software-defined Block Storage for Kubernetes Storage over NVMe/TCP, with multi-tenancy and QoS to limit noisy-neighbor impact.

Teams get the best outcomes when they pair in-network data-path controls with storage-side QoS. That mix protects CPU headroom and keeps performance boundaries enforceable.

These glossary terms are commonly reviewed alongside In-network computing when planning offload-heavy networking and storage data paths in Kubernetes Storage.

Tail Latency
Storage Latency
NVMe Latency
NVMe over RoCE

Questions and Answers

How does in-network computing improve data center performance?

In-network computing processes data directly within the network hardware, reducing latency and offloading tasks from CPUs. When combined with DPUs, it enables faster storage access, secure isolation, and efficient traffic handling at scale.

What are the benefits of in-network computing in Kubernetes environments?

By executing tasks like encryption, filtering, and NVMe-oF offload inside the network, in-network computing reduces pod-to-storage latency and network congestion. This results in better performance and improved scalability for stateful workloads.

Is in-network computing better than host-based processing for storage I/O?

Yes, offloading I/O operations into the network reduces CPU bottlenecks and enables parallel data movement. This is particularly useful for software-defined storage where high throughput and low latency are critical.

Which hardware enables in-network computing in cloud infrastructure?

Technologies like DPUs, IPUs, and programmable switches from vendors like NVIDIA (BlueField) and Intel (E2200 IPU) enable in-network computing. These devices execute functions such as storage translation, firewalls, and telemetry directly in the data path.

How does in-network computing support secure multi-tenancy in the cloud?

In-network computing allows hardware-enforced isolation and processing of security policies, reducing risks of data leakage. This enhances multi-tenant security while maintaining performance for distributed applications across tenants.