TCP BBR - Exploring TCP congestion control

Photo by Kalen Emsley on Unsplash


Bottleneck Bandwidth and Round-trip propagation time (BBR) is a TCP congestion control algorithm developed at Google in 2016. Up until recently, the Internet has primarily used loss-based congestion control, relying only on indications of lost packets as the signal to slow down the sending rate. This worked decently well, but the networks have changed. We have much more bandwidth than ever before; The Internet is generally more reliable now, and we see new things such as bufferbloat that impact latency. BBR tackles this with a ground-up rewrite of congestion control, and it uses latency, instead of lost packets as a primary factor to determine the sending rate.


Why is BBR better?

There are a lot of details I’ve omitted, and it gets complicated pretty quickly, but the important thing to know is that with BBR, you can get significantly better throughput and reduced latency. The throughput improvements are especially noticeable on long haul paths such as Transatlantic file transfers, especially when there’s minor packet loss. The improved latency is mostly seen on the last mile path, which is often impacted by Bufferbloat (4 seconds ping times, anyone?). Since BBR attempts not to fill the buffers, it tends to be better in avoiding buffer bloat.

Photo by Zakaria Zayane on Unsplash

let’s take BBR for a spin!

BBR has been in the Linux kernel since version 4.9 and can be enabled with a simple sysctl command. In my tests, I’m using two Ubuntu machines and Iperf3 to generate TCP traffic. The two servers are located in the same data center; I’m using two servers type: t1.small, which come with a 2.5Gbps NIC.

tc qdisc replace dev enp0s20f0 root netem latency 70ms
root@compute-000:~# ping
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=61 time=140 ms
64 bytes from icmp_seq=2 ttl=61 time=140 ms
64 bytes from icmp_seq=3 ttl=61 time=140 ms
sysctl -w net.ipv4.tcp_congestion_control=cubic
sysctl -w net.ipv4.tcp_congestion_control=bbr

The effect of packet loss on throughput

We’re going to repeat the same test as above, but with the addition of a minor amount of packet loss. With the command below, I’m introducing 1,5% packet loss on the server (sender) side only.

tc qdisc replace dev enp0s20f0 root netem loss 1.5% latency 70ms
Throughput Test results with various congestion control algorithms

TCP socket statistics

As you’re exploring tuning TCP performance, make sure to use socket statistics, or ss, like below. This tool displays a ton of socket information, including the TCP flow control algorithm used, the round trip time per TCP session as well as the calculated bandwidth and actual delivery rate between the two peers.

root@compute-000:~# ss -tniState          Recv-Q            Send-Q                                 Local Address:Port                                  Peer Address:PortESTAB 0 9172816 [::ffff:]:5201 [::ffff:]:37482
bbr wscale:8,8 rto:344 rtt:141.401/0.073 ato:40 mss:1448 pmtu:1500 rcvmss:536 advmss:1448 cwnd:3502 ssthresh:4368 bytes_acked:149233776 bytes_received:37 segs_out:110460 segs_in:4312 data_segs_out:110459 data_segs_in:1 bbr:(bw:354.1Mbps,mrtt:140,pacing_gain:1,cwnd_gain:2) send 286.9Mbps lastsnd:8 lastrcv:11008 pacing_rate 366.8Mbps delivery_rate 133.9Mbps busy:11008ms rwnd_limited:4828ms(43.9%) unacked:4345 retrans:7/3030 lost:7 sacked:1197 reordering:300 rcv_space:28960 rcv_ssthresh:28960 notsent:2881360 minrtt:140

When to use BBR

Both Cubic and BBR perform well for these longer latency links when there is no packet loss, and BBR really shines under (moderate) packet loss. Why is that important? You could argue why you would want to design for these packet loss situations. For that, let’s think about a situation where you have multiple data centers around the world, and you rely on transit to connect the various data centers (possibly using your own Overlay VPN). You likely have a steady stream of data between the various data centers, think of logs files, ever-changing configuration or preference files, database synchronization, backups, etc. All major Transit providers at times suffer from packet loss due to various reasons. If you have a few dozen of these globally distributed data centers, depending on your Transit providers and the locations of your POPs you can expect packet loss incidents between a set of data centers several times a week. In situations like this BBR will shine and help you maintain your SLO’s.

Downsides of BBR

It sounds great right, just execute this one sysctl command, and you get much better throughput resulting in your users to get a better experience. Why would you not do this? Well, BBR has received some criticism due to its tendency to consume all available bandwidth and pushing out other TCP streams that use say Cubic or different congestion algorithms. This is something to be mindful of when testing BBR in your environment. BBRv2 is supposed to resolve some of these challenges.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store