Discussion:
[Cerowrt-devel] beating the drum for BQL
Dave Taht
2018-08-23 00:49:08 UTC
Permalink
I had a chance to give a talk at broadcom recently, slides here:

http://flent-fremont.bufferbloat.net/~d/broadcom_aug9.pdf

(there's a fun slide on "carmageddon", and I am finding the
"tcp_square_wave" test *works* on EE types)

I was very happy to see BQL support in all of broadcom's *ethernet*
drivers, but along the way I noticed how many other drivers still
lacked it in the current kernel . Notably I figured a few dsl devices
should have it by now, and no doubt a few other device types.

I/we really should have beat the bql drum harder over the last 6
years. It's the basic start to all the debloating.
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Mikael Abrahamsson
2018-08-23 05:47:40 UTC
Permalink
I/we really should have beat the bql drum harder over the last 6 years.
It's the basic start to all the debloating.
It only helps with kernel based forwarding. A lot of devices don't even
use this, especially as speeds go up. They use packet accelerators so the
kernel never sees the packets after initial flow setup.

So you need to get the people developing that silicon to get with the
program.
--
Mikael Abrahamsson email: ***@swm.pp.se
Sebastian Moeller
2018-08-23 13:06:46 UTC
Permalink
Hi Mikael,
I/we really should have beat the bql drum harder over the last 6 years. It's the basic start to all the debloating.
It only helps with kernel based forwarding. A lot of devices don't even use this, especially as speeds go up. They use packet accelerators so the kernel never sees the packets after initial flow setup.
So you need to get the people developing that silicon to get with the program.
Or we could convince customers to stop buying toy router's that only work under a severely limited set of circumstances and opt for devices that pack enough CPU-punch to be actually adequate for modern internet speed tiers, no? Now if the packet accelerator is jus there to help save energy for typical cases but the CPU is powerful enough to handle a modern line with all the bells and whistles customers except, then I am all for it, but if the packet accelerator is just there to paper over an anemic CPU...

Best Regards
Sebastian

P.S.: I ignore issues of cost for ISP supplied CPE devices here since a) most ISPs I know actually charge rent for these devices now b) should have enough volume to drive down prices even for devices that are sufficiently fast for at least the 300 - 500 Mbps class of combined up- and download....
--
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Mikael Abrahamsson
2018-08-23 10:51:42 UTC
Permalink
http://flent-fremont.bufferbloat.net/~d/broadcom_aug9.pdf <http://flent-fremont.bufferbloat.net/~d/broadcom_aug9.pdf>
Thanks for sharing, this is really useful, raising awareness where it matters. Quite a bit of content... :)
https://community.ubnt.com/t5/EdgeRouter-Beta/BQL-support/m-p/2466657 <https://community.ubnt.com/t5/EdgeRouter-Beta/BQL-support/m-p/2466657>
https://community.ubnt.com/t5/EdgeMAX-Beta-Blog/New-EdgeRouter-firmware-2-0-0-alpha-2-has-been-released/ba-p/2414938 <https://community.ubnt.com/t5/EdgeMAX-Beta-Blog/New-EdgeRouter-firmware-2-0-0-alpha-2-has-been-released/ba-p/2414938>
My only experience with these devices is the Edgerouter 3/5/X, and they
have very low performance if you disable offloads (which you need to do to
enable AQM) and run everything in CPU, around 100 megabit/s of
uni-directional traffic.

Do they have other platforms where this would actually matter?
--
Mikael Abrahamsson email: ***@swm.pp.se
Dave Taht
2018-08-24 17:30:22 UTC
Permalink
Post by Dave Taht
http://flent-fremont.bufferbloat.net/~d/broadcom_aug9.pdf
Thanks for sharing, this is really useful, raising awareness where it matters. Quite a bit of content... :)
https://community.ubnt.com/t5/EdgeRouter-Beta/BQL-support/m-p/2466657
This started a discussion, and no, so far it looks like there’s no BQL support in the upcoming 2.0 release.
For my own benefit, re-reading the original patch series comment (https://lwn.net/Articles/469652/) makes it sound like BQL is useful even without AQM (original benchmarks were done with straight pfifo_fast). I didn’t realize this, actually. If anything incorrect about BQL was said in this discussion, correct us, please… :)
yes, bql is very useful even with pfifo fast. without BQL I doubt the
internet would be scaling as it is today in the dc, or on the smaller
hosts and devices that support it. It's in the mvneta, it's in the
ar71xx, with documented results there that I could dig up. (tho:
things like tsq are helping and mask the problem on simple tests) The
experiment I documented on the slides that kicked off this thread and
the other experiment on the systemd bug, easily show the benefit on
hosts forwarding packets (be they from local applications, coming from
various sources like docker containers, etc), and anyone can show what
goes wrong if you disable BQL nowadays, basically restoring linux-3.3
behavior, with a very simple test:

For I in /sys/class/net/your_device/queues/tx*/byte_queue_limits/limit_min
do
echo 10000000 > $I
done

so long as you run enough kinds of flows that don't engage TSQ.

However, in the edgerouter w/offloads case all that part of the stack
has been short circuited into the offload engine. I don't know how
much buffering is in there on the new firmware, I'd done a few tests
on it in the old days, showing it to be around 10ms at gigE but even
that memory is kind of vague (the easy test here is slam two ports
into one), and for all I know the new firmware is worse, without going
back to track this new release. (I do have a few edgerouters but they
are all in production)

There was also a paper on BQL a few years back that I can dig up....
Post by Dave Taht
Pete
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Dave Taht
2018-08-24 18:43:01 UTC
Permalink
Post by Dave Taht
Post by Dave Taht
http://flent-fremont.bufferbloat.net/~d/broadcom_aug9.pdf
Thanks for sharing, this is really useful, raising awareness where it matters. Quite a bit of content... :)
https://community.ubnt.com/t5/EdgeRouter-Beta/BQL-support/m-p/2466657
This started a discussion, and no, so far it looks like there’s no BQL support in the upcoming 2.0 release.
For my own benefit, re-reading the original patch series comment (https://lwn.net/Articles/469652/) makes it sound like BQL is useful even without AQM (original benchmarks were done with straight pfifo_fast). I didn’t realize this, actually. If anything incorrect about BQL was said in this discussion, correct us, please… :)
yes, bql is very useful even with pfifo fast. without BQL I doubt the
internet would be scaling as it is today in the dc, or on the smaller
hosts and devices that support it. It's in the mvneta, it's in the
things like tsq are helping and mask the problem on simple tests) The
experiment I documented on the slides that kicked off this thread and
the other experiment on the systemd bug, easily show the benefit on
hosts forwarding packets (be they from local applications, coming from
various sources like docker containers, etc), and anyone can show what
goes wrong if you disable BQL nowadays, basically restoring linux-3.3
For I in /sys/class/net/your_device/queues/tx*/byte_queue_limits/limit_min
do
echo 10000000 > $I
done
so long as you run enough kinds of flows that don't engage TSQ.
However, in the edgerouter w/offloads case all that part of the stack
has been short circuited into the offload engine. I don't know how
much buffering is in there on the new firmware, I'd done a few tests
on it in the old days, showing it to be around 10ms at gigE but even
that memory is kind of vague (the easy test here is slam two ports
into one), and for all I know the new firmware is worse, without going
back to track this new release. (I do have a few edgerouters but they
are all in production)
There was also a paper on BQL a few years back that I can dig up....
The only academic analysis so of BQL i knew of was this: "bufferbloat
systemic analysis": http://200 dot 131 dot 219 dot
61/publications/2014/its2014_bb.pdf - note that bufferbloat.net's
filters don't let me post numeric urls and you can find the paywalled
versions by searching for that title on google scholar. Or on sci-hub.

I found that again by re-reading my preso to sigcomm 2014 "The value
of repeatable experiments and negative results" -
https://conferences.sigcomm.org/sigcomm/2014/doc/slides/137.pdf )
which - in addition to providing some value-able history and links to
the bufferbloat, fq, and aqm efforts, is really one of my best rants
*ever* aimed at the academic research and publication process.

I enjoyed writing that, and giving the preso *a lot*. For some reason
or another sigcomm has not invited me back. :)
Post by Dave Taht
Post by Dave Taht
Pete
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Loading...