Discussion:
[Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Frits Riep
2014-05-20 22:11:50 UTC
Permalink
The concept of eliminating bufferbloat on many more routers is quite
appealing. Reading some of the recent posts makes it clear there is a
desire to get to a stable code, and also to find a new platform beyond the
current Netgear. However, as good as some of the proposed platforms maybe
for developing and for doing all of the new capabilities of CeroWRT, I also
would like to propose that there also be some focus on reaching a wider and
less sophisticated audience to help broaden the awareness and make control
of bufferbloat more available and easier to attain for more users.



. It appears there is a desire to merge the code into an upcoming
OpenWRT barrier breaker release, which is excellent as it will make it
easier to fight buffer bloat on a wide range of platforms and provide users
with a much easier to install firmware release. I'd like to be able to
download luci-qos-scripts and sqm-scripts and have basic bufferbloat control
Dave Taht
2014-05-20 23:14:03 UTC
Permalink
Post by Frits Riep
The concept of eliminating bufferbloat on many more routers is quite
appealing. Reading some of the recent posts makes it clear there is a
desire to get to a stable code, and also to find a new platform beyond the
current Netgear. However, as good as some of the proposed platforms maybe
for developing and for doing all of the new capabilities of CeroWRT, I also
would like to propose that there also be some focus on reaching a wider and
less sophisticated audience to help broaden the awareness and make control
of bufferbloat more available and easier to attain for more users.
I agree that reaching more users is important. I disagree we need to reach
Post by Frits Riep
· It appears there is a desire to merge the code into an upcoming
OpenWRT barrier breaker release, which is excellent as it will make it
easier to fight buffer bloat on a wide range of platforms and provide users
with a much easier to install firmware release. I’d like to be able to
download luci-qos-scripts and sqm-scripts and have basic bufferbloat control
on a much greater variety of devices and to many more users.
Frits Riep
2014-05-21 11:42:47 UTC
Permalink
Thanks Dave for your responses. Based on this, it is very good that qos-scripts is available now through openwrt, and as I experienced, it provides a huge advantage for most users. I would agree prioritizing ping is in and of itself not the key goal, but based on what I've read so far, fq-codel provides dramatically better responsiveness for any interactive application such as web-browsing, voip, or gaming, so it qos-scripts would be advantageous for users like your mom if she were in an environment where she had a slow and shared internet connection. Is that a valid interpretation? I am interested in further understanding the differences based on the brief differences you provide. It is true that few devices provide DSCP marking, but if the latency is controlled for all traffic, latency sensitive traffic benefits tremendously even without prioritizing by l7 (layer 7 ?). Is this interpretation also valid?

Yes, your mom wouldn't be a candidate for setting up ceroWRT herself, but if it were set up for her, or if it could be incorporated into a consumer router with automatically determining speed parameters, she would benefit totally from the performance improvement. So the technology ultimately needs to be taken mainstream, and yes that is a huge task.

Frits

-----Original Message-----
From: Dave Taht [mailto:***@gmail.com]
Sent: Tuesday, May 20, 2014 7:14 PM
To: Frits Riep
Cc: cerowrt-***@lists.bufferbloat.net
Subject: Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Post by Frits Riep
The concept of eliminating bufferbloat on many more routers is quite
appealing. Reading some of the recent posts makes it clear there is a
desire to get to a stable code, and also to find a new platform
beyond the current Netgear. However, as good as some of the proposed
platforms maybe for developing and for doing all of the new
capabilities of CeroWRT, I also would like to propose that there also
be some focus on reaching a wider and less sophisticated audience to
help broaden the awareness and make control of bufferbloat more available and easier to attain for more users.
· It appears there is a desire to merge the code into an upcoming
OpenWRT barrier breaker release, which is excellent as it will make it
easier to fight buffer bloat on a wide range of platforms and provide
users with a much easier to install firmware release. I’d like to be
able to download luci-qos-scripts and sqm-scripts and have basic
bufferbloat control on a much greater variety of devices and to many
more users.
d***@reed.com
2014-05-21 14:51:33 UTC
Permalink
Besides deployment in cerowrt and openwrt, what would really have high leverage is that the techniques developed in cerowrt's exploration (including fq_codel) get deployed where they should be deployed: in the access network systems: CMTS's, DSLAM's, Enterprise boundary gear, etc. from the major players.

Cerowrt's fundamental focus has been proving that the techniques really, really work at scale.

However, the fundamental "bloat-induced" experiences are actually occurring due to bloat at points where "fast meets slow". Cerowrt can't really fix the problem in the download direction (currently not so bad because of high download speeds relative to upload speeds in the US - that's in the CMTS's and DSLAM's.

What's depressing to me is that the IETF community spends more time trying to convince themselves that bloat is only a theoretical problem, never encountered in the field. In fact, every lab I've worked at (including the startup accelerator where some of my current company work) has had the network managers complaining to me that a single heavy FTP I'm running causes all of the other users in the site to experience terrible web performance. But when they call Cisco or F5 or whomever, they get told "there's nothing to do but buy complicated flow-based traffic management boxes to stick in line with the traffic (so they can "slow me down").

Bloat is the most common invisible elephant on the Internet. Just fixing a few access points is a start, but even if we fix all the access points so that uploads interfere less, there's still more impact this one thing can have.

So, by all means get this stuff into mainstream, but it's time to start pushing on the access network technology companies (and there are now open switches from Cumulus and even Arista to hack)
Post by Frits Riep
Thanks Dave for your responses. Based on this, it is very good that qos-scripts
is available now through openwrt, and as I experienced, it provides a huge
advantage for most users. I would agree prioritizing ping is in and of itself not
the key goal, but based on what I've read so far, fq-codel provides dramatically
better responsiveness for any interactive application such as web-browsing, voip,
or gaming, so it qos-scripts would be advantageous for users like your mom if she
were in an environment where she had a slow and shared internet connection. Is
that a valid interpretation? I am interested in further understanding the
differences based on the brief differences you provide. It is true that few
devices provide DSCP marking, but if the latency is controlled for all traffic,
latency sensitive traffic benefits tremendously even without prioritizing by l7
(layer 7 ?). Is this interpretation also valid?
Yes, your mom wouldn't be a candidate for setting up ceroWRT herself, but if it
were set up for her, or if it could be incorporated into a consumer router with
automatically determining speed parameters, she would benefit totally from the
performance improvement. So the technology ultimately needs to be taken
mainstream, and yes that is a huge task.
Frits
-----Original Message-----
Sent: Tuesday, May 20, 2014 7:14 PM
To: Frits Riep
Subject: Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat
control for consideration.
Post by Frits Riep
The concept of eliminating bufferbloat on many more routers is quite
appealing. Reading some of the recent posts makes it clear there is a
desire to get to a stable code, and also to find a new platform
beyond the current Netgear. However, as good as some of the proposed
platforms maybe for developing and for doing all of the new
capabilities of CeroWRT, I also would like to propose that there also
be some focus on reaching a wider and less sophisticated audience to
help broaden the awareness and make control of bufferbloat more available and
easier to attain for more users.
I agree that reaching more users is important. I disagree we need to reach them
Post by Frits Riep
· It appears there is a desire to merge the code into an
upcoming
Post by Frits Riep
OpenWRT barrier breaker release, which is excellent as it will make it
easier to fight buffer bloat on a wide range of platforms and provide
users with a much easier to install firmware release. I’d like to be
able to download luci-qos-scripts and sqm-scripts and have basic
bufferbloat control on a much grea
Dave Taht
2014-05-21 15:19:55 UTC
Permalink
Post by d***@reed.com
Besides deployment in cerowrt and openwrt, what would really have high
leverage is that the techniques developed in cerowrt's exploration
(including fq_codel) get deployed where they should be deployed: in the
access network systems: CMTS's, DSLAM's, Enterprise boundary gear, etc. from
the major players.
+10.
Post by d***@reed.com
Cerowrt's fundamental focus has been proving that the techniques really,
really work at scale.
That they even work on a processor designed in 1990! :)

I also have hoped that along the way we've shown what techniques don't
work...
Post by d***@reed.com
However, the fundamental "bloat-induced" experiences are actually occurring
due to bloat at points where "fast meets slow". Cerowrt can't really fix
the problem in the download direction (currently not so bad because of high
download speeds relative to upload speeds in the US - that's in the CMTS's
and DSLAM's.
Well, I disagree somewhat. The downstream shaper we use works quite
well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
has had the inbound shaper work up a little past 100mbits. So there is
no need (theoretically) to upgrade the big fat head ends if your cpe is
powerful enough to do the job. It would be better if the head ends did it,
of course....
Post by d***@reed.com
What's depressing to me is that the IETF community spends more time trying
to convince themselves that bloat is only a theoretical problem, never
encountered in the field. In fact, every lab I've worked at (including the
It isn't all the IETF. Certainly google gets it and has made huge strides.
reduced RTT = money.

My own frustration comes from papers that are testing this stuff at 4mbit
or lower and not seeing the results we get above that, on everything.

https://plus.google.com/u/0/107942175615993706558/posts/AbeHRY9vzLR

ns2 and ns3 could use some improvements...
Post by d***@reed.com
startup accelerator where some of my current company work) has had the
network managers complaining to me that a single heavy FTP I'm running
causes all of the other users in the site to experience terrible web
performance. But when they call Cisco or F5 or whomever, they get told
"there's nothing to do but buy complicated flow-based traffic management
boxes to stick in line with the traffic (so they can "slow me down").
It is sad that F5, in particular, doesn't have a sane solution. Their whole
approach is to have a "load-balancer" and fq_codel is a load-balancer to
end all load balancers.

I do note nobody I know has ported BQL or fq_codel to bsd (codel is in bsd now)
Post by d***@reed.com
Bloat is the most common invisible elephant on the Internet. Just fixing a
+10.
Post by d***@reed.com
few access points is a start, but even if we fix all the access points so
that uploads interfere less, there's still more impact this one thing can
have.
I was scared silly at the implications 2 years back, I am more sanguine
now.
Post by d***@reed.com
So, by all means get this stuff into mainstream, but it's time to start
pushing on the access network technology companies (and there are now open
switches from Cumulus and even Arista to hack)
Oh, cool! I keep waiting for my parallella to show up so I could start
fiddling with ethernet in the fpga....
Post by d***@reed.com
Thanks Dave for your responses. Based on this, it is very good that
qos-scripts
is available now through openwrt, and as I experienced, it provides a huge
advantage for most users. I would agree prioritizing ping is in and of
itself not
the key goal, but based on what I've read so far, fq-codel provides dramatically
better responsiveness for any interactive application such as
web-browsing, voip,
or gaming, so it qos-scripts would be advantageous for users like your mom if she
were in an environment where she had a slow and shared internet
connection. Is
that a valid interpretation? I am interested in further understanding the
differences based on the brief differences you provide. It is true that
few
devices provide DSCP marking, but if the latency is controlled for all traffic,
latency sensitive traffic benefits tremendously even without prioritizing by l7
(layer 7 ?). Is this interpretation also valid?
Yes, your mom wouldn't be a candidate for setting up ceroWRT herself, but if it
were set up for her, or if it could be incorporated into a consumer router with
automatically determining speed parameters, she would benefit totally from the
performance improvement. So the technology ultimately needs to be taken
mainstream, and yes that is a huge task.
Frits
-----Original Message-----
Sent: Tuesday, May 20, 2014 7:14 PM
To: Frits Riep
Subject: Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat
control for consideration.
Post by Frits Riep
The concept of eliminating bufferbloat on many more routers is quite
appealing. Reading some of the recent posts makes it clear there is a
desire to get to a stable code, and also to find a new platform
beyond the current Netgear. However, as good as some of the proposed
platforms maybe for developing and for doing all of the new
capabilities of CeroWRT, I also would like to propose that there also
be some focus on reaching a wider and less sophisticated audience to
help broaden the awareness and make control of bufferbloat more available and
easier to attain for more users.
I agree that reaching more users is important. I disagree we need to reach them
Post by Frits Riep
· It appears there is a desire to merge the code into an
upcoming
Post by Frits Riep
OpenWRT barrier breaker release, which is excellent as it will make it
easier to fight buffer bloat on a wide range of platforms and provide
users with a much easier to install firmware release. I’d like to be
able to download luci-qos-scripts and sqm-scripts and have basic
bufferbloat control on a much greater variety of devices and to many
more users.
d***@reed.com
2014-05-21 16:03:08 UTC
Permalink
Post by Dave Taht
Well, I disagree somewhat. The downstream shaper we use works quite
well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
has had the inbound shaper work up a little past 100mbits. So there is
no need (theoretically) to upgrade the big fat head ends if your cpe is
powerful enough to do the job. It would be better if the head ends did it,
of course....
There is an advantage for the head-ends doing it, to the extent that each edge device has no clarity about what is happening with all the other cpe that are sharing that head-end. When there is bloat in the head-end even if all cpe's sharing an upward path are shaping themselves to the "up to" speed the provider sells, they can go into serious congestion if the head-end queues can grow to 1 second or more of sustained queueing delay. My understanding is that head-end queues have more than that. They certainly do in LTE access networks.
Dave Taht
2014-05-21 16:30:12 UTC
Permalink
Post by d***@reed.com
Post by Dave Taht
Well, I disagree somewhat. The downstream shaper we use works quite
well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
has had the inbound shaper work up a little past 100mbits. So there is
no need (theoretically) to upgrade the big fat head ends if your cpe is
powerful enough to do the job. It would be better if the head ends did it,
of course....
There is an advantage for the head-ends doing it, to the extent that each
edge device has no clarity about what is happening with all the other cpe
that are sharing that head-end. When there is bloat in the head-end even if
all cpe's sharing an upward path are shaping themselves to the "up to" speed
the provider sells, they can go into serious congestion if the head-end
queues can grow to 1 second or more of sustained queueing delay. My
understanding is that head-end queues have more than that. They certainly
do in LTE access networks.
Compelling argument! I agree it would be best for the devices that have the
most information about the network to manage themselves better.

It is deeply ironic to me that I'm arguing for an e2e approach on fixing
the problem in the field, with you!

http://en.wikipedia.org/wiki/End-to-end_principle
--
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
d***@reed.com
2014-05-21 17:55:07 UTC
Permalink
The end-to-end argument against putting functionality in the network is a modularity principle, as you know. The exception is when there is a function that you want to provide that is not strictly end-to-end.

Congestion is one of them - there is a fundamental issue with congestion that it happens because of collective actions among independent actors.

So if you want to achieve the goals of the modularity principle, you need to find either a) the minimal sensing and response you can put in the network that allows the independent actors to "cooperate", or b) require the independent actors to discover and communicate amongst each other individually.

Any solution that tries to satisfy the modularity principle has the property that it provides sufficient information, in a sufficiently timely manner, for the independent actors to respond "cooperatively" to resolve the issue (by reducing their transmission volume in some - presumably approximately fair - way).

Sufficiently timely is bounded by the "draining time" of a switch's outbound link's queue. For practical applications of the Internet today, the "draining time" should never exceed about 30-50 msec., at the outbound link's rate. However, the optimal normal depth of the queue should be no larger than the size needed to keep the outbound link continuously busy at its peak rate whatever that is (for a shared WiFi access point the peak rate is highly variable as you know).

This suggests that the minimal function the network must provide to the endpoints is the packet's "instantaneous" contribution to the draining time of the most degraded link on the path.

Given this information, a pair of endpoints know what to do. If it is a receiver-managed windowed protocol like TCP, the window needs to be adjusted to minimize the contribution to the "draining time" of the currently bottlenecked node, to stop pipelined flows from its sender as quickly as possible.

In that case, cooperative behavior is implicit. The bottleneck switch needs only to inform all independent flows of their contribution, and with an appropriate control loop on each node, approximate fairness can result.

And this is the most general approach. Switches have no idea of the "meaning" of the flows, so beyond timely and accurate reporting, they can't make useful decisions about fixing congestion.

Note that this all is an argument about architectural principles and the essence of the congestion problem.

I could quibble about whether fq_codel is the simplest or best choice for the minimal functionality an "internetwork" could provide. But it's pretty nice and simple. Not clear it works for a decentralized protocol like WiFi as a link - but something like it would seem to be the right thing.
Post by Dave Taht
Post by d***@reed.com
Post by Dave Taht
Well, I disagree somewhat. The downstream shaper we use works quite
well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
has had the inbound shaper work up a little past 100mbits. So there is
no need (theoretically) to upgrade the big fat head ends if your cpe is
powerful enough to do the job. It would be better if the head ends did
it,
Post by d***@reed.com
Post by Dave Taht
of course....
There is an advantage for the head-ends doing it, to the extent that each
edge device has no clarity about what is happening with all the other cpe
that are sharing that head-end. When there is bloat in the head-end even if
all cpe's sharing an upward path are shaping themselves to the "up to" speed
the provider sells, they can go into serious congestion if the head-end
queues can grow to 1 second or more of sustained queueing delay. My
understanding is that head-end queues have more than that. They certainly
do in LTE access networks.
Compelling argument! I agree it would be best for the devices that have the
most information about the network to manage themselves better.
It is deeply ironic to me that I'm arguing for an e2e approach on fixing
the problem in the field, with you!
http://en.wikipedia.org/wiki/End-to-end_principle
--
Dave TÀht
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
Jim Gettys
2014-05-21 17:47:06 UTC
Permalink
Post by d***@reed.com
Post by Dave Taht
Well, I disagree somewhat. The downstream shaper we use works quite
well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
has had the inbound shaper work up a little past 100mbits. So there is
no need (theoretically) to upgrade the big fat head ends if your cpe is
powerful enough to do the job. It would be better if the head ends did
it,
Post by Dave Taht
of course....
There is an advantage for the head-ends doing it, to the extent that each
edge device has no clarity about what is happening with all the other cpe
that are sharing that head-end. When there is bloat in the head-end even if
all cpe's sharing an upward path are shaping themselves to the "up to"
speed the provider sells, they can go into serious congestion if the
head-end queues can grow to 1 second or more of sustained queueing delay.
My understanding is that head-end queues have more than that. They
certainly do in LTE access networks.
​I have measured 200ms on a 28Mbps LTE quadrant to a single station. This
was using the simplest possible test on an idle cell. Easy to see how that
can grow to the second range.

Similarly, Dave Taht and I took data recently that showed a large
downstream buffer at the CMTS end (line card?), IIRC, it was something like
.25 megabyte, using a UDP flooding tool.

As always, there may be multiple different buffers lurking in these complex
devices, which may only come into play when different parts of them
"bottleneck", just as we found many different buffering locations inside of
Linux. In fact, some of these devices include Linux boxes (though I do not
know if they are on the packet forwarding path or not).

Bandwidth shaping downstream of those bottlenecks can help, but only to a
degree, and I believe primarily for "well behaved" long lived elephant
flows. Offload engines on servers and coalescing acks in various equipment
makes the degree of help, particularly for transient behavior such as
opening a bunch of TCP connections simultaneously and downloading the
elements of a web page I believe are likely to put large bursts of packets
into these queues, causing transient poor latency. I think we'll get a bit
of help out of the packet pacing code that recently went into Linux (for
well behaved servers) as it deploys. Thanks to Eric Dumazet for that work!
Ironically, servers get updated much more frequently than these middle
boxes, as far as I can tell.

Somehow we gotta get the bottlenecks in these devices (broadband &
cellular) to behave better.
- Jim
Post by d***@reed.com
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Dave Taht
2014-05-21 17:53:29 UTC
Permalink
Post by Jim Gettys
Post by d***@reed.com
Post by Dave Taht
Well, I disagree somewhat. The downstream shaper we use works quite
well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
has had the inbound shaper work up a little past 100mbits. So there is
no need (theoretically) to upgrade the big fat head ends if your cpe is
powerful enough to do the job. It would be better if the head ends did it,
of course....
There is an advantage for the head-ends doing it, to the extent that each
edge device has no clarity about what is happening with all the other cpe
that are sharing that head-end. When there is bloat in the head-end even if
all cpe's sharing an upward path are shaping themselves to the "up to" speed
the provider sells, they can go into serious congestion if the head-end
queues can grow to 1 second or more of sustained queueing delay. My
understanding is that head-end queues have more than that. They certainly
do in LTE access networks.
I have measured 200ms on a 28Mbps LTE quadrant to a single station. This
was using the simplest possible test on an idle cell. Easy to see how that
can grow to the second range.
Similarly, Dave Taht and I took data recently that showed a large downstream
buffer at the CMTS end (line card?), IIRC, it was something like .25
megabyte, using a UDP flooding tool.
No it was twice that. The udpburst tool is coming along nicely, but still
needs some analytics against the departure rate to get it right.
Post by Jim Gettys
As always, there may be multiple different buffers lurking in these complex
devices, which may only come into play when different parts of them
"bottleneck", just as we found many different buffering locations inside of
Linux. In fact, some of these devices include Linux boxes (though I do not
know if they are on the packet forwarding path or not).
Bandwidth shaping downstream of those bottlenecks can help, but only to a
degree, and I believe primarily for "well behaved" long lived elephant
flows. Offload engines on servers and coalescing acks in various equipment
makes the degree of help, particularly for transient behavior such as
opening a bunch of TCP connections simultaneously and downloading the
elements of a web page I believe are likely to put large bursts of packets
into these queues, causing transient poor latency. I think we'll get a bit
of help out of the packet pacing code that recently went into Linux (for
well behaved servers) as it deploys. Thanks to Eric Dumazet for that work!
Ironically, servers get updated much more frequently than these middle
boxes, as far as I can tell.
Somehow we gotta get the bottlenecks in these devices (broadband & cellular)
to behave better.
Or we can take a break, and write books about how we learned to relax and
stop worrying about the bloat.
Post by Jim Gettys
- Jim
Post by d***@reed.com
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
--
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
d***@reed.com
2014-05-21 17:56:37 UTC
Permalink
Post by Dave Taht
Or we can take a break, and write books about how we learned to relax and
stop worrying about the bloat.
Leading to waistline bloat?
Jim Gettys
2014-05-21 17:57:57 UTC
Permalink
Post by d***@reed.com
Post by Dave Taht
Or we can take a break, and write books about how we learned to relax and
stop worrying about the bloat.
Leading to waistline bloat?
​We resemble that remark already....
​
Dave Taht
2014-05-21 18:31:39 UTC
Permalink
Post by Jim Gettys
Post by d***@reed.com
Post by Dave Taht
Or we can take a break, and write books about how we learned to relax and
stop worrying about the bloat.
Leading to waistline bloat?
We resemble that remark already....
I put on 35 pounds since starting to work on this.
--
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
Dave Taht
2014-05-21 15:07:37 UTC
Permalink
Post by Frits Riep
Thanks Dave for your responses. Based on this, it is very good that qos-scripts is available now through openwrt, and as I experienced, it provides a huge advantage for most users.
I should point out that another issue with deploying fq_codel widely
is that it requires an accurate
measurement (currently) of the providers bandwidth.

My hope/expectation is that more ISPs that
provide CPE will ship something that is configured correctly by
default, following in free.fr's footsteps,
and trying to beat the cable industry to the punch, now that the core
code is debugged and documented, creating an out-of-box win.
Post by Frits Riep
I would agree prioritizing ping is in and of itself not the key goal, but based on what I've read so far, fq-codel provides dramatically better responsiveness for any interactive application such as web-browsing, voip, or gaming, so it qos-scripts would be advantageous for users like your mom if she were in an environment where she had a slow and shared internet connection. Is that a valid interpretation?
Sure. My mom has a fast, non-shared internet connection. Her biggest
problem is she hasn't
got off of windows despite my brother's decade of attempts to move her
to osx.... :)

Markets where this stuff seriously applies as a rate limiter + qos system
today are small to medium business, cybercafes, shared workspaces,
geek-zones, and so on. It also applies on ethernet and in cases where
you want to artificially have a rate limit like:

http://pieknywidok.blogspot.com/2014/05/10g-1g.html

We're ~5 years ahead of the curve here at cerowrt-central. Tools "just
work" for any sysadmin with chops. Commercial products are in the
pipeline.

While it takes time to build it into a product, I'd kind of expect
barracuda and ubnt to add fq_codel
to their products fairly soon, and for at least one switch vendor to follow.

It's in shorewall, ipfire, streamboost, everything downstream from openwrt,
linux mainline (and thus every linux distro) already. I know of a
couple cloud providers that are running
sch_fq and fq_codel already.

One thing I'm a little frustrated about, is that I'd expected sch_fq
to replace pfifo_fast by default
on more linux distros by now. It's a single sysctl...
Post by Frits Riep
I am interested in further understanding the differences based on the brief differences you provide. It is true that few devices provide DSCP marking, but if the latency is controlled for all traffic, latency sensitive traffic benefits tremendously even without prioritizing by l7 (layer 7 ?). Is this interpretation also valid?
Very, very true. Most of the need for prioritization goes away
entirely, due to the "sparse" vs "full" (or fast vs slow) queue
concept in fq_codel. In most circumstances things like voip just cut
through other traffic like butter. Videoconferencing is vastly
improved, also.

However, on very, very slow links (<3mbit), nothing helps enough. It's
not just the qos system that needs to be tuned, but that modern TCPs
and the web are optimized for much faster links and have features that
hurt at low speeds. (what helps most is installing adblock plus!).
Torrent is something of a special case - I find it totally bearable at
20mbit/4mbit without classification - but unbearable at 8/1.

I'm pretty satisfied we have the core algorithms and theory in place,
now, to build edge devices that work much better at 3mbit to 200mbit,
at least, possibly 10gbit or higher.
Post by Frits Riep
Yes, your mom wouldn't be a candidate for setting up ceroWRT herself, but if it were set up for her, or if it could be incorporated into a consumer router with automatically determining speed parameters,
That automatic speedtest thing turns out to be hard.
Post by Frits Riep
she would benefit totally from the performance improvement.
Meh. She needs to get off of windows.
Post by Frits Riep
So the technology ultimately needs to be taken mainstream, and yes that is a huge task.
Yep. If we hadn't given away everything perhaps there would be a
business model to fund that - streamboost is trying that route.

My hope was that the technology is merely so compelling that vendors
would be falling over themselves to answer the customer complaints.
But few have tied "bufferbloat" to the problems gamers and small
business are having with their internet uplinks as yet and more
education and demonstration seems necessary.

There is a huge backlog of potential demand for a better dslam, in
particular, as well as better firewalls and cablemodems. I don't have
a lot of hope for the two CMTS vendors to move to improve things
anytime soon.
Post by Frits Riep
Frits
-----Original Message-----
Sent: Tuesday, May 20, 2014 7:14 PM
To: Frits Riep
Subject: Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Post by Frits Riep
The concept of eliminating bufferbloat on many more routers is quite
appealing. Reading some of the recent posts makes it clear there is a
desire to get to a stable code, and also to find a new platform
beyond the current Netgear. However, as good as some of the proposed
platforms maybe for developing and for doing all of the new
capabilities of CeroWRT, I also would like to propose that there also
be some focus on reaching a wider and less sophisticated audience to
help broaden the awareness and make control of bufferbloat more available and easier to attain for more users.
· It appears there is a desire to merge the code into an upcoming
OpenWRT barrier breaker release, which is excellent as it will make it
easier to fight buffer bloat on a wide range of platforms and provide
users with a much easier to install firmware release. I’d like to be
able to download luci-qos-scripts and sqm-scripts and have basic
bufferbloat control on a much greater variety of devices and to many
more users.
Michael Richardson
2014-05-21 16:50:46 UTC
Permalink
Post by Dave Taht
I should point out that another issue with deploying fq_codel widely
is that it requires an accurate
measurement (currently) of the providers bandwidth.
I've been thinking about ways to do this over PPP(oE) links if one controls
both ends --- many third party internet access ISPs terminate the PPP
on their equipment, rather than the telco's, so it should be possible
to avoid all the L2 issues.

My ISP now offers fiber-to-the-neighbourhood, 50Mb/s down, 10 up.
(vs 7/640 that I have now). They are offering me an
http://smartrg.com/products/products/sr505n/

which they suggest I run in bridge (layer-2) mode. I'm trying to figure out
what is inside, as it has the DSL interface right on it. I didn't know
of this device before.
Post by Dave Taht
My hope/expectation is that more ISPs that
provide CPE will ship something that is configured correctly by
default, following in free.fr's footsteps,
and trying to beat the cable industry to the punch, now that the core
code is debugged and documented, creating an out-of-box win.
Agreed.

--
] Never tell me the odds! | ipv6 mesh networks [
] Michael Richardson, Sandelman Software Works | network architect [
] ***@sandelman.ca http://www.sandelman.ca/ | ruby on rails [
David Lang
2014-05-21 17:58:57 UTC
Permalink
Post by Frits Riep
Thanks Dave for your responses. Based on this, it is very good that
qos-scripts is available now through openwrt, and as I experienced, it
provides a huge advantage for most users.
I should point out that another issue with deploying fq_codel widely is that
it requires an accurate measurement (currently) of the providers bandwidth.
does it need this accurate measurement for sending or for the recieving pacing?

David Lang
R.
2014-05-24 14:12:56 UTC
Permalink
I should point out that another issue with deploying fq_codel widely is that it requires an accurate measurement (currently) of the providers bandwidth.
Pardon my noobiness, but is there a technical obstacle that prevents
the creation of a user-triggered function on the router side that
measures the provider's bandwidth?

Function, when (luci-gui?) triggered, would:

1. Ensure that internet connectivity is present.
2. Disconnect all clients.
3. Engage in DL and UL on a dedicated web server, measure stats and
straight up use them in fq_codel -- or suggest them in appropriate
QoS-gui user-boxes.

Further, this function could be auto-scheduled or made enabled on
router boot up.

I must be missing something important which prevents this. What is it?
Sebastian Moeller
2014-05-24 17:31:47 UTC
Permalink
Hi R, hi List,
Post by R.
I should point out that another issue with deploying fq_codel widely is that it requires an accurate measurement (currently) of the providers bandwidth.
Pardon my noobiness, but is there a technical obstacle that prevents
the creation of a user-triggered function on the router side that
measures the provider's bandwidth?
1. Ensure that internet connectivity is present.
2. Disconnect all clients.
3. Engage in DL and UL on a dedicated web server, measure stats and
straight up use them in fq_codel -- or suggest them in appropriate
QoS-gui user-boxes.
Further, this function could be auto-scheduled or made enabled on
router boot up.
I must be missing something important which prevents this. What is it?
Well, I see a couple of challenges that need to be overcome before this could work.

In your step 3 you touch the issue of measuring the current stats; and somehow what is trickier than one would think:

1) what to measure precisely, a "dedicated web server" sounds like a great idea, but who is dedicating it and where is it located relative to the link under test?
Rich Brown has made a nice script to measure current throughput and give an estimate on the effect of link saturation on latency (see betterspeedtest.sh from https://github.com/richb-hanover/CeroWrtScripts), but using this from Germany gives:
2014-05-24 15:44:47 Testing against demo.tohojo.dk with 5 simultaneous sessions while pinging gstatic.com (60 seconds in each direction)
Download: 12.06 Mbps
Upload: 1.99 Mbps
against a server in Europe, but:
Download: 10.42 Mbps
Upload: 1.85 Mbps
against a server on the east side of the USA. So the router would need to select a close-by server. Sites as speedtest.net offer this kind of server selection by proximity but do not have a very reliable way to load the link and do not measure the effect of link saturation on the latency… but the whole idea is to find the highest bandwidth that foes not cause indecent increase of latency under load. (Also speed tests are quite stereotypic in observable behavior and length so some ISPs special case these to look good; but that is a different kettle of fish…)
Note that there is also the question where one would like to measure the linkspeed; for example for DSL there is the link to the DSLAM, the link from the DSLAM to the next network node, sometimes a PPP link to a remote BRAS system (that might throttle the traffic). All of these can be the bottlenecks of the ISP connection (depending on circumstances). My take is that one would like to look at the link between modem and DSLAM as the bottleneck, but the opinions differ (and then there is cable with its shared first segment...).

2) Some links have quite peculiar properties that are hard to deduce from quick speed tests. For example ATM based ADSL links (this includes all ADSL1, ADSL2 and to my knowledge all existing ADSL2+ links) will show a packetize dependent link speed. In short ATM uses an integer number of 48 byte cells to transport each packet, so worst case it adds 47 bytes to the payload for small packet that can effectively double the size of the packet on the wire, or stared differently half the link speed for packets of that size. (Note thanks to the work of Jesper Brouer and Russel Stuart the linux kernel can take care of that issue for you, but you need to tell the kernel explicitly.)

3) many links actually do not have a constant wire speed available. For docsis (basically cable) the local segment is shared between many users and transmit timeslots are shared between requestors, giving effectively slower links during peak hours. For DSL a resync between DSLAM and modem can (significantly) change the negotiated speed; something cerowrt does not get any notice of…

I guess buffer bloat mitigation needs to move into the modems and DSLAMs to get rid of the bandwidth guessing game. For cable at least the modems are getting better (thanks to PIE being part of the docsis 3.1? standard), but for DSL I do not think there is any generic solution on the horizon…


Best Regards
Sebastian
Post by R.
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
David P. Reed
2014-05-24 19:05:46 UTC
Permalink
Depends on the type of the provider. Most providers now have shared paths to the backbone among users and give a peak rate up and down for brief periods that they will not sustain... In fact they usually penalize use of the peak rate by reducing the rate after that.

So at what point they create bloat in their access net is hard to determine. And it depends on your neighbors' behavior as well.

The number you want is the bloatedness of your path through the access provider.

This is measurable by sending small probes back and forth to a measurement server... Measuring instantaneous latency in each direction and combining that information with one's recent history in a non trivial calculation.

Note that that measurement does not directly produce provider speeds that can be input to the shapers used in codel. But it does produce a queue size that can.

So it's a plausible way to proceed as long as the operators refuse to fix their gear to manage the actual link that is problematic.

Personally I'd suggest that the gear makers' feet be held to the fire... by not "fixing" it by an inferior fix at the home router. Keep the pressure on them at IETF and among their customers.
Post by Dave Taht
I should point out that another issue with deploying fq_codel widely
is that it requires an accurate measurement (currently) of the
providers bandwidth.
Pardon my noobiness, but is there a technical obstacle that prevents
the creation of a user-triggered function on the router side that
measures the provider's bandwidth?
1. Ensure that internet connectivity is present.
2. Disconnect all clients.
3. Engage in DL and UL on a dedicated web server, measure stats and
straight up use them in fq_codel -- or suggest them in appropriate
QoS-gui user-boxes.
Further, this function could be auto-scheduled or made enabled on
router boot up.
I must be missing something important which prevents this. What is it?
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
-- Sent from my Android device with K-@ Mail. Please excuse my brevity.
R.
2014-05-24 14:03:18 UTC
Permalink
I should point out that another issue with deploying fq_codel widely is that it requires an accurate measurement (currently) of the providers bandwidth.
Pardon my noobiness, but is there a technical obstacle that prevents
the creation of a user-triggered function on the router side that
measures the provider's bandwidth?

Function, when (luci-gui?) triggered, would:

1. Ensure that internet connectivity is present.
2. Disconnect all clients.
3. Engage in DL and UL on a dedicated web server, measure stats and
straight up use them in fq_codel -- or suggest them in appropriate
QoS-gui user-boxes.

Further, this function could be auto-scheduled or made enabled on
router boot up.

I must be missing something important which prevents this. What is it?
V***@vt.edu
2014-07-25 18:37:34 UTC
Permalink
Post by R.
Further, this function could be auto-scheduled or made enabled on
router boot up.
Yeah, if such a thing worked, it would be good.

(Note in the following that a big part of my *JOB* is doing "What could
possibly go wrong?" analysis on mission-critical systems, which tends to color
my viewpoint on projects. I still think the basic concept is good, just
difficult to do, and am listing the obvious challenges for anybody brave
enough to tackle it... :)
Post by R.
I must be missing something important which prevents this. What is it?
There's a few biggies. The first is what the linux-kernel calls -ENOPATCH -
nobody's written the code. The second is you need an upstream target someplace
to test against. You need to deal with both the "server is unavalailable due
to a backhoe incident 2 time zones away" problem (which isn't *that* hard, just
default to Something Not Obviously Bad(TM), and "server is slashdotted" (whci
is a bit harder to deal with. Remember that there's some really odd corner
cases to worry about - for instance, if there's a power failure in a town, then
when the electric company restores power you're going to have every cerowrt box
hit the server within a few seconds - all over the same uplink most likely. No
good data can result from that... (Holy crap, it's been almost 3 decades since
I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted over the
network at once when building power was restored).

And if you're in Izbekistan and the closest server netwise is at 60 Hudson, the
analysis to compute the correct values becomes.... interesting.

Dealing with non-obvious error conditions is also a challenge - a router
may only boot once every few months. And if you happen to be booting just
as a BGP routing flap is causing your traffic to take a vastly suboptimal
path, you may end up encoding a vastly inaccurate setting and have it stuck
there, causing suckage for non-obvious reasons for the non-technical, so you
really don't want to enable auto-tuning unless you also have a good plan for
auto-*RE*tuning....
David Lang
2014-07-25 21:03:38 UTC
Permalink
Post by V***@vt.edu
Post by R.
Further, this function could be auto-scheduled or made enabled on
router boot up.
Yeah, if such a thing worked, it would be good.
(Note in the following that a big part of my *JOB* is doing "What could
possibly go wrong?" analysis on mission-critical systems, which tends to color
my viewpoint on projects. I still think the basic concept is good, just
difficult to do, and am listing the obvious challenges for anybody brave
enough to tackle it... :)
Post by R.
I must be missing something important which prevents this. What is it?
There's a few biggies. The first is what the linux-kernel calls -ENOPATCH -
nobody's written the code. The second is you need an upstream target someplace
to test against. You need to deal with both the "server is
unavalailable due
to a backhoe incident 2 time zones away" problem (which isn't *that* hard, just
default to Something Not Obviously Bad(TM), and "server is
slashdotted" (whci
is a bit harder to deal with. Remember that there's some really odd corner
cases to worry about - for instance, if there's a power failure in a town, then
when the electric company restores power you're going to have every cerowrt box
hit the server within a few seconds - all over the same uplink most likely. No
good data can result from that... (Holy crap, it's been almost 3 decades since
I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted over the
network at once when building power was restored).
And if you're in Izbekistan and the closest server netwise is at 60 Hudson, the
analysis to compute the correct values becomes.... interesting.
Dealing with non-obvious error conditions is also a challenge - a router
may only boot once every few months. And if you happen to be booting just
as a BGP routing flap is causing your traffic to take a vastly
suboptimal
path, you may end up encoding a vastly inaccurate setting and have it stuck
there, causing suckage for non-obvious reasons for the non-technical, so you
really don't want to enable auto-tuning unless you also have a good plan for
auto-*RE*tuning....
have the router record it's finding, and then repeat the test
periodically, recording it's finding as well. If the new finding is
substantially different from the prior ones, schedule a retest 'soon'
(or default to the prior setting if it's bad enough), otherwise, if
there aren't many samples, schedule a test 'soon' if there are a lot of
samples, schedule a test in a while.

However, I think the big question is how much the tuning is required.

If a connection with BQL and fq_codel is 90% as good as a tuned setup,
default to untuned unless the user explicitly hits a button to measure
(and then a second button to accept the measurement)

If BQL and fw_codel by default are M70% as good as a tuned setup,
there's more space to argue that all setups must be tuned, but then the
question is how to they fare against a old, non-BQL, non-fq-codel setup?
if they are considerably better, it may still be worthwhile.

David Lang
Sebastian Moeller
2014-07-26 11:30:08 UTC
Permalink
Hi David,
Post by V***@vt.edu
Post by R.
Further, this function could be auto-scheduled or made enabled on
router boot up.
Yeah, if such a thing worked, it would be good.
(Note in the following that a big part of my *JOB* is doing "What could
possibly go wrong?" analysis on mission-critical systems, which tends to color
my viewpoint on projects. I still think the basic concept is good, just
difficult to do, and am listing the obvious challenges for anybody brave
enough to tackle it... :)
Post by R.
I must be missing something important which prevents this. What is it?
There's a few biggies. The first is what the linux-kernel calls -ENOPATCH -
nobody's written the code. The second is you need an upstream target someplace
to test against. You need to deal with both the "server is unavalailable due
to a backhoe incident 2 time zones away" problem (which isn't *that* hard, just
default to Something Not Obviously Bad(TM), and "server is slashdotted" (whci
is a bit harder to deal with. Remember that there's some really odd corner
cases to worry about - for instance, if there's a power failure in a town, then
when the electric company restores power you're going to have every cerowrt box
hit the server within a few seconds - all over the same uplink most likely. No
good data can result from that... (Holy crap, it's been almost 3 decades since
I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted over the
network at once when building power was restored).
And if you're in Izbekistan and the closest server netwise is at 60 Hudson, the
analysis to compute the correct values becomes.... interesting.
Dealing with non-obvious error conditions is also a challenge - a router
may only boot once every few months. And if you happen to be booting just
as a BGP routing flap is causing your traffic to take a vastly suboptimal
path, you may end up encoding a vastly inaccurate setting and have it stuck
there, causing suckage for non-obvious reasons for the non-technical, so you
really don't want to enable auto-tuning unless you also have a good plan for
auto-*RE*tuning....
have the router record it's finding, and then repeat the test periodically, recording it's finding as well. If the new finding is substantially different from the prior ones, schedule a retest 'soon' (or default to the prior setting if it's bad enough), otherwise, if there aren't many samples, schedule a test 'soon' if there are a lot of samples, schedule a test in a while.
Yeah, keeping some history to “predict” when to measure next sounds clever.
However, I think the big question is how much the tuning is required.
I assume in most cases you need to measure the home-routers bandwidth rarely (say on DSL only after a re-sync with the DSLAM), but you need to measure the bandwidth early as only then you can properly shape the downlink. And we need to know the link’s capacity to use traffic shaping so that BQL and fq_codel in the router have control over the bottleneck queue… An equivalent of BQL and fq_codel running in the DSLAM/CMTS and CPE obviously would be what we need, because then BQL and fq_codel on the router would be all that is required. But that does not seem like it is happening anytime soon, so we still need to workaround the limitations in the equipment fr a long time to come, I fear.
If a connection with BQL and fq_codel is 90% as good as a tuned setup, default to untuned unless the user explicitly hits a button to measure (and then a second button to accept the measurement)
If BQL and fw_codel by default are M70% as good as a tuned setup, there's more space to argue that all setups must be tuned, but then the question is how to they fare against a old, non-BQL, non-fq-codel setup? if they are considerably better, it may still be worthwhile.
Best Regards
Sebastian
David Lang
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
David Lang
2014-07-26 20:39:59 UTC
Permalink
Post by Sebastian Moeller
Hi David,
Post by V***@vt.edu
Post by R.
Further, this function could be auto-scheduled or made enabled on
router boot up.
Yeah, if such a thing worked, it would be good.
(Note in the following that a big part of my *JOB* is doing "What could
possibly go wrong?" analysis on mission-critical systems, which tends to color
my viewpoint on projects. I still think the basic concept is good, just
difficult to do, and am listing the obvious challenges for anybody brave
enough to tackle it... :)
Post by R.
I must be missing something important which prevents this. What is it?
There's a few biggies. The first is what the linux-kernel calls -ENOPATCH -
nobody's written the code. The second is you need an upstream target someplace
to test against. You need to deal with both the "server is unavalailable due
to a backhoe incident 2 time zones away" problem (which isn't *that* hard, just
default to Something Not Obviously Bad(TM), and "server is slashdotted" (whci
is a bit harder to deal with. Remember that there's some really odd corner
cases to worry about - for instance, if there's a power failure in a town, then
when the electric company restores power you're going to have every cerowrt box
hit the server within a few seconds - all over the same uplink most likely. No
good data can result from that... (Holy crap, it's been almost 3 decades since
I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted over the
network at once when building power was restored).
And if you're in Izbekistan and the closest server netwise is at 60 Hudson, the
analysis to compute the correct values becomes.... interesting.
Dealing with non-obvious error conditions is also a challenge - a router
may only boot once every few months. And if you happen to be booting just
as a BGP routing flap is causing your traffic to take a vastly suboptimal
path, you may end up encoding a vastly inaccurate setting and have it stuck
there, causing suckage for non-obvious reasons for the non-technical, so you
really don't want to enable auto-tuning unless you also have a good plan for
auto-*RE*tuning....
have the router record it's finding, and then repeat the test periodically, recording it's finding as well. If the new finding is substantially different from the prior ones, schedule a retest 'soon' (or default to the prior setting if it's bad enough), otherwise, if there aren't many samples, schedule a test 'soon' if there are a lot of samples, schedule a test in a while.
Yeah, keeping some history to “predict” when to measure next sounds clever.
However, I think the big question is how much the tuning is required.
I assume in most cases you need to measure the home-routers bandwidth rarely
(say on DSL only after a re-sync with the DSLAM), but you need to measure the
bandwidth early as only then you can properly shape the downlink. And we need
to know the link’s capacity to use traffic shaping so that BQL and fq_codel in
the router have control over the bottleneck queue… An equivalent of BQL and
fq_codel running in the DSLAM/CMTS and CPE obviously would be what we need,
because then BQL and fq_codel on the router would be all that is required. But
that does not seem like it is happening anytime soon, so we still need to
workaround the limitations in the equipment fr a long time to come, I fear.
by how much tuning is required, I wasn't meaning how frequently to tune, but how
close default settings can come to the performance of a expertly tuned setup.

Ideally the tuning takes into account the characteristics of the hardware of the
link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN
tagging, ethernet with jumbo packet support for example), then you have overhead
from the encapsulation that you would ideally take into account when tuning
things.

the question I'm talking about below is how much do you loose compared to the
idea if you ignore this sort of thing and just assume that the wire is dumb and
puts the bits on them as you send them? By dumb I mean don't even allow for
inter-packet gaps, don't measure the bandwidth, don't try to pace inbound
connections by the timing of your acks, etc. Just run BQL and fq_codel and start
the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and
shrink them based on long-term passive observation of the sender.

If you end up only loosing 5-10% of your overall network performance by ignoring
the details of the wire, then we should ignore them by default.

If however, not measuring anything first results in significantly worse
performance than a tuned setup, then we need to figure out how to do the
measurements needed for tuning.

Some people seem to have fallen into the "perfect is the enemy of good enough"
trap on this topic. They are so fixated on getting the absolute best performance
out of a link that they are forgetting how bad the status-quo is right now.

If you look at the graph that Dave Taht put on page 6 of his slide deck
http://snapon.lab.bufferbloat.net/~d/Presos/CaseForComprehensiveQueueManagement/assets/player/KeynoteDHTMLPlayer.html#5
it's important to realize that even the worst of the BQL+fq_codel graphs is
worlds better than the default setting, while it would be nice to get to the
green trace on the left, even getting to the middle traces instead of the black
trace on the right would be a huge win for the public.

David Lang
Post by Sebastian Moeller
If a connection with BQL and fq_codel is 90% as good as a tuned setup,
default to untuned unless the user explicitly hits a button to measure (and
then a second button to accept the measurement)
If BQL and fw_codel by default are M70% as good as a tuned setup, there's
more space to argue that all setups must be tuned, but then the question is
how to they fare against a old, non-BQL, non-fq-codel setup? if they are
considerably better, it may still be worthwhile.
Sebastian Moeller
2014-07-26 21:25:35 UTC
Permalink
Hi David,
Post by Sebastian Moeller
Hi David,
Post by V***@vt.edu
Post by R.
Further, this function could be auto-scheduled or made enabled on
router boot up.
Yeah, if such a thing worked, it would be good.
(Note in the following that a big part of my *JOB* is doing "What could
possibly go wrong?" analysis on mission-critical systems, which tends to color
my viewpoint on projects. I still think the basic concept is good, just
difficult to do, and am listing the obvious challenges for anybody brave
enough to tackle it... :)
Post by R.
I must be missing something important which prevents this. What is it?
There's a few biggies. The first is what the linux-kernel calls -ENOPATCH -
nobody's written the code. The second is you need an upstream target someplace
to test against. You need to deal with both the "server is unavalailable due
to a backhoe incident 2 time zones away" problem (which isn't *that* hard, just
default to Something Not Obviously Bad(TM), and "server is slashdotted" (whci
is a bit harder to deal with. Remember that there's some really odd corner
cases to worry about - for instance, if there's a power failure in a town, then
when the electric company restores power you're going to have every cerowrt box
hit the server within a few seconds - all over the same uplink most likely. No
good data can result from that... (Holy crap, it's been almost 3 decades since
I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted over the
network at once when building power was restored).
And if you're in Izbekistan and the closest server netwise is at 60 Hudson, the
analysis to compute the correct values becomes.... interesting.
Dealing with non-obvious error conditions is also a challenge - a router
may only boot once every few months. And if you happen to be booting just
as a BGP routing flap is causing your traffic to take a vastly suboptimal
path, you may end up encoding a vastly inaccurate setting and have it stuck
there, causing suckage for non-obvious reasons for the non-technical, so you
really don't want to enable auto-tuning unless you also have a good plan for
auto-*RE*tuning....
have the router record it's finding, and then repeat the test periodically, recording it's finding as well. If the new finding is substantially different from the prior ones, schedule a retest 'soon' (or default to the prior setting if it's bad enough), otherwise, if there aren't many samples, schedule a test 'soon' if there are a lot of samples, schedule a test in a while.
Yeah, keeping some history to “predict” when to measure next sounds clever.
However, I think the big question is how much the tuning is required.
I assume in most cases you need to measure the home-routers bandwidth rarely (say on DSL only after a re-sync with the DSLAM), but you need to measure the bandwidth early as only then you can properly shape the downlink. And we need to know the link’s capacity to use traffic shaping so that BQL and fq_codel in the router have control over the bottleneck queue… An equivalent of BQL and fq_codel running in the DSLAM/CMTS and CPE obviously would be what we need, because then BQL and fq_codel on the router would be all that is required. But that does not seem like it is happening anytime soon, so we still need to workaround the limitations in the equipment fr a long time to come, I fear.
by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.
Good question.
Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.
As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)
If you end up only loosing 5-10% of your overall network performance by ignoring the details of the wire, then we should ignore them by default.
If however, not measuring anything first results in significantly worse performance than a tuned setup, then we need to figure out how to do the measurements needed for tuning.
Agreed.
Some people seem to have fallen into the "perfect is the enemy of good enough" trap on this topic. They are so fixated on getting the absolute best performance out of a link that they are forgetting how bad the status-quo is right now.
If you look at the graph that Dave Taht put on page 6 of his slide deck http://snapon.lab.bufferbloat.net/~d/Presos/CaseForComprehensiveQueueManagement/assets/player/KeynoteDHTMLPlayer.html#5 it's important to realize that even the worst of the BQL+fq_codel graphs is worlds better than the default setting, while it would be nice to get to the green trace on the left, even getting to the middle traces instead of the black trace on the right would be a huge win for the public.
Just to note in the plot above the connection to the DSL modem was always mediated by fq_codel and BQL? and since shaping was used BQL would not come into effect…

Best Regards
Sebastian
David Lang
Post by Sebastian Moeller
If a connection with BQL and fq_codel is 90% as good as a tuned setup, default to untuned unless the user explicitly hits a button to measure (and then a second button to accept the measurement)
If BQL and fw_codel by default are M70% as good as a tuned setup, there's more space to argue that all setups must be tuned, but then the question is how to they fare against a old, non-BQL, non-fq-codel setup? if they are considerably better, it may still be worthwhile.
David Lang
2014-07-26 21:45:59 UTC
Permalink
Post by Sebastian Moeller
Post by David Lang
by how much tuning is required, I wasn't meaning how frequently to tune, but
how close default settings can come to the performance of a expertly tuned
setup.
Good question.
Post by David Lang
Ideally the tuning takes into account the characteristics of the hardware of
the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN,
VLAN tagging, ethernet with jumbo packet support for example), then you have
overhead from the encapsulation that you would ideally take into account when
tuning things.
the question I'm talking about below is how much do you loose compared to the
idea if you ignore this sort of thing and just assume that the wire is dumb
and puts the bits on them as you send them? By dumb I mean don't even allow
for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound
connections by the timing of your acks, etc. Just run BQL and fq_codel and
start the BQL sizes based on the wire speed of your link (Gig-E on the 3800)
and shrink them based on long-term passive observation of the sender.
As data talks I just did a quick experiment with my ADSL2+ koine at
home. The solid lines in the attached plot show the results for proper shaping
with SQM (shaping to 95% of del link rates of downstream and upstream while
taking the link layer properties, that is ATM encapsulation and per packet
overhead into account) the broken lines show the same system with just the
link layer adjustments and per packet overhead adjustments disabled, but still
shaping to 95% of link rate (this is roughly equivalent to 15% underestimation
of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams
up, 4 tcp steams down while measuring latency with ping and UDP probes). As
you can see from the plot just getting the link layer encapsulation wrong
destroys latency under load badly. The host is ~52ms RTT away, and with
fq_codel the ping time per leg is just increased one codel target of 5ms each
resulting in an modest latency increase of ~10ms with proper shaping for a
total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost
double), so RTT increases by ~43ms. Also note how the extremes for the broken
lines are much worse than for the solid lines. In short I would estimate that
a slight misjudgment (15%) results in almost 80% increase of latency under
load. In other words getting the rates right matters a lot. (I should also
note that in my setup there is a secondary router that limits RTT to max
300ms, otherwise the broken lines might look even worse...)
what is the latency like without BQL and codel? the pre-bufferbloat version?
(without any traffic shaping)

I agree that going from 65ms to 95ms seems significant, but if the stock version
goes into up above 1000ms, then I think we are talking about things that are
'close'

assuming that latency under load without the improvents got >1000ms

fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000

fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19

slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05

rather than looking at how much worse it is than the ideal, look at how much
closer it is to the ideal than to the bloated version.

David Lang
David Lang
2014-07-26 22:24:10 UTC
Permalink
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
by how much tuning is required, I wasn't meaning how frequently to tune,
but how close default settings can come to the performance of a expertly
tuned setup.
Good question.
Post by David Lang
Ideally the tuning takes into account the characteristics of the hardware
of the link layer. If it's IP encapsulated in something else (ATM, PPPoE,
VPN, VLAN tagging, ethernet with jumbo packet support for example), then
you have overhead from the encapsulation that you would ideally take into
account when tuning things.
the question I'm talking about below is how much do you loose compared to
the idea if you ignore this sort of thing and just assume that the wire is
dumb and puts the bits on them as you send them? By dumb I mean don't even
allow for inter-packet gaps, don't measure the bandwidth, don't try to
pace inbound connections by the timing of your acks, etc. Just run BQL and
fq_codel and start the BQL sizes based on the wire speed of your link
(Gig-E on the 3800) and shrink them based on long-term passive observation
of the sender.
As data talks I just did a quick experiment with my ADSL2+ koine at
home. The solid lines in the attached plot show the results for proper
shaping with SQM (shaping to 95% of del link rates of downstream and
upstream while taking the link layer properties, that is ATM encapsulation
and per packet overhead into account) the broken lines show the same system
with just the link layer adjustments and per packet overhead adjustments
disabled, but still shaping to 95% of link rate (this is roughly equivalent
to 15% underestimation of the packet size). The actual theist is
netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring
latency with ping and UDP probes). As you can see from the plot just
getting the link layer encapsulation wrong destroys latency under load
badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg
is just increased one codel target of 5ms each resulting in an modest
latency increase of ~10ms with proper shaping for a total of ~65ms, with
improper shaping RTTs increase to ~95ms (they almost double), so RTT
increases by ~43ms. Also note how the extremes for the broken lines are
much worse than for the solid lines. In short I would estimate that a
slight misjudgment (15%) results in almost 80% increase of latency under
load. In other words getting the rates right matters a lot. (I should also
note that in my setup there is a secondary router that limits RTT to max
300ms, otherwise the broken lines might look even worse...)
is this with BQL/fq_codel in both directions or only in one direction?

David Lang
Post by David Lang
what is the latency like without BQL and codel? the pre-bufferbloat version?
(without any traffic shaping)
I agree that going from 65ms to 95ms seems significant, but if the stock
version goes into up above 1000ms, then I think we are talking about things
that are 'close'
assuming that latency under load without the improvents got >1000ms
fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000
fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19
slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05
rather than looking at how much worse it is than the ideal, look at how much
closer it is to the ideal than to the bloated version.
David Lang
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Sebastian Moeller
2014-07-27 09:50:05 UTC
Permalink
Hi David,
Post by David Lang
Post by Sebastian Moeller
by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.
Good question.
Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.
As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)
is this with BQL/fq_codel in both directions or only in one direction?
So by shaping to below line rate the bottleneck is actually happening inside cerowrt and there I run BQL (which does not matter since due to shaping the NICs buffer does not fill up anyway) and fq_codel in both directions.

Best Regards
Sebastian
Post by David Lang
David Lang
what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping)
I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are 'close'
assuming that latency under load without the improvents got >1000ms
fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000
fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19
slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05
rather than looking at how much worse it is than the ideal, look at how much closer it is to the ideal than to the bloated version.
David Lang
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Sebastian Moeller
2014-07-26 22:39:23 UTC
Permalink
Hi David,
Post by Sebastian Moeller
by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.
Good question.
Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.
As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)
what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping)
So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for > 1000ms, just as with the broken line, with proper shaping even extreme pings stay < 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes > 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely...
I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are ‘close'
Well if we include outliers (and we should as enough outliers will degrade the FPS and voip suitability of an otherwise responsive system quickly) stock and improper shaping are in the >1000ms worst case range, while proper SQM bounds this to 100ms.
assuming that latency under load without the improvents got >1000ms
fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000
The sign seems off as fast < slow? I like this best ;)
fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19
But Fast < Slow and hence this ration should be <0?
slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05
and this >0?
rather than looking at how much worse it is than the ideal, look at how much closer it is to the ideal than to the bloated version.
David Lang
David Lang
2014-07-26 22:53:37 UTC
Permalink
Post by Sebastian Moeller
Hi David,
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
by how much tuning is required, I wasn't meaning how frequently to tune,
but how close default settings can come to the performance of a expertly
tuned setup.
Good question.
Post by David Lang
Ideally the tuning takes into account the characteristics of the hardware
of the link layer. If it's IP encapsulated in something else (ATM, PPPoE,
VPN, VLAN tagging, ethernet with jumbo packet support for example), then
you have overhead from the encapsulation that you would ideally take into
account when tuning things.
the question I'm talking about below is how much do you loose compared to
the idea if you ignore this sort of thing and just assume that the wire is
dumb and puts the bits on them as you send them? By dumb I mean don't even
allow for inter-packet gaps, don't measure the bandwidth, don't try to pace
inbound connections by the timing of your acks, etc. Just run BQL and
fq_codel and start the BQL sizes based on the wire speed of your link
(Gig-E on the 3800) and shrink them based on long-term passive observation
of the sender.
As data talks I just did a quick experiment with my ADSL2+ koine at
home. The solid lines in the attached plot show the results for proper
shaping with SQM (shaping to 95% of del link rates of downstream and
upstream while taking the link layer properties, that is ATM encapsulation
and per packet overhead into account) the broken lines show the same system
with just the link layer adjustments and per packet overhead adjustments
disabled, but still shaping to 95% of link rate (this is roughly equivalent
to 15% underestimation of the packet size). The actual theist is
netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring
latency with ping and UDP probes). As you can see from the plot just getting
the link layer encapsulation wrong destroys latency under load badly. The
host is ~52ms RTT away, and with fq_codel the ping time per leg is just
increased one codel target of 5ms each resulting in an modest latency
increase of ~10ms with proper shaping for a total of ~65ms, with improper
shaping RTTs increase to ~95ms (they almost double), so RTT increases by
~43ms. Also note how the extremes for the broken lines are much worse than
for the solid lines. In short I would estimate that a slight misjudgment
(15%) results in almost 80% increase of latency under load. In other words
getting the rates right matters a lot. (I should also note that in my setup
there is a secondary router that limits RTT to max 300ms, otherwise the
broken lines might look even worse...)
what is the latency like without BQL and codel? the pre-bufferbloat version?
(without any traffic shaping)
So I just disabled SQM and the plot looks almost exactly like the broken
line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings
delayed for > 1000ms, just as with the broken line, with proper shaping even
extreme pings stay < 100ms). But as I said before I need to run through my ISP
supplied primary router (not just a dumb modem) that also tries to bound the
latencies under load to some degree. Actually I just repeated the test
connected directly to the primary router and get the same ~95ms average ping
time with frequent extremes > 1000ms, so it looks like just getting the
shaping wrong by 15% eradicates the buffer de-bloating efforts completely...
just so I understand this completely

you have

debloated box <-> ISP router <-> ADSL <-> Internet <-> debloated server?

and are you measuring the latency impact when uploading or downloading?

I think a lot of people would be happy with 95ms average pings on a loaded
connection, even with occasional outliers. It's far better than sustained
multi-second ping times which is what I've seen with stock setups.

but if no estimate is this bad, how bad is it if you use as your estimate the
'rated' speed of your DSL (i.e. what the ISP claims they are providing you)
instead of the fully accurate speed that includes accounting for ATM
encapsulation?

It's also worth figuring out if this problem would remain in place if you didn't
have to go through the ISP router and were runing fq_codel on that router. As
long as fixing bufferbloat involves esoteric measurements and tuning, it's not
going to be solved, but if it could be solved by people flahing openwrt onto
their DSL router and then using the defaults, it could gain traction fairly
quickly.
Post by Sebastian Moeller
Post by David Lang
I agree that going from 65ms to 95ms seems significant, but if the stock
version goes into up above 1000ms, then I think we are talking about things
that are ‘close'
Well if we include outliers (and we should as enough outliers will
degrade the FPS and voip suitability of an otherwise responsive system
quickly) stock and improper shaping are in the >1000ms worst case range, while
proper SQM bounds this to 100ms.
Post by David Lang
assuming that latency under load without the improvents got >1000ms
fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000
The sign seems off as fast < slow? I like this best ;)
yep, I reversed fast/slow in all of these
Post by Sebastian Moeller
Post by David Lang
fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19
But Fast < Slow and hence this ration should be <0?
1 not 0, but yes, this is really slow/fast
Post by Sebastian Moeller
Post by David Lang
slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05
and this >0?
and this is really fast/slow

David Lang
Sebastian Moeller
2014-07-26 23:39:08 UTC
Permalink
Hi David,
Post by David Lang
Post by Sebastian Moeller
Hi David,
Post by Sebastian Moeller
by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.
Good question.
Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.
As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)
what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping)
So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for > 1000ms, just as with the broken line, with proper shaping even extreme pings stay < 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes > 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely...
just so I understand this completely
you have
debloated box <-> ISP router <-> ADSL <-> Internet <-> debloated server?
Well more like:

Macbook with dubious bloat-state -> wifi to de-bloated cerowrt box that shapes the traffic -> ISP router -> ADSL -> internet -> server

I assume that Dave debated these servers well, but it should not really matter as the problem are the buffers on both ends of the bottleneck ADSL link.
Post by David Lang
and are you measuring the latency impact when uploading or downloading?
No I measure the impact of latency of saturating both up- and downlink, pretty much the worst case scenario.
Post by David Lang
I think a lot of people would be happy with 95ms average pings on a loaded connection, even with occasional outliers.
No that is too low an aim, this still is not useable for real time applications, we should aim for base RTT plus 10ms. (For very slow links we need to cut some slack but for > 3Mbps 10ms should be achievable )
Post by David Lang
It's far better than sustained multi-second ping times which is what I've seen with stock setups.
True, but compared to multi seconds even <1000ms would be a really great improvement, but also not enough.
Post by David Lang
but if no estimate is this bad, how bad is it if you use as your estimate the 'rated' speed of your DSL (i.e. what the ISP claims they are providing you) instead of the fully accurate speed that includes accounting for ATM encapsulation?
Well ~95ms with outliers > 1000ms, just as bad as no estimate. I shaped 5% below rated speed as reported by the DSL modem, so disabling the ATM link layer adjustments (as shown in the broken lines in the plot), basically increased the effective shaped rate by ~13% or to effectively 107% of line rate, your proposal would be line rate and no link layer adjustments or effectively 110% of line rate; I do not feel like repeating this experiment right now as I think the data so far shows that even with less misjudgment the bloat effect is fully visible ) Not accounting for ATM framing carries a ~10% cost in link speed, as ATM packet size on the wire increases by >= ~10%.
Post by David Lang
It's also worth figuring out if this problem would remain in place if you didn't have to go through the ISP router and were runing fq_codel on that router.
If the DSL modem would be debloated at least on upstream no shaping would be required any more; but that does not fix the need for downstream shaping (and bandwidth estimation) until the head end gear is debloated..
Post by David Lang
As long as fixing bufferbloat involves esoteric measurements and tuning, it's not going to be solved, but if it could be solved by people flahing openwrt onto their DSL router and then using the defaults, it could gain traction fairly quickly.
But as there are only very few DSL modems with open sources (especially of the DSL chips) this just as esoteric ;) Really if equipment manufactures could be convinced to take these issues seriously and actually fix their gear that would be best. But this does not look like it is happening on the fast track. (Even DOCSIS developer cable labs punted on requiring codel or fq_codel in DOCSIS modems since the think that the required timestamps are to “expensive” on the device class they want to use for modems. They opted for PIE, much better than what we have right now but far away from my latency under load increase of 10ms...)
Post by David Lang
Post by Sebastian Moeller
I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are ‘close'
Well if we include outliers (and we should as enough outliers will degrade the FPS and voip suitability of an otherwise responsive system quickly) stock and improper shaping are in the >1000ms worst case range, while proper SQM bounds this to 100ms.
assuming that latency under load without the improvents got >1000ms
fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000
The sign seems off as fast < slow? I like this best ;)
yep, I reversed fast/slow in all of these
Post by Sebastian Moeller
fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19
But Fast < Slow and hence this ration should be <0?
1 not 0, but yes, this is really slow/fast
Post by Sebastian Moeller
slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05
and this >0?
and this is really fast/slow
What about taking the latency difference an re;aging it with a reference time, like say the time a photon would take to travel once around the equator, or the earth’s diamater?

Best Regards
Sebastian
Post by David Lang
David Lang
David Lang
2014-07-27 00:49:37 UTC
Permalink
Post by Sebastian Moeller
Post by David Lang
Post by Sebastian Moeller
Hi David,
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
by how much tuning is required, I wasn't meaning how frequently to tune,
but how close default settings can come to the performance of a expertly
tuned setup.
Good question.
Post by David Lang
Ideally the tuning takes into account the characteristics of the hardware
of the link layer. If it's IP encapsulated in something else (ATM, PPPoE,
VPN, VLAN tagging, ethernet with jumbo packet support for example), then
you have overhead from the encapsulation that you would ideally take into
account when tuning things.
the question I'm talking about below is how much do you loose compared to
the idea if you ignore this sort of thing and just assume that the wire
is dumb and puts the bits on them as you send them? By dumb I mean don't
even allow for inter-packet gaps, don't measure the bandwidth, don't try
to pace inbound connections by the timing of your acks, etc. Just run BQL
and fq_codel and start the BQL sizes based on the wire speed of your link
(Gig-E on the 3800) and shrink them based on long-term passive
observation of the sender.
As data talks I just did a quick experiment with my ADSL2+ koine at
home. The solid lines in the attached plot show the results for proper
shaping with SQM (shaping to 95% of del link rates of downstream and
upstream while taking the link layer properties, that is ATM encapsulation
and per packet overhead into account) the broken lines show the same
system with just the link layer adjustments and per packet overhead
adjustments disabled, but still shaping to 95% of link rate (this is
roughly equivalent to 15% underestimation of the packet size). The actual
theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while
measuring latency with ping and UDP probes). As you can see from the plot
just getting the link layer encapsulation wrong destroys latency under
load badly. The host is ~52ms RTT away, and with fq_codel the ping time
per leg is just increased one codel target of 5ms each resulting in an
modest latency increase of ~10ms with proper shaping for a total of ~65ms,
with improper shaping RTTs increase to ~95ms (they almost double), so RTT
increases by ~43ms. Also note how the extremes for the broken lines are
much worse than for the solid lines. In short I would estimate that a
slight misjudgment (15%) results in almost 80% increase of latency under
load. In other words getting the rates right matters a lot. (I should also
note that in my setup there is a secondary router that limits RTT to max
300ms, otherwise the broken lines might look even worse...)
what is the latency like without BQL and codel? the pre-bufferbloat
version? (without any traffic shaping)
So I just disabled SQM and the plot looks almost exactly like the broken
line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings
delayed for > 1000ms, just as with the broken line, with proper shaping even
extreme pings stay < 100ms). But as I said before I need to run through my
ISP supplied primary router (not just a dumb modem) that also tries to bound
the latencies under load to some degree. Actually I just repeated the test
connected directly to the primary router and get the same ~95ms average ping
time with frequent extremes > 1000ms, so it looks like just getting the
shaping wrong by 15% eradicates the buffer de-bloating efforts completely...
just so I understand this completely
you have
debloated box <-> ISP router <-> ADSL <-> Internet <-> debloated server?
Macbook with dubious bloat-state -> wifi to de-bloated cerowrt box that
shapes the traffic -> ISP router -> ADSL -> internet -> server
I assume that Dave debated these servers well, but it should not really matter
as the problem are the buffers on both ends of the bottleneck ADSL link.
right, I was forgetting that unless you are the bottleneck, you aren't buffering
anything and so debloating makes no difference. In a case like yours where you
can't debloat the actual bottleneck, the best that you can do is to artificially
become the bottleneck by shaping the traffic. but on the download side it's much
harder.

What are we aiming for? something that will show the problem clearly so that
fixes can be put in the right place? or a work-around to use in the meantime?

I think both need to be pursued, but we need to be clear on what is being done
for each one.

If having BQL+fq_codel with defaults would solve the problem if it was on the
right routers, we need to show that.

Then, because we can't get the fixes on the right routers and need to
work-around the problem by artificially becoming the bottleneck, we need to show
that the 95% that we shape to is throwing away 5% of your capacity and make that
clear to the users.

otherwise we will risk getting to the point where it will never get fixed
because the ISPs will look at their routers and say that bufferbloat can't
possibly be a problem as they never have large queues (because we are doing the
workarounds.
Post by Sebastian Moeller
Post by David Lang
and are you measuring the latency impact when uploading or downloading?
No I measure the impact of latency of saturating both up- and downlink,
pretty much the worst case scenario.
I think we need to test this in each direction independantly.

Cerowrt can do a pretty good job of keeping the uplink from being saturated, but
it can't do a lot for the downlink.
Post by Sebastian Moeller
Post by David Lang
I think a lot of people would be happy with 95ms average pings on a loaded
connection, even with occasional outliers.
No that is too low an aim, this still is not useable for real time
applications, we should aim for base RTT plus 10ms. (For very slow links we
need to cut some slack but for > 3Mbps 10ms should be achievable )
perfect is the enemy of good enough.

There's achievable if every router is tuned to exactly the right conditions and
there's achievable for course settings that can be widely deployed. Get the
second out while continuing to work on making the first easier.

residential connections only come in a smallish number of sizes, it shouldn't be
too hard to do a few probes and guess which size is in use, then set the
bandwith to 90% of that standard size and you should be pretty good without
further tuning.
Post by Sebastian Moeller
Post by David Lang
It's far better than sustained multi-second ping times which is what I've
seen with stock setups.
True, but compared to multi seconds even <1000ms would be a really great
improvement, but also not enough.
Post by David Lang
but if no estimate is this bad, how bad is it if you use as your estimate the
'rated' speed of your DSL (i.e. what the ISP claims they are providing you)
instead of the fully accurate speed that includes accounting for ATM
encapsulation?
Well ~95ms with outliers > 1000ms, just as bad as no estimate. I shaped
5% below rated speed as reported by the DSL modem, so disabling the ATM link
layer adjustments (as shown in the broken lines in the plot), basically
increased the effective shaped rate by ~13% or to effectively 107% of line
rate, your proposal would be line rate and no link layer adjustments or
effectively 110% of line rate; I do not feel like repeating this experiment
right now as I think the data so far shows that even with less misjudgment the
bloat effect is fully visible ) Not accounting for ATM framing carries a ~10%
cost in link speed, as ATM packet size on the wire increases by >= ~10%.
so what if you shape to 90% of rated speed (no allowance for ATM vs other
transports)?
Post by Sebastian Moeller
Post by David Lang
It's also worth figuring out if this problem would remain in place if you
didn't have to go through the ISP router and were runing fq_codel on that
router.
If the DSL modem would be debloated at least on upstream no shaping
would be required any more; but that does not fix the need for downstream
shaping (and bandwidth estimation) until the head end gear is debloated..
right, I was forgetting this earlier.
Post by Sebastian Moeller
Post by David Lang
As long as fixing bufferbloat involves esoteric measurements and tuning, it's
not going to be solved, but if it could be solved by people flahing openwrt
onto their DSL router and then using the defaults, it could gain traction
fairly quickly.
But as there are only very few DSL modems with open sources (especially
of the DSL chips) this just as esoteric ;) Really if equipment manufactures
could be convinced to take these issues seriously and actually fix their gear
that would be best. But this does not look like it is happening on the fast
track. (Even DOCSIS developer cable labs punted on requiring codel or fq_codel
in DOCSIS modems since the think that the required timestamps are to
“expensive” on the device class they want to use for modems. They opted for
PIE, much better than what we have right now but far away from my latency
under load increase of 10ms...)
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
I agree that going from 65ms to 95ms seems significant, but if the stock
version goes into up above 1000ms, then I think we are talking about things
that are ‘close'
Well if we include outliers (and we should as enough outliers will
degrade the FPS and voip suitability of an otherwise responsive system
quickly) stock and improper shaping are in the >1000ms worst case range,
while proper SQM bounds this to 100ms.
Post by David Lang
assuming that latency under load without the improvents got >1000ms
fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000
The sign seems off as fast < slow? I like this best ;)
yep, I reversed fast/slow in all of these
Post by Sebastian Moeller
Post by David Lang
fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19
But Fast < Slow and hence this ration should be <0?
1 not 0, but yes, this is really slow/fast
Post by Sebastian Moeller
Post by David Lang
slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05
and this >0?
and this is really fast/slow
What about taking the latency difference an re;aging it with a reference
time, like say the time a photon would take to travel once around the equator,
or the earth’s diamater?
how about latency difference scaled by the time to send one 1500 byte packet at
the measured throughput?

This would factor out the data rate and would not be affected by long distance
links.

David Lang
Sebastian Moeller
2014-07-27 11:17:16 UTC
Permalink
Hi David,
Post by Sebastian Moeller
Post by David Lang
Post by Sebastian Moeller
Hi David,
Post by Sebastian Moeller
by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.
Good question.
Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.
As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)
what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping)
So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for > 1000ms, just as with the broken line, with proper shaping even extreme pings stay < 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes > 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely...
just so I understand this completely
you have
debloated box <-> ISP router <-> ADSL <-> Internet <-> debloated server?
Macbook with dubious bloat-state -> wifi to de-bloated cerowrt box that shapes the traffic -> ISP router -> ADSL -> internet -> server
I assume that Dave debated these servers well, but it should not really matter as the problem are the buffers on both ends of the bottleneck ADSL link.
right, I was forgetting that unless you are the bottleneck, you aren't buffering anything and so debloating makes no difference. In a case like yours where you can't debloat the actual bottleneck, the best that you can do is to artificially become the bottleneck by shaping the traffic. but on the download side it's much harder.
Actually, all RRUL plots that Dave collected show that ingress shaping does work quite well on average. It will fail with a severe DOS, but let’s face it these can only be mitigated by the ISP anyways…
What are we aiming for? something that will show the problem clearly so that fixes can be put in the right place? or a work-around to use in the meantime?
Mmmh, I aim for decent internet connections for home-iusers like myself. It would be great if ISPs could use their leverage on equipment manufacturers to implement the current state of the art solution in broadband gear; realistically even if this would start like today we still face a long transition time, so I am all for putting the smarts into home-router’s. At least the end user has enough incentive to put in (small amount of) work required to mitigate bad buffer management...
I think both need to be pursued, but we need to be clear on what is being done for each one.
I have no connection into telco’s, ISPs, nor OEMs, so all I can help with is getting the “work-around” in good shape and ready for deployment. Arguably convincing ISPs might be more important.
If having BQL+fq_codel with defaults would solve the problem if it was on the right routers, we need to show that.
I think Dave has pretty much shown this. Note though that it is rather traffic shaping and fq_codel, BQL would be needed in the DSL drivers on both sides of the link.
Then, because we can't get the fixes on the right routers and need to work-around the problem by artificially becoming the bottleneck, we need to show that the 95% that we shape to is throwing away 5% of your capacity and make that clear to the users.
I think if you google for “router qos” you will find plenty of pages already describing the rational and bandwidth sacrifice required, so that knowledge might already be in the public knowledge.
otherwise we will risk getting to the point where it will never get fixed because the ISPs will look at their routers and say that bufferbloat can't possibly be a problem as they never have large queues (because we are doing the workarounds.
Honestly, for an ISP the best solution is us shaping our connections as that reduces the worst case bandwidth use per user and might allow higher oversubscription. We need to find economical incentives for ISPs to implement BQL equivalents in the broadband gear. In theory it should give a competitive advantage to be able to advertise better gaming/void suitability but many users really have no real choice of ISP. I cold imagine that the big push away from switched circuit telephony to voip even for carriers ISPs might get more interested in improving VOIP resilience unhand usability under load...
Post by Sebastian Moeller
Post by David Lang
and are you measuring the latency impact when uploading or downloading?
No I measure the impact of latency of saturating both up- and downlink, pretty much the worst case scenario.
I think we need to test this in each direction independently.
Rich Brown has made a nice script to test that, betterspeedtest.sh at https://github.com/richb-hanover/CeroWrtScripts
For figuring out the required shaping point it is easier to work on both “legs” independently, But to assess worst case behavior I think both directions need to be saturated.
There is a pretty good description of a quick bufferloat test on http://www.bufferbloat.net/projects/cerowrt/wiki/Quick_Test_for_Bufferbloat
Cerowrt can do a pretty good job of keeping the uplink from being saturated, but it can't do a lot for the downlink.
Well, except it does. Downlink shaping is less reliable than uplink shaping. Most traffic sources, TCP or UDP actually need to deal with the variable bandwidth of the internet anyway and implement some congestion control, that needs to deal with packet loss as congestion signal. So the downlink shaping mostly works okay (even though I think Dave recommends to shape downlink more aggressive than 95% of link rate)
Post by Sebastian Moeller
Post by David Lang
I think a lot of people would be happy with 95ms average pings on a loaded connection, even with occasional outliers.
No that is too low an aim, this still is not useable for real time applications, we should aim for base RTT plus 10ms. (For very slow links we need to cut some slack but for > 3Mbps 10ms should be achievable )
perfect is the enemy of good enough.
Sure but really according to http://www.hh.se/download/18.70cf2e49129168da015800094780/7_7_delay.pdf we only have a 400ms budget for acceptable voip (I would love real psychophysics papers for that instead of cisco marketing material), or 200ms oneway delay. With ~170ms RTT to the west coast (from university wired network, so no ADSL delay involved) almost half of the budget is used up in a way that can not be fixed easily. (It takes 66ms for light to travel the distance of half the earth’s circumference, or 132ms RTT, or assuming c(fiber) = 0.7* c(vacuum) rather 95ms one-way of 190ms RTT). With ~100ms RTT from each end there is barely enough time left for data processing and transcoding.
There's achievable if every router is tuned to exactly the right conditions and there's achievable for course settings that can be widely deployed. Get the second out while continuing to work on making the first easier.
Okay so that is easy, if you massively overshsape latency will be great, but bandwidth is compromised...
residential connections only come in a smallish number of sizes,
Except that with say DSL there is often a wide corridor for allowed sync speed, e.g. the 50Mbps down / 10Mbps up vdsl2 packet of DT actually will synchronize in a corridor of 50 to 27Mbps and 10 to 5.5 Mbps (numbers are approximately right), That is almost a factor of 2, too much for a one size fits all approach (say 90% of advertised speed).
it shouldn't be too hard to do a few probes and guess which size is in use, then set the bandwith to 90% of that standard size and you should be pretty good without further tuning.
No, with ATM carriers (ADSL, some VDSL) the encapsulation overhead ranges from ~10% to >50% depending on packet size, so to get the bottleneck queue reliable under our control we would need to shape to ~50% of link speed, obviously a very hard sell . (And it is not easy to figure out whether the bottleneck link uses ATM or not, so there is no one size fits all). We currently have no easy and quick way of detecting ATM link layers from cerowrt...
Post by Sebastian Moeller
Post by David Lang
It's far better than sustained multi-second ping times which is what I've seen with stock setups.
True, but compared to multi seconds even <1000ms would be a really great improvement, but also not enough.
Post by David Lang
but if no estimate is this bad, how bad is it if you use as your estimate the 'rated' speed of your DSL (i.e. what the ISP claims they are providing you) instead of the fully accurate speed that includes accounting for ATM encapsulation?
Well ~95ms with outliers > 1000ms, just as bad as no estimate. I shaped 5% below rated speed as reported by the DSL modem, so disabling the ATM link layer adjustments (as shown in the broken lines in the plot), basically increased the effective shaped rate by ~13% or to effectively 107% of line rate, your proposal would be line rate and no link layer adjustments or effectively 110% of line rate; I do not feel like repeating this experiment right now as I think the data so far shows that even with less misjudgment the bloat effect is fully visible ) Not accounting for ATM framing carries a ~10% cost in link speed, as ATM packet size on the wire increases by >= ~10%.
so what if you shape to 90% of rated speed (no allowance for ATM vs other transports)?
I have not done that but the typical recommendation for ADSL links for shaping without taking the link layer peculiarities into account is 85% (which should work for large packets, but can easily melt down with lots of smallish packets, like voip calls). I repeat there is no simple one-size fits all shaping that will solve the buffer bloat issue for most home-users in a acceptable fashion. (And I am not talking perfekt here, it simply is not good enough). Note that 90% will just account for the 48in53 ATM transport cost, it will not take the increased per packet header into account.
Post by Sebastian Moeller
Post by David Lang
It's also worth figuring out if this problem would remain in place if you didn't have to go through the ISP router and were runing fq_codel on that router.
If the DSL modem would be debloated at least on upstream no shaping would be required any more; but that does not fix the need for downstream shaping (and bandwidth estimation) until the head end gear is debloated..
right, I was forgetting this earlier.
Post by Sebastian Moeller
Post by David Lang
As long as fixing bufferbloat involves esoteric measurements and tuning, it's not going to be solved, but if it could be solved by people flahing openwrt onto their DSL router and then using the defaults, it could gain traction fairly quickly.
But as there are only very few DSL modems with open sources (especially of the DSL chips) this just as esoteric ;) Really if equipment manufactures could be convinced to take these issues seriously and actually fix their gear that would be best. But this does not look like it is happening on the fast track. (Even DOCSIS developer cable labs punted on requiring codel or fq_codel in DOCSIS modems since the think that the required timestamps are to “expensive” on the device class they want to use for modems. They opted for PIE, much better than what we have right now but far away from my latency under load increase of 10ms...)
Post by David Lang
Post by Sebastian Moeller
I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are ‘close'
Well if we include outliers (and we should as enough outliers will degrade the FPS and voip suitability of an otherwise responsive system quickly) stock and improper shaping are in the >1000ms worst case range, while proper SQM bounds this to 100ms.
assuming that latency under load without the improvents got >1000ms
fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000
The sign seems off as fast < slow? I like this best ;)
yep, I reversed fast/slow in all of these
Post by Sebastian Moeller
fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19
But Fast < Slow and hence this ration should be <0?
1 not 0, but yes, this is really slow/fast
Post by Sebastian Moeller
slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05
and this >0?
and this is really fast/slow
What about taking the latency difference an re;aging it with a reference time, like say the time a photon would take to travel once around the equator, or the earth’s diamater?
how about latency difference scaled by the time to send one 1500 byte packet at the measured throughput?
So you propose latency difference / time to send one full packet at the measured speed

Not sure: think two de-bloated setups, one fast one slow: for the slow link we get 10ms/long for a fast link we get 10ms/short, so assuming that both keep the 10ms average latency increase why should both links show different bloat-measure?
I really think the raw latency difference is what we should convince the users to look at. All one-number measures are going to be too simplistic, but at least for the difference you can easily estimate the effect on RTTs for relevant traffic...
This would factor out the data rate and would not be affected by long distance links.
I am not convinced that people on a slow link can afford latency increases any better than people on a fast link. I actually think that it is the other way round. During the tuning process your measure might be helpful to find a good tradeoff between bandwidth and latency increase though.

Best Regards
Sebastian
David Lang
Michael Richardson
2014-08-01 04:21:40 UTC
Permalink
On symmetric links, particularly PPP ones, one can use the LCP layer to do
echo requests to the first layer-3 device. This can be used to measure RTT
and through some math, the bandwidth.

On assymetric links, my instinct is that if you can measure the downlink
speed through another mechanism, that one might be able to subtract, but I
can't think exactly how right now.
I'm thinking that one can observe the downlink speed by observing packet
arrival times/sizes for awhile --- the calculation might be too low if the
sender is congested otherwise, but the average should go up slowly.

At first, this means that subtracting the downlink bandwidth from the uplink
bandwidth will, I think, result in too high an uplink speed, which will
result in rate limiting to a too high value, which is bad.

But, if there something wrong with my notion?

My other notion is that the LCP packets could be time stamped by the PPP(oE)
gateway, and this would solve the asymmetry. This would take an IETF action
to make standard and a decade to get deployed, but it might be a clearly
measureable marketing win for ISPs.
--
Michael Richardson
-on the road-
Sebastian Moeller
2014-08-01 18:28:56 UTC
Permalink
HI Michael,
Post by Michael Richardson
On symmetric links, particularly PPP ones, one can use the LCP layer to do
echo requests to the first layer-3 device. This can be used to measure RTT
and through some math, the bandwidth.
Sure.
Post by Michael Richardson
On assymetric links, my instinct is that if you can measure the downlink
speed through another mechanism, that one might be able to subtract, but I
can't think exactly how right now.
I'm thinking that one can observe the downlink speed by observing packet
arrival times/sizes for awhile --- the calculation might be too low if the
sender is congested otherwise, but the average should go up slowly.
If you go this rout, I would rather look at the minimum delay between incoming packets as a function of the size of the second packet.
Post by Michael Richardson
At first, this means that subtracting the downlink bandwidth from the uplink
bandwidth will, I think, result in too high an uplink speed, which will
result in rate limiting to a too high value, which is bad.
But given all the uncertainties right now finding the proper shaping bandwidths is an iterative process anyway, but one that is best started with a decent initial guess. My thinking is that with binary search I would want to definitely see decent latency under load after the first reduction...
Post by Michael Richardson
But, if there something wrong with my notion?
My other notion is that the LCP packets could be time stamped by the PPP(oE)
gateway, and this would solve the asymmetry.
If both devices are time synchronized to a close enough delta that would be great. Initial testing with icmp timestamp request makes me doubt the quality of synchronization (at least right now).
Post by Michael Richardson
This would take an IETF action
to make standard and a decade to get deployed, but it might be a clearly
measureable marketing win for ISPs.
But if the “grown ups” can be made to act wouldn’t we rather see nice end-user query-able SNMP information about the current up and downlink rates (and in what protocol level, e.g. 2400Mbps down, 1103Kbps up ATM carrier) (For all I know the DSLAMs/BRASes might already support this)

Best Regards
Sebastian
Post by Michael Richardson
--
Michael Richardson
-on the road-
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Wes Felter
2014-07-25 20:48:37 UTC
Permalink
The Netgear stock firmware measures bandwidth on every boot or link up
(not sure which) and I would suggest doing the same for CeroWRT.

Do you need to measure Internet bandwidth or last mile bandwidth? For
link bandwidth it seems like you can solve a lot of problems by
measuring to the first hop router. Does the packer pair technique work
on TDMA link layers like DOCSIS?
--
Wes Felter
David Lang
2014-07-25 20:57:44 UTC
Permalink
The Netgear stock firmware measures bandwidth on every boot or link up (not
sure which) and I would suggest doing the same for CeroWRT.
Do you need to measure Internet bandwidth or last mile bandwidth? For link
bandwidth it seems like you can solve a lot of problems by measuring to the
first hop router. Does the packer pair technique work on TDMA link layers
like DOCSIS?
The trouble is that to measure bandwidth, you have to be able to send and
receive a lot of traffic. unless the router you are connecting to is running
some sort of service to support that, you can't just test that link, you have to
connect to something beyond that.

David Lang
Sebastian Moeller
2014-07-26 11:18:34 UTC
Permalink
Hi David,
The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT.
Do you need to measure Internet bandwidth or last mile bandwidth? For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. Does the packer pair technique work on TDMA link layers like DOCSIS?
The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic.
Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times).
unless the router you are connecting to is running some sort of service to support that,
But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though.
Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress)
I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link…
you can't just test that link, you have to connect to something beyond that.
So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams…

Best Regards
Sebastian
David Lang
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
David Lang
2014-07-26 20:21:47 UTC
Permalink
Post by Sebastian Moeller
Hi David,
The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT.
Do you need to measure Internet bandwidth or last mile bandwidth? For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. Does the packer pair technique work on TDMA link layers like DOCSIS?
The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic.
Well that is what you typically do, but you can get away with less
measurement traffic: in an ideal quiescent network sending two packets back to
back should give you the bandwidth (packet size / incoming time difference of
both packets), or send two packets of different size (needs synchronized
clocks, then difference of packet sizes / difference of transfer times).
Except that your ideal network doesn't exist in the real world. You are never
going to have the entire network quiescent, the router you are going to be
talking to is always going to have other things going on, which can affect it's
timing.
Post by Sebastian Moeller
unless the router you are connecting to is running some sort of service to support that,
But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though.
Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress)
I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link…
As you say, anything that requires symmetrical traffic (like ICMP isn't going to
work, and routers do not currently offer any service that will.

you also can't count on time being synced properly. Top Tier companies have
trouble doing that in their dedicated datacenters, depending on it for this sort
of testing is a non-starter
Post by Sebastian Moeller
you can't just test that link, you have to connect to something beyond that.
So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams…
Well, let's talk about what we would like to have on the router

As I see it, we want to have two services

1. a service you send a small amount of data to and it responds by sending you a
large amount of data (preferrably with the most accurate timestamps it has and
the TTL of the packets it received)

2. a service you send a large amount of data to and it responds by sending you
small responses, telling you how much data it has received (with a timestamp and
what the TTL of the packets it received were)

questions:

A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/???

TCP has the problem of slow start so it would need substantially more traffic to
flow to reach steady-state.

anything else has the possibility of taking a different path through the
router/switch software and so the performance may not be the same.

B. How much data is needed to be statistically accurate?

Too many things can happen for 1-2 packets to tell you the answer. The systems
on both ends are multi-tasking, and at high speeds, scheduling jitter will throw
off your calculations with too few packets.

C. How can this be prevented from being used for DoS attacks, either against the
thing running the service or against someone else via a reflected attack if it's
a forgable protocol (i.e. UDP)

One thought I have is to require a high TTL on the packets for the services to
respond to them. That way any abuse of the service would have to take place from
very close on the network.

Ideally these services would only respond to senders that are directly
connected, but until these services are deployed and enabled by default, there
is going to be a need to be the ability to 'jump over' old equipment. This need
will probably never go away completely.


Other requirements or restrictions?

David Lang
Sebastian Moeller
2014-07-26 20:54:39 UTC
Permalink
Hi David,
Post by Sebastian Moeller
Hi David,
The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT.
Do you need to measure Internet bandwidth or last mile bandwidth? For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. Does the packer pair technique work on TDMA link layers like DOCSIS?
The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic.
Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times).
Except that your ideal network doesn't exist in the real world. You are never going to have the entire network quiescent, the router you are going to be talking to is always going to have other things going on, which can affect it's timing.
Sure, the two packets a required per measurement, guess I would calculate the average and confidence interval over several of these (potentially by a moving window) to get a handle on the variability. I have done some RTT measurements on a ADSL link and can say that realistically one needs in the hundreds data points per packet size. This sounds awe full, but at least it does not require to saturate the link and hence works without dedicated receivers on the other end...
Post by Sebastian Moeller
unless the router you are connecting to is running some sort of service to support that,
But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though.
Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress)
I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link…
As you say, anything that requires symmetrical traffic (like ICMP isn't going to work, and routers do not currently offer any service that will.
Well I think the gargoyle idea is feasible given that there is a reference implementation out in the wild ;).
you also can't count on time being synced properly.
Quick testing today drove him that message (ICMP time requests showing receive time before originating times, quite sobering). Naive me had thought that NTP would guarantee <1ms deviation from reference time, but I just figured it is rather low ms to 100ms, so basically useless for one-way delay measurements for close hosts….
Top Tier companies have trouble doing that in their dedicated datacenters, depending on it for this sort of testing is a non-starter
Agreed.
Post by Sebastian Moeller
you can't just test that link, you have to connect to something beyond that.
So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams…
Well, let's talk about what we would like to have on the router
As I see it, we want to have two services
1. a service you send a small amount of data to and it responds by sending you a large amount of data (preferrably with the most accurate timestamps it has and the TTL of the packets it received)
2. a service you send a large amount of data to and it responds by sending you small responses, telling you how much data it has received (with a timestamp and what the TTL of the packets it received were)
A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/???
TCP has the problem of slow start so it would need substantially more traffic to flow to reach steady-state.
anything else has the possibility of taking a different path through the router/switch software and so the performance may not be the same.
You thing UDP would not work out?
B. How much data is needed to be statistically accurate?
Too many things can happen for 1-2 packets to tell you the answer. The systems on both ends are multi-tasking, and at high speeds, scheduling jitter will throw off your calculations with too few packets.
Yeah, but you can (to steal an I idea from Rick Jones netperf) just keep measuring until the confidence interval around the mean of the data falls below a set magnitude. But for the purpose of traffic shaping you do not need the exact link bandwidth anyway just a close enough proxy to start the search for a decent set point from a reasonable position. I think that the actual shaping rates need to be iteratively optimized.
C. How can this be prevented from being used for DoS attacks, either against the thing running the service or against someone else via a reflected attack if it's a forgable protocol (i.e. UDP)
Well, if it only requires a sparse packet stream it is not going to be to useful for DOS attacks,
One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.
But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)
Other requirements or restrictions?
I think the measurement should be fast and continuous…

Best Regards
Sebastian
David Lang
David Lang
2014-07-26 21:14:01 UTC
Permalink
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
The trouble is that to measure bandwidth, you have to be able to send and
receive a lot of traffic.
Well that is what you typically do, but you can get away with less
measurement traffic: in an ideal quiescent network sending two packets back
to back should give you the bandwidth (packet size / incoming time
difference of both packets), or send two packets of different size (needs
synchronized clocks, then difference of packet sizes / difference of
transfer times).
Except that your ideal network doesn't exist in the real world. You are never
going to have the entire network quiescent, the router you are going to be
talking to is always going to have other things going on, which can affect
it's timing.
Sure, the two packets a required per measurement, guess I would
calculate the average and confidence interval over several of these
(potentially by a moving window) to get a handle on the variability. I have
done some RTT measurements on a ADSL link and can say that realistically one
needs in the hundreds data points per packet size. This sounds awe full, but
at least it does not require to saturate the link and hence works without
dedicated receivers on the other end...
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
unless the router you are connecting to is running some sort of service to support that,
But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though.
Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress)
I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link…
As you say, anything that requires symmetrical traffic (like ICMP isn't going to work, and routers do not currently offer any service that will.
Well I think the gargoyle idea is feasible given that there is a
reference implementation out in the wild ;).
I'm not worried about an implementation existing as much as the question of if
it's on the routers/switches by default, and if it isn't, is the service simple
enough to be able to avoid causing load on these devices and to avoid having any
security vulnerabilities (or DDos potential)
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
you can't just test that link, you have to connect to something beyond that.
So it would be sweet if we could use services that are running on the
machines anyway, like ping. That way the “load” of all the leaf nodes of the
internet continuously measuring their bandwidth could be handled in a
distributed fashion avoiding melt-downs by synchronized measurement streams…
Well, let's talk about what we would like to have on the router
As I see it, we want to have two services
1. a service you send a small amount of data to and it responds by sending
you a large amount of data (preferrably with the most accurate timestamps it
has and the TTL of the packets it received)
2. a service you send a large amount of data to and it responds by sending
you small responses, telling you how much data it has received (with a
timestamp and what the TTL of the packets it received were)
A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/???
TCP has the problem of slow start so it would need substantially more traffic
to flow to reach steady-state.
anything else has the possibility of taking a different path through the
router/switch software and so the performance may not be the same.
You thing UDP would not work out?
I don't trust that UDP would go through the same codepaths and delays as TCP

even fw_codel handles TCP differently

so if we measure with UDP, does it really reflect the 'real world' of TCP?
Post by David Lang
B. How much data is needed to be statistically accurate?
Too many things can happen for 1-2 packets to tell you the answer. The
systems on both ends are multi-tasking, and at high speeds, scheduling jitter
will throw off your calculations with too few packets.
Yeah, but you can (to steal an I idea from Rick Jones netperf) just keep
measuring until the confidence interval around the mean of the data falls
below a set magnitude. But for the purpose of traffic shaping you do not need
the exact link bandwidth anyway just a close enough proxy to start the search
for a decent set point from a reasonable position. I think that the actual
shaping rates need to be iteratively optimized.
Post by David Lang
C. How can this be prevented from being used for DoS attacks, either against
the thing running the service or against someone else via a reflected attack
if it's a forgable protocol (i.e. UDP)
Well, if it only requires a sparse packet stream it is not going to be
to useful for DOS attacks,
unless it can be requested a lot
Post by David Lang
One thought I have is to require a high TTL on the packets for the services
to respond to them. That way any abuse of the service would have to take
place from very close on the network.
Ideally these services would only respond to senders that are directly
connected, but until these services are deployed and enabled by default,
there is going to be a need to be the ability to 'jump over' old equipment.
This need will probably never go away completely.
But if we need to modify DSLAMs and CMTSs it would be much nicer if we
could just ask nicely what the current negotiated bandwidths are ;)
negotiated bandwith and effective bandwidth are not the same

what if you can't talk to the devices directly connected to the DSL line, but
only to a router one hop on either side?

for example, I can't buy (at least not for anything close to a reasonable price)
a router to run at home that has a DSL port on it, so I will always have some
device between me and the DSL.

If you have a shared media (cable, wireless, etc), the negotiated speed is
meaningless.

In my other location, I have a wireless link that is ethernet to the dish on the
roof, I expect the other end is a similar setup, so I can never see the link
speed directly (not to mention the fact that rain can degrade the effective link
speed)
Post by David Lang
Other requirements or restrictions?
I think the measurement should be fast and continuous…
Fast yes, because we want to impact the network as little as possible

continuous?? I'm not so sure. Do conditions really change that much? And as I
ask in the other thread, how much does it hurt if your estimates are wrong?

for wireless links the conditions are much more variable, but we don't really
know what is going to work well there.

David Lang
Sebastian Moeller
2014-07-26 21:48:42 UTC
Permalink
Hi David,
Post by Sebastian Moeller
The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic.
Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times).
Except that your ideal network doesn't exist in the real world. You are never going to have the entire network quiescent, the router you are going to be talking to is always going to have other things going on, which can affect it's timing.
Sure, the two packets a required per measurement, guess I would calculate the average and confidence interval over several of these (potentially by a moving window) to get a handle on the variability. I have done some RTT measurements on a ADSL link and can say that realistically one needs in the hundreds data points per packet size. This sounds awe full, but at least it does not require to saturate the link and hence works without dedicated receivers on the other end...
Post by Sebastian Moeller
unless the router you are connecting to is running some sort of service to support that,
But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though.
Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress)
I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link…
As you say, anything that requires symmetrical traffic (like ICMP isn't going to work, and routers do not currently offer any service that will.
Well I think the gargoyle idea is feasible given that there is a reference implementation out in the wild ;).
I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential)
But with gargoyle the idea is to monitor a sparse ping stream to the closest host responding and interpreting a sudden increase in RTT as a sign the the upstreams buffers are filling up and using this as signal to throttle on the home router. My limited experience shows that quite often close hosts will respond to pings...
Post by Sebastian Moeller
you can't just test that link, you have to connect to something beyond that.
So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams…
Well, let's talk about what we would like to have on the router
As I see it, we want to have two services
1. a service you send a small amount of data to and it responds by sending you a large amount of data (preferrably with the most accurate timestamps it has and the TTL of the packets it received)
2. a service you send a large amount of data to and it responds by sending you small responses, telling you how much data it has received (with a timestamp and what the TTL of the packets it received were)
A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/???
TCP has the problem of slow start so it would need substantially more traffic to flow to reach steady-state.
anything else has the possibility of taking a different path through the router/switch software and so the performance may not be the same.
You thing UDP would not work out?
I don't trust that UDP would go through the same codepaths and delays as TCP
Why should a router care
even fw_codel handles TCP differently
Does it? I thought UDP typically reacts differently to fq_codels dropping strategy but fq_codel does not differentiate between protocols (last time I looked at the code I came to that conclusion, but I am not very fluent in C so I might be simply wrong here)
so if we measure with UDP, does it really reflect the 'real world' of TCP?
But we care for UDP as well, no?
B. How much data is needed to be statistically accurate?
Too many things can happen for 1-2 packets to tell you the answer. The systems on both ends are multi-tasking, and at high speeds, scheduling jitter will throw off your calculations with too few packets.
Yeah, but you can (to steal an I idea from Rick Jones netperf) just keep measuring until the confidence interval around the mean of the data falls below a set magnitude. But for the purpose of traffic shaping you do not need the exact link bandwidth anyway just a close enough proxy to start the search for a decent set point from a reasonable position. I think that the actual shaping rates need to be iteratively optimized.
C. How can this be prevented from being used for DoS attacks, either against the thing running the service or against someone else via a reflected attack if it's a forgable protocol (i.e. UDP)
Well, if it only requires a sparse packet stream it is not going to be to useful for DOS attacks,
unless it can be requested a lot
Well yes, hence sparse stream, if we can make sure to always just send sparse streams we will stay on the backwaters of services useful for DOS I would guess, we just need not to be the low hanging fruit :) .
One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.
But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)
negotiated bandwith and effective bandwidth are not the same
what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side?
In my limited experience the typical bottleneck is the DSL line, so if we shape for that we are fine… Assume for a moment the DSLAM uplink is so congested because of oversubscription of the DSLAM, that now this constitutes the bottleneck. Now the available bandwidth for each user depends on the combined traffic of all users, not a situation we can reasonable shape for anyway (I would hope that ISPs monitor this situation and would remedy it by adding uplink capacity, so this hopefully is just a transient event).
for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL.
http://wiki.openwrt.org/toh/tp-link/td-w8970 or http://www.traverse.com.au/products ? If you had the DSL modem in the router under cerowrts control you would not need to use a traffic shaper for your uplink, as you could apply the BQL ideas to the ADSL driver.
If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless.
Not exactly meaningless, if gives you an upper bound...
In my other location, I have a wireless link that is ethernet to the dish on the roof, I expect the other end is a similar setup, so I can never see the link speed directly (not to mention the fact that rain can degrade the effective link speed)
One more case for measuring the link speed continuously!
Other requirements or restrictions?
I think the measurement should be fast and continuous…
Fast yes, because we want to impact the network as little as possible
continuous?? I'm not so sure. Do conditions really change that much?
You just gave an example above for changing link conditions, by shared media...
And as I ask in the other thread, how much does it hurt if your estimates are wrong?
I think I sent a plot to that regard.
for wireless links the conditions are much more variable, but we don't really know what is going to work well there.
Wireless as in point 2 point links or in wifi?

Best Regards
Sebastian
David Lang
David Lang
2014-07-26 22:23:13 UTC
Permalink
Post by Sebastian Moeller
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
Post by Sebastian Moeller
unless the router you are connecting to is running some sort of service to support that,
But this still requires some service on the other side. You could try to
use ICMP packets, but these will only allow to measure RTT not one-way
delays (if you do this on ADSL you will find the RTT dominated by the
typically much slower uplink path). If network equipment would be
guaranteed to use NTP for decent clock synchronization and would respond
to timestamp ICMP messages with timestamp reply measuring bandwidth might
be “cheap” enough to keep running in the background, though.
Since this looks too simple there must be a simple reason why this would
fail. (It would be nice if ping packets with timestamps would have
required the echo server top also store its incoming timestamp in the
echo, but I digress)
I note that gargoyle uses a sparse stream of ping packets to a close
host and uses increases in RTT as proxy for congestion and signal to
throttle down stream link…
As you say, anything that requires symmetrical traffic (like ICMP isn't
going to work, and routers do not currently offer any service that will.
Well I think the gargoyle idea is feasible given that there is a
reference implementation out in the wild ;).
I'm not worried about an implementation existing as much as the question of
if it's on the routers/switches by default, and if it isn't, is the service
simple enough to be able to avoid causing load on these devices and to avoid
having any security vulnerabilities (or DDos potential)
But with gargoyle the idea is to monitor a sparse ping stream to the
closest host responding and interpreting a sudden increase in RTT as a sign
the the upstreams buffers are filling up and using this as signal to throttle
on the home router. My limited experience shows that quite often close hosts
will respond to pings...
that measures latency, but how does it tell you bandwidth unless you are the
only possible thing on the network and you measure what you are receiving?
Post by Sebastian Moeller
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
Post by Sebastian Moeller
you can't just test that link, you have to connect to something beyond that.
So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams…
Well, let's talk about what we would like to have on the router
As I see it, we want to have two services
1. a service you send a small amount of data to and it responds by sending you a large amount of data (preferrably with the most accurate timestamps it has and the TTL of the packets it received)
2. a service you send a large amount of data to and it responds by sending you small responses, telling you how much data it has received (with a timestamp and what the TTL of the packets it received were)
A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/???
TCP has the problem of slow start so it would need substantially more traffic to flow to reach steady-state.
anything else has the possibility of taking a different path through the router/switch software and so the performance may not be the same.
You thing UDP would not work out?
I don't trust that UDP would go through the same codepaths and delays as TCP
Why should a router care
Post by David Lang
even fw_codel handles TCP differently
Does it? I thought UDP typically reacts differently to fq_codels
dropping strategy but fq_codel does not differentiate between protocols (last
time I looked at the code I came to that conclusion, but I am not very fluent
in C so I might be simply wrong here)
with TCP, the system can tell the difference between different connections to
the same system, with UDP it needs to infer this from port numbers, this isn't
as accurate and so the systems (fq_codel and routers) handle them in a slightly
different way. This does affect the numbers.
Post by Sebastian Moeller
Post by David Lang
so if we measure with UDP, does it really reflect the 'real world' of TCP?
But we care for UDP as well, no?
Yes, but the reality is that the vast majority of traffic is TCP, and that's
what the devices are optimized to handle, so if we measure with UDP we may not
get the same results as if we measure with TCP.

measuing with ICMP is different yet again.

Think of the router ASICs that handle the 'normal' traffic in the ASIC in the
card, but 'unusual' traffic needs to be sent to the core CPU to be processed and
is therefor MUCH slower
Post by Sebastian Moeller
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.
But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)
negotiated bandwith and effective bandwidth are not the same
what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side?
In my limited experience the typical bottleneck is the DSL line, so if
we shape for that we are fine… Assume for a moment the DSLAM uplink is so
congested because of oversubscription of the DSLAM, that now this constitutes
the bottleneck. Now the available bandwidth for each user depends on the
combined traffic of all users, not a situation we can reasonable shape for
anyway (I would hope that ISPs monitor this situation and would remedy it by
adding uplink capacity, so this hopefully is just a transient event).
for DSL you are correct, it's a point-to-point connection (star network
topology), but we have other technologies used in homes that are shared-media
bus topology networks. This includes cablemodems and wireless links.
Post by Sebastian Moeller
Post by David Lang
for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL.
http://wiki.openwrt.org/toh/tp-link/td-w8970 or
no 5GHz wireless?
Post by Sebastian Moeller
http://www.traverse.com.au/products ?
I couldn't figure out where to buy one through their site.
Post by Sebastian Moeller
If you had the DSL modem in the router
under cerowrts control you would not need to use a traffic shaper for your
uplink, as you could apply the BQL ideas to the ADSL driver.
Post by David Lang
If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless.
Not exactly meaningless, if gives you an upper bound...
true, but is an upper bound good enough? How close does the estimate need to be?

and does it matter if both sides are doign fq_codel or is this still in the mode
of trying to control the far side indirectly?
Post by Sebastian Moeller
Post by David Lang
In my other location, I have a wireless link that is ethernet to the dish on the roof, I expect the other end is a similar setup, so I can never see the link speed directly (not to mention the fact that rain can degrade the effective link speed)
One more case for measuring the link speed continuously!
at what point does the measuring process interfere with the use of the link? or
cause other upstream issues.
Post by Sebastian Moeller
Post by David Lang
Post by Sebastian Moeller
Post by David Lang
Other requirements or restrictions?
I think the measurement should be fast and continuous…
Fast yes, because we want to impact the network as little as possible
continuous?? I'm not so sure. Do conditions really change that much?
You just gave an example above for changing link conditions, by shared media...
but can you really measure fast enough to handle shared media? at some point you
need to give up measuring because by the time you have your measurement it's
obsolete.

If you look at networking with a tight enough timeframe, it's either idle or
100% utilized depending on if a bit is being sent at that instant, however a
plot at that precision is worthless :-)
Post by Sebastian Moeller
Post by David Lang
And as I ask in the other thread, how much does it hurt if your estimates are wrong?
I think I sent a plot to that regard.
yep, our mails are crossing
Post by Sebastian Moeller
Post by David Lang
for wireless links the conditions are much more variable, but we don't really know what is going to work well there.
Wireless as in point 2 point links or in wifi?
both, point-to-point is variable based on weather, trees blowing in the wind,
interference, etc. Wifi has a lot more congestion, so interference dominates
everything else.

David Lang
Sebastian Moeller
2014-07-26 23:08:08 UTC
Permalink
Hi David,


On Jul 27, 2014, at 00:23 , David Lang <***@lang.hm> wrote:
[...]
Post by Sebastian Moeller
I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential)
But with gargoyle the idea is to monitor a sparse ping stream to the closest host responding and interpreting a sudden increase in RTT as a sign the the upstreams buffers are filling up and using this as signal to throttle on the home router. My limited experience shows that quite often close hosts will respond to pings...
that measures latency, but how does it tell you bandwidth unless you are the only possible thing on the network and you measure what you are receiving?
So the idea would be start the ping probe with no traffic and increase the traffic until the ping RTT increases, the useable bandwidth is around when RTTs increase.

[...]
Post by Sebastian Moeller
even fw_codel handles TCP differently
Does it? I thought UDP typically reacts differently to fq_codels dropping strategy but fq_codel does not differentiate between protocols (last time I looked at the code I came to that conclusion, but I am not very fluent in C so I might be simply wrong here)
with TCP, the system can tell the difference between different connections to the same system, with UDP it needs to infer this from port numbers, this isn't as accurate and so the systems (fq_codel and routers) handle them in a slightly different way. This does affect the numbers.
But that only affects the hashing into fq_codel bins? From http://lxr.free-electrons.com/source/net/sched/sch_fq_codel.c
70 static unsigned int fq_codel_hash(const struct fq_codel_sched_data *q,
71 const struct sk_buff *skb)
72 {
73 struct flow_keys keys;
74 unsigned int hash;
75
76 skb_flow_dissect(skb, &keys);
77 hash = jhash_3words((__force u32)keys.dst,
78 (__force u32)keys.src ^ keys.ip_proto,
79 (__force u32)keys.ports, q->perturbation);
80 return ((u64)hash * q->flows_cnt) >> 32;
81 }

The way I read this is that it just uses source and destination IP and the port, all the protocol does is make sure different protocol connections to the same src dot ports ruble end in different bins, no? My C is bad so I would not be amazed if my interpretation would be wrong, but please show me where?
Post by Sebastian Moeller
so if we measure with UDP, does it really reflect the 'real world' of TCP?
But we care for UDP as well, no?
Yes, but the reality is that the vast majority of traffic is TCP, and that's what the devices are optimized to handle, so if we measure with UDP we may not get the same results as if we measure with TCP.
measuing with ICMP is different yet again.
Yes, I have heard stories like that when I set out for my little detect ATM quantization from ping RTTs, but to my joy it looks like ICMP still gives reasonable measurements! Besed on tat data I would assume UDP to be even less exotic and hence handled even less special and hence more like tcp?
Think of the router ASICs that handle the 'normal' traffic in the ASIC in the card, but 'unusual' traffic needs to be sent to the core CPU to be processed and is therefor MUCH slower
Except for my ICMP RTT measurements I still saw quantization steps in accordance with the expected best case RTT for a packet, showing that the slow processing at least is constant and hence easy to get ridd of in measurements...
Post by Sebastian Moeller
Post by Sebastian Moeller
One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.
But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)
negotiated bandwith and effective bandwidth are not the same
what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side?
In my limited experience the typical bottleneck is the DSL line, so if we shape for that we are fine… Assume for a moment the DSLAM uplink is so congested because of oversubscription of the DSLAM, that now this constitutes the bottleneck. Now the available bandwidth for each user depends on the combined traffic of all users, not a situation we can reasonable shape for anyway (I would hope that ISPs monitor this situation and would remedy it by adding uplink capacity, so this hopefully is just a transient event).
for DSL you are correct, it's a point-to-point connection (star network topology), but we have other technologies used in homes that are shared-media bus topology networks. This includes cablemodems and wireless links.
Well, yes I understand, but you again would assume that the cable ISP tries to provision the system so that most users are happy, so congestion is not the rule? Even then I think cable guarantees some minimum rates per user, no? With wireless it is worse in that RF events outside of the ISP and end users control can ruin the day.
Post by Sebastian Moeller
for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL.
http://wiki.openwrt.org/toh/tp-link/td-w8970 or
no 5GHz wireless?
Could be, but definitely reasonable priced, probany cheap enough to use as smart de-bloated DSL modem, so your main router does not need HTB traffic shaping on uplink anymore. I might actually go that route since I really dislike my ISP primary router, but I digress...
Post by Sebastian Moeller
http://www.traverse.com.au/products ?
I couldn't figure out where to buy one through their site.
Maybe they only sell in AU, I guess I just wanted to be helpful,
Post by Sebastian Moeller
If you had the DSL modem in the router under cerowrts control you would not need to use a traffic shaper for your uplink, as you could apply the BQL ideas to the ADSL driver.
If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless.
Not exactly meaningless, if gives you an upper bound...
true, but is an upper bound good enough? How close does the estimate need to be?
If we end up recommending people using say binary search to find the best tradeoff (maximizing throughput while keeping the maximum latency under load increase bounded to say 10ms) we should have an idea where to start, so bit to large is fine as a starting point. Traditionally the recommendation was around 85% of link rates, but that never came with a decent justification or data.
and does it matter if both sides are doign fq_codel or is this still in the mode of trying to control the far side indirectly?
Yes, this is only relevant as long as both sides of the bottleneck link are not de-bloated. But it does not look like DSLAMs/CMTs will change any time soon from the old ways...
Post by Sebastian Moeller
In my other location, I have a wireless link that is ethernet to the dish on the roof, I expect the other end is a similar setup, so I can never see the link speed directly (not to mention the fact that rain can degrade the effective link speed)
One more case for measuring the link speed continuously!
at what point does the measuring process interfere with the use of the link? or cause other upstream issues.
If my measuring by sparse stream idea works out the answer to both questions is not much ;)
Post by Sebastian Moeller
Post by Sebastian Moeller
Other requirements or restrictions?
I think the measurement should be fast and continuous…
Fast yes, because we want to impact the network as little as possible
continuous?? I'm not so sure. Do conditions really change that much?
You just gave an example above for changing link conditions, by shared media...
but can you really measure fast enough to handle shared media? at some point you need to give up measuring because by the time you have your measurement it's obsolete.
So this is not going to work well a wifi wlan with wildly fluctuating rates (see Dave’s upcoming project make-wifi-fast) but for typical cable node where congestion changes over the day as a function of people being at home it might be fast enough.
If you look at networking with a tight enough timeframe, it's either idle or 100% utilized depending on if a bit is being sent at that instant, however a plot at that precision is worthless :-)
Yes I think a moving average over some time would be required.
Post by Sebastian Moeller
And as I ask in the other thread, how much does it hurt if your estimates are wrong?
I think I sent a plot to that regard.
yep, our mails are crossing
Post by Sebastian Moeller
for wireless links the conditions are much more variable, but we don't really know what is going to work well there.
Wireless as in point 2 point links or in wifi?
both, point-to-point is variable based on weather, trees blowing in the wind, interference, etc. Wifi has a lot more congestion, so interference dominates everything else.
So maybe that is a diffent kettle of fish then.

Best Regards
Sebastian
David Lang
David Lang
2014-07-27 01:04:41 UTC
Permalink
Post by Sebastian Moeller
[...]
Post by Sebastian Moeller
I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential)
But with gargoyle the idea is to monitor a sparse ping stream to the closest host responding and interpreting a sudden increase in RTT as a sign the the upstreams buffers are filling up and using this as signal to throttle on the home router. My limited experience shows that quite often close hosts will respond to pings...
that measures latency, but how does it tell you bandwidth unless you are the only possible thing on the network and you measure what you are receiving?
So the idea would be start the ping probe with no traffic and increase the traffic until the ping RTT increases, the useable bandwidth is around when RTTs increase.
[...]
Post by Sebastian Moeller
even fw_codel handles TCP differently
Does it? I thought UDP typically reacts differently to fq_codels dropping strategy but fq_codel does not differentiate between protocols (last time I looked at the code I came to that conclusion, but I am not very fluent in C so I might be simply wrong here)
with TCP, the system can tell the difference between different connections to the same system, with UDP it needs to infer this from port numbers, this isn't as accurate and so the systems (fq_codel and routers) handle them in a slightly different way. This does affect the numbers.
Sebastian Moeller
2014-07-27 11:38:33 UTC
Permalink
[...]
Post by Sebastian Moeller
Think of the router ASICs that handle the 'normal' traffic in the ASIC in the card, but 'unusual' traffic needs to be sent to the core CPU to be processed and is therefor MUCH slower
Except for my ICMP RTT measurements I still saw quantization steps in accordance with the expected best case RTT for a packet, showing that the slow processing at least is constant and hence easy to get ridd of in measurements...
yeah, I have to remind myself of the "perfect is the enemy of good enough" frequently as well. I tend to fall into that trap pretty easily, as this discussion has shown :-)
ping is easy to test. As a thought, is the response time of NTP queries any more or less stable?
No idea? How would you test this (any command line to try). The good thingg with the ping is that often even the DSLAM responds keeping external sources (i.e. hops further away in the network) of variability out of the measurement...
Post by Sebastian Moeller
Post by Sebastian Moeller
Post by David Lang
Post by Sebastian Moeller
One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.
But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)
negotiated bandwith and effective bandwidth are not the same
what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side?
In my limited experience the typical bottleneck is the DSL line, so if we shape for that we are fine… Assume for a moment the DSLAM uplink is so congested because of oversubscription of the DSLAM, that now this constitutes the bottleneck. Now the available bandwidth for each user depends on the combined traffic of all users, not a situation we can reasonable shape for anyway (I would hope that ISPs monitor this situation and would remedy it by adding uplink capacity, so this hopefully is just a transient event).
for DSL you are correct, it's a point-to-point connection (star network topology), but we have other technologies used in homes that are shared-media bus topology networks. This includes cablemodems and wireless links.
Well, yes I understand, but you again would assume that the cable ISP tries to provision the system so that most users are happy, so congestion is not the rule? Even then I think cable guarantees some minimum rates per user, no? With wireless it is worse in that RF events outside of the ISP and end users control can ruin the day.
guarantee is too strong a word. It depends on how much competition there is.
15 years or so ago I moved from a 3Mb cablemodem to a 128K IDSL line and saw my performance increase significantly.
I used to think exactly the same, but currently I tend to think that the difference is about how well managed a node is not so much the access technology, with DSL the shared medium is the link connecting the DSLAM to the backbone, if this is congested it is similar to a busy cable node. In both cases the ISP needs to make sure the shared segments congestion is well managed. I might be that DSLAMs are typically better manages as TELCO’s always dealt with interactive (bi-directional) traffic while cable traditionally was a one-directional transport. So I assume both have different traditions about provisioning. I could be off my rocker here ;)
Post by Sebastian Moeller
Post by Sebastian Moeller
Post by David Lang
for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL.
http://wiki.openwrt.org/toh/tp-link/td-w8970 or
no 5GHz wireless?
Could be, but definitely reasonable priced, probany cheap enough to use as smart de-bloated DSL modem, so your main router does not need HTB traffic shaping on uplink anymore. I might actually go that route since I really dislike my ISP primary router, but I digress...
Post by Sebastian Moeller
http://www.traverse.com.au/products ?
I couldn't figure out where to buy one through their site.
Maybe they only sell in AU, I guess I just wanted to be helpful,
Post by Sebastian Moeller
If you had the DSL modem in the router under cerowrts control you would not need to use a traffic shaper for your uplink, as you could apply the BQL ideas to the ADSL driver.
Post by David Lang
If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless.
Not exactly meaningless, if gives you an upper bound...
true, but is an upper bound good enough? How close does the estimate need to be?
If we end up recommending people using say binary search to find the best tradeoff (maximizing throughput while keeping the maximum latency under load increase bounded to say 10ms) we should have an idea where to start, so bit to large is fine as a starting point. Traditionally the recommendation was around 85% of link rates, but that never came with a decent justification or data.
well, if we are doing a binary search, having the initial estimate off by a lot isn't actually going to hurt much, we'll still converge very quickly on the right value
Yes, but we still need to solve the question what infrastructure to test against ;)
Post by Sebastian Moeller
and does it matter if both sides are doign fq_codel or is this still in the mode of trying to control the far side indirectly?
Yes, this is only relevant as long as both sides of the bottleneck link are not de-bloated. But it does not look like DSLAMs/CMTs will change any time soon from the old ways...
yep, I had been forgetting this.
Post by Sebastian Moeller
Post by Sebastian Moeller
Post by David Lang
In my other location, I have a wireless link that is ethernet to the dish on the roof, I expect the other end is a similar setup, so I can never see the link speed directly (not to mention the fact that rain can degrade the effective link speed)
One more case for measuring the link speed continuously!
at what point does the measuring process interfere with the use of the link? or cause other upstream issues.
If my measuring by sparse stream idea works out the answer to both questions is not much ;)
Post by Sebastian Moeller
Post by David Lang
Post by Sebastian Moeller
Other requirements or restrictions?
I think the measurement should be fast and continuous…
Fast yes, because we want to impact the network as little as possible
continuous?? I'm not so sure. Do conditions really change that much?
You just gave an example above for changing link conditions, by shared media...
but can you really measure fast enough to handle shared media? at some point you need to give up measuring because by the time you have your measurement it's obsolete.
So this is not going to work well a wifi wlan with wildly fluctuating rates (see Dave’s upcoming project make-wifi-fast) but for typical cable node where congestion changes over the day as a function of people being at home it might be fast enough.
If you look at networking with a tight enough timeframe, it's either idle or 100% utilized depending on if a bit is being sent at that instant, however a plot at that precision is worthless :-)
Yes I think a moving average over some time would be required.
Post by Sebastian Moeller
Post by David Lang
And as I ask in the other thread, how much does it hurt if your estimates are wrong?
I think I sent a plot to that regard.
yep, our mails are crossing
Post by Sebastian Moeller
Post by David Lang
for wireless links the conditions are much more variable, but we don't really know what is going to work well there.
Wireless as in point 2 point links or in wifi?
both, point-to-point is variable based on weather, trees blowing in the wind, interference, etc. Wifi has a lot more congestion, so interference dominates everything else.
So maybe that is a diffent kettle of fish then.
I think we need to get a simple, repeatable test together and then have people start using it and reporting what they find and the type of connection they are on, otherwise we are speculating from far too little data.
So Rich Brown’s betterspeedtest.sh is a simple test, at least for the crowd of people involved in the buffer bloat discussion right now. I always love to see more data, especially I would be interested to see data from VDSL1 lines and GPON fiber lines…

Best Regards
Sebastian
David Lang
Michael Richardson
2014-08-01 04:51:08 UTC
Permalink
Post by Sebastian Moeller
No idea? How would you test this (any command line to try). The good
thingg with the ping is that often even the DSLAM responds keeping
external sources (i.e. hops further away in the network) of variability
out of the measurement...
With various third-party-internet-access ("TPIA" in Canada), the DSLAM
is operated by the incumbent (monopoly) telco, and the layer-3 first hop
is connected via PPPoE-VLAN or PPP/L2TP. The incumbent telco has significant
incentive to make the backhaul network as congested and bufferbloated as
possible, and to mis-crimp cables so that the DSL resyncs at different speeds
regularly... my incumbent telco's commercial LAN extension salesperson
proudly told me how they never drop packets, even when their links are
congested!!!

The Third Party ISP has a large incentive to deploy equipment that supports
whatever "bandwidth measurement" service we might cook up.
Sebastian Moeller
2014-08-01 18:04:59 UTC
Permalink
Hi MIchael,
Post by Michael Richardson
Post by Sebastian Moeller
No idea? How would you test this (any command line to try). The good
thingg with the ping is that often even the DSLAM responds keeping
external sources (i.e. hops further away in the network) of variability
out of the measurement...
With various third-party-internet-access ("TPIA" in Canada), the DSLAM
is operated by the incumbent (monopoly) telco, and the layer-3 first hop
is connected via PPPoE-VLAN or PPP/L2TP.
So they “own” the copper lines connecting each customer to the DSLAM? And everybody else just rents their DSL service and resells them? Do they really connect to the DSLAM or to the BRAS?
Post by Michael Richardson
The incumbent telco has significant
incentive to make the backhaul network as congested and bufferbloated as
possible, and to mis-crimp cables so that the DSL resyncs at different speeds
regularly…
I think in Germany the incumbent has to either rent out the copper lines to competitors (who can put their own lines cards in DSLAMs backed by their own back-bone) or rent “bit-stream” access that is the incumbent handles the DSL part on both ends and passes the traffic either in the next central office or at specific transit points. I always assumed competitors renting these services would get much better guarantees than end-customers, but it seems in Canada the incumbent has more found ways to evade efficient regulation.
Post by Michael Richardson
my incumbent telco's commercial LAN extension salesperson
proudly told me how they never drop packets, even when their links are
congested!!!
I really hope this is the opinion of a sales person and not the network operators who really operate the gear in the “field”. On the other hand having sufficient buffering in the DSLAM to never having to drop a packet sounds quite manly (and a terrible waste of otherwise fine DRAM chips) ;)
Post by Michael Richardson
The Third Party ISP has a large incentive to deploy equipment that supports
whatever "bandwidth measurement" service we might cook up.
As much as I would like to think otherwise, the only way to get a BMS in the field is if all national regulators require it by law (well maybe if ITU would bake it into the next xDSL standard that the DSLAM has to report current line speeds as per SNMP? back to all down stream devices asking for it). But I am not holding my breath…

Best Regards
Sebastian
Post by Michael Richardson
--
Michael Richardson
-on the road-
Michael Richardson
2014-08-02 20:17:32 UTC
Permalink
Post by Sebastian Moeller
Post by Sebastian Moeller
No idea? How would you test this (any command line to try). The good
thingg with the ping is that often even the DSLAM responds keeping
external sources (i.e. hops further away in the network) of
variability out of the measurement...
With various third-party-internet-access ("TPIA" in Canada), the DSLAM
is operated by the incumbent (monopoly) telco, and the layer-3 first
hop is connected via PPPoE-VLAN or PPP/L2TP.
So they “own” the copper lines connecting each customer to the DSLAM?
And everybody else just rents their DSL service and resells them? Do
they really connect to the DSLAM or to the BRAS?
correct, the copper continues to be regulated; the incumbent was given a
guaranteed 11-14% profit on that service for the past 75 years...

Third parties get an NNI to the incumbent in a data centre.
1) for bridged ethernet DSL service ("HSA" in Bell Canada land), the
each customer shows up to the ISP in a VLAN tag.
2) for PPPoE DSL service, the traffic comes in a specific VLAN, over
IP (RFC1918) via L2TP.

Other parties can put copper in the ground, and in some parts of Canada, this
has occured. Also worth mentioning that
AlbertaGovernmentTelephone/EdmontonTel/BCTel became "TELUS", and then left
the Stentor/Bell-Canada alliance, so Bell can be the third party in the west,
while Telus is the third party in the centre, and Island/Aliant/NBTel/Sasktel
remain government owned... and they actually do different things as a result.
Post by Sebastian Moeller
I think in Germany the incumbent has to either rent out the copper
lines to competitors (who can put their own lines cards in DSLAMs
backed by their own back-bone) or rent “bit-stream” access that is the
incumbent handles the DSL part on both ends and passes the traffic
either in the next central office or at specific transit points. I
always assumed competitors renting these services would get much better
guarantees than end-customers, but it seems in Canada the incumbent has
more found ways to evade efficient regulation.
This option exists, but the number of CLECs is large, and the move towards
VDSL2 / Fiber-To-The-Neighbourhood (with much shorter copper options!!) means
that this is impractical.
Post by Sebastian Moeller
my incumbent telco's commercial LAN extension salesperson proudly told
me how they never drop packets, even when their links are congested!!!
I really hope this is the opinion of a sales person and not the
network operators who really operate the gear in the “field”. On the
other hand having sufficient buffering in the DSLAM to never having to
drop a packet sounds quite manly (and a terrible waste of otherwise
fine DRAM chips) ;)
I think much of the buffer is the legacy Nortel Passport 15K that ties much
of the system together...
Post by Sebastian Moeller
The Third Party ISP has a large incentive to deploy equipment that
supports whatever "bandwidth measurement" service we might cook up.
As much as I would like to think otherwise, the only way to get a BMS
in the field is if all national regulators require it by law (well
maybe if ITU would bake it into the next xDSL standard that the DSLAM
has to report current line speeds as per SNMP? back to all down stream
devices asking for it). But I am not holding my breath…
My position is that if there isn't a technical specification, no regulation
could possibly follow...

--
] Never tell me the odds! | ipv6 mesh networks [
] Michael Richardson, Sandelman Software Works | network architect [
] ***@sandelman.ca http://www.sandelman.ca/ | ruby on rails [
Michael Richardson
2014-08-01 04:40:18 UTC
Permalink
Post by Sebastian Moeller
Post by David Lang
The trouble is that to measure bandwidth, you have to be able to send
and receive a lot of traffic.
Well that is what you typically do, but you can get away with less
measurement traffic: in an ideal quiescent network sending two packets
back to back should give you the bandwidth (packet size / incoming time
difference of both packets), or send two packets of different size
(needs synchronized clocks, then difference of packet sizes /
difference of transfer times).
Apparently common 802.1ah libraries in most routers can do speed tests at
layer-2 for ethernet doing exactly this. (Apparently, one vendor's code is
in 90% of equipment out there, cause some of this stuff invoves intimate
knowledge of PHYs and MII buses, and it's not worth anyone's time to write
the code over again vs licensing it...)
Post by Sebastian Moeller
But this still requires some service on the other side. You could try
to use ICMP packets, but these will only allow to measure RTT not
one-way delays (if you do this on ADSL you will find the RTT dominated
by the typically much slower uplink path). If network equipment would
And correct me if I'm wrong, if you naively divide by two, you wind up
overestimating the uplink speed.
Post by Sebastian Moeller
Post by David Lang
you can't just test that link, you have to connect to something beyond that.
So it would be sweet if we could use services that are running on the
machines anyway, like ping. That way the “load” of all the leaf nodes
of the internet continuously measuring their bandwidth could be handled
in a distributed fashion avoiding melt-downs by synchronized
measurement streams

sadly, ICMP responses are rate limited, even when they are implemented in the
fast path. PPP's LCP is not, AFAIK.
--
Michael Richardson
-on the road-
Sebastian Moeller
2014-07-26 11:01:02 UTC
Permalink
Hi Wes,
The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT.
Do you need to measure Internet bandwidth or last mile bandwidth?
I think you want the bandwidth of the usual bottleneck, on DSL that typically is the actual DSL-link to the DSLAM (even though the DSLAM is oversubscribed typically its upstream link is not congested…). I think with DOCSIS it is the same. Realistically bandwidth measurement are going to be sporadic, so this will only help with pretty constant bottlenecks anyway, no use in trying to track, say the DSLAM congestion that transiently happens during peak use time...
For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router.
And that would be sweet, but with DT’s network the first hop does not respond to ICMP probes, nor anything else under end user control, also the bottleneck might actually be in the BRAS, which can be upstream of the DSLAM. What would be great is if all CPE would return the current link rates per SNMP or so… Or if DSLAMs and CMTSs would supply data sinks and sources for easy testing of good-put.
Does the packer pair technique work on TDMA link layers like DOCSIS?
Toke and Dave dug up a paper showing that packet pair is not an reliable estimator for link bandwidth, So one could send independent packet of differing size, but then one needs to synchronize the clocks somehow…

Best Regards
Sebastian
--
Wes Felter
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Loading...