Discussion:
Cerowrt 3.3.8-10 is released
(too old to reply)
Dave Taht
2012-07-10 01:43:26 UTC
Permalink
Get it at : http://huchra.bufferbloat.net/~cero1/3.3/3.3.8-10

This will be the last release of cerowrt for a while. I think this is
even stabler than 3.3.8-6 was, but we'll (you'll) see.

I will be traveling for most of the next month and unable to do much
bloat-related stuff.

Everything I deeply care about has been pushed into openwrt, anyway.

Cerowrt-3.3.8-10 is stable but forward-looking. It has an outline
towards what a more wifi-bloat-free future would look like. Maybe.

While the code remains experimental (as always) I did spend the last 2
weeks doing a test deployment of 12 (3800, pico 2HP, nano-m5) radios
at a campground, with what is basically in 3.3.8-10. Uptimes are good.
Performance is excellent. Latency is remarkably low....

Did I mention you can get it from :
http://huchra.bufferbloat.net/~cero1/3.3/3.3.8-10

But:

First up are the minuses in this release

- ntp keeps getting restarted due to badly parsing ntpc (#113 strikes again)
I keep being annoyed by this and then getting intimidated by #113
again and failing.

- simple_qos still isn't done, and is ever less simple

Despite much fiddling with various models, with ECN dropping, with buffering,
etc, nothing I would consider worthy of replacing the openwrt QoS
system got done.
Certain things are good in simple_qos - ipv6 and diffserv support - others
are not (gui, flexibility, actual performance)

- dlna
- upnpd
Neither compiled out of the box and I lacked time or tools that use
these to look at them. I had multiple requests for them but I didn't
know they were borked to start with. Apologies to the requestors.

- ECN dropping - after several high level conversations with many
people smarter than me, I decided that dropping ECN packets at a
certain point made sense. So did everyone else. The "certain point"
remains puzzling to all, and
rather than continue to waste time on it in cero I decided to play
with models instead, and frankly, hope that someone else comes up with
some sane way to combine ECN and codel sojourn time.

I note that as a side effect of worrying about ECN (and the cause of
much controversy on the babel list), I arbitrarily marked babel
packets as CS6+ECN, as one means of exploring explosive but
non-dropping behavior in fq_codel + ECN.

Now, on to the plusses in Cerowrt-3.3.8-10

+ fresh openwrt merge
+ gpsd 3.7
+ switch to quagga (thx denis and Juliusz)
+ babelm available as an option - smoother convergence algo from julius
+ diffserv support (mostly to classify "ants" into the VI queue) (me)
+ hw queue length patches from Felix Feitkau (now in openwrt mainline)

Re: openwrt merge - openwrt still hasn't frozen but it looks close

Re: gpsd - I hope to finally work on the cosmic background bufferbloat
detector some, now that I have some geography to play with.

Re: Quagga

Most of my own excitement this past month has been in seeing quagga
become a routing platform that was not only usable for babel (with
authentication!) but also to interoperate with other protocols like
ra, bgp, ospf, etc. I am delighted to finally make the switch to
quagga-babeld as cerowrt's default routing daemon.

An ipv6 default routing bug may remain in this...

Re: babelm - Features a new, smoother converging babel algorithm.

Work on the original babel continues, but this algo arrived too late
and in the wrong source-base to play with much. It's in ceropackages
and should build for any version of openwrt.

Re: diffserv work

Unlike current Linux wifi, cerowrt wifi obeys the most rational set of
rules for things like EF, CS6, CS7 and ant-like packets I could come
up with. Basically everything except EF got moved out of the VO queue,
and many other markings ended up in the VI queue...

Re: hw queue reduction

Probably the most interesting of all these changes is the ath9k
hardware qlen support, which gives us a knob to play with deep in the
ath9k wireless driver to control it's native buffering. It defaults to
128 buffers per hardware queue.

I cut that down to 2 for VO, and 3 for VI, BE, BK. These are
front-ended by fq_codel running at mildly higher than it's default 5ms
target.

I get remarkably low latency results at all (even marginal) transmit
rates, at the expense of a LOT of raw bandwidth in more lab-like
conditions.

I'm in the process of running real-life benchmarks out of the Yurtlab.
I'm not prepared to publish what I've collected thus far, hopefully by
IETF I'll have something pulled together. I am very interested in
seeing how fq_codel reacts to sudden bandwith changes in wifi outside
of the lab and simulations.

I would encourage those doing their own benchmarks to PLEASE do them
at reasonable distances under difficult (NOT LAB!) conditions, and I
also note things like youtube streaming are good indicators of actual
usability.

However: the original pre-3.3.8-10 behavior can be restored by editing
/usr/sbin/debloat and changing the qlen_whatever variables to 128,
from their current 2,3,3,3.

We are painfully aware of how hard it will be to get good aggregation
AND low latency back into wireless-n, and have begun to document a way
forward here: http://www.bufferbloat.net/projects/cerowrt/wiki/Fq_Codel_on_Wireless

Anyway: Install cerowrt-3.3.8-10 and enjoy.

PS: I will be traveling extensively over the next 60 days.

In Paris July 15-27, then Vancouver, then Seattle, aug 3-5, Linux
plumbers aug 28-31, NJ sept 7-12. Perhaps I will see some of you in
one of those places?

PPS:

Multiple people have thought I was kidding when I said I was living in
a yurt. I'm not kidding.

Loading Image...

It's not just a yurt, it's a regular high-tech hut of baba yaga! It's
pleasantly located midway between Santa Cruz and San Jose, and I have
110 acres of mostly-wifi-free space to play in. And it's got a 24/7
pool, with the most advanced wifi on the planet now run to it. It's an
inexpensive place to call a temporary home, better than a shipping
container by far.

In August I mostly plan to do more analysis, and develop more tests
and benchmarks, utilizing the acreage and radios I've emplaced here
(and having a bit of fun), and to continue attempting to fix the
ongoing funding issues, than further develop cerowrt. That's the plan,
as I write, anyway.
--
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-6 is out
with fq_codel!"
Jim Reisert AD1C
2012-07-10 21:38:41 UTC
Permalink
Post by Dave Taht
Get it at : http://huchra.bufferbloat.net/~cero1/3.3/3.3.8-10
This will be the last release of cerowrt for a while. I think this is
even stabler than 3.3.8-6 was, but we'll (you'll) see.
You don't mention Samba in the release notice.

Any guesstimate as to how well this will work in a windows network
where files are shared across the PCs, some on the wired side, some on
the wireless side? I don't want to go through the hassle I did with
the previous released version.

- Jim
--
Jim Reisert AD1C, <***@alum.mit.edu>, http://www.ad1c.us
Maciej Soltysiak
2012-07-11 16:56:35 UTC
Permalink
Hi Dave,

Thanks for a good, lengthy update. I wish you pleasant time travelling.

One question on ECN though.
Post by Dave Taht
- ECN dropping - after several high level conversations with many
people smarter than me, I decided that dropping ECN packets at a
certain point made sense. So did everyone else. The "certain point"
remains puzzling to all, and
rather than continue to waste time on it in cero I decided to play
with models instead, and frankly, hope that someone else comes up with
some sane way to combine ECN and codel sojourn time.
So if we're dropping ECN packets and not of trying to allow them through to
communicate congestion am I better of enabling ECN on clients on my network
or not?

Regards,
Maciej
Maciej Soltysiak
2012-07-12 20:45:52 UTC
Permalink
Hi,

Just flashed WNDR3800 with 3.3.8-10 and I get a weird problem.
After flashing it works fine, until I change radio0 and 1 wifi settings:
set WPA2, set channel, etc.
After save&apply, on a cilent PC I am able to see the networks on the list
but it's not possible to connect.
I believe the client associates with the AP, but it has an APIPA address.
Happens to normal and guest interfaces too, even if I don't touch guest
interfaces.

I logged in over the wire to see if dhcpd was running and it was, but since
I can't find logs in the usual place I will now fallback to different
firmware.

Could that be a known issue?

Regards,
Maciej
Post by Maciej Soltysiak
Hi Dave,
Thanks for a good, lengthy update. I wish you pleasant time travelling.
One question on ECN though.
Post by Dave Taht
- ECN dropping - after several high level conversations with many
people smarter than me, I decided that dropping ECN packets at a
certain point made sense. So did everyone else. The "certain point"
remains puzzling to all, and
rather than continue to waste time on it in cero I decided to play
with models instead, and frankly, hope that someone else comes up with
some sane way to combine ECN and codel sojourn time.
So if we're dropping ECN packets and not of trying to allow them through
to communicate congestion am I better of enabling ECN on clients on my
network or not?
Regards,
Maciej
Dave Taht
2012-07-12 20:55:46 UTC
Permalink
Might have been the bug felix fixed in -11

Might also have corrupted the /etc/config/wireless and/or
/etc/config/network file too. if you ended up with a wlan0-1 in there
that would be it...

I too rarely use the gui interface.
Post by Maciej Soltysiak
Hi,
Just flashed WNDR3800 with 3.3.8-10 and I get a weird problem.
After flashing it works fine, until I change radio0 and 1 wifi settings: set
WPA2, set channel, etc.
After save&apply, on a cilent PC I am able to see the networks on the list
but it's not possible to connect.
I believe the client associates with the AP, but it has an APIPA address.
Happens to normal and guest interfaces too, even if I don't touch guest
interfaces.
I logged in over the wire to see if dhcpd was running and it was, but since
I can't find logs in the usual place I will now fallback to different
firmware.
Could that be a known issue?
Regards,
Maciej
Post by Maciej Soltysiak
Hi Dave,
Thanks for a good, lengthy update. I wish you pleasant time travelling.
One question on ECN though.
Post by Dave Taht
- ECN dropping - after several high level conversations with many
people smarter than me, I decided that dropping ECN packets at a
certain point made sense. So did everyone else. The "certain point"
remains puzzling to all, and
rather than continue to waste time on it in cero I decided to play
with models instead, and frankly, hope that someone else comes up with
some sane way to combine ECN and codel sojourn time.
So if we're dropping ECN packets and not of trying to allow them through
to communicate congestion am I better of enabling ECN on clients on my
network or not?
Regards,
Maciej
--
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-6 is out
with fq_codel!"
Dave Taht
2012-07-12 21:51:24 UTC
Permalink
I enabled crypto not via on the 2.4ghz interface on the -11 release
and it worked.

I DID have to do a clean reboot tho.

5.x is showing a new problem.
Post by Dave Taht
Might have been the bug felix fixed in -11
Might also have corrupted the /etc/config/wireless and/or
/etc/config/network file too. if you ended up with a wlan0-1 in there
that would be it...
I too rarely use the gui interface.
Post by Maciej Soltysiak
Hi,
Just flashed WNDR3800 with 3.3.8-10 and I get a weird problem.
After flashing it works fine, until I change radio0 and 1 wifi settings: set
WPA2, set channel, etc.
After save&apply, on a cilent PC I am able to see the networks on the list
but it's not possible to connect.
I believe the client associates with the AP, but it has an APIPA address.
Happens to normal and guest interfaces too, even if I don't touch guest
interfaces.
I logged in over the wire to see if dhcpd was running and it was, but since
I can't find logs in the usual place I will now fallback to different
firmware.
Could that be a known issue?
Regards,
Maciej
Post by Maciej Soltysiak
Hi Dave,
Thanks for a good, lengthy update. I wish you pleasant time travelling.
One question on ECN though.
Post by Dave Taht
- ECN dropping - after several high level conversations with many
people smarter than me, I decided that dropping ECN packets at a
certain point made sense. So did everyone else. The "certain point"
remains puzzling to all, and
rather than continue to waste time on it in cero I decided to play
with models instead, and frankly, hope that someone else comes up with
some sane way to combine ECN and codel sojourn time.
So if we're dropping ECN packets and not of trying to allow them through
to communicate congestion am I better of enabling ECN on clients on my
network or not?
Regards,
Maciej
--
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-6 is out
with fq_codel!"
--
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-6 is out
with fq_codel!"
Dave Taht
2012-07-13 00:43:13 UTC
Permalink
Post by Dave Taht
I enabled crypto not via on the 2.4ghz interface on the -11 release
and it worked.
I DID have to do a clean reboot tho.
5.x is showing a new problem.
Sorry for the flurries of emails, I am literally 95% into a suitcase right now.

The 5ghz issue I was seeing is due to the ubuntu 3.5 kernel I was
using not doing 5ghz at all!

I did manage to connect at 5ghz via another box, and get a dhcp
address, and use crypto. It did take seemingly forever to get a crypto
connection, perhaps that is an entropy problem.

So, treat -10 and -11 with kid gloves and revert to 3.3.8-6 if you have to.

I did some more lab (rather than field tests) on 3.3.8-11, in
conditions where I should be getting 100+Mbit, and would get ~20Mbit
for a single stream download, and a sum of 66Mbit for 4 streams, so
utilization is poor. Originally, I was writing this off to losing
aggregation performance with the qlen_be change from 128 to 3. Now it
appears to be more subtle and fq_codel/wireless framing related.

Latency stayed low though, and that was my primary goal for this
release. Much works lies ahead to get serious bandwidth back into the
system while keeping low latency, however those that need 50Mbit+ and
actually have wireless nodes that can do that - can fiddle all they
like with qlen_be and/or various values for fq_codel's target in
ms....

http://www.bufferbloat.net/projects/cerowrt/wiki/338-11_tests

On this string of tests I'd tripled the BE queue size from 400 to
1200, and at these speeds I would see the BE queue not get higher than
700, so this got me out of the persistently
dropping tail state and into a more codel-ing state:
http://www.bufferbloat.net/issues/402

http://www.bufferbloat.net/issues/401 seems to be better but I can
crash the beast at where
we are right now if I fully exercise 3 of the wireless queues (this
isn't going to happen to people
that aren't flooding the box on every queue, but it's easy to
duplicate as per the bug)
Post by Dave Taht
Post by Dave Taht
Might have been the bug felix fixed in -11
Might also have corrupted the /etc/config/wireless and/or
/etc/config/network file too. if you ended up with a wlan0-1 in there
that would be it...
I too rarely use the gui interface.
Post by Maciej Soltysiak
Hi,
Just flashed WNDR3800 with 3.3.8-10 and I get a weird problem.
After flashing it works fine, until I change radio0 and 1 wifi settings: set
WPA2, set channel, etc.
After save&apply, on a cilent PC I am able to see the networks on the list
but it's not possible to connect.
I believe the client associates with the AP, but it has an APIPA address.
Happens to normal and guest interfaces too, even if I don't touch guest
interfaces.
I logged in over the wire to see if dhcpd was running and it was, but since
I can't find logs in the usual place I will now fallback to different
firmware.
Could that be a known issue?
Regards,
Maciej
Post by Maciej Soltysiak
Hi Dave,
Thanks for a good, lengthy update. I wish you pleasant time travelling.
One question on ECN though.
Post by Dave Taht
- ECN dropping - after several high level conversations with many
people smarter than me, I decided that dropping ECN packets at a
certain point made sense. So did everyone else. The "certain point"
remains puzzling to all, and
rather than continue to waste time on it in cero I decided to play
with models instead, and frankly, hope that someone else comes up with
some sane way to combine ECN and codel sojourn time.
So if we're dropping ECN packets and not of trying to allow them through
to communicate congestion am I better of enabling ECN on clients on my
network or not?
Regards,
Maciej
--
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-6 is out
with fq_codel!"
--
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-6 is out
with fq_codel!"
--
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-6 is out
with fq_codel!"
Continue reading on narkive:
Loading...