Discussion:
Problems with DNSsec on Comcast, with Cero 3.10.38-1/DNSmasq 4-26-2014
(too old to reply)
Jim Gettys
2014-04-28 16:55:11 UTC
Permalink
​​Comcast recently lit up IPv6 native dual stack in the Boston area.

The http://test-ipv6.com/ web site complains about DNS problems unless
dnssec is disabled; if it is, I get various timeouts.

Test with IPv4 DNS record
ok (4.196s)
Test with IPv6 DNS record
ok (0.115s) using ipv6
Test with Dual Stack DNS record
timeout (11.882s)
Test for Dual Stack DNS and large packet
timeout (11.817s)
Test IPv4 without DNS
ok (0.214s) using ipv4
Test IPv6 without DNS
ok (0.204s) using ipv6
Test IPv6 large packet
ok (0.120s) using ipv6
Test if your ISP's DNS server uses IPv6
slow (8.752s)
Find IPv4 Service Provider
timeout (11.968s)
Find IPv6 Service Provider
ok (0.126s) using ipv6 ASN 7922
Test for buggy DNS
undefined (5.003s)

DNS server addresses look reasonable for Comcast.
DNS 1: 75.75.75.75
DNS 2: 75.75.76.76
DNS 1: 2001:558:feed::1
DNS 2: 2001:558:feed::2

Today, the problem seems consistent with turning dnssec on and off on the
router. If enabled, I have problems; if disabled, I get a clean bill of
health out of test-ipv6.com.
- Jim
Dave Taht
2014-04-28 17:03:35 UTC
Permalink
Post by Jim Gettys
​​Comcast recently lit up IPv6 native dual stack in the Boston area.
The http://test-ipv6.com/ web site complains about DNS problems unless
dnssec is disabled; if it is, I get various timeouts.
Test with IPv4 DNS record
Post by Jim Gettys
ok (4.196s)
Test with IPv6 DNS record
ok (0.115s) using ipv6
Test with Dual Stack DNS record
timeout (11.882s)
I don't know what this test does. try a local query over ipv6?

Test for Dual Stack DNS and large packet
Post by Jim Gettys
timeout (11.817s)
Test IPv4 without DNS
ok (0.214s) using ipv4
Test IPv6 without DNS
ok (0.204s) using ipv6
Test IPv6 large packet
ok (0.120s) using ipv6
Test if your ISP's DNS server uses IPv6
slow (8.752s)
Find IPv4 Service Provider
timeout (11.968s)
Find IPv6 Service Provider
ok (0.126s) using ipv6 ASN 7922
Test for buggy DNS
undefined (5.003s)
DNS server addresses look reasonable for Comcast.
DNS 1: 75.75.75.75
DNS 2: 75.75.76.76
To try to isolate things a little bit, you can turn off fetching ipv4 dns
servers
with

option peerdns '0'

in the wan (ge00) stanza of /etc/config/network

and let the wan6 stanza fetch them.

A packet capture of it working vs not working would be good.

tcpdump -i ge00 -w cap1.cap port 53

Also capture on the local interface.

DNS 1: 2001:558:feed::1
Post by Jim Gettys
DNS 2: 2001:558:feed::2
Today, the problem seems consistent with turning dnssec on and off on the
router. If enabled, I have problems; if disabled, I get a clean bill of
health out of test-ipv6.com.
- Jim
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
--
Dave TÀht

NSFW:
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
Dave Taht
2014-04-28 18:37:42 UTC
Permalink
I have put a link up to two of jim's captures going to test-ipv6 via cero,
one with dnssec enabled, captured at the local laptop

http://snapon.lab.bufferbloat.net/~cero2/baddns/

definately a lot of missing responses when captured at this end. the local
laptop is using a local dnsmasq forwarder.

It is falling back to trying a recursive lookup on the default domain (
ipv6.test-ipv6.com.home.lan ) - which it does do a nxdomain for
immediately...
Post by Jim Gettys
Post by Jim Gettys
​​Comcast recently lit up IPv6 native dual stack in the Boston area.
The http://test-ipv6.com/ web site complains about DNS problems unless
dnssec is disabled; if it is, I get various timeouts.
Test with IPv4 DNS record
Post by Jim Gettys
ok (4.196s)
Test with IPv6 DNS record
ok (0.115s) using ipv6
Test with Dual Stack DNS record
timeout (11.882s)
I don't know what this test does. try a local query over ipv6?
Test for Dual Stack DNS and large packet
Post by Jim Gettys
timeout (11.817s)
Test IPv4 without DNS
ok (0.214s) using ipv4
Test IPv6 without DNS
ok (0.204s) using ipv6
Test IPv6 large packet
ok (0.120s) using ipv6
Test if your ISP's DNS server uses IPv6
slow (8.752s)
Find IPv4 Service Provider
timeout (11.968s)
Find IPv6 Service Provider
ok (0.126s) using ipv6 ASN 7922
Test for buggy DNS
undefined (5.003s)
DNS server addresses look reasonable for Comcast.
DNS 1: 75.75.75.75
DNS 2: 75.75.76.76
To try to isolate things a little bit, you can turn off fetching ipv4
dns servers
with
option peerdns '0'
in the wan (ge00) stanza of /etc/config/network
and let the wan6 stanza fetch them.
A packet capture of it working vs not working would be good.
tcpdump -i ge00 -w cap1.cap port 53
Also capture on the local interface.
DNS 1: 2001:558:feed::1
Post by Jim Gettys
DNS 2: 2001:558:feed::2
Today, the problem seems consistent with turning dnssec on and off on the
router. If enabled, I have problems; if disabled, I get a clean bill of
health out of test-ipv6.com.
- Jim
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
--
Dave TÀht
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
--
Dave TÀht

NSFW:
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
Dave Taht
2014-04-28 18:56:32 UTC
Permalink
I see A and AAAA requests for for "ds.test-ipv6.com" that fail.
Post by Dave Taht
I have put a link up to two of jim's captures going to test-ipv6 via cero,
one with dnssec enabled, captured at the local laptop
http://snapon.lab.bufferbloat.net/~cero2/baddns/
definately a lot of missing responses when captured at this end. the local
laptop is using a local dnsmasq forwarder.
It is falling back to trying a recursive lookup on the default domain (
ipv6.test-ipv6.com.home.lan ) - which it does do a nxdomain for
immediately...
Post by Jim Gettys
Post by Jim Gettys
​​Comcast recently lit up IPv6 native dual stack in the Boston area.
The http://test-ipv6.com/ web site complains about DNS problems unless
dnssec is disabled; if it is, I get various timeouts.
Test with IPv4 DNS record
Post by Jim Gettys
ok (4.196s)
Test with IPv6 DNS record
ok (0.115s) using ipv6
Test with Dual Stack DNS record
timeout (11.882s)
I don't know what this test does. try a local query over ipv6?
Test for Dual Stack DNS and large packet
Post by Jim Gettys
timeout (11.817s)
Test IPv4 without DNS
ok (0.214s) using ipv4
Test IPv6 without DNS
ok (0.204s) using ipv6
Test IPv6 large packet
ok (0.120s) using ipv6
Test if your ISP's DNS server uses IPv6
slow (8.752s)
Find IPv4 Service Provider
timeout (11.968s)
Find IPv6 Service Provider
ok (0.126s) using ipv6 ASN 7922
Test for buggy DNS
undefined (5.003s)
DNS server addresses look reasonable for Comcast.
DNS 1: 75.75.75.75
DNS 2: 75.75.76.76
To try to isolate things a little bit, you can turn off fetching ipv4
dns servers
with
option peerdns '0'
in the wan (ge00) stanza of /etc/config/network
and let the wan6 stanza fetch them.
A packet capture of it working vs not working would be good.
tcpdump -i ge00 -w cap1.cap port 53
Also capture on the local interface.
DNS 1: 2001:558:feed::1
Post by Jim Gettys
DNS 2: 2001:558:feed::2
Today, the problem seems consistent with turning dnssec on and off on
the router. If enabled, I have problems; if disabled, I get a clean bill
of health out of test-ipv6.com.
- Jim
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
--
Dave TÀht
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
--
Dave TÀht
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
--
Dave TÀht

NSFW:
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
Simon Kelley
2014-04-28 19:32:27 UTC
Permalink
Post by Dave Taht
I see A and AAAA requests for for "ds.test-ipv6.com" that fail.
The root of this failure is that DS ds.test-ipv6.com is broken.

<<>> DiG 9.8.1-P1 <<>> @8.8.8.8 ds ds.test-ipv6.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 63751
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;ds.test-ipv6.com. IN DS

;; Query time: 1186 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Mon Apr 28 20:19:34 2014
;; MSG SIZE rcvd: 34

The latest fix I made (when the SERVFAIL reply comes, try the next
possible secure-nonexistent DS record at test-ipv6.com) works sometimes,
but the query above is taking long enough to fail that sometimes the
original requestor has timed out before it gets the answer and tries again.

Neither of authoritative nameservers for test-ipv6.com return answers to
the DS query, they just time out. They do return answers for A and AAAA
queries. That looks broken to me.

Problems like this have been at the root of most (but not all) of the
DNSSEC failures that have been reported.

Cheers,

Simon.
Post by Dave Taht
Post by Dave Taht
I have put a link up to two of jim's captures going to test-ipv6
via cero, one with dnssec enabled, captured at the local laptop
http://snapon.lab.bufferbloat.net/~cero2/baddns/
definately a lot of missing responses when captured at this end.
the local laptop is using a local dnsmasq forwarder.
It is falling back to trying a recursive lookup on the default
domain ( ipv6.test-ipv6.com.home.lan ) - which it does do a
nxdomain for immediately...
Post by Jim Gettys
​​Comcast recently lit up IPv6 native dual stack in the Boston
area.
The http://test-ipv6.com/ web site complains about DNS problems
unless dnssec is disabled; if it is, I get various timeouts.
Test with IPv4 DNS record
ok (4.196s) Test with IPv6 DNS record ok (0.115s) using ipv6
Test with Dual Stack DNS record timeout (11.882s)
I don't know what this test does. try a local query over ipv6?
Test for Dual Stack DNS and large packet
timeout (11.817s) Test IPv4 without DNS ok (0.214s) using ipv4
Test IPv6 without DNS ok (0.204s) using ipv6 Test IPv6 large
packet ok (0.120s) using ipv6 Test if your ISP's DNS server
uses IPv6 slow (8.752s) Find IPv4 Service Provider timeout
(11.968s) Find IPv6 Service Provider ok (0.126s) using ipv6 ASN
7922 Test for buggy DNS undefined (5.003s)
75.75.75.75 DNS 2: 75.75.76.76
To try to isolate things a little bit, you can turn off
fetching ipv4 dns servers with
option peerdns '0'
in the wan (ge00) stanza of /etc/config/network
and let the wan6 stanza fetch them.
A packet capture of it working vs not working would be good.
tcpdump -i ge00 -w cap1.cap port 53
Also capture on the local interface.
DNS 1: 2001:558:feed::1
DNS 2: 2001:558:feed::2
Today, the problem seems consistent with turning dnssec on and
off on the router. If enabled, I have problems; if disabled, I
get a clean bill of health out of test-ipv6.com. - Jim
_______________________________________________ Cerowrt-devel
https://lists.bufferbloat.net/listinfo/cerowrt-devel
-- Dave Täht
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
--
Post by Dave Taht
Post by Dave Taht
Dave Täht
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
_______________________________________________ Dnsmasq-discuss
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
Aaron Wood
2014-04-28 19:45:28 UTC
Permalink
This timeout, I'm guessing this is older/naive setups that aren't expecting
to support DNSSEC, and thought "over-securing" their setup, have managed to
break the non-existence-proof process?

-Aaron

On Mon, Apr 28, 2014 at 9:32 PM, Simon Kelley <***@thekelleys.org.uk>wrote:

...
Post by Simon Kelley
Neither of authoritative nameservers for test-ipv6.com return answers to
the DS query, they just time out. They do return answers for A and AAAA
queries. That looks broken to me.
Problems like this have been at the root of most (but not all) of the
DNSSEC failures that have been reported.
Phil Pennock
2014-04-28 23:24:59 UTC
Permalink
Post by Simon Kelley
Post by Dave Taht
I see A and AAAA requests for for "ds.test-ipv6.com" that fail.
The root of this failure is that DS ds.test-ipv6.com is broken.
The latest fix I made (when the SERVFAIL reply comes, try the next
possible secure-nonexistent DS record at test-ipv6.com) works sometimes,
but the query above is taking long enough to fail that sometimes the
original requestor has timed out before it gets the answer and tries again.
Er, DS records are authoritative in the parent domain and are equivalent
to glue; they are not expected to exist below the zone cut.

This is why you'll get results from:

$ dig -t ds spodhuis.org @a2.org.afilias-nst.info

but a NOERROR from:

$ dig -t ds spodhuis.org @nsauth.spodhuis.org

An NS query for "ds.test-ipv6.com" gives "test-ipv6.com", so that is the
zone cut, so it's in the COM. zone that you should expect to find any DS
records for "test-ipv6.com" and there's no need for a DS for anything
below that unless there's also a zone cut, in which case there's a DS at
the delegation point.

RFC 4033:
----------------------------8< cut here >8------------------------------
3.1. Data Origin Authentication and Data Integrity
[...]
The Delegation Signer (DS) RR type simplifies some of the
administrative tasks involved in signing delegations across
organizational boundaries. The DS RRset resides at a delegation
point in a parent zone and indicates the public key(s) corresponding
to the private key(s) used to self-sign the DNSKEY RRset at the
delegated child zone's apex. The administrator of the child zone, in
turn, uses the private key(s) corresponding to one or more of the
public keys in this DNSKEY RRset to sign the child zone's data. The
typical authentication chain is therefore
DNSKEY->[DS->DNSKEY]*->RRset, where "*" denotes zero or more
DS->DNSKEY subchains. DNSSEC permits more complex authentication
chains, such as additional layers of DNSKEY RRs signing other DNSKEY
RRs within a zone.
----------------------------8< cut here >8------------------------------
Simon Kelley
2014-04-29 13:22:27 UTC
Permalink
Post by Phil Pennock
Post by Simon Kelley
Post by Dave Taht
I see A and AAAA requests for for "ds.test-ipv6.com" that fail.
The root of this failure is that DS ds.test-ipv6.com is broken.
The latest fix I made (when the SERVFAIL reply comes, try the next
possible secure-nonexistent DS record at test-ipv6.com) works sometimes,
but the query above is taking long enough to fail that sometimes the
original requestor has timed out before it gets the answer and tries again.
Er, DS records are authoritative in the parent domain and are equivalent
to glue; they are not expected to exist below the zone cut.
A NOERROR answer from the authoritative server for test-ipv6.com would
be fine. What actually happens is no answer at all and a timeout (or a
closed TCP connection if TCP is used.)


It's maybe worth expanding on what we're trying to do here. The original
query is "A ds.test-ipv6.com". The answer to that comes back fine, but
there are no RRSIGs proving that that answer is good. Now we have to
distinguish between no signatures because the domain isn't signed, and
no signatures because the answer has come from the Bad Guys.

To do that, we need to find proof (NSEC or NSEC3 records) that a DS
doesn't exist somewhere between ds.test-ipv6.com and the root. Bear in
mind that dnsmasq is a DNS forwarder, not a recursive DNS server, so it
doesn't know where the zone cuts are.

The current strategy it to start at ds.test-ipv6.com and do DS queries.
There are three possible results.

unsigned NOERROR -> chop one label off the RHS and repeat
DS record -> definite Bad Guy activity, return BOGUS
signed no DS record -> we expect unsigned original answer, return
INSECURE result.ds.test-ipv6.com


The other alternative approach is to start from the root and add labels,
but that has a problem.

Consider

department.campus.university.edu

where there are zone cuts between university and edu and between
department and campus.

All the zones are signed, so if we look up something under .department,
we expect a signature, if we don't get it, we check

DS .edu gives an answer
DS university.edu gives secure NODATA

secure no DS means that the original unsigned answer should be accepted,
except that it shouldn't. There's no way to distinguish between secure
lack of DS because we've reached an unsigned branch of the tree, and
secure lack of DS because we're not at a zone cut, except if we know
where the zone cuts are, and we don't.


That's why dnsmasq works up from the bottom. The first secure no-DS
answer we find marks the boundary between signed and unsigned tree.

Dnsmasq is acting as a validating stub resolver here. That's a supported
role for DNSSEC, so this must be possible. If it's not then we have a
standards problem.
Post by Phil Pennock
An NS query for "ds.test-ipv6.com" gives "test-ipv6.com", so that is the
zone cut, so it's in the COM. zone that you should expect to find any DS
records for "test-ipv6.com" and there's no need for a DS for anything
below that unless there's also a zone cut, in which case there's a DS at
the delegation point.
ds.test-ipv6.com
Doing NS queries to find zone cuts is a possible solution, but I know of
ISP nameservers that elide the Authority section for "performance".


Simon.
Post by Phil Pennock
----------------------------8< cut here >8------------------------------
3.1. Data Origin Authentication and Data Integrity
[...]
The Delegation Signer (DS) RR type simplifies some of the
administrative tasks involved in signing delegations across
organizational boundaries. The DS RRset resides at a delegation
point in a parent zone and indicates the public key(s) corresponding
to the private key(s) used to self-sign the DNSKEY RRset at the
delegated child zone's apex. The administrator of the child zone, in
turn, uses the private key(s) corresponding to one or more of the
public keys in this DNSKEY RRset to sign the child zone's data. The
typical authentication chain is therefore
DNSKEY->[DS->DNSKEY]*->RRset, where "*" denotes zero or more
DS->DNSKEY subchains. DNSSEC permits more complex authentication
chains, such as additional layers of DNSKEY RRs signing other DNSKEY
RRs within a zone.
----------------------------8< cut here >8------------------------------
Phil Pennock
2014-04-29 20:57:57 UTC
Permalink
Post by Simon Kelley
secure no DS means that the original unsigned answer should be accepted,
except that it shouldn't. There's no way to distinguish between secure
lack of DS because we've reached an unsigned branch of the tree, and
secure lack of DS because we're not at a zone cut, except if we know
where the zone cuts are, and we don't.
Fair point.
Post by Simon Kelley
That's why dnsmasq works up from the bottom. The first secure no-DS
answer we find marks the boundary between signed and unsigned tree.
Dnsmasq is acting as a validating stub resolver here. That's a supported
role for DNSSEC, so this must be possible. If it's not then we have a
standards problem.
You have a standards vs reality problem: lots of loadbalancer appliances
suck at DNS and are only just now managing to return errors, instead of
dropping the query (hanging), when queried for AAAA records instead of A
records.

( This has led to no end of pain in the IPv6 world; Happy Eyeballs,
expectations around improved _client_ behaviour, handle other parts of
the puzzle and tends to require the concurrency that a client also
needs to handle DNS problems, but it's still distinct. )

You're not going to get such loadbalancers responding sanely to a DS
query any time soon, and with the other DNS client software all being
recursors which work fine because they know where zone cuts are, you're
going to be fighting a long hard battle with vendors and sites to get
them to fix their brokenness when "it works for everyone else".

So the standards 100% support what you're doing, but they don't match
common stupidity in deployed (high end, expensive) equipment.

To support DNSSEC in the real world without changing from being a
forwarder, you're going to need new insight. My only thoughts are
around whether or not this might provide impetus for TKEY-based TSIG for
forwarders to establish trust links to recursors elsewhere, in which
case once you have a TSIG key (whether TKEY-derived or OOB manual) then
you might delegate trust to the remote recursor.

Sorry to be the bearer of bad news,
-Phil
Dave Taht
2014-04-30 17:26:21 UTC
Permalink
On Tue, Apr 29, 2014 at 1:57 PM, Phil Pennock
Post by Phil Pennock
Post by Simon Kelley
secure no DS means that the original unsigned answer should be accepted,
except that it shouldn't. There's no way to distinguish between secure
lack of DS because we've reached an unsigned branch of the tree, and
secure lack of DS because we're not at a zone cut, except if we know
where the zone cuts are, and we don't.
Fair point.
Post by Simon Kelley
That's why dnsmasq works up from the bottom. The first secure no-DS
answer we find marks the boundary between signed and unsigned tree.
Dnsmasq is acting as a validating stub resolver here. That's a supported
role for DNSSEC, so this must be possible. If it's not then we have a
standards problem.
You have a standards vs reality problem: lots of loadbalancer appliances
suck at DNS and are only just now managing to return errors, instead of
dropping the query (hanging), when queried for AAAA records instead of A
records.
( This has led to no end of pain in the IPv6 world; Happy Eyeballs,
expectations around improved _client_ behaviour, handle other parts of
the puzzle and tends to require the concurrency that a client also
needs to handle DNS problems, but it's still distinct. )
You're not going to get such loadbalancers responding sanely to a DS
query any time soon, and with the other DNS client software all being
recursors which work fine because they know where zone cuts are, you're
going to be fighting a long hard battle with vendors and sites to get
them to fix their brokenness when "it works for everyone else".
So the standards 100% support what you're doing, but they don't match
common stupidity in deployed (high end, expensive) equipment.
The only idea I have is to adopt some sort of whitelisting technology,
and simultaneously nag the folk with busted implementations.
Post by Phil Pennock
To support DNSSEC in the real world without changing from being a
forwarder, you're going to need new insight. My only thoughts are
around whether or not this might provide impetus for TKEY-based TSIG for
forwarders to establish trust links to recursors elsewhere, in which
case once you have a TSIG key (whether TKEY-derived or OOB manual) then
you might delegate trust to the remote recursor.
I see there have been a few commits to dnsmasq that address some stuff.
Post by Phil Pennock
Sorry to be the bearer of bad news,
I'm delighted to have got this far.

Is the consensus to not run with negative proofs on at this juncture?
Post by Phil Pennock
-Phil
--
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
Simon Kelley
2014-05-01 18:37:21 UTC
Permalink
Post by Dave Taht
On Tue, Apr 29, 2014 at 1:57 PM, Phil Pennock
Post by Phil Pennock
Post by Simon Kelley
secure no DS means that the original unsigned answer should be accepted,
except that it shouldn't. There's no way to distinguish between secure
lack of DS because we've reached an unsigned branch of the tree, and
secure lack of DS because we're not at a zone cut, except if we know
where the zone cuts are, and we don't.
Fair point.
Post by Simon Kelley
That's why dnsmasq works up from the bottom. The first secure no-DS
answer we find marks the boundary between signed and unsigned tree.
Dnsmasq is acting as a validating stub resolver here. That's a supported
role for DNSSEC, so this must be possible. If it's not then we have a
standards problem.
You have a standards vs reality problem: lots of loadbalancer appliances
suck at DNS and are only just now managing to return errors, instead of
dropping the query (hanging), when queried for AAAA records instead of A
records.
( This has led to no end of pain in the IPv6 world; Happy Eyeballs,
expectations around improved _client_ behaviour, handle other parts of
the puzzle and tends to require the concurrency that a client also
needs to handle DNS problems, but it's still distinct. )
You're not going to get such loadbalancers responding sanely to a DS
query any time soon, and with the other DNS client software all being
recursors which work fine because they know where zone cuts are, you're
going to be fighting a long hard battle with vendors and sites to get
them to fix their brokenness when "it works for everyone else".
So the standards 100% support what you're doing, but they don't match
common stupidity in deployed (high end, expensive) equipment.
The only idea I have is to adopt some sort of whitelisting technology,
and simultaneously nag the folk with busted implementations.
Post by Phil Pennock
To support DNSSEC in the real world without changing from being a
forwarder, you're going to need new insight. My only thoughts are
around whether or not this might provide impetus for TKEY-based TSIG for
forwarders to establish trust links to recursors elsewhere, in which
case once you have a TSIG key (whether TKEY-derived or OOB manual) then
you might delegate trust to the remote recursor.
I see there have been a few commits to dnsmasq that address some stuff.
Post by Phil Pennock
Sorry to be the bearer of bad news,
I'm delighted to have got this far.
Is the consensus to not run with negative proofs on at this juncture?
If you want stuff to just work, turn off negative proofs, if you want to
push the envelope, leave them on and complain to domain-admins.

I had some feeling that something like this might be a problem, hence
the discrete controls.


Cheers,

Simon
Post by Dave Taht
Post by Phil Pennock
-Phil
Rich Brown
2014-05-01 20:26:35 UTC
Permalink
Post by Simon Kelley
Post by Dave Taht
On Tue, Apr 29, 2014 at 1:57 PM, Phil Pennock
snip, snip snip...
Post by Simon Kelley
Post by Dave Taht
Is the consensus to not run with negative proofs on at this juncture?
If you want stuff to just work, turn off negative proofs, if you want to
push the envelope, leave them on and complain to domain-admins.
I had some feeling that something like this might be a problem, hence
the discrete controls.
I apologize that I haven't been following this closely, but so I'm going to ask a TL;DR question.

Which places in the OpenWrt/CeroWrt GUI (or the config files) do I use to wiggle these levers?

Thanks!

Rich
Dave Taht
2014-05-01 22:27:20 UTC
Permalink
Post by Rich Brown
Post by Simon Kelley
Post by Dave Taht
On Tue, Apr 29, 2014 at 1:57 PM, Phil Pennock
snip, snip snip...
Post by Simon Kelley
Post by Dave Taht
Is the consensus to not run with negative proofs on at this juncture?
If you want stuff to just work, turn off negative proofs, if you want to
push the envelope, leave them on and complain to domain-admins.
I had some feeling that something like this might be a problem, hence
the discrete controls.
I apologize that I haven't been following this closely, but so I'm going to ask a TL;DR question.
Which places in the OpenWrt/CeroWrt GUI (or the config files) do I use to wiggle these levers?
There is no gui support as yet. enablement is via /etc/dnsmasq.conf

I disabled (commented out) the negative proof checks in the 3.10.38-2 release.
Post by Rich Brown
Thanks!
Rich
--
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
Sebastian Moeller
2014-05-02 14:30:27 UTC
Permalink
Hi List, hi Dave,
Post by Dave Taht
Post by Rich Brown
Post by Simon Kelley
Post by Dave Taht
On Tue, Apr 29, 2014 at 1:57 PM, Phil Pennock
snip, snip snip...
Post by Simon Kelley
Post by Dave Taht
Is the consensus to not run with negative proofs on at this juncture?
If you want stuff to just work, turn off negative proofs, if you want to
push the envelope, leave them on and complain to domain-admins.
I had some feeling that something like this might be a problem, hence
the discrete controls.
I apologize that I haven't been following this closely, but so I'm going to ask a TL;DR question.
Which places in the OpenWrt/CeroWrt GUI (or the config files) do I use to wiggle these levers?
There is no gui support as yet. enablement is via /etc/dnsmasq.conf
I disabled (commented out) the negative proof checks in the 3.10.38-2 release.
So, I installed this just now and to my amazement it directly picked up my ISP's dns servers immediately, unlike with the last two? releases I did not have to resort to google's dns servers. So this looks like the deutsche telekom setup is not ready for full dnssec (at least not when trying to use the dns server on the primary dt router...).

Best Regards
Sebastian
Post by Dave Taht
Post by Rich Brown
Thanks!
Rich
--
Dave Täht
NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Simon Kelley
2014-05-01 18:35:12 UTC
Permalink
Post by Phil Pennock
Post by Simon Kelley
secure no DS means that the original unsigned answer should be accepted,
except that it shouldn't. There's no way to distinguish between secure
lack of DS because we've reached an unsigned branch of the tree, and
secure lack of DS because we're not at a zone cut, except if we know
where the zone cuts are, and we don't.
Fair point.
Post by Simon Kelley
That's why dnsmasq works up from the bottom. The first secure no-DS
answer we find marks the boundary between signed and unsigned tree.
Dnsmasq is acting as a validating stub resolver here. That's a supported
role for DNSSEC, so this must be possible. If it's not then we have a
standards problem.
You have a standards vs reality problem: lots of loadbalancer appliances
suck at DNS and are only just now managing to return errors, instead of
dropping the query (hanging), when queried for AAAA records instead of A
records.
( This has led to no end of pain in the IPv6 world; Happy Eyeballs,
expectations around improved _client_ behaviour, handle other parts of
the puzzle and tends to require the concurrency that a client also
needs to handle DNS problems, but it's still distinct. )
You're not going to get such loadbalancers responding sanely to a DS
query any time soon, and with the other DNS client software all being
recursors which work fine because they know where zone cuts are, you're
going to be fighting a long hard battle with vendors and sites to get
them to fix their brokenness when "it works for everyone else".
A valid point, but "every leaf system has to be a recursor" is not a
pleasant outcome of widely implementing DNSSEC. I wonder, do the
browser-based validators suffer from this, or are they recursors under
the hood? This is a judgement for integrators, not for me, but if
there's anything widely deployed enough to act as a lever to get this
fixed, it's dnsmasq.
Post by Phil Pennock
So the standards 100% support what you're doing, but they don't match
common stupidity in deployed (high end, expensive) equipment.
To support DNSSEC in the real world without changing from being a
forwarder, you're going to need new insight. My only thoughts are
around whether or not this might provide impetus for TKEY-based TSIG for
forwarders to establish trust links to recursors elsewhere, in which
case once you have a TSIG key (whether TKEY-derived or OOB manual) then
you might delegate trust to the remote recursor.
That's nice, but it needs recursors to play ball too, so it's even
further into the indefinite future than what we have now.
Post by Phil Pennock
Sorry to be the bearer of bad news,
Better to know.


Cheers,

Simon.
Post by Phil Pennock
-Phil
James Cloos
2014-05-02 16:40:16 UTC
Permalink
SK> A valid point, but "every leaf system has to be a recursor" is not a
SK> pleasant outcome of widely implementing DNSSEC.
Anders Kaseorg
2014-10-03 09:28:35 UTC
Permalink
I just ran into this timeout behavior myself while testing the new
DNSSEC support in OpenWrt 14.07 (dnsmasq 2.71-4). After staring at the
problem for a few hours, I think there’s something wrong with your
justification.
Post by Simon Kelley
To do that, we need to find proof (NSEC or NSEC3 records) that a DS
doesn't exist somewhere between ds.test-ipv6.com and the root. Bear in
mind that dnsmasq is a DNS forwarder, not a recursive DNS server, so it
doesn't know where the zone cuts are.
The current strategy it to start at ds.test-ipv6.com and do DS queries.
There are three possible results.
unsigned NOERROR -> chop one label off the RHS and repeat
DS record -> definite Bad Guy activity, return BOGUS
signed no DS record -> we expect unsigned original answer, return
INSECURE result.ds.test-ipv6.com
This bottom-up algorithm also seems to have a security problem that’s
just as bad as one with the top-down algorithm that you rejected below.
Consider the same department.campus.university.edu example, where
campus and edu are signed zones, and university is not a zone.

• An attacker forges an evil response for A department, and forges an
unsigned NODATA for DS department.
• dnsmasq chops off one label, and the attacker forges an unsigned
NODATA for DS campus.
• dnsmasq chops off another label, and gets the legitimately signed
NODATA for DS university.
• dnsmasq incorrectly concludes that everything inside university is
expected to be unsigned, and returns the INSECURE evil response.

So if nothing else, the top-down algorithm seems less impractical and
equally insecure. And maybe we can fix it; see below.
Post by Simon Kelley
The other alternative approach is to start from the root and add labels,
but that has a problem.
Consider
department.campus.university.edu
where there are zone cuts between university and edu and between
department and campus.
All the zones are signed, so if we look up something under .department,
we expect a signature, if we don't get it, we check
DS .edu gives an answer
DS university.edu gives secure NODATA
secure no DS means that the original unsigned answer should be accepted,
except that it shouldn't. There's no way to distinguish between secure
lack of DS because we've reached an unsigned branch of the tree, and
secure lack of DS because we're not at a zone cut, except if we know
where the zone cuts are, and we don't.
Having just looked through RFC 5155 for clues: isn’t that the purpose of
the NS type bit in the NSEC3 record? In this example, DS university
would give an NSEC3 record with the NS bit clear. That signals that we
should go down a level and query DS campus. In this case we find a
signed DS there. But if we were to find an NSEC3 with the NS bit set,
then we’d know that we’ve really found an unsigned zone and can stop
going down.

Anders
V***@vt.edu
2014-10-03 17:28:05 UTC
Permalink
This bottom-up algorithm also seems to have a security problem that’s
just as bad as one with the top-down algorithm that you rejected below.
Consider the same department.campus.university.edu example, where
campus and edu are signed zones, and university is not a zone.
This issue is why trust anchors were devised so people could start deploying
DNSSEC before stuff like .COM got signed.
Anders Kaseorg
2014-10-03 21:35:08 UTC
Permalink
This bottom-up algorithm also seems to have a security problem that=E2=
=80=99s=20
just as bad as one with the top-down algorithm that you rejected=20
below. Consider the same department.campus.university.edu example,=20
where campus and edu are signed zones, and university is not a zone.
=20
This issue is why trust anchors were devised so people could start=20
deploying DNSSEC before stuff like .COM got signed.
No, you=E2=80=99re misreading. Trust anchors address the case where=20
campus.university.edu is a signed zone and university.edu is an unzigned=20
zone. We=E2=80=99re talking about the case where university.edu is not a z=
one at=20
all, so that campus.university.edu is served directly from the edu zone.

Obviously this won=E2=80=99t happen at the real edu zone, but real examples=
exist:=20
env.state.ma.us, state.ma.us, us are signed zones, and ma.us is not a=20
zone.

Anders
Anders Kaseorg
2014-10-04 21:45:46 UTC
Permalink
secure no DS means that the original unsigned answer should be=20
accepted, except that it shouldn't. There's no way to distinguish=20
between secure lack of DS because we've reached an unsigned branch of=
=20
the tree, and secure lack of DS because we're not at a zone cut,=20
except if we know where the zone cuts are, and we don't.
=20
Having just looked through RFC 5155 for clues: isn=E2=80=99t that the pur=
pose of=20
the NS type bit in the NSEC3 record? In this example, DS university=20
would give an NSEC3 record with the NS bit clear. That signals that we=
=20
should go down a level and query DS campus. In this case we find a=20
signed DS there. But if we were to find an NSEC3 with the NS bit set,=20
then we=E2=80=99d know that we=E2=80=99ve really found an unsigned zone a=
nd can stop=20
going down.
Aha: and this is exactly the answer given at=20
http://tools.ietf.org/html/rfc6840#section-4.4 .

Anders
Simon Kelley
2015-01-08 16:34:11 UTC
Permalink
OK, it's taken some time, but with this insight, I've recoded the
relevant stuff to look for the limits of the signed DNS tree from the
DNS root down. That's clearly the correct way to do it, and should
avoid the original problem here, caused by sending DNSSEC queries to
DNSSEC-unaware servers in the unsigned parts of the tree.

This was quite a big change, and it could do with some serious
testing. Available now on the dnsmasq git repo, or as 2.73test3 in a
tarball.

There are other DNSSEC fixes in there too, Check the changelog.


Cheers,

Simon.
Post by Anders Kaseorg
Post by Anders Kaseorg
Post by Simon Kelley
secure no DS means that the original unsigned answer should be
accepted, except that it shouldn't. There's no way to
distinguish between secure lack of DS because we've reached an
unsigned branch of the tree, and secure lack of DS because
we're not at a zone cut, except if we know where the zone cuts
are, and we don't.
Having just looked through RFC 5155 for clues: isn’t that the
purpose of the NS type bit in the NSEC3 record? In this example,
DS university would give an NSEC3 record with the NS bit clear.
That signals that we should go down a level and query DS campus.
In this case we find a signed DS there. But if we were to find
an NSEC3 with the NS bit set, then we’d know that we’ve really
found an unsigned zone and can stop going down.
Aha: and this is exactly the answer given at
http://tools.ietf.org/html/rfc6840#section-4.4 .
Anders
Dave Taht
2015-01-08 17:44:00 UTC
Permalink
Wow, this thread goes back a ways. Is ds.test-ipv6.com still
configured wrong, and does it pass now? It passes for me (but I am
behind a more modern openwrt box right now)

Is there another site that demonstrates this problem?

BTW: For a while there (on comcast), in production, I ran with pure
ipv6 for dns (it reduced ipv4 nat pressure significantly!), but it
hung after a few days and I never got back to it. Were any problems
like this experienced and/or fixed for dnsmasq in the past 8 months or
so?

Anyway... enough incremental fixes have landed all across the board in
openwrt, and the chaos calmer process seems to have settled down
enough, to consider doing an entirely updated cerowrt based on 3.14
and pushing things like dnsmasq further forward...

... but I, personally, am still, not in the position to easily build
and test a new dnsmasq package for cerowrt and have no funding or time
for further development based on chaos calmer. Hopefully someone else
in the openwrt or cerowrt world can take up the slack. I see that
several bleeding edge sub-distros of openwrt have also emerged on
their forums...

(Yet.... I will still try to produce a test dnsmasq version from the
cerowrt-3.10 tree but I doubt it would be safe to do an opkg update
for it.)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
OK, it's taken some time, but with this insight, I've recoded the
relevant stuff to look for the limits of the signed DNS tree from the
DNS root down. That's clearly the correct way to do it, and should
avoid the original problem here, caused by sending DNSSEC queries to
DNSSEC-unaware servers in the unsigned parts of the tree.
This was quite a big change, and it could do with some serious
testing. Available now on the dnsmasq git repo, or as 2.73test3 in a
tarball.
There are other DNSSEC fixes in there too, Check the changelog.
Cheers,
Simon.
Post by Anders Kaseorg
Post by Anders Kaseorg
Post by Simon Kelley
secure no DS means that the original unsigned answer should be
accepted, except that it shouldn't. There's no way to
distinguish between secure lack of DS because we've reached an
unsigned branch of the tree, and secure lack of DS because
we're not at a zone cut, except if we know where the zone cuts
are, and we don't.
Having just looked through RFC 5155 for clues: isn’t that the
purpose of the NS type bit in the NSEC3 record? In this example,
DS university would give an NSEC3 record with the NS bit clear.
That signals that we should go down a level and query DS campus.
In this case we find a signed DS there. But if we were to find
an NSEC3 with the NS bit set, then we’d know that we’ve really
found an unsigned zone and can stop going down.
Aha: and this is exactly the answer given at
http://tools.ietf.org/html/rfc6840#section-4.4 .
Anders
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAEBCAAGBQJUrrGDAAoJEBXN2mrhkTWitZ0P/1T8AaAAlcgI6Z9oDXBGKR+Q
gw0E0bUcmMsvOf5YepR4jqNqonMYBDEv5aSx4EG13LEYBdEekVjUWlakcTSFGCCH
r4bx91XmxZBBSjBM2UNRd4B/dGY34YydbjPFnV/Mmzv5FdUzmVxG3PRQ3E0EyyLp
Eczm+s0Dxz4pGzEINhFHZ6T8sByDeSjAb3adBNidofKFSevwIv/iOMOQJ5moQfem
VkY+azpFzSmpdeNpIU+uboMfcg4jhFpVU3WRr7umTmLc0KOus1j7ao9GxSujPQHo
S7q+IwSwKHUPMEeEmQh+j7yJ2seweGuqGl0quWkHaqGUIOh2C2E756qZfXeenUcv
ia00dcKmpCYi0Ay3nXdgIq91aRwc78GsR93MEBTuvJwDmAUDupsbZMdlA/3D6tOd
ZTREvBmxkFz/QYOo731N/JzdaflQeLUrNPIwRJKpYFW9caotiJ3EiihRGrqrjHBk
a7h8QXy8bQKxc3G0LLKlJNIkxApnNzG6YGSmD6t9bzRPn/sSqar0Ws0IIYd5nYDv
hB4ggfpHvrnEbke4lkfoEBLbJmFFcnSngJh7oDCMT6XEpqeUH7HT0RmYEncnbH1C
9ZRpzUlzxyhZawjBbXWQBNmxhT2Z/KFYkLUkKMPnb060CBtn8DwlkZ22b2dqOvH8
TeRUKySnx6ieH+55fjG4
=CehB
-----END PGP SIGNATURE-----
--
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
Simon Kelley
2015-01-08 18:07:49 UTC
Permalink
Post by Dave Taht
Wow, this thread goes back a ways. Is ds.test-ipv6.com still
configured wrong, and does it pass now? It passes for me (but I am
behind a more modern openwrt box right now)
ds.test-ipv6.com is still showing the same behaviour it was back in
April (!) as far as I can see. Queries to test-ipv6.com (which is what
tripped up Jim in the first place) work fine on the latest dnsmasq,
code, forwarding to 8.8.8.8
Post by Dave Taht
Is there another site that demonstrates this problem?
There were three in Aaron Wood's original posting (subject: Had to
disable dnssec today)

- - Bank of America (sso-fi.bankofamerica.com)
- - Weather Underground (cdnjs.cloudflare.com)
- - Akamai (e3191.dscc.akamaiedge.net.0.1.cn.akamaiedge.net)


All three work for me with the new code. I didn't try old dnsmasq, to
see if the repair was from that or the DNS configuration.
Post by Dave Taht
BTW: For a while there (on comcast), in production, I ran with
pure ipv6 for dns (it reduced ipv4 nat pressure significantly!),
but it hung after a few days and I never got back to it. Were any
problems like this experienced and/or fixed for dnsmasq in the past
8 months or so?
http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=5782649ad95382dd558df97b33b64e854d8789fb

is a possible candidate.
Post by Dave Taht
Anyway... enough incremental fixes have landed all across the board
in openwrt, and the chaos calmer process seems to have settled
down enough, to consider doing an entirely updated cerowrt based on
3.14 and pushing things like dnsmasq further forward...
... but I, personally, am still, not in the position to easily
build and test a new dnsmasq package for cerowrt and have no
funding or time for further development based on chaos calmer.
Hopefully someone else in the openwrt or cerowrt world can take up
the slack. I see that several bleeding edge sub-distros of openwrt
have also emerged on their forums...
(Yet.... I will still try to produce a test dnsmasq version from
the cerowrt-3.10 tree but I doubt it would be safe to do an opkg
update for it.)
There shouldn't be any non backwards-compatible changes in dnsmasq to
bite you. Don't know about other stuff.


Cheers,

Simon.
Post by Dave Taht
On Thu, Jan 8, 2015 at 8:34 AM, Simon Kelley
this insight, I've recoded the relevant stuff to look for the
limits of the signed DNS tree from the DNS root down. That's
clearly the correct way to do it, and should avoid the original
problem here, caused by sending DNSSEC queries to DNSSEC-unaware
servers in the unsigned parts of the tree.
This was quite a big change, and it could do with some serious
testing. Available now on the dnsmasq git repo, or as 2.73test3 in
a tarball.
There are other DNSSEC fixes in there too, Check the changelog.
Cheers,
Simon.
Post by Anders Kaseorg
Post by Anders Kaseorg
Post by Simon Kelley
secure no DS means that the original unsigned answer
should be accepted, except that it shouldn't. There's no
way to distinguish between secure lack of DS because
we've reached an unsigned branch of the tree, and secure
lack of DS because we're not at a zone cut, except if we
know where the zone cuts are, and we don't.
Having just looked through RFC 5155 for clues: isn’t that
the purpose of the NS type bit in the NSEC3 record? In
this example, DS university would give an NSEC3 record with
the NS bit clear. That signals that we should go down a
level and query DS campus. In this case we find a signed DS
there. But if we were to find an NSEC3 with the NS bit
set, then we’d know that we’ve really found an unsigned
zone and can stop going down.
Aha: and this is exactly the answer given at
http://tools.ietf.org/html/rfc6840#section-4.4 .
Anders
Dave Taht
2015-01-08 19:52:23 UTC
Permalink
OK, I built this latest dnsmasq as a test for cerowrt-3.10-50 users:

login to the router
cd /tmp
wget http://snapon.lab.bufferbloat.net/~cero2/dnsmasq/dnsmasq-full_2.73-3_ar71xx.ipk
opkg install ./dnsmasq-full_2.73-3_ar71xx.ipk
(ignore the warnings about not overwriting several files)

I did a few tests on the former dnssec problematic sites and
everything looked kosher. As for the variety of the dnssec testing web
sites.... about half seem down or mis-behaving. Sigh. the ongoing
costs of keeping core internet test tools going strikes again...

In an orgy of self-flagellation, and *only because I have native ipv6*
I also turned off dns queries over ipv4 entirely (this is option
peerdns '0' in /etc/config/networks on cerowrt's ge00 config), and
will pound on it a few days/weeks. I send this email prior to actually
trying that, however....
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Post by Dave Taht
Wow, this thread goes back a ways. Is ds.test-ipv6.com still
configured wrong, and does it pass now? It passes for me (but I am
behind a more modern openwrt box right now)
ds.test-ipv6.com is still showing the same behaviour it was back in
April (!) as far as I can see.
My bad. The "modern openwrt" I am behind does not have the dnsmasq-full
package installed.
Queries to test-ipv6.com (which is what
tripped up Jim in the first place) work fine on the latest dnsmasq,
code, forwarding to 8.8.8.8
Post by Dave Taht
Is there another site that demonstrates this problem?
There were three in Aaron Wood's original posting (subject: Had to
disable dnssec today)
- - Bank of America (sso-fi.bankofamerica.com)
- - Weather Underground (cdnjs.cloudflare.com)
- - Akamai (e3191.dscc.akamaiedge.net.0.1.cn.akamaiedge.net)
All three work for me with the new code. I didn't try old dnsmasq, to
see if the repair was from that or the DNS configuration.
Post by Dave Taht
BTW: For a while there (on comcast), in production, I ran with
pure ipv6 for dns (it reduced ipv4 nat pressure significantly!),
but it hung after a few days and I never got back to it. Were any
problems like this experienced and/or fixed for dnsmasq in the past
8 months or so?
http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=5782649ad95382dd558df97b33b64e854d8789fb
is a possible candidate.
K.
Post by Dave Taht
Anyway... enough incremental fixes have landed all across the board
in openwrt, and the chaos calmer process seems to have settled
down enough, to consider doing an entirely updated cerowrt based on
3.14 and pushing things like dnsmasq further forward...
... but I, personally, am still, not in the position to easily
build and test a new dnsmasq package for cerowrt and have no
funding or time for further development based on chaos calmer.
Hopefully someone else in the openwrt or cerowrt world can take up
the slack. I see that several bleeding edge sub-distros of openwrt
have also emerged on their forums...
(Yet.... I will still try to produce a test dnsmasq version from
the cerowrt-3.10 tree but I doubt it would be safe to do an opkg
update for it.)
There shouldn't be any non backwards-compatible changes in dnsmasq to
bite you. Don't know about other stuff.
So far so good.
Cheers,
Simon.
Post by Dave Taht
On Thu, Jan 8, 2015 at 8:34 AM, Simon Kelley
this insight, I've recoded the relevant stuff to look for the
limits of the signed DNS tree from the DNS root down. That's
clearly the correct way to do it, and should avoid the original
problem here, caused by sending DNSSEC queries to DNSSEC-unaware
servers in the unsigned parts of the tree.
This was quite a big change, and it could do with some serious
testing. Available now on the dnsmasq git repo, or as 2.73test3 in
a tarball.
There are other DNSSEC fixes in there too, Check the changelog.
Cheers,
Simon.
Post by Anders Kaseorg
Post by Anders Kaseorg
Post by Simon Kelley
secure no DS means that the original unsigned answer
should be accepted, except that it shouldn't. There's no
way to distinguish between secure lack of DS because
we've reached an unsigned branch of the tree, and secure
lack of DS because we're not at a zone cut, except if we
know where the zone cuts are, and we don't.
Having just looked through RFC 5155 for clues: isn’t that
the purpose of the NS type bit in the NSEC3 record? In
this example, DS university would give an NSEC3 record with
the NS bit clear. That signals that we should go down a
level and query DS campus. In this case we find a signed DS
there. But if we were to find an NSEC3 with the NS bit
set, then we’d know that we’ve really found an unsigned
zone and can stop going down.
Aha: and this is exactly the answer given at
http://tools.ietf.org/html/rfc6840#section-4.4 .
Anders
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAEBCAAGBQJUrsduAAoJEBXN2mrhkTWiRLwP/2tUY2VLgehYWiAtBXcfjLMU
ZBPpRqwQmvXfypcXoqcZPvutDJCXPw4N/UGN3ole1eDCILBPQ6k8asujNLs0wZnN
m7/mrgS0JEWWSbVsqcTy3JgLh2TcGO0DG7LcOUKZX0VIbNwPVvG6Bv4eBk9afVJ1
sXwxAzdPLoQ5RBnjBCcpcVqRijU5jFClsBXSPsg725xKr9LYh4ZmUJB4TIgHGS/D
UfywntWAvF2hhEZNAIdE6wenQmTlmnQ0mEJK9mn5OfKP3WnDyOlvTI7E3gZ/9gRc
qj+4QSjK31pCam3CoyCHLW8jEDy0/GkEWCCJt58ZelpZz7jh34aiPclalaRVGKNz
PcXiGnmoQnk7ZALaE8VqEEtLh5XLZ067QditsR89Syu8g1iwIOIDR4yJ2gN+0VKD
qs48K7FgxVX+DCpJjCoVfu9F0dWf3haeJetMchFw1WsJdVyIg1yBvc2x+3JD+j8j
idv196X1rb1P68ufGzFILwHcX9oWXDhKaYyZLSZnfPLAUq6is3bnTBY74SHrRYOw
gmPpZ0ysY+gVH7DAMhSViT5fsmUHzho8LLJ4gTuzYyrLAx91CamD6sX/cYXAXZ5t
RNSMp6jOiMV7N9/d1R8WTeX3b9lJ5dZHzql2ldllRhRvlCrb/Lx7+E1frn19dwGe
/cL5NcnFWYc5n32K8mTF
=i31t
-----END PGP SIGNATURE-----
--
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
Dave Taht
2015-01-09 08:52:13 UTC
Permalink
I was able to lock up this version of dnsmasq twice: 100% cpu usage.
No syscalls were visible from strace during the lockup. Lockups
occurred once on nearly at boot, and the second time, after a few
hours of casual usage, with only ipv6 upstreams, on cero-3.10.50-1.

furthermore, the only thing that kills it is a kill -9. I will build a
non-stripped version in the morning... (and I do note that I was
testing two things - one ipv6 upstreams only, and two, dnssec. Prior
to this version I was using both ipv4 and ipv6 upstreams, no issues,
had dnssec on also, usually no issues)

Other suggestions for debugging the causes of a lockup requested (log
all queries?)
Simon Kelley
2015-01-09 15:36:20 UTC
Permalink
A backtrace is the most important starting point. A query log _if_
it's query dependent, but that seems unlikely since it doesn't break
when forwarding to IPv4. An easy way to reproduce would be great :-)

I can do the same tests here, but it's a bit risky, since my IPv6 is
via a sixXS tunnel. If the tunnel goes down, it needs to do DNS
queries to bring it back up.

Cheers,

Simon.
Post by Dave Taht
I was able to lock up this version of dnsmasq twice: 100% cpu
usage. No syscalls were visible from strace during the lockup.
Lockups occurred once on nearly at boot, and the second time, after
a few hours of casual usage, with only ipv6 upstreams, on
cero-3.10.50-1.
furthermore, the only thing that kills it is a kill -9. I will
build a non-stripped version in the morning... (and I do note that
I was testing two things - one ipv6 upstreams only, and two,
dnssec. Prior to this version I was using both ipv4 and ipv6
upstreams, no issues, had dnssec on also, usually no issues)
Other suggestions for debugging the causes of a lockup requested
(log all queries?)
Simon Kelley
2015-01-09 16:49:46 UTC
Permalink
An interesting observation: my IPv6 connectivity is via a sixXS tunnel.

Resolving isc.org through dnsmasq w/DNSSEC to google's IPv6 DNS
servers times out, because dnsmasq was never getting a reply to a
query for the DNSKEY RRset for org. This reply (when signed) is
1600-or-so bytes. running dnsmasq with --edns-packet-max=1280 makes it
work.

The tunnel MTU is 1280


Simon.
Post by Dave Taht
I was able to lock up this version of dnsmasq twice: 100% cpu
usage. No syscalls were visible from strace during the lockup.
Lockups occurred once on nearly at boot, and the second time, after
a few hours of casual usage, with only ipv6 upstreams, on
cero-3.10.50-1.
furthermore, the only thing that kills it is a kill -9. I will
build a non-stripped version in the morning... (and I do note that
I was testing two things - one ipv6 upstreams only, and two,
dnssec. Prior to this version I was using both ipv4 and ipv6
upstreams, no issues, had dnssec on also, usually no issues)
Other suggestions for debugging the causes of a lockup requested
(log all queries?)
Dave Taht
2015-01-09 21:34:49 UTC
Permalink
I strongly suspect an ipv6 fragmentation handling bug in the kernel
version cerowrt uses. Have tons of evidence pointing to that now,
starting with some tests run last year from iwl and also the tests
that netalyzer was doing. And: I just locked up the box completely
while doing some dnssec stuff.

will go through kernel git logs and see what has happened there since 3.10.50.

Turning on the edns-packet-max feature now, however, as I lack time to
poke into this in more detail, and we're supposed to be testing dnssec
as it is....
Simon Kelley
2015-01-10 15:37:07 UTC
Permalink
OK, that's useful, but not good. The last thing DNSSEC/IPv6 needs is
yet another reason why network access which used to work now doesn't.

edns-packet-max=1280 seems to be working fine here. Please let me know
if you find anything more.

Cheers,

Simon.
Post by Dave Taht
I strongly suspect an ipv6 fragmentation handling bug in the
kernel version cerowrt uses. Have tons of evidence pointing to that
now, starting with some tests run last year from iwl and also the
tests that netalyzer was doing. And: I just locked up the box
completely while doing some dnssec stuff.
will go through kernel git logs and see what has happened there since 3.10.50.
Turning on the edns-packet-max feature now, however, as I lack time
to poke into this in more detail, and we're supposed to be testing
dnssec as it is....
Loading...