DNS model and terminology

The DNS model and terminology used in the RFCs is idiosyncratic, to say the least. It doesn't help, moreover, that the RFCs explain most things in terms of the way that BIND happens to implement them.

For example: A "zone file" is just an implementation detail. It is the way that one particular DNS content server happens to store the DNS content that it serves. It's not necessary to the operation of the DNS that everyone use "zone files". The type of data storage that a content server chooses to use is entirely irrelevant as far as DNS clients are concerned. They have no way of finding out what type of data storage a content server uses. Yet the RFCs explain things in terms of zone files.

The following sections explain the somewhat more straightforward and conventional terminology used by the Internet Utilities, and provide an overview of the model of DNS operation that they aim to support.

The DNS as a table

The Internet DNS is one of the world's largest distributed databases. In concept, the DNS is one single table with four fields:

Name Class Type Data
a.root-servers.orsc. Internet IP4 address 199.166.24.12
nmsu.edu. Internet SMTP Relay server 128.123.34.234
128.123.34.235

The first three fields of the table form the key. Duplicate keys are not allowed. The fourth field contains a set of zero or more data. A key with an empty data field (i.e. a set with zero members) is a distinct case from a key that has no matching record in the table.

No single host stores the entire DNS table. Instead, only subsets of the table are ever stored anywhere. Each domain owner provides one or more content DNS servers running on machines connected to Internet. These servers serve up a table of their own, which includes within it a subset of the DNS table. The DNS table would be formed by taking the the tables served up by all of the content servers in the world, extracting the subsets of those tables that are determined by administrative boundaries, and constructing the union of those subsets.

This union is never in fact constructed. Instead, there is a procedure, known as "query resolution", for locating individual pieces of the overall DNS table. This procedure involves descending a path down a tree structure of content servers, starting from one of a set of known pre-defined root content DNS servers.

Records in the table are used for many purposes, the purpose generally dictating the value of the type field used in the key. The type A, for example, is used to map hostnames onto a set of (zero or more) IP version 4 addresses, stored in the data field. Similarly, the type AAAA is used to map hostnames onto a set of (zero or more) IP version 6 addresses.

Servers

The RFCs make the unfortunate unstated assumption that all DNS servers are fundamentally the same. This is an entirely unwarranted assumption. There are in fact several distinct rôles that a DNS server can perform.

DNS servers are categorised into two classes:

Both sorts of server can use caching. Only proxy servers have their own caches of resource record sets, however. Content servers usually rely on the caching done by the database engine, or the ordinary file caching of the operating system upon which they are running.

The RD and AUTH bits are superfluous

When the rôles of content servers and proxy servers are distinguished from each other, it becomes clear that the RD and AUTH bits in DNS message packets are superfluous. Whether or not the sender of a query will receive a recursive response, and whether or not the data in a response are from an authoritative source, are implicit in, and solely determined by, the type of server that the sender has chosen to communicate with in the first place.

The DNS client library DLL in the Internet Utilities and the back end of the Forwarding Caching Proxy Server always set the RD bit to 1 in queries that they send, even though they expect to always be talking to a proxy server. Similarly, the back ends of the Resolving Caching Proxy Server always sets the RD bit to 0 in queries that it sends, even though it expects to always be talking to content servers. They all do this in order to allow for the existence of badly designed softwares where a single huge program vainly tries to perform both rôles.

The DNS servers in the Internet Utilities ignore the RD bit in queries sent to them. Irrespective of what the client actually specifies, content servers will act as if the RD bit was set to 0 and proxy servers will act as if the RD bit was set to 1. All responses from content servers will have the RD bit set to 0, the RA bit set to 0, and the AUTH bit set to 1; and all responses from proxy servers will have the RD bit set to 1, the RA bit set to 1, and the AUTH bit set to 0.

Queries and responses

DNS clients communicate with DNS servers using UDP datagrams or TCP connections. In both cases, the server is expected to be listening on port number 53. Most transactions are short, and therefore involve only UDP datagrams. TCP connections are used for transactions whose data would overflow the 512 byte limit (imposed by the RFCs) for DNS datagrams, and for certain specific types of transactions.

The responses from DNS servers are categorised into two classes:

Proxy servers always return complete responses to their clients. Fully resolving proxy servers return complete responses because they accumulate information from content servers until they can construct a complete response. Forwarding proxy servers return complete responses for the simple reason that their back ends should always be configured to talk to fully resolving proxy servers and they pass along what they receive.

Content servers return either complete responses or partial responses.

Truncation

When a DNS/UDP response cannot be made to fit within the DNS/UDP datagram size limit imposed by the RFCs (512 bytes), the response is sent with the "truncated" flag in its header set to 1.

The DNS protocol gives no guarantee that DNS/UDP responses that have the "truncated" flag set to 1 will contain complete resource record sets (and, indeed, some other DNS server softwares, such as BIND, do send partial resource record sets when they truncate responses). Thus the entire contents of a "truncated" response are untrustworthy and are not used by reasonable proxy DNS servers.

Since the resource records in a "truncated" response are not useful, there is no point in including them. (And, moreover, doing so would only needlessly duplicate traffic that is about to be sent via DNS/TCP anyway.) Thus when the "truncated" flag is set to 1 in a response, it is best practice to eliminate all resource records from the response.

Bailiwicks

Clients do not trust content servers. They only consult them as authorities on particular subsets of the entire domain name namespace.

What the trusted subset is, is determined by the content server that referred the client to the server in the first place. The referral indicates the portion of the namespace that the referred-to content DNS server should be trusted to serve up, its bailiwick.

As such, content servers cannot know what parts of the namespace clients will expect them to be authorities on. (They may be the subject of so-called "lame delegations", for example, where another content server refers clients to them for particular domains without their knowledge or agreement.)

Indeed, content servers rely on this. They provide complete responses for the entire namespace, and expect clients to only be referred to them for a subset of that namespace. In effect, they have a complete DNS database table of their own that covers the entire namespace, and the delegation of authority from other content servers selects one or more groups of rows from that table, a subset, for inclusion into the overall DNS database.

It is the responsibility of DNS clients to filter out the parts of a response from a content server that are outside of the server's bailiwick. This filtration prevents poisoning.

Poisoning

Poisoning is a technique used by the malicious whereby information about parts of the domain name namespace not owned by the administrator of a content DNS server is injected into the responses returned by that server. It relies on the fact that until relatively recently, clients and caching proxy DNS servers were naïve enough to retain that information and make use of it.

For example, the content DNS server for warez.com. might return, in the responses to questions about names in warez.com., extra information about www.ibm.com. giving false A data with a lifetime of several days. If believed by the DNS client, this would effectively redirect software on the client machines that was trying to contact www.ibm.com. to another machine. This machine could run an HTTP server, which both software and users would believe was the HTTP server for IBM.

Poisoning is avoided by ignoring (not caching and not passing along) any part of the response from a content DNS server that is outside of its bailiwick. The warez.com. content DNS server may publish whatever it likes about www.ibm.com.. Everyone will ignore it because the delegation pointing to it gives warez.com. as its bailiwick and www.ibm.com. is outside of that bailiwick.

Content DNS servers that serve multiple domains and that "helpfully" add extraneous information about names in other domains (usually NS record sets) to every response that they send are doing so in vain, as that additional information will be discarded because it is considered to be outside of the server's bailiwick. Such servers are wasting bandwidth, and should be reconfigured to not attach the additional, unrequested, information.

It is important to note that www.ibm.com. is within the bailiwicks of the ibm.com. content DNS servers, the com. content DNS servers, and the . content DNS servers. Any of those three sets of servers can provide information about www.ibm.com. that will be trusted.

"zones"

In the Domain Name System, a "zone" comprises an apex, a point in the DNS namespace tree, and everything below that apex in the tree. Zones are conventionally named after the domain names at their apices.

"Zones" can overlap. Content DNS servers construct the overall DNS database that they serve up by trimming the overlapping portions and constructing a single, served to the public, database. Resolving proxy DNS servers construct the the overall, global, DNS database, that never exists as a whole in any single place, by selecting parts of the databases served by the individual content servers, and stitching them together subject to bailiwick rules.

Clients

There are two classes of DNS clients:

Client-server interactions

The interactions between clients and servers form a chain:

                                                                ÚÄÄÄÄÄÄÄÄÄÄ¿
                                                              Ú>³  Content ³
                                                              ³ ³   Server ³
                                                              ³ ÀÄÄÄÄÄÄÄÄÄÄÙ
                                                              ³
 ÚÄÄÄÄÄÄÄÄ¿    ÚÄÄÄÄÄÄÄÄÄÄÄÄ¿                  ÚÄÄÄÄÄÄÄÄÄÄÄ¿  ³ ÚÄÄÄÄÄÄÄÄÄÄ¿
 ³ Client ³<ÄÄ>³ Forwarding ³<ÄÄÄÄÄ.......ÄÄÄÄ>³ Resolving ³<ÄÅ>³  Content ³
 ³        ³    ³    Proxy   ³                  ³   Proxy   ³  ³ ³   Server ³
 ÀÄÄÄÄÄÄÄÄÙ    ÀÄÄÄÄÄÄÄÄÄÄÄÄÙ                  ÀÄÄÄÄÄÄÄÄÄÄÄÙ  ³ ÀÄÄÄÄÄÄÄÄÄÄÙ
                                                              ³
                                                              ³ ÚÄÄÄÄÄÄÄÄÄÄ¿
                                                              À>³  Content ³
                                                                ³   Server ³
                                                                ÀÄÄÄÄÄÄÄÄÄÄÙ

The application program using the DNS includes a lightweight client, usually in a library that the application links to. This lightweight client consults a configuration file that lists a fixed set of DNS servers. The lightweight client sends all queries to those servers.

The DNS server that the lightweight clients talk to is not a content server. Lightweight clients expect to be able to query the entire DNS, and no content server stores the entire DNS. Instead, lightweight clients talk to proxy servers. These proxy servers in turn communicate with other servers.

The chain of proxy servers may be arbitrarily long. It comprises zero or more "forwarding" proxy servers and a single "resolving" proxy server at the end of the chain. It is that server at the end of the chain that actually determines where to find the parts of the DNS table that are being requested.

The resolving proxy server is the server that communicates with the content servers. Content servers do not communicate with one another directly (apart from when multiple content servers for a single domain use the old and inefficient method of "zone transfers" to ensure that they all have identical tables). Instead, content servers provide referrals to one another in their answers to the resolving proxy server. The resolving proxy server follows these referrals by switching communication to the referred-to content servers.


The Internet Utilities are © Copyright Jonathan de Boyne Pollard. "Moral" rights are asserted.