DNS Explained: A Complete Breakdown of How the Domain Name System Functions
In this blog, I have explained the core concepts of the DNS system and all the components involved in a DNS query.
Before we start talking about the DNS system, let us understand why it exists or what problem it is solving.
The need for the DNS system is the exactly same as the need for addressing systems in the real world. If somebody asks for your house location you would give them a human-readable address like door no, apartment name, street no, city, state, etc. you wouldn’t give them a latitude and longitude right? This is the case when you are communicating with humans, what if you are communicating with a machine that understands only latitude and longitude such as your mobile phone?
In the above case you need a translator to translate the human-readable addresses to equivalent latitudes and longitudes, the map providers such as Google map, MapBox, etc are doing exactly the same thing ie; translate a human-readable address to equivalent latitude and longitude so that you can search the location by human-readable format and your mobile will navigate to the address using the latitude and longitude translated by the map provider.
Similarly, in the internet world, humans understand human-readable web addresses like www.scribbledtech.com and the computer understands the IP addresses like 162.214.80.37 (IPv4)
or 2404:6800:4007:808::200e (IPv6)
. This human-readable address is known as the Domain Name.
So like in the real world, we need a translator, and the entire DNS system is developed for this translation ie; IP addresses to human-readable addresses or other way around. The entire DNS system is massive and highly scalable, it handles billions of DNS resolution requests per day. So if you are interested to know how it started and how it became as big as it is today, you can read the history section of this article.
A domain name has three parts Top-level Domain(com, org, in, etc.), Second-level Domain, and Sub-domain as shown below.
All the web addresses are ends with a dot (.) even if we are entering the URL in the browser without a dot. But this dot is getting appended internally when the system makes a DNS resolve query. Ie; when we enter www.scribbledtech.com then internally a dot is appended at the end and sends www.scribbledtech.com. in DNS resolve query. So now the question is what is it with this dot? The answer is, it is a representation of the ROOT server ie;
The domain name is resolved from left to right so the www.scribbledtech.com. Will be analyzed and queried as .com.scribbledtech.www
. So the dot represents the Root level then com represents the TLD level etc shown in the below picture.
Broadly we can classify the DNS servers into two categories
1. Recursive Name Servers:-
These are the servers that help to resolve the domain names by querying recursively until it gets the final answer ie; IP address. These servers do not own any data but cache the resolved data to respond faster.
Eg:- Your Home router, LAN DNS server, ISP (Internet Service Provider) DNS server, Goole DNS server (8.8.8.8, 8.8.4.4 etc), Cloudflare DNS server(1.1.1.1, 1.0.0.1 etc). Anybody in the world can own these kinds of servers, there is no authorization required for this. So many companies use their own internal DNS server to impose policies and security.
2. Authoritative Name Servers:-
These are the servers that hold the actual mapping information. Since the domain name system is huge in terms of size and traffic, these servers are maintained in a topological manner. You can visualize this one as a tree data structure where dot (.) is the ROOT level and TLD like com, org, net, etc.. at the second level.
These two levels are owned and regulated globally by The Internet Corporation for Assigned Names and Numbers (ICANN) and Internet Assigned Numbers Authority (IANA). Each node of these levels is a distributed system itself having so many instances across the world. Levels below this one are owned by different companies that are authorized by TLD companies. Most of the time, these are the companies where you buy domain names for your website like GoDaddy, BigRock, Google(domain. google), etc.
Now let us talk about the actual DNS resolving process in the real world, ie what happens when you type a domain name like www.scribbledtech.com in the browser URL bar and hit Enter.
- Your browser will check whether the IP mapping is available locally in the host file, browser cache, and system cache in order. If it is available it uses that IP.
- If mapping is not present locally, it appends a dot at the end of the name (www.scribbledtech.com.) and sends a DNS resolution request to the DNS server (Recursive Name Server) already configured in the system. Most of the time this will be your home router or company router. You can check more about this in the Deep dive section of this article to understand different option and how it is configured.
- The DNS server checks whether the IP mapping is available in its cache, if it is available it returns the mapping otherwise it will start a recursive DNS query to the next DNS server. This can be another Recursive DNS server or one of the ROOT servers of the Authoritative Name Servers depending on the configurations. Most of the time the router (Recursive Name Server) sends the request to the Internet Service Provider DNS server (another Recursive Name Server).
Usually, there can be many intermediate Recursive Name Servers involved in the DNS resolution process depending on your location and how many ISPs it should hop to reach the ROOT server. For simplicity, we will consider the ISP DNS server as the last Recursive Name Server in this use case, don’t worry the process is the same in all Recursive Name Servers ie; checking the cache and returning the mapping if present otherwise starts a recursive DNS query to the next DNS server. - Consider the ISP DNS server (the last Recursive Name Server in this example) does not have a mapping in its cache then it starts a recursive DNS query to the ROOT DNS server (Authoritative Name Server). Unlike Recursive Name Servers, Authoritative Name Servers will respond with the answer or lead to the answer.
Here the lead means another server address that might know the answer and it does not cache any response from any other servers however, it might cache the data that it has in its DB. Since the ROOT servers keep only the TLD server details it returns the IP address of the TLD servers in response. In this use case, it will return the ip addresses of com TLD servers. - After getting the response from the ROOT DNS server the ISP DNS server will make another query to the TLD DNS servers. These servers will return either the IP address of the actual Name server which holds the IP mapping or it can also send back another domain name reference in the same or different TLD ie, org or net, etc. More on this in the deep dive section. For now, we will consider it returns the actual IP address of the name server which holds the domain name IP mapping.
- ISP DNS server then makes another query to the Domain name server and receives the IP address where the actual website runs in the response. This mapping detail is saved in the ISP DNS cache first before sending the response back to the called server. Similarly, this mapping detail is saved in all the intermediate Recursive Name Servers through which this one was requested with a running TTL field.
Note:- Most of the DNS requests are sent over UDP because it is fast and light weighted (packet size is 600bytes). Google has come up with an idea to send these requests over TLS since they started using HTTP3 which is using UDP under the hood making it easy to implement DNS over TLS. You can find more info about this in their official doc
DNS Deep Dive
I know that you have many questions about the entire setup so in this section, we will dig deep by asking those questions. You might also want to experience a real-world DNS resolve process on your own. Don’t worry will try to connect all the missing dots in this section.
1. DNS name caching with running TTL
As I mentioned earlier DNS records get cached in countless places, it starts with your browser where it is stored in Host Resolver Cache and it is also saved in the form of already-connected keep-alive sockets. In Chrome, you can check chrome://net-internals/?#dns
Next is your local system cache also known as the OS cache. In OS we have one more place where we can set the resolution manually which is in the /etc/hosts
file, this is a simple text file with some key-value pairs. You can even override any host names in the world and assign an IP to that locally.
A typical host file looks like the one below. In this localhost is mapped to loopback IPv4 and IPv6 addresses and there is this custom entry where I mapped google.com to point to my local machine. Now on my computer, I can use google.com to point to anything running in my system.
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
127.0.0.1 google.com # custom entry
Text file
So the order of resolution is host file → browser cache → os cache
in the local system. Outside of our local system, DNS records are cached almost in all the intermediate Recursive Name Servers (router, Local DNS server, ISP DNS server, etc..) before it reaches the ROOT Authoritative Name Servers. Typically, a DNS query is resolved from a Recursive Name Server’s cache before it reaches the actual Authoritative Name Server.
We need to talk about one more important aspect before we wind up this caching section which is the Running TTL. When you map your domain name to an IP address in the registrar’s portal like GoDaddy, BigRock, etc… you are asked to enter the TTL along with that which is considered the maximum time any cache in the entire world can keep this mapping data before refreshing the record again. A typical mapping on the registrar’s portal would look like the one below.
The TTL is a running TTL i.e. When any of the Recursive Name Servers return the answer from its cache instead of sending the actual TTL value, it will send the Elapsed TTL e.g. A record with actual TTL of 300 seconds sitting in the cache for 20 seconds will be served with a TTL of 280 seconds (300-20).
In this way, all the caches in the request chain are in sync and expire at the same time, but there are some caveats to this because of network delay and request processing delay of different DNS servers, sometimes the records are getting cached longer than they should be to honor the initial TTL configured by the domain owner.
Actually, the TTL is an indicative value i.e. indicates the maximum time in the cache, Different DNS servers have different algorithms to make it more up-to-date and consistent, so the TTL you would see might be completely different than the actual but rarely exceeds the actual TTL.
Using a simple dig
command you can experience the running TTL e.g.
📂 ~ ▶︎ dig scribbledtech.com
shell; <<>> DiG 9.10.6 <<>> scribbledtech.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27035
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;scribbledtech.com. IN A
;; ANSWER SECTION:
scribbledtech.com. 11421 IN A 162.214.80.37
;; Query time: 10 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Wed Jan 11 07:17:07 IST 2023
;; MSG SIZE rcvd: 62
shellExecute it again
📂 ~ ▶︎ dig scribbledtech.com
shell; <<>> DiG 9.10.6 <<>> scribbledtech.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1607
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;scribbledtech.com. IN A
;; ANSWER SECTION:
scribbledtech.com. 11233 IN A 162.214.80.37
;; Query time: 3 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Wed Jan 11 07:20:17 IST 2023
;; MSG SIZE rcvd: 62
shell
The highlighted part in the response is the TTL and you can see the value of the second response is lesser than the first one. The value will keep on getting reduced until it hit zero and reset/refresh again.
2. How does our computer know the immediate DNS server IP in the first place?
Even to connect DNS servers system needs the IP addresses of the DNS servers, where is it configured? or how is it configured?
The IP address of immediate DNS is configured in /etc/resolve.conf
file in Linux based system. By default, most systems automatically fetch and configure the connected network router IP as the immediate DNS server unless you configure it otherwise. But you can add more DNS server entries there and it will fall back to the next one in the list if the previous one does not respond. A customized resolve file looks like the one below
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
nameserver 1.1.1.1
nameserver 1.0.0.1
nameserver 8.8.8.8
nameserver 8.8.4.4
nameserver 192.168.1.1
Text file
In the above file, the first four nameserver entries are manually entered and the fifth one is the automatically fetched/configured router ip address. Since we have some manually configured DNS server IPS the DNS request will bypass all the DNS configurations in the router and connect the name servers in the list.
Here bypass means only the DNS config bypass, the request will still be going through the router as it is your gateway to the internet. Most of the home systems will have only the fifth entry in the file and the router is responsible for finding out the DNS server.
So let’s talk about how the router gets the IP of the immediate DNS server. Most of the routers can automatically fetch the DNS server IP from the ISP like the way our system fetches the IP address of the router. You can even configure custom IP addresses in the router.
Configuring custom DNS server IP is a very common practice in organizations where they have their own local DNS server. A typical router configuration window looks like the one below.
So configuring a custom DNS entry in the local computer will make only that computer connect to a custom DNS server but configuring it in the router will make the entire devices in the LAN to connect the custom DNS server. Most organizations go with the latter approach.
3. How does the Recursive Name Server know the IP of the first Authoritative Name Server (ROOT Server)?
After possible chaining of a few Recursive Name Servers when it reaches the final Recursive Name Server in the chain it has to connect the first Authoritative Name Server i.e. the ROOT Server. To connect the ROOT server, it maintains the IP addresses of all the ROOT servers.
This depends upon the algorithm used by the Recursive Name Server, a common scenario is to keep a configuration file that has the IP addresses of all the 13 Root servers and uses a technique called priming to update configuration in case of any changes in the IP addresses of ROOT servers (more on the 13 ROOT servers is in the next section).
This itself is a topic for another day, so if you are more interested in reading it you can refer Root File and RFC-8109.
4. How are these massive distributed Authoritative Name Servers maintained and regulated?
The overarching organization that is responsible for Root servers and TLDs is Internet Corporation for Assigned Names and Numbers (ICANN), this is the organization making the global policies for the DNS system, This has members from all over the world including companies, governments, etc. ICANN then delegates some of its responsibilities to its sub-organization Internet Assigned Numbers Authority (IANA).
ROOT Servers are a network of 100s of servers located in many countries across the world configured in DNS ROOT zone as 13 named authorities and these are managed by 12 different operators. The alphabet-labeled names of these zones are a.root-servers.net. through m.root-servers.net. as given below. If
HOSTNAME | IP ADDRESSES | OPERATOR |
a.root-servers.net | 198.41.0.4, 2001:503:ba3e::2:30 | Verisign, Inc. |
b.root-servers.net | 199.9.14.201, 2001:500:200::b | University of Southern California, Information Sciences Institute |
c.root-servers.net | 192.33.4.12, 2001:500:2::c | Cogent Communications |
d.root-servers.net | 199.7.91.13, 2001:500:2d::d | University of Maryland |
e.root-servers.net | 192.203.230.10, 2001:500:a8::e | NASA (Ames Research Center) |
f.root-servers.net | 192.5.5.241, 2001:500:2f::f | Internet Systems Consortium, Inc. |
g.root-servers.net | 192.112.36.4, 2001:500:12::d0d | US Department of Defense (NIC) |
h.root-servers.net | 198.97.190.53, 2001:500:1::53 | US Army (Research Lab) |
i.root-servers.net | 192.36.148.17, 2001:7fe::53 | Netnod |
j.root-servers.net | 192.58.128.30, 2001:503:c27::2:30 | Verisign, Inc. |
k.root-servers.net | 193.0.14.129, 2001:7fd::1 | RIPE NCC |
l.root-servers.net | 199.7.83.42, 2001:500:9f::42 | ICANN |
m.root-servers.net | 202.12.27.33, 2001:dc3::35 | WIDE Project |
As I mentioned, these root servers have 1000s of servers across the world and if you are interested to look into that list you can refer to root-servers.org. This will give all the instances details of all the root servers. This ROOT server database contains routing info like IP addresses of over 1500 TLDs and Internationalized Domain Names (IDNs).
Now let’s talk about TLDs, as I mentioned earlier there are more than 1500 TLDs present and you can check the list of publically available TLDs here. Generally, the TLDs are classified into 5 different categories as explained below.
- Generic Top-level Domains (gTLD) :- This one contains 3 or more characters and these TLDs are open for registration by anyone. Some examples of gTLDs are .com, .org, .net, .info, .biz, etc. But apart from .org, .com, and .net other domains in the gTLDs have some restrictions ie; which should meet some standards set by the regulators.
- Sponsored Top-level Domains (sTLD) :- These kinds of TLDs are proposed and maintained by businesses, government agencies, or some kind of organizations. And the final authority to decide who can register to these TLDs also lies with these organizations. Some examples of sTLDs are .edu, .gov, .mil, .museum, .jobs, etc.
- Country Code Top-level Domains (ccTLD) :- These kinds of TLDs represent a country. Some examples of gTLDs are .in, .uk, .us, .fr, .ca, etc. More than 300 ccTLDs are there now and counting. It is very common that big organizations to use ccTLDs for their websites running in specific countries e.g. Amazon uses .in (www.amazon.in) for running its business in India.
- Infrastructure Top-Level Domain (ARPA) :- This one is a reserved category and currently, it has only one TLD (.arpa). This domain is dedicated to Internet Engineering Task Force (IETF) under the guidance of the Internet Architecture Board (IAB). This is not open for the public to register and it is managed by IANA directly.
- Test Top-Level Domains (tTLD) :- These TLDs are reserved by IETF to use for documentation and testing purposes and there is no virtual existence for these TLDs ie; not a part of the ROOT server database.
5. How do the TLD servers get the Authoritative Name Servers address of the specific domain name?
To understand this we should understand how we register a domain, where we register a domain, and how this information is saved in the entire DNS system to make it available globally. Normally when you want to buy a domain you would go to some websites that sell the domain name e.g. GoDaddy, BigRock, etc.
These sites are called Registrar i.e. an accredited organization to sell domain names to the public. The servers that maintain these TLDs are called Registries e.g. VeriSign manages the .com TLD i.e they are the only authority that set the rule for the specific TLD, extensions for the TLD, and work with the registrar to sell the domain names.
Actually, a Registrar and an Authoritative Name Server for a specific domain are different. Registrar’s job is to sell the domain and send the Authoritative Name Server information for that domain to TLD servers. The Authoritative Name Server’s job is to maintain the IP mapping record of that domain.
Most of the time registrars provide their own Authoritative Name Server and they keep it as the default name server. But one can change the name servers from their dashboard and point it to completely a different server. In this case, Registrar will update the new name server details in TLD servers and TLD servers will return the new name server details in the DNS query response.
As a rule, a minimum of two name servers should be configured for each domain name. Like the Recursive Name Servers and unlike the ROOT and TLD servers these name servers can be owned by anyone in the world. A typical dashboard entry for the name server in a registrar’s portal would look like below.
6. What is it look like the request and response of a DNS query?
Yea now it the time for some hands-on, we can use the normal dig
command to mimic the calls.
📂 ~ ▶︎ dig www.scribbledtech.com
shell; <<>> DiG 9.10.6 <<>> www.scribbledtech.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35514
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;www.scribbledtech.com. IN A
;; ANSWER SECTION:
www.scribbledtech.com. 14400 IN CNAME scribbledtech.com.
scribbledtech.com. 14400 IN A 162.214.80.37
;; Query time: 485 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Sat Jan 14 08:03:12 IST 2023
;; MSG SIZE rcvd: 80
shell
The response to the query dig www.scribbledtech.com
is fully resolved hence it has an answer section, most probably the answer will come from one of the Recursive Name Server’s cache, and these details are abstracted from you.
Let’s leave Recursive Name Server’s response right here as there is no difference in the response from any of these Recursive Name Servers. We are interested in how the ROOT server responds with TLD address and How the TLD responds with the domain’s Authoritative Name Server address etc. For that, we need to act as the last Recursive Name Server in the chain which contacts the ROOT servers.
As we already know that this server has the address of ROOT name servers and a dig query from this server and a response from the ROOT server will look like below
📂 ~ ▶︎ dig @c.root-servers.net www.scribbledtech.com
shell; <<>> DiG 9.10.6 <<>> @c.root-servers.net www.scribbledtech.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9056
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 27
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;www.scribbledtech.com. IN A
;; AUTHORITY SECTION:
com. 172800 IN NS i.gtld-servers.net.
com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
com. 172800 IN NS c.gtld-servers.net.
com. 172800 IN NS f.gtld-servers.net.
com. 172800 IN NS m.gtld-servers.net.
com. 172800 IN NS e.gtld-servers.net.
com. 172800 IN NS d.gtld-servers.net.
com. 172800 IN NS h.gtld-servers.net.
com. 172800 IN NS g.gtld-servers.net.
com. 172800 IN NS l.gtld-servers.net.
com. 172800 IN NS j.gtld-servers.net.
com. 172800 IN NS k.gtld-servers.net.
;; ADDITIONAL SECTION:
m.gtld-servers.net. 172800 IN A 192.55.83.30
l.gtld-servers.net. 172800 IN A 192.41.162.30
k.gtld-servers.net. 172800 IN A 192.52.178.30
j.gtld-servers.net. 172800 IN A 192.48.79.30
i.gtld-servers.net. 172800 IN A 192.43.172.30
h.gtld-servers.net. 172800 IN A 192.54.112.30
g.gtld-servers.net. 172800 IN A 192.42.93.30
f.gtld-servers.net. 172800 IN A 192.35.51.30
e.gtld-servers.net. 172800 IN A 192.12.94.30
d.gtld-servers.net. 172800 IN A 192.31.80.30
c.gtld-servers.net. 172800 IN A 192.26.92.30
b.gtld-servers.net. 172800 IN A 192.33.14.30
a.gtld-servers.net. 172800 IN A 192.5.6.30
m.gtld-servers.net. 172800 IN AAAA 2001:501:b1f9::30
l.gtld-servers.net. 172800 IN AAAA 2001:500:d937::30
k.gtld-servers.net. 172800 IN AAAA 2001:503:d2d::30
j.gtld-servers.net. 172800 IN AAAA 2001:502:7094::30
i.gtld-servers.net. 172800 IN AAAA 2001:503:39c1::30
h.gtld-servers.net. 172800 IN AAAA 2001:502:8cc::30
g.gtld-servers.net. 172800 IN AAAA 2001:503:eea3::30
f.gtld-servers.net. 172800 IN AAAA 2001:503:d414::30
e.gtld-servers.net. 172800 IN AAAA 2001:502:1ca1::30
d.gtld-servers.net. 172800 IN AAAA 2001:500:856e::30
c.gtld-servers.net. 172800 IN AAAA 2001:503:83eb::30
b.gtld-servers.net. 172800 IN AAAA 2001:503:231d::2:30
a.gtld-servers.net. 172800 IN AAAA 2001:503:a83e::2:30
;; Query time: 139 msec
;; SERVER: 192.33.4.12#53(192.33.4.12)
;; WHEN: Sat Jan 14 08:25:08 IST 2023
;; MSG SIZE rcvd: 849
shell
Oh, this is big, here we made a DNS query against one of the ROOT servers c.root-servers.net
and we got this big response.
Let’s talk about some common request and response fields here, when we make a DNS query, we are making a query for the specific record. There are many types of DNS records like A, AAA, NS, TXT, etc… If we don’t mention any record type, by default it will append the type as A record hence in effect our last query was dig @c.root-servers.net www.scribbledtech.com A
.
Since the ROOT server does not have the answer for this, it returns zero answers and 13 NS (Name Server) records of com TLDs in the AUTHORITY SECTION. It also returned the A (IPv4) and AAAA (IPv6) addresses of TLD servers in an ADDITIONAL SECTION. Since there is no answer in the response, the DNS resolver will make another call to the TLD name servers listed in the previous response.
📂 ~ ▶︎ dig @j.gtld-servers.net www.scribbledtech.com
shell
; <<>> DiG 9.10.6 <<>> @j.gtld-servers.net www.scribbledtech.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7950
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.scribbledtech.com. IN A
;; AUTHORITY SECTION:
scribbledtech.com. 172800 IN NS ns1.bluehost.in.
scribbledtech.com. 172800 IN NS ns2.bluehost.in.
;; Query time: 140 msec
;; SERVER: 192.48.79.30#53(192.48.79.30)
;; WHEN: Sat Jan 14 08:46:42 IST 2023
;; MSG SIZE rcvd: 97
shell
Still, there is no answer in return and it got the name server details of the domain (scribbledtech), this is the name server we talked about in the last section i.e the one we configured in Registrar’s dashboard. Now the resolver will use one of these name server address and makes another call
📂 ~ ▶︎ dig @ns1.bluehost.in www.scribbledtech.com
shell
; <<>> DiG 9.10.6 <<>> @ns1.bluehost.in www.scribbledtech.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57140
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;www.scribbledtech.com. IN A
;; ANSWER SECTION:
www.scribbledtech.com. 14400 IN CNAME scribbledtech.com.
scribbledtech.com. 14400 IN A 162.214.80.37
;; Query time: 23 msec
;; SERVER: 162.159.24.72#53(162.159.24.72)
;; WHEN: Sat Jan 14 09:02:42 IST 2023
;; MSG SIZE rcvd: 80
shell
This time the resolver got two answers in the response, the first one is a CNAME (canonical domain name) record i.e an alias to the actual domain name, and an A record which is the actual IP address. Here www.scribbledtech.com is an alias for scribbledtech.com and both are pointing to the same website. Once the resolver gets the answer it will not make any further queries, it just caches this record and sends it back to the caller.
Alright then, by now you might have got some fair idea about the entire DNS system and how it works. If you are curious enough to explore how it got the way it is now you can continue reading the next section where I explain the evolution of DNS system.
DNS History
Domain name resolution started much before the modern internet was developed, This problem first popped up when the predecessor of the internet called ARPAnet was developed (1970). Initially, it was just the combination of an IMP (computer) number and a port number for each host. This was an easy solution; we only needed to remember the IMP number and port to connect the host.
TCP/IP Host table
This worked well for quite some time when the system was small but became complicated and not maintainable when the system started scaling up. It became so difficult to remember the IMP name and port for each and every host in the ARPAnet. That is where the first symbolic name concept is introduced, the developers of ARPAnet realized that it is much easier to remember a symbolic name rather than a numeric number.
So each of the hosts in the network got its own name and each site managed the host table that contains the mapping of a symbolic name to address. To reduce the duplication and inconsistencies this list was made centralized and used RFC (request for comment, Wikipedia link) to document it. But each host administrator still maintained their own copy of the host table and this copy had to be updated each time when a new RFC was published.
This process is extremely slow and error-prone since the new updates will have to wait for the next RFC update and if some administrator forgot to update the RFC this change will never be considered. But this technique is still in use to resolve the local system domain resolution where you have a text file in your computer located at /etc/hosts
in the Unix system HOSTS located in the main windows folder in the windows system.
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
127.0.0.1 google.com # custom entry
Text file
This structure was so flat that one can take any name in the world without any formal structure. To avoid these scenarios and to improve consistency certain rules were put in place regarding how names should be created such as hierarchical namespace and no of namespaces available etc.
TCP/IP Domain Name System (DNS)
Even though the adoption of a Host table improved the efficiency and usability of the internet, this could not meet up the requirements of the growing internet ie; A central storage of domain to address mapping that is regulated by an authority with some structural rules to create a name. This is when the modern DNS system is being developed which is based on the hierarchical division of a network into groups and subgroups with names reflecting this structure, storing this data in a distributed fashion to facilitate decentralized access and included flexible and extensible mechanisms for name registration and resolution.