dns: add experimental c-ares cache and lookup support#62361
Draft
mcollina wants to merge 1 commit intonodejs:mainfrom
Draft
dns: add experimental c-ares cache and lookup support#62361mcollina wants to merge 1 commit intonodejs:mainfrom
mcollina wants to merge 1 commit intonodejs:mainfrom
Conversation
Add --experimental-dns-cache-max-ttl to expose c-ares query cache, and --experimental-dns-lookup-cares to route dns.lookup() through ares_getaddrinfo instead of libuv's blocking getaddrinfo. Combined, these flags enable fast cached async DNS for all HTTP/net connections without code changes. Refs: nodejs#57641
Collaborator
|
Review requested:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add --experimental-dns-cache-max-ttl to expose c-ares query cache, and --experimental-dns-lookup-cares to route dns.lookup() through ares_getaddrinfo instead of libuv's blocking getaddrinfo.
Refs: #57641
I had AI compile a record of this history of this change.
The Two DNS Resolution Paths in Node.js: A History
The Original Design
Node.js originally used c-ares exclusively for all DNS operations. c-ares was
the natural choice: the standard POSIX
getaddrinfo(3)is synchronous andblocking, fundamentally incompatible with an event-driven runtime. c-ares
provides non-blocking DNS resolution that integrates with event loops.
c-ares was part of Node.js from its earliest days, initially living inside the
libuv source tree before being moved to
deps/caresin 2012.Why
dns.lookup()Was Addeddns.lookup()was introduced because c-ares, as a pure DNS stub resolver,could not replicate the full behavior of the operating system's name resolution:
/etc/hostsfor static name-to-IP mappings/etc/nsswitch.conf) to determine resolution order (files, dns, mdns)localhosthandling per the OSAs Ben Noordhuis explained in nodejs/node#49394:
The workaround was to run
getaddrinfo(3)on libuv's threadpool (default 4threads) to simulate async behavior from JavaScript's perspective.
The Current Split
dns.lookup()dns.resolve*()getaddrinfo(threadpool)/etc/hostsdns.setServers()effecthttp/netnet.connect(),http.request(), and all high-level APIs usedns.lookup()by default. The c-ares resolver (
dns.resolve*()) is only used when explicitlycalled or when a custom
lookupfunction is provided to an HTTP agent orsocket.
Known Problems
Threadpool Starvation
The most critical issue, reported since 2012 (nodejs/node-v0.x-archive#2868,
nodejs/node#8436). When DNS servers are slow or unreachable,
getaddrinfo(3)calls saturate libuv's shared threadpool and block unrelatedfile system I/O and crypto operations.
No DNS Caching
Node.js performs no DNS caching in
dns.lookup(). Every call goes throughgetaddrinfo(3)from scratch, delegating caching entirely to the OS. Manyproduction environments (containers, minimal Linux images) have no local DNS
cache daemon, causing unnecessary latency and load on DNS servers.
Behavioral Inconsistency
dns.lookup()anddns.resolve4()can return different results for the samehostname because they use entirely different resolution mechanisms. This
confuses developers.
dns.setServers()has no effect ondns.lookup().HTTP Timeout Does Not Cover DNS
http.request()'ssetTimeoutdoes not include thedns.lookup()phase. Arequest with a 10ms timeout can hang for seconds during DNS resolution
(nodejs/node#8436).
Failed Unification Attempts
2015: Pure JavaScript DNS Resolver (#1843)
Proposed adding a new pure JS resolver as default, relegating c-ares behind a
--use-old-dnsflag. Closed in 2016 because the author could not match thespeed of the existing resolver from JavaScript.
2015: Replace c-ares with node-dns (#1013)
Discussion about replacing c-ares with Tim Fontaine's
node-dnsmodule. BenNoordhuis was in favor, but Fontaine himself concluded his own module was not
ready for core inclusion, citing technical debt and the impossibility of
faithfully implementing
dns.lookup()behavior from userspace.2017: Tracking DNS Features (#14713)
Tracked growing limitations of c-ares (no AXFR, no DNSSEC at the time, limited
TTL exposure). Considered writing a custom DNS library. Went stale.
2023: Global Flag for Async DNS (#49394)
Proposed a Go-like flag to globally switch
dns.lookup()to use the asyncc-ares resolver. Ben Noordhuis rejected a global flag as "probably a bad idea"
because "lots of things only work with the system resolver." His
counter-proposal was to improve libuv's threadpool scaling and add a transparent
DNS cache. Neither was implemented. The issue was closed by the stale bot.
What Changed:
ares_getaddrinfo(c-ares 1.16.0, 2020)c-ares 1.16.0 introduced
ares_getaddrinfo(), an async equivalent to POSIXgetaddrinfo()with:/etc/hostsreading (with caching since c-ares 1.22.0)localhosthandling per RFC 6761The original gap between c-ares and the system resolver has significantly
narrowed. The main remaining difference is NSS plugin support (mDNS, LDAP,
NIS), which
ares_getaddrinfodoes not implement.Bun uses
ares_getaddrinfoas its defaultdns.lookup()backend on Linux,demonstrating the approach is viable for a JavaScript runtime.