ipconfig /flushdns, client resolver cache, NIC DNS settings), see Windows Server Network Troubleshooting. For Active Directory DNS health (SRV records, DC locator, _msdcs zone, dcdiag /test:dns), see Active Directory DNS Problems.
Windows Server DNS troubleshooting starts with one question before any command runs: is this the DNS Server role itself, or is it Active Directory, or is it the client resolver wearing a DNS costume. Roughly half the wasted time in a DNS incident comes from diagnosing the wrong layer – chasing AD replication when the DNS service is simply stopped, or rebuilding a zone when the actual problem is a dead forwarder three seconds away from a fix.
- Confirm the DNS Server service is running:
Get-Service DNS - Confirm it’s listening on the right interface (DNS console > Properties > Interfaces)
- Test a forwarder directly:
Test-NetConnection <forwarder-ip> -Port 53 - Clear the server’s own cache, not the client’s:
Clear-DnsServerCache - Run a recursive test from the server:
Resolve-DnsName microsoft.com -Server localhost
- Windows Server DNS troubleshooting follows a fixed order: service running, listening interfaces, zone loaded, record present, forwarder reachable, recursion working – each step rules out a whole class of failure
- Test against the server with
nslookup <name> <server-ip>andResolve-DnsName -Server, never with client resolver tools – those test a different layer entirely - Event IDs in the 4000-4019 range, plus 4013 and 4015, are Active Directory problems wearing a DNS event log – redirect, don’t deep-dive
- The genuinely server-side events are 407/408/410 (listening interfaces), 414 (single-label hostname), 6702 (peer A-record update), and 6522/6525/6534 (zone transfer)
- The single most common SMB/homelab recursion failure is a dead or unreachable forwarder, not a misconfigured zone
The Six-Step Server-Side Triage Workflow
Work this top to bottom. Each step isolates a failure class, and there’s rarely a reason to skip ahead – the steps most operators skip are usually the ones that turn out to be the actual cause.
- Service running?
Get-Service DNS. From any host,nslookup <name> <server-ip>– “Request to server timed out” or “No response from server” points at the service, not the zone. - Listening on the right interface? DNS console > server Properties > Interfaces. A service that’s running but not bound to the queried IP behaves identically to a stopped service from the client’s perspective.
- Zone loaded and not paused? DNS console General tab, or
Get-DnsServerZone. A paused zone returns “Query refused” or “Server failure.” - Record present?
Get-DnsServerResourceRecord -ZoneName corp.local -Name host. Missing records on an otherwise healthy zone usually trace to scavenging. - Forwarder reachable? DNS console > Properties > Forwarders, or
Test-NetConnection <forwarder> -Port 53. This is the highest-priority check for “external names won’t resolve” symptoms. - Recursion actually working?
Resolve-DnsName microsoft.com -Server localhost, or the DNS console Monitoring tab’s recursive query test.
In practice, steps 1 through 3 resolve the majority of “DNS is completely down” tickets within minutes. Steps 5 and 6 cover almost everything else once the server itself is confirmed healthy.
Testing the DNS Server Directly: nslookup vs Resolve-DnsName
This is the part of Windows Server DNS troubleshooting most people get backwards. Both tools have to point AT the DNS server, not at a client resolver, or they’re testing the wrong layer entirely.
nslookup <name> <server-ip> queries a specific server directly. Reading the response: “Server failure” or “Query refused” means the zone is paused or the server is overloaded; “Request to server timed out” or “No response from server” means the service isn’t running or isn’t listening on that IP. One thing that catches people early: nslookup’s own startup message about not finding a PTR record for the server’s address (“Default servers are not available”) is just a missing reverse-lookup entry for the server itself – it does not mean the server can’t answer queries.
For walking a broken delegation chain, the interactive nslookup sequence is server <IP>, then set norecursion, then set q=NS, then the FQDN in question – this traces where the NS/A chain actually breaks, root down.
Resolve-DnsName -Server localhost (or -Server <dns-ip>) is the modern PowerShell equivalent and the one to reach for first on anything Server 2012 R2 and later.
Cache handling deserves its own line because mixing it up is the single most common diagnostic mistake in this category: Clear-DnsServerCache clears the server’s own resolver cache – the right tool here. Clear-DnsClientCache and ipconfig /flushdns clear a client’s cache, which is a different layer entirely and belongs in Windows Server Network Troubleshooting.
Quick Event ID Reference
This is a fast lookup, not a full reference. Each Event ID below gets the depth it needs for triage – what it means and what to check next – not a complete explanation.
| Event ID | Layer | Meaning | Action |
|---|---|---|---|
| 407 / 408 | Server | Could not bind/open a socket on an interface | Check Interfaces tab; look for a port 53 conflict or stale IP |
| 410 | Server | Restricted interfaces list invalid – falling back to all interfaces | Common after NIC teaming; dnscmd /resetlisteningaddresses, set service to Automatic (Delayed Start) |
| 414 | Server | Server has a single-label hostname (no DNS suffix) | Set the primary DNS suffix; reboot. Harmless on a standalone box |
| 6702 | Server | Server’s own A record update failed on an AD-integrated peer | Check for wrong A records on replication partners; restrict listening on multi-NIC servers |
| 6522 / 6525 / 6534 | Server | Zone transfer refused or failed | See Zone Transfer Failures below |
| 2501 / 2502 | Server | Scavenging cycle completed, with/without deletions | See Scavenging-Related Failures below |
| 4000 / 4004 / 4007 / 4016 | Active Directory | DNS can’t open, load, or enumerate the AD-integrated zone | Redirect – see Active Directory DNS Problems |
| 4013 | Active Directory | DNS waiting on AD DS initial synchronization | Redirect – see Active Directory DNS Problems |
| 4015 | Active Directory | Critical error from Active Directory (RODC, permissions, or orphaned object limit) | Redirect – see Active Directory DNS Problems |
This table is a fast-lookup boundary, not a full per-event reference. It covers what’s needed to triage during an active incident.
Recursion and Forwarder Failures
This is where the bulk of SMB and homelab “DNS is broken” tickets actually live. Internal names resolve fine; anything external times out or comes back slow.
Dead or unreachable forwarder. Windows Server’s default timeouts are tight by design: an 8-second recursion timeout and a 3-second forwarding timeout mean a server can realistically only try about three forwarders before giving up and returning failure to the client. A dead forwarder at the top of the list burns most of that budget before the server ever reaches a working one. Fix: remove unreachable forwarders, confirm the survivors actually answer with Test-NetConnection <ip> -Port 53.
DC-to-DC forwarding. Domain controllers in the same domain shouldn’t forward to each other – they already hold the same zone data, and forwarding between them only adds latency without resolving anything new. Forward to external resolvers only.
DNSSEC plus forwarders. With DNSSEC validation enabled and a forwarder configured, the server can misjudge an unsigned zone as signed and return SERVFAIL instead of a clean answer. If recursion fails specifically on DNSSEC-aware configurations, this combination is worth checking before assuming a forwarder is simply down.
EDNS0 and firewalls. Windows DNS uses EDNS0 for responses larger than the old 512-byte UDP ceiling. A firewall that drops oversized UDP packets produces a pattern where some domains resolve fine and others fail intermittently, which looks like a flaky forwarder but isn’t. dnscmd /config /enableednsprobes 0 is the workaround; fixing the firewall’s UDP handling is the actual resolution.
Clear-DnsServerCache confirms the diagnosis immediately; the durable fix is blocking AAAA recursion through a query resolution policy and addressing the upstream forwarder, not scheduling recurring cache flushes as a workaround.
Scavenging-Related Resolution Failures
Records that simply vanish are almost always a scavenging configuration issue, not corruption or an attack. Full scavenging mechanics, safe rollout sequencing, and recovery from over-aggressive scavenging are covered in DNS Scavenging Windows Server – this section is the fast-triage version for an active incident.
Fastest check: dnscmd /zoneinfo <zone> for current aging settings, and Event IDs 2501/2502 in the DNS Server log to confirm a scavenging cycle ran and what it removed. If a record keeps disappearing on a predictable cycle, the no-refresh and refresh intervals are very likely set below the device’s actual re-registration cadence – operators report this is the most common root cause once the scavenging audit events are actually checked instead of assumed.
Zone Transfer Failures
Applies to standard primary/secondary zone pairs. AD-integrated zones replicate through Active Directory replication, not AXFR/IXFR – if both zones in question are AD-integrated, this section doesn’t apply and the issue belongs in Active Directory DNS Problems instead.
Symptoms: a secondary server serving visibly stale records, paired with Event 6534 (“aborted or failed to complete transfer of the zone”) or 6522/6525 (“zone transfer request… refused by the master”).
- Confirm TCP port 53 is reachable from the secondary to the primary:
Test-NetConnection <primary> -Port 53 - Compare SOA serial numbers on both servers:
Resolve-DnsName <zone> -Type SOA -Server <primary>and the same against the secondary - If the primary’s serial isn’t higher than the secondary’s, no transfer will trigger – this is expected behavior, not a bug
- Check the primary’s Zone Transfers tab – “Only to servers listed on the Name Servers tab” is a common silent blocker if the secondary isn’t actually listed there
- If the secondary is a non-Windows server (BIND or similar), confirm “BIND secondaries” is enabled on the primary’s Advanced tab – without it, fast transfers can fail to complete
- Force a manual transfer to confirm the fix:
dnscmd /zonerefresh <zone>
When It’s Not the DNS Server Role
A meaningful share of tickets that arrive as “DNS is broken” turn out to be a different layer entirely. This table exists to redirect fast instead of debugging the wrong system for an hour.
| Symptom | Actual layer | Where it belongs |
|---|---|---|
Works after ipconfig /flushdns on the client, breaks again later | Client resolver cache | Windows Server Network Troubleshooting |
DC won’t register SRV records, or dcdiag /test:dns fails | Active Directory / Netlogon | Active Directory DNS Problems |
| Event IDs in the 4000-4019 range, or 4013/4015 | Active Directory | Active Directory DNS Problems |
| AD-integrated zone not replicating to all DCs | Replication scope, not transfer | Windows Server DNS Replication Scope |
| Records dynamically registering on some devices but not others | Dynamic update permissions | DNS Dynamic Update Failed and Windows Server DNS Forwarders |
| Internal and external clients resolve the same name to different IPs | Expected behavior, not a failure | Windows Server Split-Brain DNS |
Verification and Debug Logging
For Windows Server DNS troubleshooting that survives the quick checks above, two tools go deeper without requiring a packet capture.
# Query/recursion/cache counters
dnscmd /statistics
# Temporary debug logging - enable, reproduce the issue, then disable
dnscmd <server> /config /loglevel 0x8101F
# ... reproduce the issue, check %windir%\System32\Dns\Dns.log ...
dnscmd <server> /config /loglevel 0x0
Debug logging is resource-intensive and meant to be temporary – enable it, reproduce the failure, capture the log, then turn it back off. On Server 2016 and later, the DNS Analytical log under Event Viewer > Applications and Services Logs > Microsoft > Windows > DNS-Server is the lower-overhead alternative for production servers where flipping on full debug logging isn’t comfortable mid-incident.
Final Thoughts
Most Windows Server DNS troubleshooting time isn’t spent fixing the actual problem – it’s spent figuring out which layer the problem is even in. The six-step order exists to shortcut that part: confirm the service, the interface, the zone, the record, the forwarder, and recursion, in that sequence, and the actual fix is usually obvious by the time the failing step turns up. The cases that don’t fit cleanly into this workflow are, more often than not, Active Directory or client-resolver issues that only look like a DNS Server problem from the outside.
FAQ
Where do I start with Windows Server DNS troubleshooting during an active incident?
The five-item quick-fix checklist at the top of this article covers the fastest path: service running, listening interfaces, forwarder reachable, server cache cleared, recursive test from the server. Most incidents resolve within those five checks before the full six-step workflow is even needed.
Why does nslookup show an error before I even type a query?
That startup message is nslookup failing to find a reverse-lookup (PTR) record for the DNS server’s own IP address. It’s cosmetic – the server can still answer the actual query that follows. Don’t read it as a sign the server is down.
What’s the fastest check for “external websites won’t load” on a domain controller?
Test the configured forwarders directly with Test-NetConnection <forwarder-ip> -Port 53. A dead or unreachable forwarder is the single most common cause of this exact symptom in SMB and homelab environments.
Should I use Clear-DnsServerCache or Clear-DnsClientCache?
Clear-DnsServerCache when troubleshooting the DNS Server role itself – this is almost always the right one for server-side work. Clear-DnsClientCache (or ipconfig /flushdns) only clears a client’s local resolver cache and won’t change anything the server is doing.
I see Event ID 4013 or 4015 in the DNS Server log – is this a DNS problem?
No. Both indicate the DNS Server role is waiting on or failing to read Active Directory, not a DNS Server role misconfiguration. Treat these as Active Directory issues – see Active Directory DNS Problems.
Records keep disappearing from a zone – is that an attack?
Almost always not. Missing records on an otherwise healthy zone are overwhelmingly a scavenging configuration issue – aging intervals set shorter than the actual device re-registration cadence. Check Event IDs 2501/2502 before assuming anything more dramatic.
Do I need to enable debug logging every time I troubleshoot DNS?
No. The six-step triage workflow and the quick Event ID table resolve most incidents without it. Debug logging is for the cases that survive triage and need packet-level detail – enable it, reproduce, capture, then disable it again.
A DNS incident that takes an hour to diagnose almost always took fifty minutes to figure out which layer was actually broken, and ten minutes to fix it once that was clear.
Windows Server DNS Series
10 articles — Zones & Configuration · Scavenging · Forwarders · Replication Scope · Split-Brain DNS · Troubleshooting · Dynamic Updates · Event IDs · Scavenging Recovery · Zone Transfers