Hour 8 of 8

Live Reconnaissance Simulators

Real WHOIS · DNS · CT · Wayback · Headers · Robots

~80 min6 interactive labs

CEH Objectives ▸ Execute live passive recon against real public targets · Interpret RDAP, DoH, crt.sh, Wayback, header, and robots output · Synthesise findings into a footprint report

Maps to Module 02 · Footprinting and Reconnaissance

OP. GLASSHOUSE — LIVE FIRE

Mission Brief

Theory is over. Run six live OSINT tools against approved public targets (your choice — try iana.org, example.com, github.com) and turn each raw output into one actionable finding for the Glasshouse report.

▸Execute all six recon tools successfully

▸Capture at least one finding per tool

▸Submit a synthesised footprint by end of session

Story · The terminal opens

Your senior consultant slides a laptop across. 'Real APIs. Real targets. Public only — iana.org, example.com, github.com, your own domain. No production scanning. Go.'

Six tools. Six readouts. Six lessons in what hides in plain sight.

Trainer · Core Concepts

How to read RDAP output

Look for: registrar (who controls the domain), creation date (older = more trust + more legacy infra), expiry (impending = social-engineering pretext), nameservers (often reveal hosting/DNS provider), and status flags (clientTransferProhibited is good hygiene).

How to read DoH (DNS) output

A/AAAA → live hosts. MX → mail provider. NS → DNS provider. TXT → SPF (sending hosts), DMARC (anti-spoof posture), site-verification tokens (which SaaS they use!). CAA → which CAs may issue certs.

How to read crt.sh

Every cert ever issued for the domain is here. Look for: unusual subdomains (dev-*, internal-*, *-staging), wildcard certs (broad attack surface), short-lived ACME certs (modern infra), and surprise sibling domains (M&A footprint).

How to read Wayback CDX

First-seen date tells the domain's web history. Look for removed paths (/admin, /portal, /old-app) that may still resolve. 200-status snapshots of pages that now 404 are an OSINT goldmine.

How to read HTTP headers

Server header → fingerprint. HSTS/CSP/XFO/XCTO/Referrer/Permissions → 6-point security score. <3 = immature. 6 = mature defensive engineering.

How to read robots.txt

NOT a security control — it's a sign-posted map of what the operator wanted hidden. Every Disallow path is a candidate for manual review. Sitemaps reveal content inventories.

Knowledge Map · drag to explore

core

Public target

Approved, non-production

tool

RDAP

Ownership + lifecycle

tool

DoH

DNS over HTTPS

tool

crt.sh

Certificate transparency

tool

Wayback

Web history

tool

Headers

Security posture

tool

robots

Operator's hidden map

out

Footprint report

Synthesis

Micro Labs

SIMULATOR

Lab 22 · Live WHOIS / RDAP

Run an RDAP lookup against a public target. Identify the registrar and creation date.

LIVE SIMULATOR · WHOIS / RDAP

Query RDAP via rdap.org. Approved public targets only.

DEBRIEF ▸ Registrar + creation date establish trust signals and pretext for social engineering. Status flags reveal hygiene.

SIMULATOR

Lab 23 · Live DNS (DoH)

Pull MX or TXT records to read the target's email posture.

LIVE SIMULATOR · DNS over HTTPS

Use Cloudflare DoH. Try TXT to see SPF/DMARC + SaaS verification tokens.

DEBRIEF ▸ SPF tells you which hosts may send mail; DMARC tells you the enforcement posture (none/quarantine/reject). Site-verification TXT records reveal SaaS usage.

SIMULATOR

Lab 24 · CT-log subdomain enumeration

Pull every subdomain ever issued a TLS cert for the target via crt.sh.

LIVE SIMULATOR · Certificate Transparency (crt.sh)

Search Certificate Transparency. Look for non-obvious hosts (dev-*, internal-*, *-staging).

DEBRIEF ▸ CT logs are append-only — every cert is permanent intel. Anomalous subdomains often point to shadow IT or M&A footprint.

SIMULATOR

Lab 25 · Wayback Machine history

Pull the snapshot history of a target to find first-seen date and removed content.

LIVE SIMULATOR · Wayback Machine

Use the Internet Archive CDX API.

DEBRIEF ▸ Removed pages still in Wayback often expose forgotten admin portals or legacy stack details. First-seen date is a useful trust signal.

SIMULATOR

Lab 26 · HTTP security header audit

Score the target's defensive web headers out of 6.

LIVE SIMULATOR · HTTP Header Audit

Fetch the homepage. Read HSTS / CSP / XFO / XCTO / Referrer-Policy / Permissions-Policy.

DEBRIEF ▸ <3 = immature posture. 6 = mature defensive engineering. The Server header often reveals stack.

SIMULATOR

Lab 27 · robots.txt + sitemap recon

Parse the target's robots.txt and surface every Disallow path + sitemap reference.

LIVE SIMULATOR · robots.txt + sitemap.xml

Read /robots.txt — Disallow entries are recon hints, not security.

DEBRIEF ▸ Every Disallow is a sign-posted map of what the operator wanted hidden. Sitemaps reveal content inventories.

Knowledge Check

1. An RDAP response shows clientTransferProhibited status. This indicates…

2. A target's DMARC TXT record says 'p=none'. What does that tell an attacker?

3. crt.sh shows a wildcard *.dev.target.com cert issued last week. Most useful follow-up?

4. HTTP security score is 1/6 on a banking site. Single highest-ROI fix?

0/4 answered

Challenge · Six-tool footprint sprint

Pick one approved public target. Run all six simulators. Produce a 5-bullet footprint with one finding per category (ownership, DNS, certs, history, headers, robots).

CEH v13 Exam Focus

★★★★★

Frequently tested

·Reading RDAP / WHOIS output fields
·Interpreting DNS record types and DMARC policies
·Using CT logs for subdomain discovery
·HTTP security header taxonomy
·robots.txt as recon (not security)

Memory tricks

·RDAP fields: REG-DATE-NS-STATUS
·DMARC policies: none → quarantine → reject (least to most enforcing)
·6 headers to score: HSTS, CSP, XFO, XCTO, Referrer, Permissions

Common traps

⚠Treating robots.txt as a defence
⚠Assuming privacy-WHOIS hides everything (CT logs still leak hosts)
⚠Reading p=none as 'protected' (it's monitor-only)
⚠Confusing 'HSTS present' with 'HSTS preloaded'

Rapid revision

▸RDAP > legacy WHOIS (structured JSON)
▸DoH = DNS-over-HTTPS (RFC 8484)
▸CT logs = append-only, public, permanent
▸CDX = Wayback's query interface
▸robots.txt = recon goldmine

Interview Prep

Hour 7: Footprinting & Reconnaissance Back to hub