DomainHunter: A Distributed System for Identifying Potentially Malicious Domains
Introduction
In today's digital landscape, phishing attacks remain one of the most prevalent threats to organizations and individuals. Attackers constantly register new domains that mimic legitimate services, often using sophisticated techniques to evade detection. Security teams need efficient tools to identify, analyze, and respond to these threats quickly.
In this post, I'll walk through DomainHunter, a distributed system built on Cloudflare Workers that can help security teams identify, analyze, and respond to potentially malicious domains. We'll explore the architecture, implementation details, and how each component works together to provide a comprehensive domain monitoring solution.
System Architecture Overview
DomainHunter consists of four main components:
- Webhook Service: Receives domain alerts from Cloudflare's Brand Protection product and filters unwanted domains.
- Enrichment Service: Enhances domain data with WHOIS information, IP resolution details, and URLscan results.
- LogView Service: Provides a web interface for viewing and searching collected domain data.
- Graph Service: Offers a visualization dashboard for domain threat analytics.

Let's dive into each component to understand how they work together.
The Webhook Service: Processing Domain Alerts
The webhook service is the heart of DomainHunter. It receives notifications from Cloudflare's Brand Protection product about potentially suspicious domains that might be impersonating your brand or services.
This component performs several critical functions:
- Receives webhook notifications about potentially malicious domains
- Refangs the domain for processing (replaces '[.]' with '.')
- Validates and filters out noisy or spammy domains to conserve API resources
Here's a simplified version of the webhook handler:
async function handleWebhook(request, env) {
// Parse the incoming request
const alertData = await request.json();
// Extract and refang the domain (replace '[.]' with '.')
let domain, domainDefang;
if (alertData.domain) {
domain = alertData.domain.replace('[.]', '.');
console.log(`Extracted Domain (refanged): ${domain}`);
domainDefang = alertData.domain;
console.log(`Extracted Domain (defanged): ${domainDefang}`);
}
// Get match details
const match = alertData.match || 'Unknown';
const matchTime = alertData.matchTime || new Date().toISOString();
// Filter out false positives
if (domain && /XXX|YYY|ZZZ/i.test(domain)) {
console.log('Spam domain name detected, skipping processing');
return new Response('Spam domain detected, skipping processing.', {
status: 200,
headers: { 'Content-Type': 'text/plain' },
});
}
const spammyTLDs = ['XXX', 'YYY', 'ZZZ'];
if (domain && spammyTLDs.some(tld => domain.endsWith(tld))) {
console.log('Spammy TLD detected, skipping processing');
return new Response('Spammy TLD detected, skipping processing.', {
status: 200,
headers: { 'Content-Type': 'text/plain' },
});
}
// Processing continues...
}
Filtering Logic
The filtering logic is particularly important as it helps prevent noise and conserves your API quotas. The system filters domains based on:
- Known spam patterns in the domain name
- Specific TLDs frequently associated with abuse
- Other custom rules you might want to add based on your threat landscape
This initial triage ensures that only domains worthy of further investigation move to the enrichment phase.
The Enrichment Service: Gathering Intelligence
One of the most valuable aspects of DomainHunter is its ability to enrich domain data with information from multiple sources. This gives security teams context about a potentially malicious domain. In the example code below my real domain has been replaced with [YOURDOMAIN]
as a placeholder. If you were deploying the same code (if I eventually release it) this would be your domain name.
// Fetch Whois information for the domain
const whoisUrl = `https://whois.[YOURDOMAIN]/?domainName=${domain}`;
const whoisResponse = await fetch(whoisUrl, {
method: 'GET',
headers: {
'Content-Type': 'application/json',
},
});
const whoisData = await whoisResponse.json();
let registrarName = 'N/A';
let hostNames = [];
if (whoisData && whoisData.registrar) {
registrarName = whoisData.registrar;
}
if (whoisData && Array.isArray(whoisData.hostNames)) {
hostNames = whoisData.hostNames;
}
// Fetch IP information if IPs are available
let ipData = null;
let org = 'N/A';
let formattedIps = 'N/A';
if (whoisData && Array.isArray(whoisData.ips) && whoisData.ips.length > 0) {
formattedIps = whoisData.ips.join('\n');
const ipInfoUrl = `https://ipinfo.[YOURDOMAIN]/?ip=${whoisData.ips[0]}`;
const ipInfoResponse = await fetch(ipInfoUrl, {
method: 'GET',
headers: {
'Content-Type': 'application/json',
},
});
ipData = await ipInfoResponse.json();
if (ipData && ipData.org) {
org = ipData.org;
}
}
// Fetch URLScan information for the domain
const urlScanUrl = `https://urlscan.[YOURDOMAIN]/?url=https://${domain}`;
const urlScanResponse = await fetch(urlScanUrl, {
method: 'GET',
headers: {
'Content-Type': 'application/json',
},
});
const urlScanData = await urlScanResponse.json();
let resultUrl = 'N/A';
if (urlScanData && urlScanData.result) {
resultUrl = urlScanData.result;
}
Understanding the Intelligence APIs
To build a comprehensive threat profile for each domain, DomainHunter leverages three specialized intelligence APIs that work together to provide a multi-dimensional view:
1. WHOIS Intelligence (whois.[YOURDOMAIN])
This API provides critical domain registration information:
- Registrar details (which company the domain was registered through)
- Registration and expiration dates (newly registered domains are often suspicious)
- Name servers that host the domain's DNS records
- IP addresses the domain resolves to
- Ownership information that can help identify patterns across malicious domains
- Historical registration data
This information helps security teams evaluate the legitimacy of a domain by examining its age, who registered it, and what infrastructure it uses. In my specific use case, WhoisXML generously supported my independent security research. I created a Worker wrapper around their WHOIS API to query and extract the exact data points needed for this project.
2. IP Intelligence (ipinfo.[YOURDOMAIN])
Once we have the IP addresses from the WHOIS data, this API enriches our understanding of the hosting infrastructure:
- Hosting organization/provider (certain providers are frequently used by threat actors)
- Geographic location of the server
- Autonomous System Number (ASN) information
- Network details (subnet, range, etc.)
- Abuse contact information for the IP
- Additional domains hosted on the same IP (potentially revealing campaign infrastructure)
In this use case I am making use of data from IPinfo. This data helps identify hosting patterns and infrastructure connections that might not be apparent from domain data alone.
3. Content Analysis (urlscan.[YOURDOMAIN])
This API performs active analysis of the website content:
- Initiates a real-time scan of the website
- Captures screenshots of the site (visual evidence)
- Analyzes page content for phishing indicators
- Detects malicious scripts, iframes, and redirects
- Identifies technologies used on the site
- Maps connections to other domains and resources
- Provides a detailed report URL for manual investigation
This active scanning component is crucial for understanding what's actually hosted on the domain and determining if it contains phishing content.
By combining these three intelligence sources, DomainHunter creates a multi-dimensional threat assessment of each domain, providing security teams with the context they need to make informed decisions.
Real-Time Alerts via Slack
After processing the domain, DomainHunter sends a formatted notification to a Slack channel, allowing security teams to quickly assess the threat:
const slackUrl = env.SLACK_WEBHOOK_URL;
const slackMessage = {
text: `New Detection: \`${domainDefang}\`
Detection Time: \`${matchTime}\`
Matched this query: \`${match}\`
Domain resolves to: \`\`\`${formattedIps}\`\`\`
Host/CDN: \`${org}\`
Name servers: \`\`\`${formattedHostNames}\`\`\`
Registrar Name: \`${registrarName}\`
URLscan: <${resultUrl}>`
};
const slackResponse = await fetch(slackUrl, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(slackMessage)
});
The Slack notification provides a concise summary of all the critical information, allowing security teams to make quick assessments:

Benefits of Real-Time Alerts
- Rapid response: Security teams can immediately see new detections
- Actionable intel: All critical data points are included
- Direct URLscan link: One-click access to visual evidence
- Seamless workflow: No need to constantly check a dashboard
Persistent Storage with Cloudflare D1
All the collected information is stored in a Cloudflare D1 database for historical analysis and reporting:
// Store data in D1 database
const { d1 } = env;
const formattedHostNames = hostNames.join('\n');
const insertQuery = `INSERT INTO logs (domain, match_time, match, ips, name_servers, registrar, host_cdn, urlscan_result)
VALUES (?, ?, ?, ?, ?, ?, ?, ?)`;
const stmt = d1.prepare(insertQuery);
const result = await stmt
.bind(domain, matchTime, match, formattedIps, formattedHostNames, registrarName, org, resultUrl)
.run();
Why Cloudflare D1?
Using Cloudflare D1 for storage provides several advantages:
- Zero infrastructure management: No database servers to maintain
- Global distribution: Data is stored close to where it's used
- SQL compatibility: Familiar query language for data analysis
- Automatic scaling: Handles high volumes of data without provisioning concerns
- Direct integration: Seamlessly works with Cloudflare Workers

The LogView Service: Searching and Analyzing Domains
The LogView service provides a web interface for viewing and searching all collected domain data. It features:
- Basic authentication for secure access
- A sortable and searchable HTML table
- Pagination for browsing large datasets
- Direct links to URLScan results
- Export functionality for further analysis
async function handleRequest(request, env) {
// Basic Authentication
const authHeader = request.headers.get('Authorization');
const expectedAuth = 'Basic ' + btoa('XXX:YYY'); // Username: 'XXX', password: 'YYY'
if (authHeader !== expectedAuth) {
return new Response('Unauthorized', {
status: 401,
headers: {
'WWW-Authenticate': 'Basic realm="Logs Viewer"',
'Content-Type': 'text/plain'
},
});
}
// Query the database for all logs
const { d1 } = env;
const stmt = d1.prepare('SELECT * FROM logs ORDER BY id DESC');
const result = await stmt.all();
// Generate HTML table with the results
// Including search, sort, and pagination functionality
// ...
}
The LogView interface allows security teams to conduct historical analysis, search for patterns, and create reports based on collected data.

The Graph Service: Visualizing Domain Threats
The Graph service provides a visualization dashboard using Chart.js to display:
- Top domain registrars used by potential threat actors
- Top hosting providers/CDNs where suspicious domains are hosted
- Top match patterns (security rules that triggered alerts)
- Top TLDs (Top-Level Domains) used in suspicious domains
- Trends over time for all metrics
async function handleRequest(request, env) {
const { d1 } = env;
// Query for top registrars
const registrarStmt = d1.prepare(`
SELECT registrar, COUNT(*) as count
FROM logs
WHERE registrar IS NOT NULL AND registrar != 'N/A'
GROUP BY registrar
ORDER BY count DESC
LIMIT 10
`);
const registrarResult = await registrarStmt.all();
// Similar queries for host_cdn, match, and TLDs
// ...
// Format data for Chart.js
const registrarLabels = registrarResult.results.map(item => item.registrar);
const registrarData = registrarResult.results.map(item => item.count);
// Generate HTML with Chart.js visualizations
// ...
}
Benefits of Visualization
The visualization dashboard provides several benefits:
- Pattern recognition: Quickly identify common infrastructure used by attackers
- Trend analysis: See how attack patterns evolve over time
Future Enhancements
There are several ways DomainHunter could be enhanced in the future:
- Machine learning integration: Automatically classify domains based on risk scores
- Threat intelligence feeds: Integrate with external threat feeds for additional context
- Automated takedown workflow: Create a workflow for submitting abuse reports when there is confirmed malicious activity.
- Custom alerting rules: Allow security teams to define their own alerting criteria
Conclusion
DomainHunter demonstrates how Cloudflare Workers, combined with D1 and external APIs, can create a powerful system for detecting and responding to potentially malicious domains. The architecture is:
- Distributed: Each component runs as a separate Cloudflare Worker
- Scalable: Workers automatically scale to handle load
- Responsive: Real-time alerts via Slack
- Comprehensive: Enriches domain data from multiple sources
- Analytical: Provides visualization tools for threat analysis
This approach to security tooling leverages the serverless paradigm to create a system that is both efficient and effective for security teams. The combination of automated detection, enrichment, and visualization allows for rapid identification and response to potential phishing threats.
For organizations looking to enhance their security posture against domain-based threats, this architecture provides a solid foundation that can be customized and extended based on specific needs.