Triggered by some colleagues I was researching domain abuse and domainsquatting scanners. These scanners crawl for domains that are similar to a given name and could be used for phishing and similar attacks. An example of such a domain targeting my blog ‘ciko.io’ could be ‘ClKO.io’ (replacing the I with an l) , ‘wwwciko.io’ (mistype), or ‘ciko-blog.io’.

I was under the impression that such scanners are wildly available. The most prominent ones are dnstwist and urlcrazy. Both tools are useful, but they sacrifice completeness for scan speed.

To check if a domain is registered, the safest way is a whois query that checks the domain name database. These databases also contain personal information on the domain holders. To avoid misuse, whois queries are rate limited to a few hundred per day and query IP. Another, less safe option, is to query DNS servers for the SOA or NS record of a given domain. Registered domains do not necessarily need to have a SOA record or NS record set, but almost everyone has one by default. The benefit of DNS requests is the much higher scan speed. Public DNS servers serve up to 20 requests per second and have no relevant daily limit.

Both dnstwist and urlcrazy generate domain names by changing and replacing letters in the domain name, as well as iterating often-used top-level domains. Per keyword they scan between 1000 and 5000 domains. Unfortunately, the shallow scans miss a lot of potential domain misuses. Possible phishing domains do not only include generic typosquatting, but are context sensitive. For a bookshop called “tarboris.com”, possible phishing domains could be “TARB0RIS.com” (found by dnstwist), but also “tarboris-books.com” (not found).

To tackle these issues I wrote my own scanner, monodon. Monodon sacrifices scan speed for completeness. A regular run spans 10.000 to 1.000.000 domains and needs a few hours, instead of minutes like dnstwist and urlcrazy.

Monodon has several methods to generate domain names. Besides the standard char insertion, deletion, and homoglphy variants, monodon uses Wikipedia to generate topic-related word lists. These ensure a far better coverage that detects phishing domains that were hidden until now.

To try out monodon, follow the usage instructions on Github.