Blog & How To Guides | WhoisXML API

Typosquatting Data Feed Blog

Typosquatting Feed data for a DNS firewall

There is a tremendous number of domain names registered daily which resemble legitimate domains of brands or organizations, or whose names imply being related to a known service or product. Domains suggesting to be a “support” or “account-verification” or “support page” are also common for containing such strings. Initially, many of these are parked and some become used in malicious activities such as being sold at an inflated price to the legitimate owner, being used as botnet Command & Control servers, or in phishing campaigns to host fake pages to have the victim's sensitive data typed in and sent to the miscreants. 

It seems trivial to remind users of the Word Wide Web that they should be very cautious when following a link they have received in a message. But it is not the case: as the Web is used by people from diverse cultures, education, and technical backgrounds, phishing can harvest its victims. To say nothing of botnets: a bot is a hidden piece of software and thus it uses DNS queries to find its C&C server without any notification to the user. 

Continue reading

Domain Parking and the Typosquatting Feed

In an earlier post, we described the key elements of the domain parking ecosystem and discussed the risks typically stemming from a lack of appropriate regulation of this area. In the present post, we shall conduct a particular investigation revealing the connection between typosquatting, bulk domain registrations, and domain parking, by using WhoisXML API's Typosquatting Data Feed

The Typosquatting Data Feed takes all second-level domains in all generic Top-Level Domains (TLDs) and some of the country-code TLDs that started to operate on the Internet on a given day. That is, these are newly registered or re-registered domains. It performs a lexical similarity-based clustering in search of groups of domains so that all domains in a group have similar names. Hence, the domain feed provides groups of newly registered domains that have been registered on the same day, are similarly named, and are frequently parts of bulk domain name registrations. 

We have found that these sets of domains are closely related to many illicit or semi-legal activities on the Internet that deserve attention, including typosquatting, but also phishing, malware activity, etc. In addition to that, since 1 July 2020, the data are available in an "enriched" fashion, that is, part of the WHOIS information, and the IP addresses associated with the domains are also provided. We shall see below that this is very useful. So, let us see how it relates to domain parking. 

Continue reading

Detect Possible Domain Spoofing and Homograph Attacks with Typosquatting Data Feed

Charles Caleb Colton once said that imitation was the sincerest form of flattery. This proverbial expression finds its origins in the 19th century and other historical writings before that. What likely wasn’t foreseen at the time, however, was that certain forms of imitation in the 21st century could give organizations terrible headaches. We are talking about domain spoofing and homograph attacks.

Imitators in our contemporary context can register one or several domain names highly similar to that of an established brand and use these to deceive people and trick them into sharing sensitive information or even transfering funds to fraudulent bank accounts.

Registering copycat domain names of known brands and organizations isn’t the only way to fool victims, though. At the height of coronavirus-themed attacks, the Typosquatting Data Feed proved useful in spotting potentially dangerous footprints containing thousands of domain names with word strings such as “covid” and “coronavirus” combined with “mask,” “vaccine,” “donation,” “lawsuit,” and plenty of others.

In this post, we put the feed’s capabilities to the test to detect spoofed domain names, including Punycode domains, that could be used to abuse employees, customers, and other parties who regularly interact with Lloyds Bank and Apple. We will also show how other sources of intelligence can help learn more about possible impersonators and the infrastructure they use.

Continue reading

Amplify a blacklist with the Typosquatting Data Feed. A technical blog

The Typosquatting Data Feed list groups of domains that have been registered on the same day, and whose names are similar to each other within the group. A question might be: why buy such data. Here we illustrate the power of the data set through a very efficient application to detect malicious domains. A simple Python code will be presented to illustrate how it works. Then we will illustrate its efficiency by applying it to the PhishTank data feed, demonstrating that it is capable of revealing a tremendous amount of additional domains.

Detection of malicious domains is an important and hard task in IT security. It is the major ingredient of protection against phishing, malware, botnet activity, etc. The most reliable approach to the problem is the use of blacklists such as PhishTank or URLhaus, where a community or a specialized group of experts publish a list of domains or URLs that are confirmed to be malicious. PhishTank, for instance, is community operated: a number of benevolent activists do a great favor to all of us by checking suspicious domains and reveal their phishing activity.

A blacklist of domains is not only useful for direct use in firewalls or spam filters though. It can also serve as an input for methods that can find additional domains strongly related to the blacklisted ones, thus being suspicious. By "amplification" of a blacklist we mean its extension with such a method. With WhoisXML API's recently introduced Typosquatting Data Feed such an amplification can be easily achieved. Some of the domains in the original blacklist will turn out to be the "top of the iceberg": we shall find a relevant set of related domains.

Continue reading
Posted on April 16, 2020

Early Typosquatting Detection Made Possible: A Short Illustration in the Financial Sector

To those who keep an eye on trends in IT security threats, notably phishing and typosquatting attacks, the name Wells Fargo is not unfamiliar, not even to those who have no business relation whatsoever with this multinational financial services company. In fact, all financial companies are likely targets for phishing campaigns, and Wells Fargo had TCPA settlement cases which are amongst the greatest attractors of these kinds of threats. So, rather unsurprisingly, there has been a continuous and significant malicious activity against this company.

Continue reading
Posted on March 18, 2020

The footprint of coronavirus disease in domain name registrations

Cybercriminals use all possibilities which can serve their evil aims. They follow the headlines and react quickly – and they do not have ethical considerations. Even the drama of the coronavirus terrorizing the entire world and causing the deaths of thousands of people is seen as a good ’business’ opportunity to spread out some malware.

IBM X-force recently reported that the coronavirus went cyber via the Emotet trojan. Rather disgustingly, the miscreants send e-mails to people on behalf of respected health organizations, containing attachments claiming to inform about infection prevention measures. As the victim opens the attachment, it silently installs the trojan on the computer.

Traditional phishers are also on board, a typical case is described by Kaspersky: a coronavirus-related message containing a link to an Outlook-looking page to collect login credentials. All this has attracted a lot of media attention, of course...

Continue reading
Posted on February 20, 2020

TCPA settlements in the crosshairs of typosquatters

The Telephone Consumer Protection Act of 1991 (TCPA), Public Law 102-243., as also explained on its Wikipedia page, "restricts telephone solicitations (i.e., telemarketing & BPO) and the use of automated telephone equipment. The TCPA limits the use of automatic dialing systems, artificial or pre-recorded voice messages, SMS text messages, and fax machines."

Naturally, it has generated a number of court cases, which frequently result in calls for settlement claims. Victims can submit their claim online, either directly, or with the help of a number of lawyers and their companies specializing in helping with such cases. The related web pages attract a lot of visitors, and many of them type in the URL of the case manually - a very attractive situation to do some typosquatting… leaving a footprint of TCPA settlements in the records of WhoisXML API's Typosquatting Data Feed.

Continue reading
Posted on January 29, 2020

Typosquatting Daily Data Feed: the new enabler in the fight against phishing and malware

One result of our reseach and development is the introduction of the new "typosquatting data feed", an innovative data set based on our long-standing experience with cybersecurity and the Domain Name System. In what follows we will demonstrate how this new resource can be used efficiently in the fight against spam, phishing and malware.

The main idea behind the new data feed is the observation that domain names which were registered on the same day and have similar names have an increased likelihood of being involved in a range of IT scams, including typosquatting attacks, domain name hijacking, and also phishing and malware. So, we have developed a technology for finding these groups of domain names.

Continue reading
Try our Typosquatting Data Feed for free
Get started
Have questions?

We are here to listen. For a quick response, please select your request type or check our Contact us page for more information. By submitting a request, you agree to our Terms of Service and Privacy Policy.

Or shoot us an email to