Basic database | Specifications | Typosquatting Data Feed | WhoisXML API

Basic database

Sample

Data availability

Subscription type Description
Daily updates Are available by 6:00 p.m. UTC on most days. On some, as the generation input data coming from other daily feeds may take more time, the data are generated 8 hours later.
Weekly updates Weeks start on Sunday. The data files for the last week become available on Monday 8 p.m. UTC every week.
Monthly updates The data files for the last month are available on the second day of the month at 8 p.m. UTC.

Naming convention

Subscription type Description
Daily updates typosquatting.YYYY-MM-DD.daily.full.basic.csv
Weekly updates typosquatting.YYYY-MM-DD.weekly.full.basic.csv, where the date corresponds to a Sunday which is the last day whose data are included in the file; thus a week is considered to start with the previous Monday, end with the Sunday in the file name, and the file is published the next Monday after the date in the file name.
Monthly updates typosquatting.YYYY-MM-DD.monthly.full.basic.csv, where the date corresponds to the first day of the next month, thus e.g. data for July 2020 are in the file typosquatting.2020-08-01.monthly.full.basic.csv.


Note that the weekly and monthly data are derived from the concatenation of the respective daily data and the addition of the first field, the date.

CSV structure

The basic data files are comma-separated value-files without text delimiters. The files use DOS/Windows - style line terminators (CR+LF). The first line is a header line with the field names. Each line has four or five fields depending on the subscription type:

Field Description
date The day when the group was detected (only in weekly and monthly files).
group_number Ordinal number of the group within the given day (in case of daily subscription, within the file).
group_member_number Ordinal number of the domain within the group.
total_no_of_grp_members Number of group members within the group.
domain Domain name
domain_utf Domain name transcribed to Unicode; only for domains with national (non-English) characters.


E.g. a two adjacent groups, No. 1058 and 1059, with 3 and 5 members, respectively, appear in the file as:

...
1058,1,3,slut.bar,
1058,2,3,slut.events,
1058,3,3,slut.red,
1059,1,5,worldthinkcreativity.online,
1059,2,5,worldthinkcreativity.org,
1059,3,5,worldthinkcreativity.com,
1059,4,5,worldthinkcreativity.info,
1059,5,5,xn--wrkdthinkcreativity-g5c.net,wırkdthinkcreativity.net
...

The last domain in the list has a non-English character ("i" without a dot) as a second letter, as seen in the non-empty last field. In a weekly or monthly file, the lines of a group will look like

...
2020-08-17,3,1,9,app1e1d05.com,
2020-08-17,3,2,9,app1e1d09.com,
2020-08-17,3,3,9,app1e1d03.com,
2020-08-17,3,4,9,app1e1d04.com,
2020-08-17,3,5,9,app1e1d02.com,
2020-08-17,3,6,9,app1e1d01.com,
2020-08-17,3,7,9,app1e1d07.com,
2020-08-17,3,8,9,app1e1d08.com,
2020-08-17,3,9,9,app1e1d06.com,
...

Note that it is the date and the ordinal number of the group (the first two fields) which identifies the group uniquely in these files.