Researchers analysing the top 100,000 websites globally found that many of them are collecting user information, including email IDs and handing them over to third-party trackers without user consent. The research found that 1,844 websites did this when a user visited the site from the EU and that 2,950 websites did this for visitors from the United States. Some of these websites were even found to collect passwords.
To carry out this research, the team used a web crawler that finds and fills email and password fields, while monitoring network traffic for leaks and intercepting script access to the filled input fields. The research was a joint effort between researchers at KU Leuven, Radboud University, and the University of Lausanne.
In a follow-up investigation, researchers found that Meta (Facebook) and TikTok collect hashed personal information from web forms even when the user filling the form doesn’t submit it or give consent. Both Meta and TikTok have a ‘Pixel’ that can be placed on sites with a feature called Automatic Advanced Matching, which automatically collects hashed personal identifiers from web forms. These identifiers are then used to target advertisements on the companies’ respective platforms, measure conversions or create custom audiences for advertisers.
“We believe the leaks are due to Facebook’s script interpreting clicks on irrelevant buttons as ‘submit button clicked’ events,” wrote the researchers in a paper detailing the study. They reported the bugs to both Meta and TikTok. According to them, Meta says they have assigned the issue to an engineering team while TikTok is yet to respond. (TikTok was made aware of the issue later because the bug in TikTok was found later)
USAToday, Trello, The Independent, Shopify, Newsweek and AZcentral figured among the top ten websites leaking email addresses of EU users to third party tracker domains. Business Insider, USAToday, Time, Fox News, Trello, The Verge and WebMD were on the same l for US users. All of the websites leaked email addresses to third-party trackers without the user interacting with the consent dialogue.
The researchers also found over 52 instances of websites incidentally collecting passwords and relaying them to third-party trackers. Russian search engine Yandex was found to be the top third-party collector of these passwords. According to the researchers, these issues were later corrected the websites and Yandex after it was pointed out to them.
“Based on our findings, users should assume that the personal information they enter into web forms may be collected trackers—even if the form is never submitted. Considering its scale, intrusiveness and unintended side-effects, the privacy problem we investigate deserves more attention from browser vendors, privacy tool developers, and data protection agencies,” concluded the researchers in their paper.