Database breakdown exposes data from 235M social media accounts


A database breach has exposed profile data for nearly 235 million users of TikTok, Instagram and YouTube.

The data appears to be collected by a practice known as web scraping, where a company has access to the web interface of a service and then automatically collects data …

This is different from a hack because it involves hacking into a system to gain access to data that is not demonstrably publicly accessible. Web scraping accesses only public data.

For example, an automated system can access a series of YouTube channels, collect the username, photo and follower count of the channel owner. An entire database of these records becomes a privacy issue, even though the data itself is publicly targeted.

Once this data is collected in a database, you would normally expect it to be protected. Mar TNW reports that a database containing 235M records was found on the web without password protection.

The deleted data contained four important datasets containing details of millions of users of the named platforms. It contained information such as profile name, full name, profile picture, age, gender and follower statistics […]

Bob Diachenko, lead researcher for security company Comparitech, found three identical copies of the database on August 1. According to Diachenko and the team, the data belonged to a now-defunct company called Deep Social.

When they reached the company, the request was forwarded to Hong Kong-based firm Social Data, which acknowledged the hack and closed access to the database. However, Social Data declined to comment on Deep Social.

Comparitech said that each record contained some or all of the following:

  • Profile name
  • Full real name
  • Profile photo
  • Account description
  • Whether the profile belongs to a company or has ads
  • Follower engagement statistics, including:
    • Number of followers
    • Commitment rate
    • Follow-up growth is growing
    • Gender for public
    • Audience age
    • Location for public
    • Likes
  • Last post timestamp
  • Age
  • Gender

In addition, about 20% of the sampled records contained a telephone number as an e-mail address. As TNW notes, this type of data can be used for spam but also for phishing attempts.

Web scraping is normally prohibited by the terms and conditions of the services in question, but a court in California ruled last year that it is not illegal. That can be a good thing in many cases.

For example, CityMapper is a very popular app that works out how to get from A to B in a city using the fastest method, by pulling live traffic and public transportation data to do so. These days, most public transportation companies make that data available through an API, but in the early days they were only available on the web. Web scraping by early forerunners to CityMapper offered a convenient way to make the data more usable.

Web scraping can still be useful today, if companies post useful data on the web but do not make it available through an API. Price comparison services, for example, often rely on web scraping.

But deleting personal data is another matter, and courts may have to distinguish between the two types of use.

FTC: We use revenue-generating links for auto-affiliate. More.


Check out 9to5Mac on YouTube for more Apple news: