Skip to content
Credential stuffingResolved

Collection #1 credential dump

A 87GB aggregated credential dump posted to the MEGA cloud service exposed 772.9 million unique email addresses and 21.2 million unique passwords, assembled from thousands of prior breaches to fuel credential-stuffing attacks at industrial scale.

Victim
Internet users worldwide (aggregated multi-breach dump)
records
772.9M
users
772.9M

In January 2019, security researcher Troy Hunt disclosed Collection #1 β€” an aggregated credential dump of more than 87GB spread across over 12,000 files that had been posted to the MEGA cloud-storage service and shared on a popular hacking forum. It was, at the time, the single largest dataset ever loaded into Have I Been Pwned (HIBP).

What happened

Collection #1 was not a single breach. It was a compilation β€” a curated aggregation assembled from thousands of separate data breaches, some dating back years, deduplicated and formatted for one purpose: credential stuffing at scale. Credential stuffing is an automated attack in which stolen username/password pairs from one service are replayed against many others, exploiting the widespread habit of password reuse.

The dataset contained 2,692,818,238 rows, which deduplicated to 1,160,253,228 unique email-and-password combinations, 772,904,991 unique email addresses, and 21,222,975 unique passwords. The cleartext passwords are what made it especially dangerous β€” unlike a hashed-password dump, these were immediately usable.

Scale and novelty

When Hunt loaded the data into HIBP:

  • Roughly 140 million email addresses were previously unknown to the service.
  • About 10.6 million passwords (around half) were new to the Pwned Passwords corpus.

Collection #1 was quickly followed by Collections #2 through #5, a related family of dumps totalling approximately 2.2 billion records and 845GB, circulating in the same forums and torrent ecosystem. Together they represented one of the largest concentrations of breached credentials ever assembled into a single distribution.

Attribution

In February 2019, threat-intelligence firm Recorded Future attributed the underlying compilation to a threat actor operating under the handle "C0rpz", who had created and sold Collection #1 along with an additional ~611 million credentials. The aggregation and resale of breach data is a mature criminal market; Collection #1 was a snapshot of that supply chain rather than the product of a single intrusion.

Impact

Because Collection #1 was an aggregation, there was no single corporate victim and no direct, quantifiable financial loss attributable to one organisation. The harm was diffuse but real: any account whose credentials appeared in the dump β€” and that reused those credentials elsewhere β€” was at immediate risk of takeover. Account-takeover fraud, business email compromise, and downstream phishing all draw on exactly this kind of consolidated credential supply.

Why it matters

Collection #1 is the canonical illustration of why password reuse is catastrophic and why credential stuffing is among the most cost-effective attacks available to criminals. It accelerated industry adoption of breached-password screening (against corpora like HIBP's Pwned Passwords), multi-factor authentication, and passkeys. For defenders, it reframed credential exposure as a continuous, compounding condition β€” old breaches never expire; they are endlessly recombined, redistributed, and replayed.

Timeline

  1. Component breaches that make up Collection #1 occur across thousands of separate websites and services over several years.

  2. A forum user posts seven links to databases hosted on the MEGA cloud service under the folder name 'Collection #1'.

  3. Security researcher Troy Hunt publicly discloses Collection #1 and loads it into Have I Been Pwned β€” the largest dataset added to HIBP at that time.

  4. The MEGA folder is removed, but the dataset has already circulated widely on hacking forums.

  5. Follow-on collections (#2 through #5), totalling roughly 2.2 billion records and 845GB, are identified circulating in the same ecosystem.

  6. Recorded Future attributes the original compilation to a threat actor known as 'C0rpz', who created and sold it alongside related dumps.

Sources

  1. troyhunt.comhttps://www.troyhunt.com/the-773-million-record-collection-1-data-reach/
  2. en.wikipedia.orghttps://en.wikipedia.org/wiki/Collection_No._1
  3. recordedfuture.comhttps://www.recordedfuture.com/research/collection-1-data-breach
  4. securityaffairs.comhttps://securityaffairs.com/80008/data-breach/collection-1-data-leak.html

Related incidents

Credential stuffingContained

Snowflake customer-account credential-stuffing campaign (UNC5537, 2024)

A threat cluster tracked as UNC5537 / ShinyHunters used credentials harvested by infostealer malware to log into ~160 Snowflake customer tenants that lacked MFA. Victims included AT&T, Ticketmaster, Santander, LendingTree, Advance Auto Parts, Neiman Marcus, and Bausch Health. Ticketmaster alone exposed data for ~560 million users.

Victim
Snowflake customer tenants (~160 organisations: AT&T, Ticketmaster, Santander, LendingTree, Advance Auto Parts, Neiman Marcus, Bausch Health, et al.)
Records
560.0M