Update on 10/20/24 added to the bottom of this article.
Internet Archive's "The Wayback Machine" has suffered a data breach after a threat actor compromised the website and stole a user authentication database containing 31 million unique records.
News of the breach began circulating Wednesday afternoon after visitors to archive.org began seeing a JavaScript alert created by the hacker, stating that the Internet Archive was breached.
"Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of suffering a catastrophic security breach? It just happened. See 31 million of you on HIBP!," reads a JavaScript alert shown on the compromised archive.org site.
The text "HIBP" refers to the Have I Been Pwned data breach notification service created by Troy Hunt, with whom threat actors commonly share stolen data to be added to the service.
Hunt told BleepingComputer that the threat actor shared the Internet Archive's authentication database nine days ago and it is a 6.4GB SQL file named "ia_users.sql." The database contains authentication information for registered members, including their email addresses, screen names, password change timestamps, Bcrypt-hashed passwords, and other internal data.
The most recent timestamp on the stolen records is September 28th, 2024, likely when the database was stolen.
Hunt says there are 31 million unique email addresses in the database, with many subscribed to the HIBP data breach notification service. The data will soon be added to HIBP, allowing users to enter their email and confirm if their data was exposed in this breach.
The data was confirmed to be real after Hunt contacted users listed in the databases, including cybersecurity researcher Scott Helme, who permitted BleepingComputer to share his exposed record.
9887370, internetarchive@scotthelme.co.uk,$2a$10$Bho2e2ptPnFRJyJKIn5BiehIDiEwhjfMZFVRM9fRCarKXkemA3PxuScottHelme,2020-06-25,2020-06-25,internetarchive@scotthelme.co.uk,2020-06-25 13:22:52.7608520,\N0\N\N@scotthelme\N\N\N
Helme confirmed that the bcrypt-hashed password in the data record matched the brcrypt-hashed password stored in his password manager. He also confirmed that the timestamp in the database record matched the date when he last changed the password in his password manager.
Hunt says he contacted the Internet Archive three days ago and began a disclosure process, stating that the data would be loaded into the service in 72 hours, but he has not heard back since.
It is not known how the threat actors breached the Internet Archive and if any other data was stolen.
Earlier today, the Internet Archive suffered a DDoS attack, which has now been claimed by the BlackMeta hacktivist group, who says they will be conducting additional attacks.
BleepingComputer contacted the Internet Archive with questions about the attack, but no response was immediately available.
Update 10/10/24: Internet Archive founder Brewster Kahle shared an update on X last night, confirming the data breach and stating that the threat actor used a JavaScript library to show the alerts to visitors.
"What we know: DDOS attacked-fended off for now; defacement of our website via JS library; breach of usernames/email/salted-encrypted passwords," reads a first status update tweeted last night.
"What we've done: Disabled the JS library, scrubbing systems, upgrading security."
A second update shared this morning states that DDoS attacks have resumed, taking archive.org and openlibrary.org offline again.
While the Internet Archive is facing both a data breach and DDoS attacks at the same, it is not believed that the two attacks are connected.
Update 10/20/24: The Internet Archive was breached again, this time with the threat actors gaining access to their Zendesk support email system.
BleepingComputer has published a detailed story on how they breached Internet Archive and stole the member data in this article: Internet Archive breached again through stolen access tokens.
Comments
jackmchue - 1 month ago
But why?
pianotm - 1 month ago
To force them into taking it down. The government and corporations have been after them for years for "copyright infringement". They've been doing damage wherever and whenever they can, but they've never been able to shut them down completely.
alaskandude - 1 month ago
definitely state-sponsored hack.
alaskandude - 1 month ago
seems like to be pro-israeli group
pianotm - 1 month ago
"seems like to be pro-israeli group"
No surprise that it's someone that's fascist pro-government.
Lawrence Abrams - 1 month ago
I don't believe the person behind the data breach did it for any other reason than they can.
pianotm - 1 month ago
That would be in line with Occam's Razor, wouldn't it? But I'm not just going to assume it's a coincidence that after Internet Archive loses in a lawsuit basically declaring them criminals for being a library, hackers take the site down.
Tadirro - 1 month ago
"seems like to be pro-israeli group"
"sn_blackmeta", who claimed the attacks, is a self-declared pro-Palestinian group.
(However, it's as likely that they're just attention-seeking kiddies who had nothing to do with it.)
harrybarracuda - 1 month ago
SN_BlackMeta is a pro-Palestinian activist group. I wouldn't surprised if Iran is behind this.
pianotm - 1 month ago
"SN_BlackMeta is a pro-Palestinian activist group. I wouldn't surprised if Iran is behind this."
You think so? That's interesting. Have you heard their reasoning? They "think" Internet Archive is a US government owned propaganda site. A few minutes of thinking this whole thing through should tell you: Criminal claims to be fighting for Palestine - attacks the one thing that will get the largest amount of people ticked off at him and turn them against their cause - makes sure everybody and his mother knows they're pro-Palestine. Israel/U. S. backed false flag. No doubt. If that guy's pro-Palestine, then I'm Kate Middleton.
Rameynoodles - 1 month ago
You guys are all reading politically into this too much. These hackers just want money and fame. Hacking big companies gives them fame AND usually TONS of usernames/emails and passwords. This user login data is very profitable, because they can sell the data on the black market to other ne'er-do-wells that can further use it to steal things, like money from someone's bank account, or use it to steal their identity.
cs_280zx - 1 month ago
Hacking the archive has the same feel as shooting a medic on the battlefield ..
NiaD - 1 month ago
"Hacking the archive has the same feel as shooting a medic on the battlefield .."
My same second thoughts after the first of, "But, why?"
cearrach - 1 month ago
The archive.org auth backend is also used by openlibrary.org and archive-it.org so technically they're also compromised.
Dragonking1000 - 1 month ago
Is there a way to find out what ip address ddosed them?
Elastoer - 1 month ago
It's a good thing that I used fake information when I created my account there.
alphavault - 1 month ago
<p>Why would they do that? That's like sending four H-bombs to a retirement home. Is this for the lulz or for inherently political reasons?</p>
fargoal - 1 month ago
someone who doesn't want us to go back in time and find embarassing deleted messages or posts
cs_280zx - 1 month ago
hmm, so what happens to those who sign into web.archive.org with google SSO account?
kryp-tonite - 1 month ago
Your password would never have been stored in this database if you logged in with Google SSO. So password and password change date would not have been exposed, but likely all the other info still was.
Jancis - 1 month ago
Internet archive has been a lifesaver, dunno why anyone would want to attack it. It's pretty low.
Marc06 - 1 month ago
Not to sound skeptic but it is probable that some big company organize the attack
jwalks - 1 month ago
I never singed up for the way back machine or used it, but apparently pawned I was. LOL
fargoal - 1 month ago
<p>great job,. you've just ruined an internet treasure. are you proud of yourself, script kiddie?</p>
IoI_xD - 1 month ago
Four days after Google finishes dropping one of it's best features - caching, a feature which probably costs pennies for it to maintain - and tells its users to just use the web archive, it gets hacked. And then a week later it gets DDOSed. Twice.
I feel bad for IA but I feel less bad for Google who could've potentially put a lot more eyes on the site, including those who would do this. All for the line to go up slightly.
Winston2021 - 1 month ago
"Four days after Google finishes dropping one of it's best features - caching, a feature which probably costs pennies for it to maintain - and tells its users to just use the web archive, it gets hacked. And then a week later it gets DDOSed. Twice."
The Archive's Wayback Machine which fills in to some extent for that former Google service which is fully dropped, by mere coincidence of course, shortly before a major US election is DDoSed, a service which allows, for instance, politically damaging research on the prior positions, statements, and actions by politicians and their political parties which were previously published on the web.
Who controls the past controls the future
Who controls the present controls the past
Tadirro - 1 month ago
"The Archive's Wayback Machine which fills in to some extent for that former Google service which is fully dropped"
I don't think the Wayback Machine and Google Cache have much overlap in terms of the use cases they support. Declaring the Wayback Machine an alternative to its cache was just a lame excuse by Google, they must know that they don't serve the same purpose at all.
Most importantly, Google Cache was focussed on serving a copy that was as up-to-date as possible. The common use case would be a site which is intermittently unavailable, often because some recent event made them overwhelmed with requests. With Google Cache, you could often see the thing that made the site hit the news, because it was a current copy. The Wayback Machine usually takes about a month for a scraped page to show up, which is way too late for current events.
If it shows up at all, because the strength of Google's cache was that it cached pretty much all sites that could be found through Google Search. Very often, you found the site you currently can't access through Google, so the chances of being able to access a cached copy were maybe 99%, while looking for it in the Wayback Machine, you maybe had a 20% chance of finding a version from six months ago.
Not to mention that the cached copy contained the actual content that made the site show up in your Google search, which was priceless in the case of frequently updated pages, or shady sites who serve different content to the Google bot than to human visitors.
Mentioning the Wayback Machine as an alternative was just a cheap cop-out by Google, who apparently felt they no longer wanted the maintenance, traffic costs, and maybe legal uncertainties about serving cached copies. I mean, the guy communicating it said something about the Cache feature no longer being useful because the web has become so completely reliable compared to the time when it was introduced. I don't know which bizarro version of the web this guy is using, but in my reality, the number of cases where pages I want to visit have been deleted, made useless, DoS'ed, or are encountering some other technical difficulties, has been exploding over the past few years. The web is getting less reliable by the day.
tencho - 1 month ago
The Internet Archive is perhaps one of the biggest advocates of open source, sharing knowledge for free, and a community built on user contributions. No profiteering, exploitation, or quashing of free speech. Why any hacking group in their right mind would target the Internet Archive is baffling to say the least. They're on your side.
Blipping - 1 month ago
"Why any hacking group in their right mind would target the Internet Archive is baffling to say the least. They're on your side."
That is simply a lie. They are _not_ on the side of most hacking groups. Nor should they.
By far, most hacking groups/individuals hack for profit, political, war or religious reasons. Or to pat themselves on the back, aka. narcissism. Not for free information.
sammaverick - 1 month ago
It's a real shame because I have often used the Wayback Machine in my research and it has been really useful.
Aminkflfkldkfl - 1 month ago
It's like burning down a very old and important library full of useful information. Why would anyone do this? Maybe they want to be famous, or we just don't know their real reason.
Aminkflfkldkfl - 1 month ago
The data breach culprit likely acted solely to prove their ability, not for other motives.
trayisdagoat - 1 month ago
i am a victim of the attack
Youssef- - 2 weeks ago
me too :(
Licht92 - 1 month ago
Web archive is awesome, but I don't like the way people are using it to upload pirated content like game roms, movies etc over the last few years.
Lawrence Abrams - 3 weeks ago
The Internet Archive was breached again, this time via the Zendesk email support system.
We have a new article explaining how Internet Archive was initially breached:
https://www.bleepingcomputer.com/news/security/internet-archive-breached-again-through-stolen-access-tokens/
Mars4dude - 2 days ago
Liberians are members of a profession which defend freedoms just like the great men and women of our armed forces. An attack on the Internet Archive breaks my heart. While it may appear as the person, cooperation, or government responsible for the attack are getting away with it. Know this. I call it God's law, some call it His Word. “What ye sow, so shall ye reap” and I've heard it said “what comes around, goes around”. Eventually, they will suffer the consistences of appropriate measure. It may take months, years, or generations but, it will come back on them.
The internet is a fraction of what it used to be, and demanding our personal information and tracking activity on the internet, using computers, phones and even our cars, and giving back as little as possible. Darn, where can one find a VPN for their car? Everyone's freedoms are at risk as never before . All in the name of SECURITY. The best asset to defend one from the Thought Police, the librarian.