TL;DR: A number of discussions have been had regarding the stealer log data dump known as Alien_Txtbase. One of these analyses was performed by Specops Software on March 27, 2025. You can use that writeup to compare to the new data. Before Breach Forums was taken down yet again, a number of new records were offered by a forum member, totalling about 126m rows. This data was not explicitly mentioned as more Alien_Txtbase data, however the files were named the appropriate filename, with the Alien_Txtbase header, consistent with previous releases. We will now perform an analysis of the data to investigate how real the threat is (and discuss the records therein).
The files have the following headers, advertising the telegram channel; as the intent of these datasets is to drive paying traffic to closer to real-time log purchases:
|=====================================================================================|
| ___ _ _____ _____ _ _ _______ _____________ ___ _____ _____ |
| / _ \ | | |_ _| ___| \ | | |_ _\ \ / /_ _| ___ \/ _ \ / ___| ___| |
| / /_\ \| | | | | |__ | \| | | | \ V / | | | |_/ / /_\ \\ `--.| |__ |
| | _ || | | | | __|| . ` | | | / \ | | | ___ \ _ | `--. \ __| |
| | | | || |_____| |_| |___| |\ | | | / /^\ \ | | | |_/ / | | |/\__/ / |___ |
| \_| |_/\_____/\___/\____/\_| \_/ \_/ \/ \/ \_/ \____/\_| |_/\____/\____/ |
| |
| JOIN TELEGRAM TXTBASE: |
| JOIN TELEGRAM TXTBASE: SNIP |
| JOIN TELEGRAM TXTBASE: |
| _________________________________________________________________ |
| ▼ BUY PRIVATE SUBSCRIPTION ON OUR 7/24 ONLINE SHOP BOT ▼ |
| |
| |
SNIP
So uh, that’s not overly hard to determine attribution.
The Delta
As we are interested in the passwords, and the patterns therein, we split the passwords off the records into its own file (most records are the format url:username:password, with some ~ username:password:url; we’ll simply discard the latter for speed of processing).
Since it’s clear that the data is from the same source, a delta was taken between the previous dataset known as Alien_Txtbase and these new 126m records, resulting in a count of 51,571,780; representing ~ 51m passwords that were not in the previous release as consumed by HIBP. This will allow us to discuss only the new records.
The Base Words
Base Word | Count |
---|---|
qq.com | 25129 |
guruku.id | 12939 |
user | 10666 |
gmail.com | 6197 |
alex | 5965 |
admin | 5915 |
aruba.it | 5310 |
ahmed | 4919 |
daniel | 4246 |
david | 3951 |
We can disregard a couple of the records due to simply the result of dealing with the always clobbered formatting of credential lists that get posted on forums such as breached; they’re always clobbered to shit. But once you disregard those, you see a pretty standard set of base words (where basewords are the special characters and numbers stripped off); ie a baseword of admin could come from a password admin123. Nothing exciting here, given it’s stealer logs, and people are garbo at generating their own memorable passwords, you get what you see here.
Password Lengths
Length | Count (and percentage) |
---|---|
10 | 4702923 (9.12%) |
11 | 3612608 (7.01%) |
8 | 3516211 (6.82%) |
9 | 3399076 (6.59%) |
12 | 2746051 (5.32%) |
22 | 2246982 (4.36%) |
23 | 2159239 (4.19%) |
21 | 2142296 (4.15%) |
13 | 2141544 (4.15%) |
20 | 2041611 (3.96%) |
24 | 2012335 (3.9%) |
7 | 1948241 (3.78%) |
Generally, everything longer than 20 characters is a result of the aformentioned clobbered formatting resulting in email addresses getting rammed into the dataset; so as is often tradition, so long as you’re using sufficiently long passwords and being good and following NIST 800-63B, there is minimal risk of re-use for organizations.
Character Set Distribution
Character Types | Count |
---|---|
loweralphaspecialnum | 11685903 (22.66%) |
loweralphaspecial | 10055146 (19.5%) |
loweralphanum | 7123899 (13.81%) |
numeric | 5905319 (11.45%) |
loweralpha | 5322840 (10.32%) |
mixedalphanum | 2751838 (5.34%) |
mixedalpha | 2273913 (4.41%) |
mixedalphaspecialnum | 2147456 (4.16%) |
mixedalphaspecial | 1203950 (2.33%) |
upperalphanum | 1163896 (2.26%) |
specialnum | 902512 (1.75%) |
upperalpha | 420664 (0.82%) |
upperalphaspecialnum | 304816 (0.59%) |
upperalphaspecial | 155093 (0.3%) |
The complexity distribution isn’t phenomenal, for example, loweralphaspecialnum would be a password such as password123!, meaning no mixed casing. And knowing the distribution of lengths, it’s also not an amazing look. This is naturally a side-effect of the human nature of creating simple and easy to remember passwords. This is why it’s so important to enforce commonly agreed upon password complexity standards and lengths; see: NIST 800-63B while it still exists and before NIST gets completely dismantled.
Some Domain Samples
For curiosity’s sake we’ll take some samples of the domains that are impacted to highlight what organizations should be concerned about rotating credentials.
Domain | Count |
---|---|
irs.gov | 5378 |
pornhub.com | 52313 |
proton.me | 25004 |
onlyfans.com | 33910 |
ashleymadison.com | 7618 |
Since stealer logs pull their content from the saved credential store of browsers, the involved domains obviously trend towards erm, consumer-facing sites. You’ll see odd pockets of government, financial, and so on when a user didn’t have good security posture and saved a work account’s credentials; but generally it’s consumer, which leads to it also being a little spicy. This does however also reduce the risk of impact down to reasonably inconsequential personal accounts, which is great news.
Conclusion
The final drop of Alien_Txtbase telegram data to Breach Forums follows the same patterns as the big one that hit the media; largely consumer-facing sites just by virtue of where the data comes from, and largely thankfully inconsequential to corporate environments.
It is worth noting that with the closure of Breached Forums, the threat is not gone, it’s simply cast into the wind to find other forums to share data. More on that later.
For an organization or application that’s following NIST 800-63B as they should and forcing complexity and a correct 12+ characters minimum length (and hopefully using a breached corpus) you’d just dodge any re-use of these common passwords. Strong MFA (preferably multiple factors, ala Specops Authentication) should be enabled where possible, and users should be taught never to share a multi-factor code with another user or their servicedesk.
Same old poorly processed crap scraped from personal machines. Stop using browser password stores, and use a password manager such as Bitwarden instead.