From HN discussion - "This is why I use ad blockers and a pi-hole server" (https://news.ycombinator.com/item?id=22124929)
This is GDPR in action. Wow.
⚠️ The Fediverse has been scraped, again ⚠️
Almost six million posts from 363 instances have been scraped.
"All the posts with public visibility published by users hosted on Mastodon servers [...] which support the English language" have been scraped along with their metadata, and the "policy, the code of conduct and the prohibited contents of each instance".
The dataset is an attempt at creating an open dataset for "research" into algorithms like the ones Facebook uses to identify problematic content, based around users' use of Content Warnings.
The dataset can be found here:
It was created by the University of Milan, Italy, apparently for the 13th AAAI:
The associated publishing:
https://aaai.org/ojs/index.php/ICWSM/article/download/3262/3130/ or https://likeable.space/media/30ae595a191923a1ce84a1e0feac6a3cef5b8669f44e15535ea18c7a5594b93a.pdf?name=Mastodon%20Content%20Warnings%3A%20Inappropriate%20Contents%20in%20a%20Microblogging%20Platform.pdf or DM me for a copy.
very noteworthy imo that the selection pressures of youtube algorithms created a neogenre of film about pirated children’s characters performing surreal pranks on each other in a CGI nightmarescape for precisely 12 minutes, whereas spotify’s algorithms instead produced a homogenous wave of mid-tempo funless dirge music. what went wrong with the latter?
digital games making person, bitsyfolk, 29 y/o & non-binary. follow requests welcome
Cybrespace is an instance of Mastodon, a social network based on open web protocols and free, open-source software. It is decentralized like e-mail.