it appears that the SEO optimum is "repeat common search terms but in natural language" and so now every google result for thing like "fix coil whine" or "toilet not filling" are FAQ style posts by domains you've never heard that are like:
How to fix coil whine?
What causes coil whine?
Can coil whine cause damage?
the search situation is dire because sure "toilet not filling" doesn't tend to have misinformation, but if any schmuck can register a 4 dollar domain and get gpt2 to blast out some misinfo that the google algo loves and make bank on advertising (also sold by google) then how am I supposed to trust google when I ask it about things that tend to have common misconceptions or falsehoods like pest remedies or medical symptoms?
@SuricrasiaOnline remembering the time when google let you remove certain domains from results for your account. was amazing. it's gone now, of course
@SuricrasiaOnline Yeah, for whatever reason search has gotten disastrous in the last five years and ... well, I have my suspicions why, but
@SuricrasiaOnline I feel like it doesn’t have to be, if it lets the user choose sources they want to use…?
Maybe it could allow rating quality of results for sources to help highlight good/new ones? Idk
@catalina I like how I clicked "random websites" and got the main page for a tilde server populated by people I know
@SuricrasiaOnline archwiki, tvtropes, etymonline, dict.org, something with all the man pages, memory alpha, emojipedia
@SuricrasiaOnline here u go: https://easrng.github.io/no-shit-sherlock/
it's a google custom search engine but i added some js before loading it that makes it so the ad script doesn't load even without adblock
it searches wikipedia, wiktionary, wikimedia commons, wikidata, all the SE sites, github, gitlab, sourcehut, codeberg, archive.org, hathitrust, project gutenberg, unsplash, and pexels for now
@SuricrasiaOnline ok i added a few more sites (MDN, css-tricks, old.reddit, emojipedia, dict.org, etymonline, emojipedia, arch wiki, debian.org)
@easrng @SuricrasiaOnline probably better to use a custom centrality algo (like custom pagerank) centered around a few certain sites you deem to be the “Ideal Archetype” of the content you want and then use huerestics to adjust the weight of a site. Huerestics could include:
How much Readability strips
How much uBlock Origin filters (Teclis.com does this)
Structured data (FairSearch, Google, Bing, Yandex, and maybe Petal do this)
@SuricrasiaOnline DDG just uses Bing for search data, but the ! searches make it actually useful despite that.
This way your searches don't go through DDG's servers, but directly to the site you're searching. It's just better in both privacy and speed.
@werwolf @jalefkowit @SuricrasiaOnline The trouble with the non-DDG systems is, you've got a limited, non-growing, probably stale set of searches. DDG actually keeps theirs working and up to date, because it's a key feature.
So anytime I hit a site again, I search DDG's bang page, and it's almost always already there.
And about the quantity, you may be right, but after using your browser for some time while adding every page search you happen to use, you'll have most likely covered your needs.
@SuricrasiaOnline if anything, ddg is *worse* at seospam than google, i just can't stand giving google another bit of data so it's worth the sightly shittier results
@SuricrasiaOnline yeah i wasn't gonna because _i_ use ddg and i get those weirdo results
and now i'm probably to reply to the next post and say "just use !bangs at that point"
@SuricrasiaOnline i don't think allow-list only for search is a good idea, but allowing users to block certain sites from showing up, or at least rate results as helpful / not helpful, would be pretty useful
@SuricrasiaOnline that's already become kind of a meme of people searching specific sites for good info
@SuricrasiaOnline huh dang. we usually are searching for phrases in quotes or multiple words with each in quotes, maybe that's revelant - like: https://archive.org/details/texts?query=%22Omelette+du+fromage%22&sin=TXT
@SuricrasiaOnline lol yeah they don't make it obvious tbh! but once you find it it's great. the homepage should be
https://archive.org/details/texts but also the top right search box should have it as an option for any page in that category. the "books" page might just mean stuff that's "officially" uploaded by an IA employee somewhere, which is a lot but far from all the text/docs/books on there
@SuricrasiaOnline you mean there weren't 17 separate lifestyle blogs that really reviewed products in $PRODUCT_CATEGORY last $($CURRENT_MONTH - 1)?
boomers - say they don't trust the internet but actually do
millennials - say they trust the internet but actually don't
zoomers - they are the internet
@SuricrasiaOnline Well, it's basic knowledge to never ever ask Google medical questions unless you know what to avoid and what to look for.
If you look up cancer treatment in german 80% of sites are about "alternative medicine", many even trying to tell you NOT to do chemo or radiotherapy but to smoke weed and eat amygdalin (poison) instead. It's ridiculous.