I'm confused about how or why this is a new policy. My memory is inside Google we were discussing this risk back in 2003, probably earlier. Search quality was on it. I just assumed they'd lost the arms race, or that the parasites' ranking was justified for other reasons that were hard to tease apart. What are they doing new now?
I think often about Mahalo, the sleazy shovel content that was spamming the web back in 2007. Google shut that down somewhat fast, although it did take several years. These days with AI and more aggressive spammers it's a losing battle. The real problem is the financial incentives that make this kind of spamming profitable in the first place.
My tiny little blog gets about 3 requests a week for someone to "pay me to run a guest article". Going rate is $50-$200 and again, my blog is tiny.
The air purifier review site Housefresh dug into why sites like theirs were seeing less traffic back in the spring, and it amounts to a handful of companies buying up popular magazine/blog brands and using them as affiliate farms that cross-post to sites within their networks of brands to boost visibility:
Seriously, they tackled this years ago with the panda update to kill off all the how to and similar seo spam. It's like after around that time they just stopped caring at all and let the best X sites take over.
> I'm confused about how or why this is a new policy.
My best guess is it's because they finally have a real competitor in ChatGPT.
> The real problem is the financial incentives that make this kind of spamming profitable in the first place.
Yeah, but the financial incentives exist on both ends. There's a gross symbiotic relationship between Google and SEO spammers, because Google also owns the ad network the spammers put on their page. If Google puts ad-laden SEO blogspam as the top result and a user clicks it, the user sees a bunch of ads from Google. Everyone wins: Google, the SEO spammers, and advertisers. Well, everyone except the user, but who cares about them?
My guess/hope is that ChatGPT has made someone who actually cares about the quality of search results actually step in and say things have gone too far.
You're totally right about that symbiotic relationship. We were aware of that risk in the early days when AdSense launched, we saw some very innovative and gross exploitation and created some policies to rein it in. But ultimately if Google makes a buck coming and going, they will do that.
Wasn't there a big story last year in the wake of the DOJ antitrust investigation about Google manipulating search quality to boost ad revenue? I can't put my hands on a reference now, in part because Google is so bad at search these days I can't find anything more than a few months old.
Because ChatGPT is dependent on good search when it searches the web? Or because it completes with Google when it provides a good answer without searching? Or what do you mean specifically?
I would say the latter. For software dev questions, my Google searches and Stack Overflow visits have fallen off a cliff since I started paying for ChatGPT.
Ironically, I probably would have paid the same amount to Google for ad-free, old-style (accurate) Google searches, but no, they just wanted to keep cranking that ad dial up every year so that ship has sailed.
At this point, I'm enjoying watching the old guard of search scrambling to find a life jacket.
Stackoverflow visits fell[1] off a cliff since GPT became popular.
Google is getting destroyed by the chatbot workflow because it is no longer the start of a browser session and clickthrus (the things that earn the high sponsored link rates) are falling as more users get their queries answered faster with less effort.
iunno, I used to rank pretty well for things in my country like "$company's tech support number". Unlike every $company, my page had a nice clean URL like whatever.tld/Tech-Support-Number-for-$company and I'd just list of their phone numbers with a few paragraphs about how $company is shit. Maybe 50kb total.
Meanwhile $company's page was company.tld/234897234982-029823749823742-2340823492 and 3 pages down was a phone number if your browser didn't choke on the javascript.
For ISP ones, I recommended people print a copy so they can call if they can't get on the internet, which kinda backfired when a major ISP changed their tech support number (!) and it redirected to a toll-free squatter's sex chat line.
Turns out $company really hates it when you call their call (cost) centres.
I had maybe 50 pages for our different oligopolies and averaged $500/month revenue on adsense, so GOOG's cut was $250/month.
Today, for one $company, the first 9 results are different pages from $company.tld, each unhelpful with a phone number in their own way, and they don't run adsense!
> My best guess is it's because they finally have a real competitor in ChatGPT.
Bingo. I always chuckle when people here say Google has lost it, and become incompetent. Well, they all make the mistake of assuming that they’re trying but failing, rather than that it’s deliberate simply due to boring economics.
Now look at how quickly decades-long problems, so big they have an entire cottage industry built around it, suddenly be cleaned up. Incompetence? Nah.
Of course, this does nothing to convince regulators and not even average HN user that innovation is harmed by these dominant players. Someone’s gotta think of the poor mega-corps.
You're mostly describing Kagi. They do have AI results but you have to explicitly ask for them. They have an "No AI" image search option as well.
I also like my "Before AI" lens I can click on to search the internet pre-2021. And you can downrank or fully block those garbage spam sites. They even have a "leaderboard" for most blocked/pinned sites you can use to get started.
It would take a benefactor who wants to pay for running it for its own sake and not for profit. As soon as there's a profit motive, enshittification sets in since you're serving whoever pays rather than your users.
But people were complaining about the sAme issues under Matt Cutts. Also, there has been A Ton more money and work chasing the SEO farm game. Now big private equity companies have focused on buying a stable of big brands to do the same that used to be garage startups.
Because it became an embarrassing news story (https://larslofgren.com/forbes-marketplace/, also mentioned in this article). They would have lazily left it unfixed if everybody weren't laughing at them.
What's hilarious is when people boast about being "in Forbes" like it's the magazine from 20 years ago, and not this parasitic SEO operation that publishes garbage on anything.
I took advantage of this in business school. A lot of my professors considered Forbes a reputable business magazine. It was amazing, I could easily cite a source for just about anything I wanted to say.
Hopefully this is a step in the right direction. Google's search results have gotten so bad - seems like even some of the simplest searches are just packed with AI generated and SEO garbage. I don't even want SearchGPT do take over the search market space because I'm almost sure it will still be garbage. Just bring back the google from 5-10 years ago please :(.
Google search degraded in usefulness before the panda update, when spammers had filled the web with low quality content designed to exploit Google's algorithms. Google improved their search to punish the content farms, and people were happy with that search for many years.
Exactly. In a constantly changing world, you need constantly changing policy to achieve the same outcomes. Even then you probably won't replicate the past universe perfectly.
You can't go back to the way things were. The world moves forward and changes, and we have to adapt to it.
Web search has always been an extremely messy solution to many problems. Think about the premise: type in anything, and somehow it will read your mind, intuit who you are and what you really wanted, find the exact thing amid the morass of the whole web, and then give it to you?
That's impossible. So it uses tricks to make it seem like it worked. It uses information about you to refine results. It uses curated, human-edited search and result heuristics for the most common or difficult search queries. It uses a giant corups of data, and shows you things that are like what you wanted.
You don't notice that it isn't giving you the best result, because there are so many mediocre-but-acceptable results to look at. And it doesn't have to work perfectly every time, because we can "sift through" results and "refine" our search. Often we are flooded with results that are targeted at us, rather than what we want, because, remember: Google is an advertising company, and the entire Web is now a shopping mall, where either you're being sold-to, or you're just being sold.
You will get results, and they will sort-of seem like what you wanted, so you will just sort of sigh and accept it. Because what other option is there?
There are more intelligent, more accurate, more safe, ways to solve the problems people have, that are not "a search engine". It's time we start implementing them.
> type in anything, and somehow it will read your mind
I think we can go back to the way things were, which had nothing to do with mind reading. In the past, you could type in word, and google would offer 10 million results, and you could page through each of them. That was very powerful, and google does not do that today.
I don't think you know what you are asking. Do you really want 10 million pages of results, of which 99.999...% are SEO spam for Viagra et al, and on average you will need to browser ~9 million pages of results to find something that's actually "relevant"?
I was in high school 15 years ago and Google absolutely read minds to conclude Briney Spears was not a search for pickles but rather a pop artist. This was significant enough for them to come to go talk about it.
> You don't notice that it isn't giving you the best result
That's fine. It's always been fine. I don't need Google to read my mind and fulfill my dreams.
The problem isn't that they're not divinely perfect. The problem is that they used to be good enough, and now they're not.
> There are more intelligent, more accurate, more safe, ways to solve the problems people have, that are not "a search engine". It's time we start implementing them.
What solutions are there that fulfill all the use cases of a search engine, while definitively not being a search engine? An AI chatbot that gives me synopses of the same websites that I was searching for does not count.
>Think about the premise: type in anything, and somehow it will read your mind, intuit who you are and what you really wanted, find the exact thing amid the morass of the whole web, and then give it to you?
I never once asked for anything remotely like this. Maybe you could just show me results for the fucking thing I typed? When I go to the library, the Dewey decimal system doesn't rearrange itself based on all the metadata the library has on me and people fitting my demographic criteria, it just shows me what I fucking searched for.
Not sure why you’re being downvoted, this is a pointed analysis of why crawl-based search is insufficient for an Internet of our current scale. There is no corporate-curated algorithm that is up to the task, especially when the primary purpose is to profit from advertising.
Google is remarkably effective at handling the scale. It doesn't seem up for handling the sheer army dedicated to misleading it. Especially now that they've been given tools for automating crap generation.
Ironically, Google itself was a key developer of that tech.
If there is any solution it would seem to involve removing the incentive to merely look at your page. That problem seems remarkably stubborn.
>There is no corporate-curated algorithm that is up to the task, especially when the primary purpose is to profit from advertising.
I think this is the root cause of the problem. Google can easily put a big dent in this problem by allowing users to create their own importable/exportable filters and support the dissemination of something like "EasyList for search results." But that kills their golden goose of advertising influence.
It would indeed be crowd-sourced, but with a core set of maintainers. Wouldn't be all that different from EasyList or Steven Black's HOSTS file. They basically take in merge requests from the community and serve as an initial filter against garbage. [1]
And unlike Amazon reviews or YouTube comments, anyone can fork it if they think they can maintain it better.
[1] "The filter lists are currently maintained by four authors, Fanboy, MonztA, Khrin, Yuki2718 and PiQuark6046, who are ably assisted by an ample forum community." https://easylist.to/
There are loads more hits like the above and they are nearly all wrong. The RPI distribution is based on Debian Linux but has a few differences. Between those two versions of Debian, RPi changed things in /boot quite dramatically and failing to do that, you will end up with a weird chimera - I created several of these beasts until I fixed them: https://blog.scheib.me/2024/04/14/upgrade-raspberry-bullseye...
In this case it may actually be a blog matching the template of the AI clones! However, they do all look very similar.
Google does perfectly on the latter search. It returns a relevant blog post written by an actual human, and a bunch of forum threads about that exact upgrade path.
I was searching for a uniquely named company by exact name (think: verizon), and it was 80% of the way down the results page. Google knew exactly what I wanted to see and flooded my screen with alternatives who had paid them.
I tried Kagi but just didn't see notably better results than other search engines. Maybe if I spent more time on the power user tools, or if Kagi offered more of a trial period I would have, but adding yet another monthly subscription is a high bar for me and what I saw didn't clear it.
These days my default assumption is that any SAAS product will get worse and more expensive over time, so it has to be pretty good to justify reworking my online habits around, given that I don't know how long I'll keep using it. Hopefully Kagi will be the exception to that rule, but I wouldn't bet on it.
That subscription fee is just too big of an obstacle in a time when everything has a subscription and is still often degrading in quality. Seems like an unsolvable chicken and egg scenario though, since relying on advertising to make it free would just result in the same issues as everything else.
It's quite literally this. It costs more than free and people don't want that. We're poor and poorer and everyone is overburdened by subscriptions for everything. I get that HN is in a rich bubble but most folks can't afford rent, food, and a search engine.
> costs more than free and people don't want that. We're poor and poorer and everyone is overburdened by subscriptions for everything
But that’s also the answer on preference. Google is good enough for most people. For everyone else, there can be a paid premium layer. Similar to news, this might be the equilibrium, not an anomaly.
I still search Google and other search engines from the command line. There is no "AI" garbage in the results. The way HN commenters refer to Google search in this thread, one might conclude it is not possible anymore to search the web without a popular browser running Javascript (which is a prerequisite for this "AI" stuff). That conclusion would be incorrect. It is still possible; I am still doing it every day.
3. Perform text processing on the response body (I create own SERP instead of using Google's)
Personally, I use multiple programs, some I wrote myself in C, to perform these individual steps, connected by UNIX pipes and the shortest, simplest possible Bourne shell scripting
However there are countless ways to perform these steps in wide variety of programming languages; there is no need for UNIX or shell scripting, it is purely personal preference
As others noted, this has been an issue for years. What prompted Google to act now, using a manual override that was supposedly not feasible in the era of algorithms? Did the viral articles by Lars provide ammunition for teams within Google?
Why doesn't google just manually block/derank all these massive content spam sites like forbes, business insider. Actually solving the problem even though its not some neat tech solution. This is like watching doctors theorize on how to save a bleeding out patient that dies because they are talking.
An interesting intellectual exercise is to think about how a search engine could provide the best possible answer (from a user satisfaction perspective) to a query like "best CBD gummies".
A lot of people have a significant financial incentive to win at that search query.
What would the perfect top search result for that look like?
It would probably be an article written by professional writers in a trustworthy publication with a strong ethics policy, provably followed over the years, concerning whether they accept payment for promoting specific products in supposedly impartial reviews.
If you can figure out how to algorithmically detect that kind of content you could build a pretty great search engine!
Since "the best" doesn't exist, just like there is no magical professional that has unique insight into the mind of the user making the search, a search engine could become pretty great by simply not taking decades to remove scams like the one described in the article from the top of search results
There are many criteria for "best" that are acceptable to many people, e.g. lowest price, proven high quality ingredients, efficacy, etc. "Website with high reputation that happens to be running ads for company XYZ" is usually not how people define "best".
I think I'd be pretty happy if Consumer Reports was on the top for queries like these (if they had the relevant data, of course). I think they follow your criteria pretty closely.
Are we assuming that this search engine is only used by a few nerds, or is the idea to build something that remains good even if it gets popular enough that webmasters have strong financial incentives to game it like they currently do with Google Search? Because the latter sounds like a much, much harder problem, and in particular like it probably requires huge financial resources in order to win the ongoing cat-and-mouse game, if that's even possible.
I think it'd maybe a query for the best gummies would be based on reviews from users, but I guess that's the point. Having something understand what one means by best is hard.
Hm, I think that Amazon shows that just user votes might not be sufficient - e.g. because users can be paid off to give 5 star reviews, which bias the results.
Google should vibe out others as well.
If I search now "Best CBD Gummies", the first few results are:
vice.com
independent.co.uk
healthline.com
observer.com
How is Forbes worse than any of those shallow comparison pages?
Kagi does let the user adjust the rankings of these sites if they don't want them coming up. While it would be nice to have this done proactively for the link farms, at least the user does have some control.
Forbes did make it on the blocked and lowered leaderboards.
Was probably a whole company right? Pretty good argument that Forbes the traditional media property and Forbes the seo giant are 2 different things: https://larslofgren.com/forbes-marketplace/
It was funny watching the warrior whatever site back in the day when Panda came along. Love when these people get their horrible business models kneecapped.
Now let's make corporate stock manipulation illegal again and ban corporate stock buybacks. Talk about a purely manipulative business strategy.
They are nothing but direct stock manipulation that was 'legalized' at the same time where executive compensation was moved from salary to... stock, so that you end up with a quasi-legal (stock manipulation by executives is supposed to be illegal) corrupt incentives system.
-Stock buybacks are not manipulation, they’re simply a way to return cash to shareholders and then the shareholder decides when to incur tax liability. A company is well within its rights to issue additional shares or buy back and destroy shares at their discretion. It’s functionally equivalent to a dividend without a taxable event.
-Corporate boards award stock grants to executives because they want management to be aligned with shareholders. I think executive compensation is excessive, but stock grants do align management and shareholders.
Dilution is immoral and unfair to investors. If a company wants to raise money they should have to sell shares they own, not print more and sell those.
'Stock manipulation is cool, especially when you change executives pay structure to be based purely on said manipulation. Totally creates healthy incentives not perverse ones.'
Sorry, buy backs are not stock manipulation. Let's step back from emotions and political skew. A company is able to take their capital and deploy it how they see fit. This can include purchasing percentage ownership of their company back from stockholders. Whether or not you agree doesn't make it manipulation in the general sense. It's just a way for a company to use their money.
One area where Google search is terribly broken is porn.
If your search for some specific term "$foo", nearly every result is just 'search site $bar for "$foo"', taking you to the site's search page, regardless of whether $foo is actually found on the site.
Had an old roommate who moved here after getting married to be closer to family. Weirdly, the name never at any point came up so I think everyone is just kind of resigned to the fact that they live in a place called Cumming
I assume the threat to their business posed by OpenAI (and others) is what is getting them to start addressing these long standing issues. I'm glad they're doing it, but upset that they let users suffer with sub-par results for so many years.
Good point. I just asked ChatGPT for the best CBD gummies, then asked what sources it used for the list. This was the first thing it said…
> Consumer reviews from trusted websites like Healthline, CNET, and Forbes Health that provide in-depth reviews and rankings based on effectiveness, ingredients, and customer feedback.
So the LLMs are now giving us affiliate link garbage, but we can’t easily see that was the source, and the affiliate links don’t even work. Everyone loses in this scenario.
Google has reams of company reviews, both those they've scraped and those they've solicited from the public. How hard could it be for them to downrank sites that advertise companies with relatively bad reviews, and uprank sites that advertise companies with relatively good reviews?
They could even scale the downranking so that the higher your site's reputation, the more it gets downranked if you're advertising poorly-reviewed companies. That would ding Forbes more than it would ding Joe's Little Blog, and prevent highly ranking sites (like Forbes) from having a monopoly on some search results.
pointing out the line between what you can get away with with SEO and what you can't get away with and what you can't get away with is making Google look stupid.
That page gives a good hint at how opaque these paid placements can be to an outsider like Google. Really tough to prevent too much collateral damage when going after bad actors like Forbes. Glad Google is working on it though.
Google starts getting into the way too much. Recently I searched for German football clubs that "fan rivalries" and Google refused that and only gave me results for "fan friendships" .
I just tried this(German football fan rivalries) and only got one thing with friendly and it was a reddit post. The ai response on top was on topic too
It sounds like they're just deranking the blog spam posts. In my opinion, Google should derank the whole domain/brand. If you purposely put your name on garbage, we should put your name in the trash.
My feeling as a Google user since the beginning is that the search engine doesn't matter anymore in terms of quality. That is why I wonder how Google supposely discover their own "bugs" so late.