It’s been said that Google is on “code red” due to the threat of LLMs1.
The issue isn’t just that people won't be clicking through as many links and ads anymore, although that’s certainly a problem.
It's that LLMs reduce the importance of search ranking, Google’s secret sauce.
Why Google is dominant
Today’s search process involves people clicking through a list of results, sorted roughly by decreasing relevance.
Since people are lazy and impatient, they’ll only click through a few. The top 5 (or 10) results are key, and Google wins because it returns great results there.
LLMs are speed readers
Since LLMs are automated, they can “read” through search results at a much faster rate than human readers.
Let’s say that instead of 10 results like humans, an LLM can quickly read through 100.
We can then create an LLM search product that:
Runs the user’s search query (e.g. via API) and feeds the top 100 results to the LLM
Uses the LLM to generate an answer for the user, by summarizing relevant results
Returns source citations for humans to individually check
LLMs will “flatten” search rankings
Assuming the LLM can use longer context windows well, search quality within these 100 results becomes less impactful to the end user of a (search + LLM) experience.
Why is this? For a “good” search engine like Google, much of the valuable information is concentrated in the top 10 results, corresponding somewhat to what humans will read. While for competitors with less sophisticated ranking, it’s difficult to capture these perfectly within their respective top 10’s, it’s significantly easier to capture them within their top 100, the expanded limits of what LLMs can access.
As LLMs layers are added as an intermediate layer to synthesize search results for users, then the quality gap between search products will be flattened.
Today, Google is much better than Bing or DuckDuckGo. In the future, the gap between (Google + LLM layer) and (Bing + LLM) or (DuckDuckGo + LLM) will likely be smaller. While Google is actively thinking about LLMs & search (see Bard), it will be forced to compete from a more level playing field going forward.
Using longer context windows effectively
For search to play out as described above, LLMs need to be able to use all parts of the context window equally well. Today LLMs are significantly better at using information in the very beginning or end of prompts, and underperform in the middle23. As this would translate into a bias for highly- (e.g. 1-10) or lowly-ranked (e.g. 90-100) search results, this is something that needs fixing.
The final obstacle is cost
Some companies like Perplexity.ai are already making good headway in applying LLMs to search, but their business model is based on paid subscribers4.
It’s certainly possible a subscription based model will end up doing really well. For an LLM-based search engine to become as popular as Google, however, we’d also need to consider the cost (and perhaps latency) required to support a free, ads-based model.
Widespread adoption for non-subscription-based LLM search doesn’t seem years off - although a light initial version of LLM-based search costs 10x as much as standard keyword search5, companies like Google can absorb the hit.
The full impact for search probably won’t be realized until costs drop further - at which point we could either run more powerful LLMs, or run do multiple LLMs-runs per search query to further improve quality. With OpenAI’s inference costs dropping by 90%+ in a year6, as well as smaller & cheaper models like Llama & Mistral 7B demonstrating progress, the timeline might be shorter than you think.