How Generative AI Disrupts Search: An Empirical Study of Google Search, Gemini, and AI Overviews
Riley Grossman, Songjiang Liu, Michael K. Chen, Mike Smith, Cristian Borcea, Yi Chen
Problem
This preprint addresses the gap in understanding how generative AI alters the landscape of web search, particularly in comparison to traditional search engines. The authors investigate the differences in information retrieval and presentation between Google’s search engine, its AI Overview (AIO), and Gemini Flash 2.5. The study is motivated by the increasing integration of generative AI into search functionalities, which raises questions about the implications for user experience and content visibility.
Method
The authors introduce a public benchmark dataset comprising 11,500 user queries to facilitate their empirical analysis. They conduct a comparative study of search results generated by Google Search, AIO, and Gemini Flash 2.5. Key metrics include the frequency of AIO generation, Jaccard similarity for source retrieval, and consistency of results across multiple query runs. The analysis employs statistical measures to quantify the differences in source retrieval and the impact of query modifications on AIO outputs.
Results
The findings reveal that AIOs are generated for 51.5% of the analyzed queries, often appearing above traditional organic search results. The average Jaccard similarity between the sources retrieved by different search engines is notably low (<0.2), indicating substantial divergence in source selection. Google Search predominantly retrieves content from established institutional websites, while generative search engines favor Google-owned content. Additionally, websites that block Google’s AI crawler are significantly underrepresented in AIO results, despite having accessible content. The consistency of AIO outputs is also flagged as a concern, with less reliability observed when processing identical queries or minor query edits.
Limitations
The authors acknowledge that their study is limited to a specific set of user queries and may not generalize across all search contexts. They do not address potential biases in the dataset or the representativeness of the queries. Furthermore, the implications of the findings on user behavior and long-term trends in search engine optimization are not fully explored. The study also does not consider the impact of evolving generative AI models beyond the current versions analyzed.
Why it matters
This research has significant implications for the future of web search and content visibility. The findings suggest that generative AI could disrupt traditional SEO practices, necessitating new strategies for content publishers to maintain visibility in search results. The authors advocate for the development of revenue frameworks that support a sustainable ecosystem for both content creators and generative search providers. Understanding these dynamics is crucial for stakeholders in the digital content landscape, as it may influence how information is accessed and consumed in the future.
Authors: Riley Grossman, Songjiang Liu, Michael K. Chen, Mike Smith, Cristian Borcea, Yi Chen
Source: arXiv:2604.27790
URL: https://arxiv.org/abs/2604.27790v1