MCP vs RAG vs NLWeb vs HTML: a comparison of the effectiveness and efficiency of different agent interfaces to the web


Steiner, Aaron ; Peeters, Ralph ; Bizer, Christian


[img]
Preview
PDF
3774904.3792893.pdf - Published

Download (1MB)

DOI: https://doi.org/10.1145/3774904.3792893
URL: https://dl.acm.org/doi/10.1145/3774904.3792893
URN: urn:nbn:de:bsz:180-madoc-721302
Document Type: Conference or workshop publication
Year of publication Online: 2026
Date: 12 April 2026
Book title: WWW '26 : Proceedings of the ACM Web Conference 2026
Page range: 8493-8496
Conference title: WWW '26: The ACM Web Conference 2026
Location of the conference venue: Dubai, United Arab Emirates
Date of the conference: 13.-17.04.2026
Publisher: Hacid, Hakim ; Maarek, Yoelle
Place of publication: New York, NY, USA
Publishing house: Association for Computing Machinery
ISBN: 979-8-4007-2307-0
Related URLs:
Publication language: English
Institution: School of Business Informatics and Mathematics > Information Systems V: Web-based Systems (Bizer 2012-)
Pre-existing license: Creative Commons Attribution 4.0 International (CC BY 4.0)
Subject: 004 Computer science, internet
Individual keywords (German): web agents , llm agents , rag , mcp , nlweb , electronic commerce
Abstract: LLM-based agents are increasingly used to automate web tasks such as product search, offer comparison, and order placement. Current research explores different interfaces through which these agents interact with websites, including traditional HTML browsing, retrieval-augmented generation (RAG) over pre-crawled content, communication via Web APIs using the Model Context Protocol (MCP), and natural-language querying through the NLWeb interface. Yet no systematic comparison of the effectiveness and efficiency of these interfaces on identical challenging task sets exists. To address this gap, we introduce a testbed consisting of four simulated e-shops, each offering its products via HTML, MCP, and NLWeb interfaces. For each interface (HTML, RAG, MCP, and NLWeb), we develop specialized agents that perform the same sets of tasks, ranging from simple product searches and price comparisons to complex queries for complementary or substitute products and checkout processes. We evaluate the agents using GPT-5 and GPT-5-mini. Our evaluation shows that RAG, MCP, and NLWeb agents outperform HTML browsing agents by 11 percentage points in task completion while requiring 2–5 times fewer tokens on search-oriented tasks. The GPT-5 RAG agent achieves the highest task completion rate (0.79) while maintaining moderate token consumption.
Additional information: Konferenz wurde auf Juni/Juli verschoben




Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Diese Publikation ist bisher nur Online erschienen. Diese Publikation nun als "Jetzt in Print erschienen" melden.




Metadata export


Citation


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item