mcp-proto-okn: Natural-language access to open scientific knowledge graphs through the Model Context Protocol
Peter W. Rose, Benjamin M. Good, Amanda M. Saravia-Butler, Charlotte A. Nelson, James P. Balhoff, Yaphet Kebede
- Published
- May 28, 2026 — 17:37 UTC
Problem
This paper presents a preprint addressing the gap in accessibility and usability of scientific knowledge graphs for non-expert users, particularly in the biomedical domain. Traditional methods for querying and integrating knowledge graphs often require specialized knowledge of query languages like SPARQL, which limits their use to domain experts. The authors propose the Model Context Protocol (MCP) server, mcp-proto-okn, to facilitate natural language interactions with these knowledge graphs, thereby democratizing access to scientific data.
Method
The core technical contribution is the implementation of the mcp-proto-okn server, which leverages the FastMCP framework to enable natural language processing capabilities for knowledge graph interaction. Key features include:
- Graph Routing: Directs queries to the appropriate knowledge graph based on context.
- Schema Inspection: Allows users to explore the structure of the knowledge graph.
- SPARQL Execution: Executes SPARQL queries generated from natural language inputs.
- Ontology Expansion: Enhances the knowledge graph with additional ontological information.
- Multi-Graph Querying: Supports querying across multiple knowledge graphs simultaneously.
- Transcript Generation: Produces human-readable transcripts of interactions for user reference.
The server is implemented in Python, and the authors provide comprehensive documentation, client configuration instructions, and example analysis transcripts on their GitHub repository.
Results
While specific quantitative results are not provided in the abstract, the authors claim that mcp-proto-okn significantly lowers the barrier for cross-domain knowledge graph analysis. The effectiveness of the server is implied through its design, which aims to facilitate user engagement with complex data structures without requiring deep technical expertise. The authors likely benchmark the server against traditional querying methods, although specific baseline comparisons and performance metrics are not detailed in the abstract.
Limitations
The authors acknowledge that the mcp-proto-okn server may not fully capture the nuances of complex queries that require advanced SPARQL knowledge. Additionally, the reliance on natural language processing introduces challenges related to ambiguity and context understanding, which may affect query accuracy. The paper does not discuss scalability issues or performance benchmarks under high-load scenarios, which are critical for real-world applications. Furthermore, the server’s effectiveness in diverse scientific domains beyond biomedicine remains untested.
Why it matters
The development of mcp-proto-okn has significant implications for the accessibility of scientific knowledge. By enabling natural language access to complex knowledge graphs, the server can empower a broader range of users, including researchers, clinicians, and educators, to leverage scientific data without requiring extensive technical training. This democratization of data access can foster interdisciplinary collaboration and innovation, potentially accelerating discoveries in biomedical research and beyond. The work sets a precedent for future research in natural language interfaces for knowledge graph interaction, paving the way for more intuitive AI-driven tools in scientific inquiry.
Authors: Peter W. Rose, Benjamin M. Good, Amanda M. Saravia-Butler, Charlotte A. Nelson, James P. Balhoff, Yaphet Kebede, Patricia L. Whetzel, Christopher Bizon et al.
Source: arXiv:2605.30283
URL: https://arxiv.org/abs/2605.30283v1
By Turing Wire editorial staff · May 28, 2026 · Editorial standards →
Source: arXiv cs.AI