Notable privacy preservation OpenAI

SharedRequest: Privacy-Preserving Model-Agnostic Inference for Large Language Models

Peihua Mai, Xuanrong Gao, Youlong Ding, Xianglong Du, Wei Liu, Yan Pang

Published
Jun 3, 2026 — 15:23 UTC

Problem
The paper addresses the critical issue of user prompt privacy in the context of public large language models (LLMs) like ChatGPT. Existing privacy-preserving methods often compromise either utility or efficiency and typically require model-specific adaptations, which limits their applicability across different architectures. This work presents a preprint that seeks to fill this gap by proposing a model-agnostic solution that does not necessitate modifications to the underlying model.

Method
The authors propose SharedRequest, a framework that reformulates privacy protection at the batch level. The core innovation involves obscuring sensitive information by mixing original prompts with noisy variants while grouping semantically equivalent instructions. This batching approach allows for the amortization of inference costs across multiple queries, thereby minimizing the impact on the quality of responses generated by the LLM. The method operates independently of the model architecture, requiring no access to model parameters or structural changes. The authors detail the implementation of this framework, although specific training compute requirements are not disclosed.

Results
Empirical evaluations demonstrate that SharedRequest achieves over 20% higher utility compared to existing differential privacy baselines. Additionally, the shared-prompt mechanism significantly reduces query costs, achieving up to a 5× reduction compared to traditional non-batched inference methods. These results indicate a substantial improvement in both the efficiency and effectiveness of privacy-preserving inference for LLMs.

Limitations
The authors acknowledge that while SharedRequest improves utility and efficiency, it may still be susceptible to certain types of privacy attacks that exploit the noise introduced in the prompts. Furthermore, the performance gains are contingent on the quality of the noise variants generated, which may vary based on the specific application context. The paper does not address the potential computational overhead introduced by the noise generation process or the implications of varying batch sizes on inference quality.

Why it matters
The implications of this work are significant for the deployment of LLMs in privacy-sensitive applications, as it provides a scalable solution that enhances user privacy without sacrificing performance. The model-agnostic nature of SharedRequest allows for broader applicability across different LLM architectures, facilitating its integration into existing systems. This advancement is crucial for fostering trust in AI applications, particularly in sectors where data privacy is paramount, such as healthcare and finance. The findings and methodologies presented in this paper are relevant for future research in privacy-preserving machine learning, as published in arXiv.

Turing Wire

By Turing Wire editorial staff · Jun 3, 2026 · Editorial standards →

Source: arXiv cs.AI