Detection Library
highexperimentalLinuxAI/MLT1565.001
LLM Service Replacing Retrieval Corpus Files
Detects LLM service processes writing to knowledge base, corpus, or RAG document index directories. Replacement of the retrieval corpus is a direct mechanism for injecting misinformation into RAG-grounded LLM responses.
Updated Jan 15, 2025 · Detection Engineering Team
llmmisinformationlinuxcorpus-replaceowasp-llm09
Problem Statement
The retrieval corpus is the knowledge source that grounds RAG model responses. Replacing or modifying corpus files allows an attacker to inject false facts that the model will confidently cite as retrieved evidence.
Sample Logs
{"timestamp":"2025-01-15T04:55:20Z","computer_name":"llm-host-01","user":"llm_svc","image":"/opt/llm/app/corpus_updater.py","target_filename":"/opt/llm/rag/knowledge/company_policy.txt","event_type":"file_modify"}Required Fields
image
target_filename
user
computer_name
False Positives
- ·Approved knowledge base update pipelines that refresh RAG document stores
Tuning Guidance
Corpus updates should follow a controlled pipeline with content validation and review. Alert on any write outside an approved deployment window or service account.