[TOC]
RAG
RAG的应用背景:
-
提高回答准确率
“increase LLM accuracy”
-
减少幻觉
“reducing LLM hallucination”
-
无需训练的实时信息更新和专业知识增强
“handling training-free up-to-date information and niche information”
但是对于RAG性能来说,检索的precision要比recall重要的多1
KGs+RAG
Motivation:
-
query-focused summarization问题RAG回答不了2
“RAG does not consider broader contextual semantics, e.g., correlations among documents within corpora. (query-focused summarization)”
-
无效召回对RAG性能损害大
“RAG is a means of LLM optimization, and as such, RAG document selection must be precise, not general.”
而KGs是一种可能能够实现non-noisy RAG的方法[7, 8, 9]
“KGs may provide concise and understandable sources of subject matter expert knowledge[6] and non-noisy RAG” [7, 8, 9]
Graph RAG research的两个方面:
-
如何(准确,高效,trade-off)利用 KGs 做生成增强
-
如何(准确,高效,trade-off)从 document corpora 中构建 KGs(这是一种更general的方法1)
Challenge:Graph RAG的一个基本定性是:穷举不可行。
“At the core of Graph RAG, research is the problem of relevant sub-graph identification from larger KGs, which is computationally infeasible (NP-hard) if done exhaustively.”
GraphRAG
主要还是把图分成了community,对每一个community保存了摘要用于总结。