[Question]: Gaining better control over the insert process? #1828
WilliamDiakite
started this conversation in
General
Replies: 1 comment
-
Answering the second part of my question: one can insert custom knowledge base using |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Do you need to ask a question?
Your Question
The
rag.insert(...)
takes a text document (or a collection of documents) as an input. From there, LightRAG uses an LLM to extract various information (entities, relationships, etc.). This extraction is tailored by a rather complex prompt which outputs formatted data that is later parsed by LightRAG.My first question concern the format of the output : why this default particular default format (the one described in
prompt.py
)? Why not ask the LLM to output something likejson
or evenxml
which can be easily handled by machines and humans alike? Not to mention that LLM are trained on such formats (or language in the case of xml). I couldn't find any reason for this design choice running through the paper.The follow-up question addresses the possibility of interacting with the insert task by providing
rag.insert(...)
formatted data rather than plain text (using a schema specified by LightRAG). This way, preparation of data could be handled outside LightRAG, allowing easier testing and better control (moreover, having an explicit schema for the extracted data would make it more comfortable to modify the LLM prompt). Is there an approach that already allows that kind of interaction?Additional Context
No response
Beta Was this translation helpful? Give feedback.
All reactions