Replies: 5 comments 2 replies
-
|
You can save the original chunk and label as metadata with your question-answer pairs in the embedding database by adding them to the metadata dictionary of each Document object before the embedding is created. Dify’s pipeline (for example, in the QAIndexProcessor and DatasetDocumentStore) passes this metadata all the way through to the vector database, where it is stored as a JSON object alongside the embedding and the text chunk. This works with backends like PGVector, which has a dedicated meta column for arbitrary metadata. Here’s a simplified example: doc = Document(
page_content=question_text,
metadata={
"original_chunk": original_chunk_text,
"label": your_label,
"answer": answer_text,
# ...any other fields you want
}
)
# Add doc to your document store and proceed with embedding as usualWhen you call the vector creation method, this metadata will be persisted and retrievable with the embedding. You can later filter or query based on these metadata fields as well. For more details, see the code references for QAIndexProcessor, DatasetDocumentStore, and PGVector. If you’re using the API, note that as of v1.1.0, you may need to update the metadata after document creation, since direct metadata saving during creation isn’t always supported yet. There are also feature requests to make segment-level metadata easier to use in future versions. To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
Can I use like this api "https://api.dify.ai/v1/datasets/{dataset_id}/documents/create-by-qa" to manually save such information together? Can you give me a guide line about how to call and use this type api to manually insert question and answer pairs data with meta information? The onter question is if I can insert these meta data into knowledge embeding database, Can you tell me what steps I need to use these meta information in a chat pipeline application? |
Beta Was this translation helpful? Give feedback.
-
|
what api I can use to implement the add new question and answer pair with meta information at the same time? |
Beta Was this translation helpful? Give feedback.
-
|
From your reply, I felt that every the question and answer pair does not support to have their private meta data and only the whole document(a pdf or a docx) can has metadata? But I want to let every question and answer pair can map to a chunk. Does dify support this target? |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
Self Checks
1. Is this request related to a challenge you're experiencing? Tell me about your story.
When I save the question and answer pairs into embeding database, how can I save the original chunk and label as meta data together at the same time?
2. Additional context or comments
No response
Beta Was this translation helpful? Give feedback.
All reactions