Missing custom metadata "file_url" when retrieving document in Dify #13626
Replies: 2 comments 3 replies
-
|
Hi @Hories0106! I'm Dosu and I’m helping the ragflow team. This is a known limitation. RAGFlow's code does include custom metadata fields (like your The likely cause: When building the metadata cache for retrieval responses, RAGFlow uses a default pagination of To verify this is your issue, check how many documents are in your knowledge base and where your document falls in the list. If it's beyond position 30-128, this is likely the cause. Possible workarounds:
There are also other related Dify integration issues being tracked, including #13335 regarding missing To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
-
|
@dosu |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Body:
I'm using RAGFlow v0.24.0. In my knowledge base, I added a custom metadata field named file_url to a specific file (screenshot attached showing that I have added this metadata). However, when I retrieve this document through Dify, the returned metadata does not include the file_url field. Below is the metadata I received from Dify:
json
"metadata": {
"_source": "knowledge",
"data_source_type": "external",
"dataset_id": "9b1529ac-4beb-4067-8795-114fd7b03a3a",
"dataset_name": "人事总务(26/3/10)",
"doc_metadata": {
"dataset_id": "9b1529ac-4beb-4067-8795-114fd7b03a3a",
"dataset_name": "人事总务(26/3/10)",
"doc_id": "3a26cc001d1d11f1866c6d65ecb9f55a",
"score": 0.3046044653646284,
"title": "E-Mail管理规定(中).pdf"
},
"document_id": "E-Mail管理规定(中).pdf",
"document_name": "E-Mail管理规定(中).pdf",
"position": 3,
"retriever_from": "workflow",
"score": 0.3046044653646284
},
"title": "E-Mail管理规定(中).pdf"
As you can see, the custom field file_url is missing.
Steps to reproduce:
In RAGFlow v0.24.0, add a file to a knowledge base.
Add a custom metadata field file_url with some value (e.g., a local file path) to that file (see screenshot).
Use Dify to retrieve this document (e.g., via a retrieval workflow).
Observe the returned metadata – the file_url field is not present.
Expected behavior: The custom metadata file_url should be included in the metadata returned when the document is retrieved.
Actual behavior: The file_url field is absent.
Environment:
RAGFlow version: v0.24.0
Dify version: (unknown, but the issue seems to be on RAGFlow side)
Deployment: (e.g., local Docker, cloud – not specified)
Additional context: Screenshot attached showing the metadata added in RAGFlow. Please let me know if you need any further information.


Beta Was this translation helpful? Give feedback.
All reactions