-
Notifications
You must be signed in to change notification settings - Fork 56
Closed as not planned
Labels
epicLarger tracking issue encompassing multiple smaller issuesLarger tracking issue encompassing multiple smaller issuesstale
Description
Add support for ingesting and processing various document types (Markdown, PDF, DOCX, etc.) into formats compatible with SDG workflows.
Key Features:
- InstructLab Schema: Define an instructlab schema to standardize input formats for SDG and RAG.
- Docling Integration: Use Docling for converting document formats (PDF, DOCX, HTML) into JSON-compatible schema.
- Document Chunking Command: Develop
ilab document format
for chunking and formatting documents as per SDG schema. - Simplified Git Workflows: Introduce script to handle Git repo setup, structure, and file organization for knowledge documents.
Metadata
Metadata
Assignees
Labels
epicLarger tracking issue encompassing multiple smaller issuesLarger tracking issue encompassing multiple smaller issuesstale