-
Notifications
You must be signed in to change notification settings - Fork 602
Open
Labels
type: enhancementImprovementImprovement
Description
🚀 The feature
I use docTR as OCR pre-processing before I send the text data into a LLM to extract data. However, a lot of information is encoded in font style like important things are often in bold or red or double underlined. Since there is already the possibility to combine networks, I was wondering whether you can train/add a network which can estimate font styles.
Motivation, pitch
Using annotated data for LLMs increases the accuracy of the task the LLM has to do, because if gets a context what is the important thing in a line or block.
Alternatives
Currently there is no alternative then just hoping the LLM can figure it out.
Additional context
No response
Metadata
Metadata
Assignees
Labels
type: enhancementImprovementImprovement