Enhancing Quality of Optical Character Recognition for Financial Document Processing – Bc. Dávid Meluš
Bc. Dávid Meluš
Master's thesis
Enhancing Quality of Optical Character Recognition for Financial Document Processing
Enhancing Quality of Optical Character Recognition for Financial Document Processing
Abstract:
Optické rozpoznávanie znakov (OCR) hrá kľúčovú úlohu pri spracovaní dokumentov. Typické spracovanie dokumentov začína práve s procesom OCR, čo ovplyvňuje následujúce úlohy, ako je rozpoznávanie entít alebo získavanie informácií z dokumentov. Táto práca sa zaoberá špecializáciou univerzálných OCR modelov na konkrétne domény so zameraním na české faktúry. Táto diplomová práca sa zaoberá implementáciou …moreAbstract:
Optical Character Recognition (OCR) plays a crucial role in document processing, impacting downstream tasks like Named-Entity Recognition and Information Retrieval. This thesis investigates the fine-tuning of general-purpose OCR models to specific domains, focusing on Czech invoice documents. This thesis involves implementing and evaluating techniques within the invoice processing pipeline that comes …more
Language used: English
Date on which the thesis was submitted / produced: 15. 12. 2023
Identifier:
https://is.muni.cz/th/o88qt/
Thesis defence
- Date of defence: 8. 2. 2024
- Supervisor: Mgr. Michal Štefánik
- Reader: Mgr. Jiří Polcar
Citation record
ISO 690-compliant citation record:
MELUŠ, Dávid. \textit{Enhancing Quality of Optical Character Recognition for Financial Document Processing}. Online. Master's thesis. Brno: Masaryk University, Faculty of Informatics. 2023. Available from: https://theses.cz/id/avkzwl/.
Full text of thesis
Contents of on-line thesis archive
Published in Theses:- světu
Other ways of accessing the text
Institution archiving the thesis and making it accessible: Masarykova univerzita, Fakulta informatikyMasaryk University
Faculty of InformaticsMaster programme / field:
Artificial intelligence and data processing / Bioinformatics and systems biology
Theses on a related topic
-
Receipt database with OCR scan
Petr Janík -
Evolutionary and Neural Approaches in OCR Error Correction
Dung Quoc Nguyen -
Využití OCR technologií v oblasti zrakově handicapovaných
David Bernard -
Evaluation of off-the-shelf OCR technologies
Martin Tomaschek -
Strojové zpracování faktur metodou OCR a jeho integrace do CRM systému Atollon
Marián Čamák -
OCR historických dokumentů
Martin Mejzlík -
Sada dobrých praktik pro automatizaci testů pomocí Robot Framework a technologie OCR
Richard Bruna -
Prožitky účastníků OCR Gladiator Race
Ester KOPECKÁ