Bc. Šárka Ščavnická

Master's thesis

Multimodal Document Understanding through Visual Question Answering

Multimodal Document Understanding through Visual Question Answering
Abstract:
Služby na spracovania dokumentov sú čoraz populárnejšie vo viacerých odvetviach, čo vedie k rastúcemu počtu výskumov použitia umelej inteligencie pri spracovaní dokumentov, táto oblasť je známa ako Document Intelligence. Táto práca sa zameriava na zodpovedanie otázok, ktoré sa týkajú dokumentov a ich vizuálnej stránky, skrátene známe pod pojmom DVQA (document visual question answering). Ide o podoblasť …more
Abstract:
Applications of document processing become increasingly popular across multiple industries, resulting in a growing amount of research on the applications of artificial intelligence in document processing, known as Document Intelligence. This paper focuses on Document Visual Question Answering, shortly known as DVQA, a subtask of Document Intelligence that is gaining attention for its universality. …more
 
 
Language used: English
Date on which the thesis was submitted / produced: 15. 12. 2023

Thesis defence

  • Date of defence: 8. 2. 2024
  • Supervisor: Mgr. Michal Štefánik
  • Reader: Edoardo Signoroni

Citation record

Full text of thesis

Contents of on-line thesis archive
Published in Theses:
  • světu
Other ways of accessing the text
Institution archiving the thesis and making it accessible: Masarykova univerzita, Fakulta informatiky