Bc. Jan Brichta

Bachelor's thesis

Corpora from reddit.com texts

Corpora from reddit.com texts
Abstract:
Cílem této práce je vyvinout nástroje pro zpracování dat z webové stránky reddit.com do korpusů a ukázat analýzu těchto dat pomocí nástroje Sketch Engine. Ve výsledku bylo z datasetu vytvořeno 10 korpusů, které pokrývjí období od roku 2005 do roku 2023.
Abstract:
The purpose of this thesis is to develop tools for processing data from the reddit.com website into text corpora and show analysis of the data with the Sketch Engine. This results in the creation of 10 corpora from dataset that spans from the year 2005 to 2023.
 
 
Language used: English
Date on which the thesis was submitted / produced: 23. 5. 2024

Thesis defence

  • Date of defence: 28. 6. 2024
  • Supervisor: RNDr. Vít Suchomel, Ph.D.
  • Reader: RNDr. Ondřej Herman

Citation record

Full text of thesis

Contents of on-line thesis archive
Published in Theses:
  • světu
Other ways of accessing the text
Institution archiving the thesis and making it accessible: Masarykova univerzita, Fakulta informatiky

Masaryk University

Faculty of Informatics

Bachelor programme / field:
Informatics / Informatics