Exploring Semantic Homogeneity in Unlabeled Data Clustering Using Large Language Models – Bashar FARES
Bashar FARES
Master's thesis
Exploring Semantic Homogeneity in Unlabeled Data Clustering Using Large Language Models
Abstract:
This thesis investigates the topical clustering of unlabeled scientific text, leveraging various pre-trained large language models. The primary focus is on grouping the publication database at Deggendorf Institute of Technology (DIT) according to their main topics.Abstract:
This thesis investigates the topical clustering of unlabeled scientific text, leveraging various pre-trained large language models. The primary focus is on grouping the publication database at Deggendorf Institute of Technology (DIT) according to their main topics.
Language used: English
Date on which the thesis was submitted / produced: 8. 2. 2024
Thesis defence
- Supervisor: prof. Dr. Andreas Fischer
Citation record
ISO 690-compliant citation record:
FARES, Bashar. \textit{Exploring Semantic Homogeneity in Unlabeled Data Clustering Using Large Language Models}. Online. Master's thesis. České Budějovice: University of South Bohemia in České Budějovice, Faculty of Science. 2024. Available from: https://theses.cz/id/zn85fp/.
The right form of listing the thesis as a source quoted
FARES, Bashar. Exploring Semantic Homogeneity in Unlabeled Data Clustering Using Large Language Models. České Budějovice, 2024. diplomová práce (Mgr.). JIHOČESKÁ UNIVERZITA V ČESKÝCH BUDĚJOVICÍCH. Přírodovědecká fakulta
Full text of thesis
Contents of on-line thesis archive
Published in Theses:- světu
Other ways of accessing the text
Institution archiving the thesis and making it accessible: JIHOČESKÁ UNIVERZITA V ČESKÝCH BUDĚJOVICÍCH, Přírodovědecká fakultaUNIVERSITY OF SOUTH BOHEMIA IN ČESKÉ BUDĚJOVICE
Faculty of ScienceMaster programme / field:
Artificial Intelligence and Data Science / Artificial Intelligence and Data Science
Theses on a related topic
-
Large Language Models (LLMs): Examining the quality of generated text with task specific data
Michal Caninec -
Large Language Models as a tool for generating high-level features for text documents
Vojtěch Balek -
Developing a Cybersecurity Domain Chatbot based on an Open Source Large Language Model
Shahrukh Azhar AHSAN -
Think Twice Before You Answer: Mitigating Biases of Question Answering Models
Lukáš Mikula
Name
Posted by
Uploaded/Created
Rights
Folders
Files
Bulánová, L.
9/2/2024