Developing a Cybersecurity Domain Chatbot based on an Open Source Large Language Model – Shahrukh Azhar AHSAN
Shahrukh Azhar AHSAN
Master's thesis
Developing a Cybersecurity Domain Chatbot based on an Open Source Large Language Model
Abstract:
The objective of this research is to determine the effectiveness of fine-tuned open-source LLMs in the domain of cybersecurity. Specifically, the study evaluates how effective fine-tuning is for LLMs to learn and provide accurate information about recently reported software vulnerabilities. The LLMs used in this study were Falcon-7B and Llama-2-7b-chat-hf. A custom dataset of 19,135 question-answer …moreAbstract:
The objective of this research is to determine the effectiveness of fine-tuned open-source LLMs in the domain of cybersecurity. Specifically, the study evaluates how effective fine-tuning is for LLMs to learn and provide accurate information about recently reported software vulnerabilities. The LLMs used in this study were Falcon-7B and Llama-2-7b-chat-hf. A custom dataset of 19,135 question-answer …more
Language used: English
Date on which the thesis was submitted / produced: 20. 8. 2024
Thesis defence
- Supervisor: prof. Dr. Michael Heigl
Citation record
ISO 690-compliant citation record:
AHSAN, Shahrukh Azhar. \textit{Developing a Cybersecurity Domain Chatbot based on an Open Source Large Language Model}. Online. Master's thesis. České Budějovice: University of South Bohemia in České Budějovice, Faculty of Science. 2024. Available from: https://theses.cz/id/k1oezu/.
The right form of listing the thesis as a source quoted
AHSAN, Shahrukh Azhar. Developing a Cybersecurity Domain Chatbot based on an Open Source Large Language Model. České Budějovice, 2024. diplomová práce (Mgr.). JIHOČESKÁ UNIVERZITA V ČESKÝCH BUDĚJOVICÍCH. Přírodovědecká fakulta
Full text of thesis
Contents of on-line thesis archive
Published in Theses:- světu
Other ways of accessing the text
Institution archiving the thesis and making it accessible: JIHOČESKÁ UNIVERZITA V ČESKÝCH BUDĚJOVICÍCH, Přírodovědecká fakultaUNIVERSITY OF SOUTH BOHEMIA IN ČESKÉ BUDĚJOVICE
Faculty of ScienceMaster programme / field:
Artificial Intelligence and Data Science / Artificial Intelligence and Data Science
Theses on a related topic
-
Practical use of natural language processing in education technology
Dominik Hartinger -
Application of Natural language processing to enhance qualitative research used for marketing
Poj Nuangniyom Netsiri -
Scalability of Semantic Analysis in Natural Language Processing
Radim Řehůřek -
Exploring Semantic Homogeneity in Unlabeled Data Clustering Using Large Language Models
Bashar FARES -
Large Language Models (LLMs): Examining the quality of generated text with task specific data
Michal Caninec -
Large Language Models as a tool for generating high-level features for text documents
Vojtěch Balek -
Think Twice Before You Answer: Mitigating Biases of Question Answering Models
Lukáš Mikula -
Risk Assessment Model for Open Source Software Projects in GitHub
Samuel Macko
Name
Posted by
Uploaded/Created
Rights