bachelor's thesis
Ilija Tavchioski (Author), Marko Robnik Šikonja (Mentor), Senja Pollak (Co-mentor)

Abstract

The rapid technological advances in the past two decades have drastically affected our society's behavior, culture, and lifestyle. Social media, as one of the many products of this phenomenon, become an essential part of our lives as a tool for communication and expression. Social media become a popular choice for people to share information with the community, such as their thoughts and feelings on various matters in their lives. This especially can be observed in people with mental health issues, such as depression. They often prefer to express their strong feelings or ask for advice on social media. Social media posts also offer a possibility for automatically detecting signs of depression. In this thesis, we present the solution to this problem using natural language processing methods on two different data sets which are composed of posts from the social platforms Reddit and Twitter. We propose using large pre-trained models such as the BERT model and the use of transfer learning between the two data sets. We additionally improved the results by creating an ensemble of several combinations of transformer-based models pre-trained on different domains.

Keywords

natural language processing;transformers;depression detection;social networks;computer and information science;diploma thesis;

Data

Language: English
Year of publishing:
Typology: 2.11 - Undergraduate Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [I. Tavchioski]
UDC: 004.8:81'322.2(043.2)
COBISS: 123092995 Link will open in a new window
Views: 27
Downloads: 13
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: Slovenian
Secondary title: Detekcija depresije na družbenih omrežjih z metodami obdelave naravnega jezika
Secondary abstract: Hiter tehnološki razvoj v zadnjih dveh desetletjih je močno vplival na vedenje, kulturo in življenjski slog družbe. Družbeni mediji, ki so eden izmed rezultatov tega fenomena, so postali ključen del naših življenj in pomembno orodje za komunikacijo in izražanje. Postali so priljubljen način, da ljudje s skupnostjo delijo informacije, na primer svoje misli in vprašanja o različnih zadevah svojega življenja. To lahko opazimo tudi pri osebah s težavami v duševnem zdravju, kot je depresija, ki družbene medije pogosto uporabljajo za izražanje svojih čustev ali tam prosijo za nasvete. Objave na družbenih omrežjih lahko služijo tudi za avtomatsko zaznavanje znakov depresije. V diplomskem delu predstavimo možne rešitve te naloge z uporabo metod za obdelavo naravnega jezika. Uporabimo dve različni zbirki podatkov, ki sta sestavljeni iz objav na družbenih platformah Reddit in Twitter. Predlagamo uporabo velikih prednaučenih jezikovnih modelov, kot je model BERT in uporabo učenja s prenosom znanja med zbirkami podatkov. Rezultate dodatno izboljšamo z uporabo ansamblov z različnimi kombinacijami modelov, ki temeljijo na transformerjih, prednaučenih na različnih domenah.
Secondary keywords: transformerji;zaznavanje depresije;družbena omrežja;računalništvo in informatika;univerzitetni študij;diplomske naloge;Obdelava naravnega jezika (računalništvo);Depresija (psihologija);Računalniško jezikoslovje;Računalništvo;Univerzitetna in visokošolska dela;
Type (COBISS): Bachelor thesis/paper
Study programme: 1000468
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 35 str.
ID: 16448525