master's thesis
Boris Radovič (Author), Veljko Pejović (Mentor)

Abstract

Federated learning (FL) is a distributed machine learning paradigm in which a model is collectively trained by using the data available on multiple devices without such devices exposing their data. This concept marks a significant stride towards decentralized AI. However, the challenge arises when dealing with non-independently and identically distributed (non-IID) data, as any kind of data heterogeneity among devices' datasets can hinder training convergence and worsen the predictive quality of the model being trained. Among the many techniques recently proposed for addressing such difficulties, there is clustering. Established clustering methods require the devices to possess a labelled dataset in order to assign the devices to a cluster, and this limits the applicability of such clustering approaches. In this thesis, we introduce a comprehensive framework and a suite of algorithms designed to cluster devices that lack a labelled dataset. Through experimentation, we demonstrate that our proposed algorithms yield results comparable to current state-of-the-art methods. An advantage of our approach is its capability to cluster devices that did not participate in the training stage. This includes cases where devices lack a labelled dataset or the devices' computational capabilities are limited.

Keywords

machine learning;deep learning;federated learning;computer science;master's thesis;

Data

Language: English
Year of publishing:
Typology: 2.09 - Master's Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [B. Radovič]
UDC: 004.8(043.2)
COBISS: 164990467 Link will open in a new window
Views: 52
Downloads: 10
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: Slovenian
Secondary title: Gručenje odjemalcev za izboljšanje zveznega učenja na heterogenih podatkih
Secondary abstract: Zvezno učenje (ZU) je pristop, v katerem množica naprav sodeluje z namenom treniranja modela strojnega učenja. Pri tem si sodelujoče naprave ne izmenjujejo surovih podatkov, tako da ohranja postopek varnost in zasebnost uporabnikovih podatkov. Med težavami, s katerimi se ZU trenutno sooča, je učenje modelov v primerih, ko so podatki na sodelujočih napravah porazdeljeni neenakomerno. V primeru prisotnosti heterogenosti podatkov se kakovost končnih predikcij treniranega modela zmanjša in v najhujših primerih lahko model celo divergira. Med uveljavljenimi pristopi, ki skušajo omiliti negativne posledice heterogenosti podatkov, je gručenje naprav. Sodobne metode gručenja v ZU-ju zahtevajo, da imajo naprave označeno podatkovno množico, ta predpostavka pa omejuje uporabnost takšnih pristopov. V magistrski nalogi predstavimo torej celovito ogrodje in nabor algoritmov, ki omogočijo gručenje naprav, ki nimajo označene podatkovne množice. Poizkusi, ki jih izvedemo v magistrski, kažejo na dejstvo, da predlagani algoritmi dajejo rezultate, ki so primerljivi s tistimi, ki jih dosegajo uveljavljene metode gručenja v ZU-ju. V primerjavi z obstoječimi metodami pa razviti algoritmi omogočijo gručenje naprav, ki niso sodelovale med učenjem zaradi pomanjkanja označenih podatkov oz. zaradi omejenih računskih sposobnosti.
Secondary keywords: zvezno učenje;magisteriji;Globoko učenje (strojno učenje);Računalništvo;Univerzitetna in visokošolska dela;
Type (COBISS): Master's thesis/paper
Study programme: 1000471
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: X, 87 str.
ID: 19888176