master's thesis
Jakob Šalej (Author), Mira Trebar (Mentor)

Abstract

The Internet of Things (IoT) consists of resource-constrained devices or sensors connected to the network. These devices send large amounts of data to the servers in the cloud, to be stored and processed. Relying on the cloud has its disadvantages, namely high network usage, privacy concerns and slower data processing. These downsides can be mitigated with a new paradigm - edge computing. The new category of smart AIoT (Artificial Intelligence + IoT) devices is capable of learning from and responding to new scenarios instantly by using ML methods locally, on-device. The goal of this master's thesis is to analyse the possibilities of local data processing on resource-constrained edge devices. An experimental analysis of classification and runtime performance of ML algorithms is provided. Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Artificial Neural Network (ANN) are evaluated on a dataset DS2OS traffic traces in an anomaly detection domain. Additionally, Stochastic Gradient Descent (SGD) is used for incremental learning. Algorithm performance is measured on two devices: Raspberry Pi 4 model B serves as a reference edge device, while a laptop PC XPS 13 provides a performance baseline. Mode Ds and mode D implementations differ in how data is split into training and test sets. Two methods for reducing dataset size are presented: randomly sampled smaller datasets and datasets with reduced majority class. Classification results on initial training sets and reduced training sets show that SVM, DT and RF perform the best. Performance analysis shows that DT achieves fastest training and inference times on RPi 4. By using datasets with reduced majority class, RPi 4 is able to match XPS 13 runtime performance results with only a small decrease in classification accuracy.

Keywords

Internet of Things;machine learning;sensor data;edge computing;anomaly detection;computer science;computer and information science;master's thesis;

Data

Language: English
Year of publishing:
Typology: 2.09 - Master's Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [J. Šalej]
UDC: 004.738.5:004.8(043.2)
COBISS: 87405571 Link will open in a new window
Views: 218
Downloads: 37
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: Slovenian
Secondary title: AIoT in robna obdelava senzorskih podatkov
Secondary abstract: Internet stvari (IoT) sestavljajo naprave in senzorji z omejenimi viri. Imajo dostop do omrežja in pošiljajo velike količine podatkov v strežnike v oblaku, kjer se podatki shranjujejo in obdelujejo. Zanašanje na oblačne storitve ima svoje pomanjkljivosti: obremenjenost omrežja, pomanjkljiva zasebnost in počasnejša obdelava podatkov. Nova paradigma robne obdelave podatkov lahko te težave odpravi. Kategorija pametnih naprav AIoT (AI + IoT) je z lokalno uporabo metod strojnega učenja na sami napravi zmožna učenja in prilagajanja novim situacijam. Glavni cilj magistrske naloge je analiza zmožnosti lokalne obdelave podatkov na robnih napravah z omejenimi viri. Izvedena je eksperimentalna analiza klasifikacijske točnosti in časovne učinkovitosti izvajanja algoritmov strojnega učenja. Metode strojnega učenja LR (ang. Logistic Regression), SVM (ang. Support Vector Machine), DT (ang. Decision Tree), RF (ang.Random Forest) in ANN (ang. Artificial Neural Network) so primerjane na področju detekcije anomalij z uporabo podatkovne zbirke DS2OS. V analizo je vključena tudi metoda SGD (ang. Stochastic Gradient Descent) v domeni inkrementalnega učenja. Uporabljeni sta dve testni napravi: Raspberry Pi 4 (model B) in prenosni računalnik XPS 13. Implementirana sta dva načina za za določanje učnih in testnih množic, poimenovana način Ds in način D. Predstavljena sta tudi dva načina za določanje manjših učnih množic: naključno izbrane manjše učne množice in učne množice z zmanjšanim večinskim razredom. Rezultati klasifikacije na začetnih učnih množicah in zmanjšanih učnih množicah pokažejo, da se pri razvrščanju najbolje odrežejo metode SVM, DT in RF. Analiza hitrosti izvajanja pokaže, da metoda DT doseže najhitrejše čase učenja in predikcije na napravi RPi 4. Z določanjem učnih množic z zmanjšanim večinskim razredom se delovanje RPi 4 pohitri na nivo naprave XPS 13, ob tem pa pride le do manjše izgube klasifikacijske točnosti.
Secondary keywords: senzorski podatki;robna obdelava;detekcija anomalij;računalništvo in informatika;magisteriji;Internet stvari;Strojno učenje;Računalništvo;Univerzitetna in visokošolska dela;
Type (COBISS): Master's thesis/paper
Study programme: 1000471
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: X, 102 str.
ID: 14010128