magistrsko delo
Jan Pavlin (Author), Matjaž Kukar (Mentor)

Abstract

V zadnjem desetletju so se modeli strojnega učenja začeli praktično uporabljati na vseh področjih našega življenja. Prisotni so tako v obliki priporočilnih sistemov, napovednih modelov kot tudi sistemov za oceno tveganja ali sistemov za pomoč pri odločanju. Tudi področja, kjer morajo odločitve sprejemati strokovnjaki (doktor medicine v zdravstvu ali poslovni analitik v ekonomiji), vse bolj uporabljajo modele strojnega učenja pri sprejemanju odločitev. Na takšnih področjih pa je ključno, da strokovnjak oziroma odločevalec dobro razume, kako je uporabljen model strojnega učenja prišel do svoje napovedi. Priložnost na tem področju predstavljajo tako imenovani interpretabilni modeli. Takšni modeli sicer dosegajo nekoliko manjše napovedne točnosti, a lahko uporabnik iz napovedi modela tudi vidi, kateri atributi so koliko doprinesli k njegovi odločitvi. In dodatna razlaga ni potrebna. V našem delu smo se posvetili modelu interpretabilnega strojnega učenja RiskSLIM. Rezultat učenja je enostaven točkovalnik, kjer uporabnik sešteje točke in pridobi verjetnost pripadnosti razredu za posamezen primer. Raziskali smo, kako različne značilnosti podatkov vplivajo na čas učenja modela in na njegove performanse. Preverili smo, ali je RiskSLIM sposoben v atributih najti kompleksne odvisnosti in kako uspešen je na naborih podatkov, ki niso monotoni. Algoritem je omejen na binarne nabore podatkov, ki so v praksi zelo redki. Ker se morajo atributi z zveznimi številskimi podatki za spremembo v dvorazredne diskretizirati, smo raziskali več načinov diskretizacije in jih med seboj primerjali glede na napovedno točnost modelov. Za primerjavo napovedne točnosti z RiskSLIM smo uporabili pogosto uporabljen algoritem XGBoost, ki je trenutno ena izmed boljših metod za učenje nad tabelaričnimi podatki. Algoritem smo poskusili razširiti tudi tako, da bi bil zmožen napovedovati večrazredne probleme. V ta namen smo implementirali več razširitev modela in jih med sabo primerjali. Tu smo opazovali tako napovedno točnost razširitve kot tudi njeno interpretabilnost. V zadnjem sklopu smo se posvetili še knjižnici CPLEX, ki jo algoritem pri učenju uporablja za reševanje mešanega celoštevilskega nelinearnega problema. Ker je knjižnica plačljiva, smo jo poskusili zamenjati z odprtokodno alternativo. Identificirali smo funkcionalnosti CPLEX, ki jih model uporablja, in odseke njune medsebojne interakcije. Znotraj teh odsekov smo preverili, ali alternativne knjižnice omogočajo enake funkcionalnosti kot CPLEX, da bi jo nadomestile. V delu ugotavljamo, da RiskSLIM predstavlja interpretabilno alternativo znanim metodam strojnega učenja. Posledice dobre interpretabilnosti so slabša napovedna točnost, daljši čas učenja (pri podatkih z več atributi) in omejenost na dvorazredne nabore podatkov. Omejenost algoritma na samo binarne učne atribute smo poskušali razrešiti z uporabo diskretizacije in binarizacije podatkov, omejenost na dvorazreden ciljni razred pa smo poskusili razširiti na več načinov. Kot najboljša diskretizacija se je izkazala nadzorovana diskretizacija, ki sta jo predstavila Fayyad in Irani, med razširitvami pa smo kot najboljšo identificirali razširitev z odločitveno usmerjenim nekrožnim grafom. Pri zamenjavi CPLEX smo ugotovili, da nobena izmed preizkušenih alternativ ne vsebuje vseh funkcionalnosti, ki jih RiskSLIM uporablja. Tako menjava ni mogoča brez kompleksnejšega poseganja v samo kodo programa.

Keywords

interpretabilno učenje;RiskSLIM;točkovalni modeli;magisteriji;

Data

Language: Slovenian
Year of publishing:
Typology: 2.09 - Master's Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [J. Pavlin]
UDC: 004.8(043.2)
COBISS: 138636291 Link will open in a new window
Views: 69
Downloads: 6
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Evaluation and generalization of the interpretable machine learning approach based on RiskSLIM
Secondary abstract: In the last decade, machine learning models have started to be increasingly used in all areas of our lives. They are present in the form of recommendation systems, predictive models, as well as risk assessment systems or decision-making support systems. Even in areas, where decisions are made by field experts (doctors in medicine or business analysts in economics) machine learning models are increasingly used. In such areas, however, it is crucial that the expert or decision-maker has a good understanding of how the used model has calculated its prediction. A big opportunity in these areas is presented by interpretable machine learning models. These models achieve lower predictive accuracy, but the users are able to identify which attributes were used and how they affect the model prediction. Because of that, additional explanation of the machine learning model is not required. In this work, we evaluated the performance of the interpretable machine learning model called RiskSLIM. The result of its learning is a simple scorer where user can add points from each attribute and calculate assessment of the risk. We have explored how different features of datasets affect learning time and other performances of the machine learning model. Additionally, we have explored if RiskSLIM is capable of identifying complex relations between learning attributes and how it performs on non-monotone datasets. The algorithm is restricted to binary datasets, which are very rare in the real world. Since all numerical attributes have to be discretized and binarized to get binary attributes, we have researched multiple types of discretization and then implemented and compared their performances. To compare predictive accuracy of RiskSLIM we have used well known algorithm called XGBoost, which is currently one of the best methods for machine learning on tabular datasets. We have tried to expand the algorithm in a way that could also predict multiclass problems. Here again we found multiple types of expansions of an algorithm, implemented them, and then compared them between each other. We compared both predictive accuracy and their interpretability. In the last part, we tried to analyze and replace the CPLEX library that RiskSLIM uses to solve mixed integer nonlinear problem. Since developers have to pay for the licence of this library (except for academics purposes), we have tried to replace it with an open-source one. We have identified functions of CPLEX that are used in RiskSLIM, and replaced their interactions with the same functionalities of the open-source libraries. We can conclude that RiskSLIM presents an interpretable alternative to more popular machine learning approaches. The offsets of more interpretable predictions are lower predictive accuracy, longer learning time (especially when dataset has more than 20 attributes) and a limitation to only be able to work with binary datasets. To bypass this limitation, we implemented several methods of discretization and tested several ways to expand algorithm usability on multiclass problems. We found out that the best discretization technique was presented by Fayyad and Irani. The best expansion of an algorithm was with Directed Decision Acyclic Graph. When trying to replace CPLEX we learned, that none of the alternatives supports all the required functionalities, which RiskSLIM uses. Because of that, its replacement is not possible without more complex interference in the code of RiskSLIM.
Secondary keywords: machine learning;interpretable machine learning;RiskSLIM;scoring systems;computer science;computer and information science;master's degree;Strojno učenje;Računalništvo;Univerzitetna in visokošolska dela;
Type (COBISS): Master's thesis/paper
Study programme: 1000471
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 85 str.
ID: 17351928