magistrsko delo
Abstract
Algoritmi za odkrivanje enačb, ki uporabljajo verjetnostne gramatike, delujejo tako, da najprej vzorčijo strukture izrazov iz gramatike in nato na podlagi teh poiščejo enačbe, ki se najbolj prilegajo vhodnim podatkom. Strukture izrazov vzorčijo na podlagi verjetnosti, ki jih določa verjetnostna gramatika. Problem, ki ga srečamo pri tem je, da želimo tvoriti samo končne strukture in želimo imeti ustrezno verjetnostno porazdelitev na množici vseh možnih končnih struktur izrazov, ki jih tvori gramatika. Na srečo lahko v ta namen na verjetnostne gramatike gledamo kot na večtipske procese razvejanja. Za te obstaja izrek, ki pod določenimi pogoji pove, kdaj lahko ustrezno porazdelitev definiramo in kdaj ne. Poleg tega v magistrskem delu razvijem empirično okolje, ki omogoča uporabo omenjenih algoritmov za odkrivanje enačb v celoštevilskih zaporedjih iz Spletne enciklopedije celoštevilskih zaporedij (OEIS). Uporabo okolja ilustriram na odkrivanju enačb za štirinajst izbranih zaporedij iz OEIS.
Keywords
odkrivanje enačb;simbolna regresija;strojno učenje;verjetnostne kontekstno-neodvisne gramatike;večtipski procesi razvejanja;
Data
Language: |
Slovenian |
Year of publishing: |
2021 |
Typology: |
2.09 - Master's Thesis |
Organization: |
UL FMF - Faculty of Mathematics and Physics |
Publisher: |
[B. Gec] |
UDC: |
519.21:004 |
COBISS: |
79059971
|
Views: |
1223 |
Downloads: |
100 |
Average score: |
0 (0 votes) |
Metadata: |
|
Other data
Secondary language: |
English |
Secondary title: |
Equation discovery for integer sequences with probabilistic grammars |
Secondary abstract: |
Equation discovery algorithms that are based on probabilistic grammars sample arithmetic expressions from the grammar that are then fitted to the input data, to become equations that describe that data. The arithmetical expressions are generated according to the probabilities encoded in the probabilistic grammar. The problem we encounter in this approach is that we consider only finite expressions and we try to define the corresponding probabilistic distribution on the space of the candidate finite expressions. Fortunately, probabilistic grammars can be seen as multitype branching processes. I present and partly prove a theorem that holds for multitype branching processes that tells us whether the grammar properly define the corresponding distribution or not. Furthermore, in this master thesis I design an empirical framework for applying the aforementioned algorithm to the task of discovery of equations that hold for integer sequences from The On-Line Encyclopedia of Integer Sequences (OEIS). I illustrate the use of the framework on discovery of equations for fourteen selected sequences from OEIS. |
Secondary keywords: |
equation discovery;symbolic regression;machine learning;probabilistic context-free grammars;multitype branching processes; |
Type (COBISS): |
Master's thesis/paper |
Study programme: |
0 |
Thesis comment: |
Univ. v Ljubljani, Fak. za matematiko in fiziko, Oddelek za matematiko, Matematika - 2. stopnja |
Pages: |
VII, 52 str. |
ID: |
13600429 |