doktorska disertacija
Emil Polajnar (Author), Aleš Žiberna (Mentor), Mihael Perman (Co-mentor)

Abstract

Kanonične korelacijske metode sestavljajo družino statističnih metod, ki omogočajo analizo povezanosti med dvema množicama spremenljivk. Standardni postopek reševanja temelji na reševanju problema lastnih vrednosti. Kanonično rešitev sestavljata par kanoničnih spremenljivk in pripadajoča kanonična korelacija. Kanonične rešitve so med seboj nekorelirane in si sledijo po padajoči vrednosti kanonične korelacije. Klasična kanonična korelacijska analiza proučuje linearno povezanost med dvema množicama spremenljivk, medtem ko nadgradnje osnovne metode omogočajo tudi druge vrste analiz. Podrobneje bomo obravnavali dve vrsti nadgradenj, in sicer kanonično korelacijsko analizo z nenegativnimi omejitvami in jedrno kanonično korelacijsko analizo z nenegativnimi omejitvami. Prva omogoča analizo linearne povezanosti in druga analizo nelinearne povezanosti. Standardni postopek reševanja obeh problemov z nenegativnimi omejitvami je omejen na izračun prve kanonične rešitve. Zaradi eksponentne časovne zahtevnosti postopka so že problemi z nekaj deset spremenljivkami praktično nerešljivi v razumnem času. V doktorski disertaciji je predstavljen alternativni pristop, ki temelji na uporabi metode alternirajočih najmanjših kvadratov in regularizacije. Oboje skupaj nam omogoča, da lahko v razumnem času poiščemo prvo in ostale kanonične rešitve za probleme z več deset tisoč spremenljivkami. Predlagani alternativni pristop za reševanje problemov z nenegativnimi omejitvami smo zapisali v obliki algoritmov, ki smo jih implementirali v programskem jeziku Python. Na primeru podatkov iz mednarodne raziskave TIMSS in medjezičnega iskanja informacij smo predlagane algoritme tudi uspešno preizkusili.

Keywords

jedrne metode;kanonična korelacijska analiza z omejitvami;medjezično iskanje informacij;metoda alternirajočih najmanjših kvadratov;regularizacija;Družboslovno raziskovanje;Statistične metode;Metodologija;Doktorske disertacije;

Data

Language: Slovenian
Year of publishing:
Typology: 2.08 - Doctoral Dissertation
Organization: UL FDV - Faculty of Social Sciences
Publisher: [E. Polajnar]
UDC: 519.22:303(043.3)
COBISS: 29041411 Link will open in a new window
Views: 518
Downloads: 186
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Regularization or restricted canonical correlation analysis
Secondary abstract: Canonical correlation methods are a family of statistical methods for the analysis of correlation between two sets of variables. The standard technique for solving canonical correlation analysis problems is based on an eigenvalue problem. The canonical solution consists of a pair of canonical variables and the corresponding canonical correlation. The first pair of canonical variables has the largest canonical correlation, the second pair of canonical variables has the second largest canonical correlation, and so on. The original canonical correlation analysis was developed to examine linear relationships between two sets of variables. In order to increase the flexibility of the original method, several extensions of canonical correlation analysis have been proposed. Two extensions will be discussed in some detail, restricted canonical correlation analysis and restricted kernel canonical correlation analysis. The former examines linear relationships and the latter non-linear relationships. The standard technique for solving the two restricted problems is limited to the first pair of canonical variables. The search process has an exponential time complexity and even problems with a few tens of variables cannot be solved in a feasible time. In this doctoral dissertation we propose an alternative technique for solving the two restricted problems. The proposed alternative technique is based on the alternating least-squares and regularization. Combining both, we were able to solve the two restricted problems with tens of thousands of variables in a feasible time. The proposed alternative technique was implemented as several algorithms in Python. The algorithms were successfully applied to the analysis of TIMSS international assessment data and to the problem of cross-language information retrieval.
Secondary keywords: alternating least-squares;cross-language information retrieval;kernel methods;restricted canonical correlation analysis;regularization;Social science research;Statistical methods;Methodology;Doctoral dissertations;
Type (COBISS): Doctoral dissertation
Study programme: 0
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za družbene vede
Pages: 167 str.
ID: 12035039
Recommended works:
, doktorska disertacija
, methods and applications
, no subtitle data available