Iskalni niz:
išči po
išči po
išči po
išči po
Vrsta gradiva:
Jezik:
Št. zadetkov: 33
Video in druga učna gradiva
Oznake: humanities;linguistics;lexicography;social sciences;society;computer science
With the rise of digital media in the last decades, many language-related discussions have found home on various fora and social media such as Facebook, where users can participate in a shared-interest group to discuss language use, problems and resources. The posts in these groups are formulated b ...
Leto: 2018 Vir: videolectures.net
Video in druga učna gradiva
Oznake: humanities;linguistics
Avtomatsko luščenje kolokacij temelji predvsem na izračunu statističnih sopojavitev besed v besedilnem korpusu, vsi tako izluščeni kandidati pa niso ustrezni. Da bi opredelili, kaj je legitimna statistična kolokacija na eni in slovarsko relevantna kolokacija na drugi strani, smo pripravili učno množ ...
Leto: 2018 Vir: videolectures.net
Video in druga učna gradiva
Oznake: humanities;linguistics
Leto: 2018 Vir: videolectures.net
Raziskovalni podatki
Oznake: computer-mediated communication;tokenisation;word normalisation;tagging;lemmatisation;manual annotation;TEI
Janes-Tag is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is meant as a gold-standard training and testing dataset for tokenisation, sentence segmentation, word normalisation, morphosyntactic tagging and lemmatisation of non-standard Slovene. As the corpus has bee ...
Leto: 2016 Vir: CLARIN.si
Raziskovalni podatki
Oznake: computer-mediated communication;tokenisation;word normalisation;tagging;lemmatisation;manual annotation;TEI
Janes-Tag is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is meant as a gold-standard training and testing dataset for tokenisation, sentence segmentation, word normalisation, morphosyntactic tagging and lemmatisation of non-standard Slovene. As the corpus has bee ...
Leto: 2016 Vir: CLARIN.si
Raziskovalni podatki
Oznake: computer-mediated communication;tokenisation;word normalisation;manual annotation;TEI
Janes-Norm is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is meant as a gold-standard training and testing dataset for tokenisation, sentence segmentation and word normalisation of non-standard Slovene. The corpus is also automatically annotated with morphosyntac ...
Leto: 2016 Vir: CLARIN.si
Raziskovalni podatki
Oznake: computer-mediated communication;tokenisation;word normalisation;manual annotation;TEI
Janes-Norm is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is meant as a gold-standard training and testing dataset for tokenisation, sentence segmentation and word normalisation of non-standard Slovene. As the corpus has been carefully manually annotated, it is a ...
Leto: 2016 Vir: CLARIN.si
Raziskovalni podatki
Oznake: spoken corpus;frequency list;n-grams;characters
Frequency lists of character-level n-grams were extracted from the GOS 1.0 Corpus of Spoken Slovene (http://hdl.handle.net/11356/1040) using the LIST corpus extraction tool (http://hdl.handle.net/11356/1227). The lists contain 1-5-gram combinations of characters occurring in the corpus along with th ...
Leto: 2019 Vir: CLARIN.si
Raziskovalni podatki
Oznake: frequency list;spoken corpus;words;lemmas;normalized forms
Frequency lists of words were extracted from the GOS 1.0 Corpus of Spoken Slovene (http://hdl.handle.net/11356/1040) using the LIST corpus extraction tool (http://hdl.handle.net/11356/1227). The lists contain all words occurring in the corpus along with their absolute and relative frequencies, perce ...
Leto: 2019 Vir: CLARIN.si
Raziskovalni podatki
Oznake: spoken corpus;word parts;initial part of the word;final part of the word;morphology
Frequency lists of words split into word parts were extracted from the GOS 1.0 Corpus of Spoken Slovene (http://hdl.handle.net/11356/1040) using the LIST corpus extraction tool (http://hdl.handle.net/11356/1227). The lists contain all lemmas, lower-case word forms or normalized word forms occurring ...
Leto: 2019 Vir: CLARIN.si
Št. zadetkov: 33
Ključne besede:
Leto izdaje:
Avtorji:
Repozitorij:
Tipologija:
Jezik: