Abstract
V prispevku je predstavljen pisni korpus usvajanja slovenščine kot tujega jezika KOST, poudarek pa je na njegovem položaju med obstoječimi korpusi tega tipa, zgrajenimi za druge ciljne jezike. Glede na sorodni sociolingvistični položaj je KOST mogoče primerjati s slabo desetino med več kot 190 korpusi usvajanja tujega jezika. Ugotovimo lahko, da je KOST s svojo zasnovo, trenutno velikostjo skoraj 835.000 besed, delno označenimi jezikovnimi napakami in prostim dostopom do podatkov s temi korpusi popolnoma primerljiv in kot tak uporaben vir za raznovrstne raziskave.
Keywords
slovenščina;slovenščina kot drugi jezik;korpus usvajanja tujega jezika;klasifikacija napak;jezikovni korpusi;KOST;
Data
Language: |
Slovenian |
Year of publishing: |
2022 |
Typology: |
1.16 - Independent Scientific Component Part or a Chapter in a Monograph |
Organization: |
UL FF - Faculty of Arts |
UDC: |
811.163.6'243'322 |
COBISS: |
131032579
|
Views: |
28 |
Downloads: |
0 |
Average score: |
0 (0 votes) |
Metadata: |
|
Other data
Secondary language: |
English |
Secondary abstract: |
This article presents the written Slovenian learner corpus KOST, focusing on its position among other learner corpora for other target languages. In terms of the sociolinguistic position of the target language, KOST can be compared with approximately one-tenth of more than 190 learner corpora. With its design, current size of almost 835,000 words, partially tagged language errors, and free access to data, KOST is fully comparable to these corpora and thus a useful resource for various forms of language research. |
Secondary keywords: |
Slovene;Slovene as a second language;learner corpus;error classification;language corpora;KOST; |
Pages: |
Str. 323-334 |
DOI: |
10.4312/Obdobja.41.323-334 |
ID: |
19833947 |