Iskalni niz:
išči po
išči po
išči po
išči po
Vrsta gradiva:
Jezik:
Št. zadetkov: 1
Raziskovalni podatki
Oznake: web corpus
The Finnish web corpus fiWaC was built by crawling the .fi top-level domain in 2015 for both Finnish and English documents. The corpus was naively tokenised (via spaces), near-deduplicated on paragraph level and paragraph-shuffled. Each paragraph contains metadata on the URL and language identificat ...
Leto: 2016 Vir: CLARIN.si
Št. zadetkov: 1
Ključne besede:
Leto izdaje:
Avtorji:
Repozitorij:
Tipologija:
Jezik: