Iskalni niz:
išči po
išči po
išči po
išči po
Vrsta gradiva:
Jezik:
Št. zadetkov: 5
Video in druga učna gradiva
Oznake: computer science;text mining;cryptography and security
Ghostwriting became students’ most popular way to avoid writing of boring essays, or the best way to easily earn by writing on behalf of another student. This paper presents several markers indicating a presence of potential ghostwriters. Proposed methodology suggests various inspection techniques, ...
Leto: 2011 Vir: videolectures.net
Video in druga učna gradiva
Oznake: computer science;data mining;artificial intelligence
One of the crucial challenges of statistical machine translation is the lexical consistency of manually translated words and multiword expressions (MWEs) with multiple occurrences in the source language. In this paper, we present the degree of translation inconsistency and we introduce the ...
Leto: 2014 Vir: videolectures.net
Raziskovalni podatki
Oznake: lemmatisation;language model
The model for lemmatisation of standard Macedonian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the 1984 training corpus (to be published). The estimated F1 of the lemma annotations is ~99.1.
Leto: 2020 Vir: CLARIN.si
Raziskovalni podatki
Oznake: lemmatisation;inflection;part-of-speech tagging;multilingual
The MULTEXT-East morphosyntactic lexicons have a simple structure, where each line is a lexical entry with three tab-separated fields: (1) the word-form, the inflected form of the word; (2) the lemma, the base-form of the word; (3) the MSD, the morphosyntactic description of the word-form, i.e., its ...
Leto: 2010 Vir: CLARIN.si
Raziskovalni podatki
Oznake: parallel corpus;part-of-speech tagging;multilingual;Slavic languages;manual annotation;TEI
The novel "1984" by George Orwell is the central component of the MULTEXT-East corpus. This parallel and sentence aligned corpus contains the novel in the English original (about 100,000 words in length), and its translations into a number of languages. This version of the corpus contains the li ...
Leto: 2010 Vir: CLARIN.si
Št. zadetkov: 5
Ključne besede:
Leto izdaje:
Avtorji:
Repozitorij:
Tipologija:
Jezik: