Povzetek

A natural language processing framework called TextProc is described in this paper. First the frameworks software architecture is described. The architecture is made of several parts and all of them are described in detail. Natural language processing capabilities are implemented as software plug-ins. Plug-ins can be put together into processes that perform a practical natural processing function. Several practical TextProc processes are briefly described, like part-of-speech tagging, named entity tagging and others. One of those is capable to perform plagiarism detection on texts in Slovenian language, which is explained in detail. This process is actually used in digital library of University of Maribor. The integration of digital library with TextProc is also briefly described. At the end of this paper some ideas for future development are given.

Ključne besede

natural language processing;text processing;text mining;Slovenian language;plagiarism detection;

Podatki

Jezik: Angleški jezik
Leto izida:
Tipologija: 1.01 - Izvirni znanstveni članek
Organizacija: UM FERI - Fakulteta za elektrotehniko, računalništvo in informatiko
UDK: 004.777
COBISS: 14856982 Povezava se bo odprla v novem oknu
ISSN: 2074-1316
Št. ogledov: 2098
Št. prenosov: 68
Ocena: 0 (0 glasov)
Metapodatki: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Ostali podatki

Sekundarni jezik: Angleški jezik
Sekundarne ključne besede: procesiranje naravnih jezikov;tekstovno procesiranje;detekcija plagiatov;slovenski jezik;
URN: URN:SI:UM:
Strani: str. 293-300
Letnik: ǂVol. ǂ5
Zvezek: ǂiss. ǂ3
Čas izdaje: 2011
ID: 8718519