data collection and parallel corpus compilation for machine translation of subtitles

Povzetek

This paper describes the data collection and parallel corpus compilation activities carried out in the FP7 EU-funded SUMAT project. This project aims to develop an online subtitle translation service for nine European languages combined into 14 different language pairs. This data provides bilingual and monolingual training data for statistical machine translation engines which will semi-automate the subtitle translation processes of subtitling companies on a large scale.

Ključne besede

parallel multilingua corpora;statistical machine translation;subtitle translation service;

Podatki

Jezik: Angleški jezik
Leto izida:
Tipologija: 1.08 - Objavljeni znanstveni prispevek na konferenci
Organizacija: UM FERI - Fakulteta za elektrotehniko, računalništvo in informatiko
UDK: 004.8
COBISS: 16027926 Povezava se bo odprla v novem oknu
Št. ogledov: 1420
Št. prenosov: 53
Ocena: 0 (0 glasov)
Metapodatki: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Ostali podatki

Sekundarni jezik: Neznan jezik
URN: URN:SI:UM:
Vrsta dela (COBISS): Delo ni kategorizirano
Strani: Str. 21-28
Ključne besede (UDK): science and knowledge;organization;computer science;information;documentation;librarianship;institutions;publications;znanost in znanje;organizacije;informacije;dokumentacija;bibliotekarstvo;institucije;publikacije;prolegomena;fundamentals of knowledge and culture;propaedeutics;prolegomena;splošne osnove znanosti in kulture;computer science and technology;computing;data processing;računalniška znanost in tehnologija;računalništvo;obdelava podatkov;artificial intelligence;umetna inteligenca;
ID: 1439062
Priporočena dela:
, data collection and parallel corpus compilation for machine translation of subtitles
, diplomsko delo