Towards Automatic Subtitling: Assessing the Quality of Old and New Resources

Growing needs in localising multimedia content for global audiences have resulted in Neural Machine Translation (NMT) gradually becoming an established practice in the field of subtitling in order to reduce costs and turn-around times. Contrary to text translation, subtitling is subject to spatial a...

Full description

Saved in:
Bibliographic Details
Main Authors: Alina Karakanta, Matteo Negri, Marco Turchi
Format: Article
Language:English
Published: Accademia University Press 2020-06-01
Series:IJCoL
Online Access:https://journals.openedition.org/ijcol/649
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846134098552160256
author Alina Karakanta
Matteo Negri
Marco Turchi
author_facet Alina Karakanta
Matteo Negri
Marco Turchi
author_sort Alina Karakanta
collection DOAJ
description Growing needs in localising multimedia content for global audiences have resulted in Neural Machine Translation (NMT) gradually becoming an established practice in the field of subtitling in order to reduce costs and turn-around times. Contrary to text translation, subtitling is subject to spatial and temporal constraints, which greatly increase the post-processing effort required to restore the NMT output to a proper subtitle format. In our previous work (Karakanta, Negri, and Turchi 2019), we identified several missing elements in the corpora available for training NMT systems specifically tailored for subtitling. In this work, we compare the previously studied corpora with MuST-Cinema, a corpus enabling end-to-end speech to subtitles translation, in terms of the conformity to the constraints of: 1) length and reading speed; and 2) proper line breaks. We show that MuST-Cinema conforms to these constraints and discuss the recent progress the corpus has facilitated in end-to-end speech to subtitles translation.
format Article
id doaj-art-6b10c8ee0b0d491f8bfd87846813f31f
institution Kabale University
issn 2499-4553
language English
publishDate 2020-06-01
publisher Accademia University Press
record_format Article
series IJCoL
spelling doaj-art-6b10c8ee0b0d491f8bfd87846813f31f2024-12-09T13:21:52ZengAccademia University PressIJCoL2499-45532020-06-0161637610.4000/ijcol.649Towards Automatic Subtitling: Assessing the Quality of Old and New ResourcesAlina KarakantaMatteo NegriMarco TurchiGrowing needs in localising multimedia content for global audiences have resulted in Neural Machine Translation (NMT) gradually becoming an established practice in the field of subtitling in order to reduce costs and turn-around times. Contrary to text translation, subtitling is subject to spatial and temporal constraints, which greatly increase the post-processing effort required to restore the NMT output to a proper subtitle format. In our previous work (Karakanta, Negri, and Turchi 2019), we identified several missing elements in the corpora available for training NMT systems specifically tailored for subtitling. In this work, we compare the previously studied corpora with MuST-Cinema, a corpus enabling end-to-end speech to subtitles translation, in terms of the conformity to the constraints of: 1) length and reading speed; and 2) proper line breaks. We show that MuST-Cinema conforms to these constraints and discuss the recent progress the corpus has facilitated in end-to-end speech to subtitles translation.https://journals.openedition.org/ijcol/649
spellingShingle Alina Karakanta
Matteo Negri
Marco Turchi
Towards Automatic Subtitling: Assessing the Quality of Old and New Resources
IJCoL
title Towards Automatic Subtitling: Assessing the Quality of Old and New Resources
title_full Towards Automatic Subtitling: Assessing the Quality of Old and New Resources
title_fullStr Towards Automatic Subtitling: Assessing the Quality of Old and New Resources
title_full_unstemmed Towards Automatic Subtitling: Assessing the Quality of Old and New Resources
title_short Towards Automatic Subtitling: Assessing the Quality of Old and New Resources
title_sort towards automatic subtitling assessing the quality of old and new resources
url https://journals.openedition.org/ijcol/649
work_keys_str_mv AT alinakarakanta towardsautomaticsubtitlingassessingthequalityofoldandnewresources
AT matteonegri towardsautomaticsubtitlingassessingthequalityofoldandnewresources
AT marcoturchi towardsautomaticsubtitlingassessingthequalityofoldandnewresources