Towards Automatic Subtitling: Assessing the Quality of Old and New Resources
Growing needs in localising multimedia content for global audiences have resulted in Neural Machine Translation (NMT) gradually becoming an established practice in the field of subtitling in order to reduce costs and turn-around times. Contrary to text translation, subtitling is subject to spatial a...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Accademia University Press
2020-06-01
|
| Series: | IJCoL |
| Online Access: | https://journals.openedition.org/ijcol/649 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846134098552160256 |
|---|---|
| author | Alina Karakanta Matteo Negri Marco Turchi |
| author_facet | Alina Karakanta Matteo Negri Marco Turchi |
| author_sort | Alina Karakanta |
| collection | DOAJ |
| description | Growing needs in localising multimedia content for global audiences have resulted in Neural Machine Translation (NMT) gradually becoming an established practice in the field of subtitling in order to reduce costs and turn-around times. Contrary to text translation, subtitling is subject to spatial and temporal constraints, which greatly increase the post-processing effort required to restore the NMT output to a proper subtitle format. In our previous work (Karakanta, Negri, and Turchi 2019), we identified several missing elements in the corpora available for training NMT systems specifically tailored for subtitling. In this work, we compare the previously studied corpora with MuST-Cinema, a corpus enabling end-to-end speech to subtitles translation, in terms of the conformity to the constraints of: 1) length and reading speed; and 2) proper line breaks. We show that MuST-Cinema conforms to these constraints and discuss the recent progress the corpus has facilitated in end-to-end speech to subtitles translation. |
| format | Article |
| id | doaj-art-6b10c8ee0b0d491f8bfd87846813f31f |
| institution | Kabale University |
| issn | 2499-4553 |
| language | English |
| publishDate | 2020-06-01 |
| publisher | Accademia University Press |
| record_format | Article |
| series | IJCoL |
| spelling | doaj-art-6b10c8ee0b0d491f8bfd87846813f31f2024-12-09T13:21:52ZengAccademia University PressIJCoL2499-45532020-06-0161637610.4000/ijcol.649Towards Automatic Subtitling: Assessing the Quality of Old and New ResourcesAlina KarakantaMatteo NegriMarco TurchiGrowing needs in localising multimedia content for global audiences have resulted in Neural Machine Translation (NMT) gradually becoming an established practice in the field of subtitling in order to reduce costs and turn-around times. Contrary to text translation, subtitling is subject to spatial and temporal constraints, which greatly increase the post-processing effort required to restore the NMT output to a proper subtitle format. In our previous work (Karakanta, Negri, and Turchi 2019), we identified several missing elements in the corpora available for training NMT systems specifically tailored for subtitling. In this work, we compare the previously studied corpora with MuST-Cinema, a corpus enabling end-to-end speech to subtitles translation, in terms of the conformity to the constraints of: 1) length and reading speed; and 2) proper line breaks. We show that MuST-Cinema conforms to these constraints and discuss the recent progress the corpus has facilitated in end-to-end speech to subtitles translation.https://journals.openedition.org/ijcol/649 |
| spellingShingle | Alina Karakanta Matteo Negri Marco Turchi Towards Automatic Subtitling: Assessing the Quality of Old and New Resources IJCoL |
| title | Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
| title_full | Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
| title_fullStr | Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
| title_full_unstemmed | Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
| title_short | Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
| title_sort | towards automatic subtitling assessing the quality of old and new resources |
| url | https://journals.openedition.org/ijcol/649 |
| work_keys_str_mv | AT alinakarakanta towardsautomaticsubtitlingassessingthequalityofoldandnewresources AT matteonegri towardsautomaticsubtitlingassessingthequalityofoldandnewresources AT marcoturchi towardsautomaticsubtitlingassessingthequalityofoldandnewresources |