De très grands corpus pour l’étude diachronique du français : annotations, informations métalinguistiques et paratextes

Corpus linguistics and very large equipped corpora have been used for several decades for the diachronic study of the French language. They have made it possible to refine our knowledge of its evolution, and to bring to light phenomena that had not been studied until then. However, their constitutio...

Full description

Saved in:
Bibliographic Details
Main Author: Mathieu Goux
Format: Article
Language:fra
Published: Humanistica 2024-06-01
Series:Humanités Numériques
Subjects:
Online Access:https://journals.openedition.org/revuehn/3930
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841549893094604800
author Mathieu Goux
author_facet Mathieu Goux
author_sort Mathieu Goux
collection DOAJ
description Corpus linguistics and very large equipped corpora have been used for several decades for the diachronic study of the French language. They have made it possible to refine our knowledge of its evolution, and to bring to light phenomena that had not been studied until then. However, their constitution and their search functionalities suffer from certain blind spots originating in the annotation procedure they used and the way they consider philological, paleographic and paratextual information. This contribution reviews these difficulties and makes several suggestions for the development of procedures of encoding and analysis.
format Article
id doaj-art-e32e292443894e8eb39d59f1d797fa7b
institution Kabale University
issn 2736-2337
language fra
publishDate 2024-06-01
publisher Humanistica
record_format Article
series Humanités Numériques
spelling doaj-art-e32e292443894e8eb39d59f1d797fa7b2025-01-10T12:52:19ZfraHumanisticaHumanités Numériques2736-23372024-06-01910.4000/11wmvDe très grands corpus pour l’étude diachronique du français : annotations, informations métalinguistiques et paratextesMathieu GouxCorpus linguistics and very large equipped corpora have been used for several decades for the diachronic study of the French language. They have made it possible to refine our knowledge of its evolution, and to bring to light phenomena that had not been studied until then. However, their constitution and their search functionalities suffer from certain blind spots originating in the annotation procedure they used and the way they consider philological, paleographic and paratextual information. This contribution reviews these difficulties and makes several suggestions for the development of procedures of encoding and analysis.https://journals.openedition.org/revuehn/3930encodingnatural language processingcorpus buildingannotationlinguistics and language sciences
spellingShingle Mathieu Goux
De très grands corpus pour l’étude diachronique du français : annotations, informations métalinguistiques et paratextes
Humanités Numériques
encoding
natural language processing
corpus building
annotation
linguistics and language sciences
title De très grands corpus pour l’étude diachronique du français : annotations, informations métalinguistiques et paratextes
title_full De très grands corpus pour l’étude diachronique du français : annotations, informations métalinguistiques et paratextes
title_fullStr De très grands corpus pour l’étude diachronique du français : annotations, informations métalinguistiques et paratextes
title_full_unstemmed De très grands corpus pour l’étude diachronique du français : annotations, informations métalinguistiques et paratextes
title_short De très grands corpus pour l’étude diachronique du français : annotations, informations métalinguistiques et paratextes
title_sort de tres grands corpus pour l etude diachronique du francais annotations informations metalinguistiques et paratextes
topic encoding
natural language processing
corpus building
annotation
linguistics and language sciences
url https://journals.openedition.org/revuehn/3930
work_keys_str_mv AT mathieugoux detresgrandscorpuspourletudediachroniquedufrancaisannotationsinformationsmetalinguistiquesetparatextes