Preprocessing Greek Papyri for Linguistic Annotation

Show full item record



Vierros , M K & Henriksson , E I 2017 , ' Preprocessing Greek Papyri for Linguistic Annotation ' , Journal of Data Mining and Digital Humanities , vol. Special Issue on Computer-Aided Processing of Intertextuality in Ancient Languages , no. June 8 2017 . < >

Title: Preprocessing Greek Papyri for Linguistic Annotation
Author: Vierros, Marja Kaisa; Henriksson, Erik Ilmari
Contributor organization: Department of World Cultures 2010-2017
Date: 2017-06-08
Language: eng
Number of pages: 15
Belongs to series: Journal of Data Mining and Digital Humanities
ISSN: 2416-5999
Abstract: Greek documentary papyri form an important direct source for Ancient Greek. It has been exploited surprisingly little in Greek linguistics due to a lack of good tools for searching linguistic structures. This article presents a new tool and digital platform, “Sematia”, which enables transforming the digital texts available in TEI EpiDoc XML format to a format which can be morphologically and syntactically annotated (treebanked), and where the user can add new metadata concerning the text type, writer and handwriting of each act of writing. An important aspect in this process is to take into account the original surviving writing vs. the standardization of language and supplements made by the editors. This is performed by creating two different layers of the same text. The platform is in its early development phase. Ongoing and future developments, such as tagging linguistic variation phenomena as well as queries performed within Sematia, are discussed at the end of the article.
Subject: 6121 Languages
linguistic annotation
TEI Epidoc XML
Peer reviewed: Yes
Rights: cc_by
Usage restriction: openAccess
Self-archived version: publishedVersion

Files in this item

Total number of downloads: Loading...

Files Size Format View
pdf.pdf 650.7Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record