Corpona – The Pythonic Way of Processing Corpora

Show full item record



Permalink

http://hdl.handle.net/10138/327863
Title: Corpona – The Pythonic Way of Processing Corpora
Author: Alnajjar, Khalid; Hämäläinen, Mika
Date: 2021
URI: http://hdl.handle.net/10138/327863
Abstract: Every NLP researcher has to work with different XML or JSON encoded files. This often involves writing code that serves a very specific purpose. Corpona is meant to streamline any workflow that involves XML and JSON based corpora, by offering easy and reusable functionalities. The current functionalities relate to easy parsing and access to XML files, easy access to sub-items in a nested JSON structure and visualization of a complex data structure. Corpona is fully open-source and it is available on GitHub and Zenodo.
Subject: XML data
corpus processing
open source
Rights: CC BY 4.0
https://creativecommons.org/licenses/by/4.0/deed.fi


Files in this item

Total number of downloads: Loading...

Files Size Format View
3_Alnajjar_Hama ... tilingual_Facilitation.pdf 208.1Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record

CC BY 4.0 Except where otherwise noted, this item's license is described as CC BY 4.0