Corpona – The Pythonic Way of Processing Corpora

Show simple item record

dc.contributor.author Alnajjar, Khalid
dc.contributor.author Hämäläinen, Mika
dc.date.accessioned 2021-03-11T15:16:00Z
dc.date.available 2021-03-11T15:16:00Z
dc.date.issued 2021
dc.identifier.uri http://hdl.handle.net/10138/327863
dc.description.abstract Every NLP researcher has to work with different XML or JSON encoded files. This often involves writing code that serves a very specific purpose. Corpona is meant to streamline any workflow that involves XML and JSON based corpora, by offering easy and reusable functionalities. The current functionalities relate to easy parsing and access to XML files, easy access to sub-items in a nested JSON structure and visualization of a complex data structure. Corpona is fully open-source and it is available on GitHub and Zenodo. fi
dc.language.iso en fi
dc.rights CC BY 4.0
dc.rights.uri https://creativecommons.org/licenses/by/4.0/deed.fi
dc.subject XML data fi
dc.subject corpus processing fi
dc.subject open source fi
dc.title Corpona – The Pythonic Way of Processing Corpora fi
dc.type Book Article fi
dc.identifier.doi https://doi.org/10.31885/9789515150257.3

Files in this item

Total number of downloads: Loading...

Files Size Format View
3_Alnajjar_Hama ... tilingual_Facilitation.pdf 208.1Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record

CC BY 4.0 Except where otherwise noted, this item's license is described as CC BY 4.0