Fast in-memory XPath search using compressed indexes

Show simple item record Arroyuelo, Diego Claude, Francisco Maneth, Sebastian Mäkinen, Veli Navarro, Gonzalo Nguyen, Kim Sirén, Jouni Välimäki, Niko 2010-11-25T15:35:00Z 2010-11-25T15:35:00Z 2010
dc.identifier.citation Arroyuelo , D , Claude , F , Maneth , S , Mäkinen , V , Navarro , G , Nguyen , K , Sirén , J & Välimäki , N 2010 , Fast in-memory XPath search using compressed indexes . in ICDE 2010 : 26th IEEE International Conference on Data Engineering . IEEE Computer Society , pp. 417-428 , International Conference on Data Engineering , Long Beach, California , United States , 01/03/2010 .
dc.identifier.citation conference
dc.identifier.other PURE: 2115734
dc.identifier.other PURE UUID: b29f1534-3830-4cc4-9516-f69aeaf9ca3a
dc.identifier.other WOS: 000286933100045
dc.identifier.other Scopus: 77952795119
dc.identifier.other ORCID: /0000-0003-4454-1493/work/28882766
dc.description.abstract A large fraction of an XML document typically consists of text data. The XPath query language allows text search via the equal, contains, and starts-with predicates. Such predicates can be efficiently implemented using a compressed self-index of the document's text nodes. Most queries, however, contain some parts querying the text of the document, plus some parts querying the tree structure. It is therefore a challenge to choose an appropriate evaluation order for a given query, which optimally leverages the execution speeds of the text and tree indexes. Here the SXSI system is introduced. It stores the tree structure of an XML document using a bit array of opening and closing brackets plus a sequence of labels, and stores the text nodes of the document using a global compressed self-index. On top of these indexes sits an XPath query engine that is based on tree automata. The engine uses fast counting queries of the text index in order to dynamically determine whether to evaluate top-down or bottom-up with respect to the tree structure. The resulting system has several advantages over existing systems: (1) on pure tree queries (without text search) such as the XPathMark queries, the SXSI system performs on par or better than the fastest known systems MonetDB and Qizx, (2) on queries that use text search, SXSI outperforms the existing systems by 1-3 orders of magnitude (depending on the size of the result set), and (3) with respect to memory consumption, SXSI outperforms all other systems for counting-only queries. en
dc.format.extent 12
dc.language.iso eng
dc.publisher IEEE Computer Society
dc.relation.ispartof ICDE 2010
dc.relation.isversionof 978-1-4244-5446-4
dc.rights.uri info:eu-repo/semantics/openAccess
dc.subject 113 Computer and information sciences
dc.title Fast in-memory XPath search using compressed indexes en
dc.type Conference contribution
dc.contributor.organization Department of Computer Science
dc.contributor.organization Helsinki Graduate School in Computer Science and Engineering (Hecse)
dc.contributor.organization Genome-scale Algorithmics research group / Veli Mäkinen
dc.contributor.organization Bioinformatics
dc.description.reviewstatus Peer reviewed
dc.rights.accesslevel openAccess
dc.type.version acceptedVersion

Files in this item

Total number of downloads: Loading...

Files Size Format View
icde2010.pdf 293.8Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record