Browsing by Subject "dependency syntax"

Sort by: Order: Results:

Now showing items 1-3 of 3
  • Yli-Jyrä, Anssi Mikael (The Association for Computational Linguistics, 2017)
    A recently proposed encoding for non- crossing digraphs can be used to imple- ment generic inference over families of these digraphs and to carry out first-order factored dependency parsing. It is now shown that the recent proposal can be substantially streamlined without information loss. The improved encoding is less dependent on hierarchical processing and it gives rise to a high-coverage bounded-depth approximation of the space of non- crossing digraphs. This subset is presented elegantly by a finite-state machine that recognises an infinite set of encoded graphs. The set includes more than 99.99% of the 0.6 million noncrossing graphs obtained from the UDv2 treebanks through planarisation. Rather than taking the low probability of the residual as a flat rate, it can be modelled with a joint probability distribution that is factorised into two underlying stochastic processes – the sentence length distribution and the related conditional distribution for deep nesting. This model points out that deep nesting in the streamlined code requires extreme sentence lengths. High depth is categorically out in common sentence lengths but emerges slowly at infrequent lengths that prompt further inquiry.
  • Yli-Jyrä, Anssi Mikael (Springer-Verlag, 2012)
    Arc contractions in syntactic dependency graphs can be used to decide which graphs are trees. The paper observes that these contractions can be expressed with weighted finite-state transducers (weighted FST) that operate on string-encoded trees. The observation gives rise to a finite-state parsing algorithm that computes the parse forest and extracts the best parses from it. The algorithm is customizable to functional and bilexical dependency parsing, and it can be extended to non-projective parsing via a multi-planar encoding with prior results on high recall. Our experiments support an analysis of projective parsing according to which the worst-case time complexity of the algorithm is quadratic to the sentence length, and linear to the overlapping arcs and the number of functional categories of the arcs. The results suggest several interesting directions towards efficient and highprecision dependency parsing that takes advantage of the flexibility and the demonstrated ambiguity-packing capacity of such a parser.
  • Yli-Jyrä, Anssi (The Association for Computational Linguistics, 2019)
    Proceedings of the International Conference on Finite-State Methods and Natural Language Processing
    Transition-based parsing of natural language uses transition systems to build directed annotation graphs (digraphs) for sentences. In this paper, we define, for an arbitrary ordered digraph, a unique decomposition and a corresponding linear encoding that are associated bijectively with each other via a new transition system. These results give us an efficient and succinct representation for digraphs and sets of digraphs. Based on the system and our analysis of its syntactic properties, we give structural bounds under which the set of encoded digraphs is restricted and becomes a context-free or a regular string language. The context-free restriction is essentially a superset of the encodings used previously to characterise properties of noncrossing digraphs and to solve maximal subgraphs problems. The regular restriction with a tight bound is shown to capture the Universal Dependencies v2.4 treebanks in linguistics.