This document summarizes research on improving transition-based dependency parsing using a bootstrapping technique. It presents the parsing algorithm, experimental setup evaluating models on English and Czech data, and results showing the bootstrapping approach achieves near state-of-the-art performance while maintaining linear parsing time complexity. It also introduces an open-source parser implementation called ClearParser.
5. Bohnet: the best graph-based system for CoNLL’09 (the overallrank is in the parenthesis).
6.
7. ‘Nivre’ indicates Nivre’s swap algorithm that showed an expected linear time non-projective parsing complexity (Nivre, 2009), of which we used the implementation from MaltParser.
8.
9. L is a dependency label, and i, j, k are indices of their corresponding word tokens.
10. The initial state is ([0], [ ], [1, …, n], E); 0 corresponds to the root node.
11. The final state is (λ1, λ2, [ ], E); the algorithm terminates when all tokens in β are consumed.
12. Left-PopL and Left-ArcL are performed when wj is the head of wi with a dependency L. : Left-Pop removes wi from λ1, assuming that the token is no longer needed. : Left-Arc keepswiso it can be the head of some token wj<k≤n in β.
14. Shift is performed when : DT – λ1is empty. : NT – There is no token in λ1 that is either the head or a dependent of wj.
15.
16.
17. Parse history can be used as features. : Parsing complexity is still preserved. Can non-projective dependency parsing be any faster?
18. # of non-projective dependencies <<< # of projective dependencies. : Perform projective parsing for most cases and non-projective parsing only when it is needed.
19. Choi and Nicolov, 2009. : Added a non-deterministic Shifttransition to Nivre’s list-based non-projective algorithmreduced the search space achieved linear time parsing speed in practice.
28. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
29. Stop the procedure when the parsing accuracy of the current cross-validation is lower than the one from the previous iteration.