The coordination task
The aim of this task is to improve the accuracy of the C&C parser on coordination constructions.
Bibliography
A bibliography about coordination disambiguation can be found here.
Examples of coordination
An early corpus can be accessed from this page. Limited to 30 words and less sentences containing 'and' or 'or'. Obtained from 5000 pages of Wikipedia data. The CCG parse is collapsed onto one line, with the original sentence afterwards. There is also a version showing coordinations only, where I have started marking the sentences with an incorrect and/or parse.
Work so far
A feature-based implementation has been produced. The features considered are:
- similarity of coordinates according to WordNet? edge-counting measure
- similarity of n-grams to which the coordinates belong (based on parts of speech)
- distance of the second coordinate to the coordination
- distance between the two coordinates
- original rank
This implementation gives improvements in F-score in the area of three points, both on Wikipedia data and on Depbank data.