Frequently Asked Questions
It takes an afwul long time to parse one simple sentence. Why?
It is because the parser needs to load the model which is very time consuming. There are two ways around this. You can give the parser a file with many sentences so it only has to load the model once. Or you can use the parser in server mode.
I don't get a parse for this simple sentence which is in the input file. What's wrong?
Check whether the input file has a new line after the sentence you want to parse. The parser expects each sentence on a line followed by a new line.
I get the following error when trying to run bin/parser:
parser:could not open model configuration file for reading:../models/parser//config
This is a bug in version 1.00 of the models. To fix it just create an empty configuration file in the parser model directory:
touch models/parser/config
Can the parser output more than one analysis?
No. The current version of the parser is only able to output the highest scoring analysis. This may change in future versions.
Does the parser do any tokenisation or sentence boundary detection?
Currently no. So it's important that the input fed to the parser consists of a single sentence per line tokenised according to the Penn Treebank standard (details of which are on the Penn Treebank website).
Can the parser output more than one output format (eg CCG dependencies and Grammatical Relations) at the same time?
Currently no.
Where can I find out what the various CCG categories mean?
The CCG pages contain links to a number of publications on CCG, including tutorial papers: http://groups.inf.ed.ac.uk/ccg/publications.html
What do the various features on the S categories mean, such as [dcl]?
Look at Julia Hockenmaier's thesis, available from the above web page.