Changes between Version 9 and Version 10 of Biomedical

Show
Ignore:
Author:
laura (IP: 128.232.10.86)
Timestamp:
09/16/09 00:06:15 (2 months ago)
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Biomedical

    v9 v10  
    3131 
    3232Supertagger model trained on CCGbank (Wall Street Journal) plus ten copies of 1000 !MedLine sentences (the first 1000 of GENIA) manually tagged with CCG lexical categories. 
     33 
     34To use this model with the parser or the candc executable: 
     35 
     36{{{ 
     37% bin/parser --super models/pos_genia 
     38}}} 
     39 
     40{{{ 
     41% bin/candc --super models/pos_genia 
     42}}} 
     43 
     44Note that there is no separate parser model for biomedical parsing: the parser was adapted by retraining the POS tagger and supertagger only, taking advantage of the lexicalized nature of CCG parsing. However, to increase coverage, it is recommended that you use the settings (DESCRIBE). 
     45 
     46=== Markedup File === 
     47 
     48It's important to have the most up-to-date markedup file since the biomedical pipeline uses a small number of CCG categories that aren't present in CCGbank and hence were not in version 1.00 of the markedup file. 
     49 
     50Moreover, you may want the version of the markedup file that produces grammatical relations (GRs) in Stanford Dependency Format, though this isn't necessary unless your application particularly requires Stanford dependencies. 
     51 
     52Use the markedup file with this command: 
     53 
     54If you're using the Stanford version, it's particularly important that you use the post-processing script. 
     55 
     56=== Post-Processing Script === 
     57 
     58There is one for DepBank style dependencies. 
     59 
     60However the one for Stanford dependencies is even more important.