Apertium

Read an input stream and store NPs and anaphora

Write a program that reads an input stream in Apertium stream format and stores NPs and pro-anaphor. The program should have a variable window in terms of sentences.

The program should have an XML file describing the structure of NPs, it should look something like:

<chunker>
<section-def-cats>
<def-cat n="det">
  <cat-item tags="det.*"/>
</def-cat>
<def-cat n="adj">
  <cat-item tags="adj.*"/>
</def-cat>
<def-cat n="nom">
  <cat-item tags="n.*"/>
</def-cat>
<def-cat n="prop">
  <cat-item tags="np.*"/>
</def-cat>
<def-cat n="pers">
  <cat-item tags="prn.pers.*"/>
</def-cat>
</section-def-cats>
<section-def-markables>
<markable c="this is a comment">
<pattern>
  <pattern-item n="det"/>
  <pattern-item n="adj"/>
  <pattern-item n="nom" head/>
</pattern>
</markable>
<markable c="this is a comment">
<pattern>
  <pattern-item n="prop" head/>
</pattern>
</markable>
</section-def-markables>
<section-def-anaphora>
  <anaphor c="this is a comment">
   <pattern>
    <pattern-item n="det"/>
  </pattern>
  </anaphor>
</section-def-anaphora>
</chunker>

After N sentences it should print out a summary of the NPs it has found and the anaphora it has found and the intersection in terms of tags between each of the anaphora and each of the NPs.

Task tags

  • python
  • c++
  • anaphora

Students who completed this task

Ryan A. Chi

Task type

  • code Code
close

2017