MXTERMINATOR is a JAVA (JDK 1.1) implementation of the sentence boundary detector described in:

Jeffrey C. Reynar and Adwait Ratnaparkhi. A Maximum Entropy Approach to Identifying Sentence Boundaries. In Proceedings of the Fifth Conference on Applied Natural Language Processing, March 31-April 3, 1997. Washington, D.C.

USERS MUST ABIDE BY THE LICENSE INCLUDED WITH THIS DISTRIBUTION.

MXTERMINATOR is copyright (c) 1997 Adwait Ratnaparkhi

INSTRUCTIONS FOR USE

To use:

  1. Edit your CLASSPATH variable to include file mxpost.jar.
  2. Type:

    mxterminator projectdir < textfile

    where textfile contains raw text, and where projectdir is a project directory.

    The project directory eos.project contains a model trained on about 1 million words of Wall St. Journal text.

To train a new model:
  1. Edit your CLASSPATH variable to include the file mxpost.jar.
  2. Create an empty project directory
  3. Type:

    trainmxterminator projectdir traindata

    where projectdir is the newly created project directory, and where traindata contains one sentence per line.