Natural Language Processing with Java Cookbook
上QQ阅读APP看书,第一时间看更新

How to do it...

The necessary steps include the following:

  1. Add the following imports to your project:
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import opennlp.tools.sentdetect.SentenceDetectorME;
import opennlp.tools.sentdetect.SentenceModel;
import opennlp.tools.util.Span;
  1. Next, add the declaration for the sample text as an instance variable:
private static String text = 
"We will start with a simple sentence. However, is it "
+ "possible for a sentence to end with a question "
+ "mark? Obviously that is possible! Another "
+ "complication is the use of a number such as 56.32 "
+ "or ellipses such as ... Ellipses may be found ... "
+ "with a sentence! Of course, we may also find the "
+ "use of abbreviations such as Mr. Smith or "
+ "Dr. Jones.";
  1. Add the following try block to your main method:
try (InputStream inputStream = new FileInputStream(
new File("en-sent.bin"))) {
...
} catch (FileNotFoundException ex) {
// Handle exceptions
} catch (IOException ex) {
// Handle exceptions
}
  1. Insert this next sequence into the try block. This will instantiate the model, perform sentence detection, and then display the results:
SentenceModel sentenceModel = new SentenceModel(inputStream);
SentenceDetectorME sentenceDetector =
new SentenceDetectorME(sentenceModel);
String sentences[] = sentenceDetector.sentDetect(text);
for (String sentence : sentences) {
System.out.println("[" + sentence + "]");
}
  1. Execute your program. You will get the following output:
[We will start with a simple sentence.]
[However, is it possible for a sentence to end with a question mark?]
[Obviously that is possible!.]
[Another complication is the use of a number such as 56.32 or ellipses such as ... Ellipses may be found ... within a sentence!]
[Of course, we may also find the use of abbreviations such as Mr. Smith or Dr. Jones.]