Package org.egothor.stemmer
Class StemmerDictionaryParser
java.lang.Object
org.egothor.stemmer.StemmerDictionaryParser
Parser of line-oriented stemmer dictionary files.
Each non-empty logical line consists of a stem followed by zero or more known word variants separated by whitespace. The first token is interpreted as the canonical stem, and every following token on the same line is interpreted as a variant belonging to that stem.
Input lines are normalized to lower case using Locale.ROOT. Leading
and trailing whitespace is ignored.
The parser supports line remarks and trailing remarks. The remark markers
# and // terminate the logical content of the line, and the
remainder of that line is ignored.
This class is intentionally stateless and allocation-light so it can be used both by runtime loading and by offline compilation tooling.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interfaceCallback receiving one parsed dictionary line.static final recordImmutable parsing statistics. -
Method Summary
Modifier and TypeMethodDescriptionparse(Reader reader, String sourceDescription, StemmerDictionaryParser.EntryHandler entryHandler) Parses a dictionary from a reader.parse(String fileName, StemmerDictionaryParser.EntryHandler entryHandler) Parses a dictionary file from a path string.parse(Path path, StemmerDictionaryParser.EntryHandler entryHandler) Parses a dictionary file from a filesystem path.
-
Method Details
-
parse
public static StemmerDictionaryParser.ParseStatistics parse(Path path, StemmerDictionaryParser.EntryHandler entryHandler) throws IOException Parses a dictionary file from a filesystem path.- Parameters:
path- dictionary file pathentryHandler- handler receiving parsed entries- Returns:
- parsing statistics
- Throws:
NullPointerException- if any argument isnullIOException- if reading fails
-
parse
public static StemmerDictionaryParser.ParseStatistics parse(String fileName, StemmerDictionaryParser.EntryHandler entryHandler) throws IOException Parses a dictionary file from a path string.- Parameters:
fileName- dictionary file name or path stringentryHandler- handler receiving parsed entries- Returns:
- parsing statistics
- Throws:
NullPointerException- if any argument isnullIOException- if reading fails
-
parse
public static StemmerDictionaryParser.ParseStatistics parse(Reader reader, String sourceDescription, StemmerDictionaryParser.EntryHandler entryHandler) throws IOException Parses a dictionary from a reader.- Parameters:
reader- source readersourceDescription- logical source description for diagnosticsentryHandler- handler receiving parsed entries- Returns:
- parsing statistics
- Throws:
NullPointerException- if any argument isnullIOException- if reading or handler processing fails
-