Class StemmerPatchTrieLoader

java.lang.Object
org.egothor.stemmer.StemmerPatchTrieLoader

public final class StemmerPatchTrieLoader extends Object
Loader of patch-command tries from bundled stemmer dictionaries.

Each dictionary is line-oriented. The first token on a line is interpreted as the stem, and all following tokens are treated as known variants of that stem.

For each line, the loader inserts:

  • the stem itself mapped to the canonical no-op patch command PatchCommandEncoder.NOOP_PATCH, when requested by the caller
  • every distinct variant mapped to the patch command transforming that variant to the stem

Parsing is delegated to StemmerDictionaryParser, which also supports line remarks introduced by # or //.

  • Method Details

    • load

      public static FrequencyTrie<String> load(StemmerPatchTrieLoader.Language language, boolean storeOriginal, ReductionSettings reductionSettings) throws IOException
      Loads a bundled dictionary using explicit reduction settings.
      Parameters:
      language - bundled language dictionary
      storeOriginal - whether the stem itself should be inserted using the canonical no-op patch command
      reductionSettings - reduction settings
      Returns:
      compiled patch-command trie
      Throws:
      NullPointerException - if any argument is null
      IOException - if the dictionary cannot be found or read
    • load

      public static FrequencyTrie<String> load(StemmerPatchTrieLoader.Language language, boolean storeOriginal, ReductionMode reductionMode) throws IOException
      Loads a bundled dictionary using default settings for the supplied reduction mode.
      Parameters:
      language - bundled language dictionary
      storeOriginal - whether the stem itself should be inserted using the canonical no-op patch command
      reductionMode - reduction mode
      Returns:
      compiled patch-command trie
      Throws:
      NullPointerException - if any argument is null
      IOException - if the dictionary cannot be found or read
    • load

      public static FrequencyTrie<String> load(Path path, boolean storeOriginal, ReductionSettings reductionSettings) throws IOException
      Loads a dictionary from a filesystem path using explicit reduction settings.
      Parameters:
      path - path to the dictionary file
      storeOriginal - whether the stem itself should be inserted using the canonical no-op patch command
      reductionSettings - reduction settings
      Returns:
      compiled patch-command trie
      Throws:
      NullPointerException - if any argument is null
      IOException - if the file cannot be opened or read
    • load

      public static FrequencyTrie<String> load(Path path, boolean storeOriginal, ReductionMode reductionMode) throws IOException
      Loads a dictionary from a filesystem path using default settings for the supplied reduction mode.
      Parameters:
      path - path to the dictionary file
      storeOriginal - whether the stem itself should be inserted using the canonical no-op patch command
      reductionMode - reduction mode
      Returns:
      compiled patch-command trie
      Throws:
      NullPointerException - if any argument is null
      IOException - if the file cannot be opened or read
    • load

      public static FrequencyTrie<String> load(String fileName, boolean storeOriginal, ReductionSettings reductionSettings) throws IOException
      Loads a dictionary from a filesystem path string using explicit reduction settings.
      Parameters:
      fileName - file name or path string
      storeOriginal - whether the stem itself should be inserted using the canonical no-op patch command
      reductionSettings - reduction settings
      Returns:
      compiled patch-command trie
      Throws:
      NullPointerException - if any argument is null
      IOException - if the file cannot be opened or read
    • load

      public static FrequencyTrie<String> load(String fileName, boolean storeOriginal, ReductionMode reductionMode) throws IOException
      Loads a dictionary from a filesystem path string using default settings for the supplied reduction mode.
      Parameters:
      fileName - file name or path string
      storeOriginal - whether the stem itself should be inserted using the canonical no-op patch command
      reductionMode - reduction mode
      Returns:
      compiled patch-command trie
      Throws:
      NullPointerException - if any argument is null
      IOException - if the file cannot be opened or read
    • loadBinary

      public static FrequencyTrie<String> loadBinary(Path path) throws IOException
      Loads a GZip-compressed binary patch-command trie from a filesystem path.
      Parameters:
      path - path to the compressed binary trie file
      Returns:
      compiled patch-command trie
      Throws:
      NullPointerException - if path is null
      IOException - if the file cannot be opened, decompressed, or read
    • loadBinary

      public static FrequencyTrie<String> loadBinary(String fileName) throws IOException
      Loads a GZip-compressed binary patch-command trie from a filesystem path string.
      Parameters:
      fileName - file name or path string
      Returns:
      compiled patch-command trie
      Throws:
      NullPointerException - if fileName is null
      IOException - if the file cannot be opened, decompressed, or read
    • loadBinary

      public static FrequencyTrie<String> loadBinary(InputStream inputStream) throws IOException
      Loads a GZip-compressed binary patch-command trie from an input stream.
      Parameters:
      inputStream - source input stream
      Returns:
      compiled patch-command trie
      Throws:
      NullPointerException - if inputStream is null
      IOException - if the stream cannot be decompressed or read
    • saveBinary

      public static void saveBinary(FrequencyTrie<String> trie, Path path) throws IOException
      Saves a compiled patch-command trie as a GZip-compressed binary file.
      Parameters:
      trie - compiled trie
      path - target file
      Throws:
      NullPointerException - if any argument is null
      IOException - if writing fails
    • saveBinary

      public static void saveBinary(FrequencyTrie<String> trie, String fileName) throws IOException
      Saves a compiled patch-command trie as a GZip-compressed binary file.
      Parameters:
      trie - compiled trie
      fileName - target file name or path string
      Throws:
      NullPointerException - if any argument is null
      IOException - if writing fails