Class StemmerKnowledgeExperiment

java.lang.Object
org.egothor.stemmer.StemmerKnowledgeExperiment

public final class StemmerKnowledgeExperiment extends Object
Evaluates how stemming quality degrades when the compiled trie is built from only a deterministic subset of the available dictionary knowledge.

The experiment operates on whole dictionary entries. For a chosen knowledge percentage, each parsed dictionary line is deterministically included or excluded from the training subset using a seeded SplittableRandom. The resulting subset is compiled into a FrequencyTrie, while the evaluation is performed against all word forms from the original dictionary.

Two lookup APIs are evaluated:

  • Field Details

    • MINIMUM_KNOWLEDGE_PERCENT

      public static final int MINIMUM_KNOWLEDGE_PERCENT
      Minimum supported knowledge percentage.
      See Also:
    • MAXIMUM_KNOWLEDGE_PERCENT

      public static final int MAXIMUM_KNOWLEDGE_PERCENT
      Maximum supported knowledge percentage.
      See Also:
    • KNOWLEDGE_PERCENT_STEP

      public static final int KNOWLEDGE_PERCENT_STEP
      Step between adjacent evaluated knowledge percentages.
      See Also:
  • Constructor Details

    • StemmerKnowledgeExperiment

      public StemmerKnowledgeExperiment()
      Creates a new experiment harness.
  • Method Details