Skip to content

Programmatic Usage

This document provides the programmatic entry point to Radixor.

Radixor follows a clear lifecycle:

  1. acquire a compiled stemmer,
  2. query it for patch commands,
  3. apply those commands to produce stems,
  4. reopen and extend the compiled structure when needed.

Conceptual model

Radixor is dictionary-driven, but runtime stemming does not operate by scanning raw dictionary files. A source dictionary is parsed as a sequence of canonical stems and their known variants. Each variant is converted into a compact patch command that transforms the variant into the stem, while the stem itself may optionally be stored as a canonical no-op patch. The mutable trie is then reduced into a compiled read-only structure that stores ordered values and their counts at addressed nodes.

Two consequences matter for developers:

  • the quality and coverage of stemming behavior depend on dictionary richness,
  • runtime usage is based on compiled patch-command lookup rather than on direct dictionary traversal.

This is why Radixor can generalize beyond explicitly listed forms and why compiled artifacts are well suited for deployment.

Documentation map

The programmatic API is easier to understand when split by developer task:

Core types

The main types involved in programmatic usage are:

  • FrequencyTrie.Builder<V> for mutable construction and extension,
  • FrequencyTrie<V> for the compiled read-only trie,
  • PatchCommandEncoder for creating and applying patch commands,
  • StemmerPatchTrieLoader for loading bundled or textual dictionaries,
  • StemmerPatchTrieBinaryIO for reading and writing compressed binary artifacts,
  • FrequencyTrieBuilders for reconstructing a mutable builder from a compiled trie,
  • ReductionMode and ReductionSettings for controlling compilation semantics.

For most developers, the best order is:

  1. Loading and Building Stemmers
  2. Querying and Ambiguity Handling
  3. Extending and Persisting Compiled Tries

Next steps