Class PatchCommandEncoder

java.lang.Object
org.egothor.stemmer.PatchCommandEncoder

public final class PatchCommandEncoder extends Object
Encodes a compact patch command that transforms one word form into another and applies such commands back to source words.

The generated patch command follows the historical Egothor convention: instructions are serialized so that they are applied from the end of the source word toward its beginning. This keeps the command stream compact and matches the behavior expected by existing stemming data.

The encoder computes a minimum-cost edit script using weighted insert, delete, replace, and match transitions. The resulting trace is then serialized into the compact patch language.

This class is stateful and reuses internal dynamic-programming matrices across invocations to reduce allocation pressure during repeated use. Instances are therefore not suitable for unsynchronized concurrent access. The encode(String, String) method is synchronized so that a shared instance can still be used safely when needed.

  • Constructor Summary

    Constructors
    Constructor
    Description
    Creates an encoder with the traditional Egothor cost model: insert = 1, delete = 1, replace = 1, match = 0.
    PatchCommandEncoder(int insertCost, int deleteCost, int replaceCost, int matchCost)
    Creates an encoder with explicit operation costs.
  • Method Summary

    Modifier and Type
    Method
    Description
    static String
    apply(String source, String patchCommand)
    Applies a compact patch command to the supplied source word.
    encode(String source, String target)
    Produces a compact patch command that transforms source into target.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • PatchCommandEncoder

      public PatchCommandEncoder()
      Creates an encoder with the traditional Egothor cost model: insert = 1, delete = 1, replace = 1, match = 0.
    • PatchCommandEncoder

      public PatchCommandEncoder(int insertCost, int deleteCost, int replaceCost, int matchCost)
      Creates an encoder with explicit operation costs.
      Parameters:
      insertCost - cost of inserting one character
      deleteCost - cost of deleting one character
      replaceCost - cost of replacing one character
      matchCost - cost of keeping one equal character unchanged
  • Method Details

    • encode

      public String encode(String source, String target)
      Produces a compact patch command that transforms source into target.
      Parameters:
      source - source word form
      target - target word form
      Returns:
      compact patch command, or null when any argument is null
    • apply

      public static String apply(String source, String patchCommand)
      Applies a compact patch command to the supplied source word.

      This method operates directly on serialized opcodes rather than mapping them to another representation. That keeps the hot path small and avoids unnecessary indirection during patch application.

      For compatibility with the historical behavior, malformed patch input that causes index failures results in the original source word being returned unchanged.

      Parameters:
      source - original source word
      patchCommand - compact patch command
      Returns:
      transformed word, or null when source is null