Class FrequencyTrie<V>
- Type Parameters:
V- value type
String keys to one or more values with
frequency tracking.
A key may be associated with multiple values. Each value keeps the number of
times it was inserted during the build phase. The method get(String)
returns the locally most frequent value stored at the terminal node of the
supplied key, while getAll(String) returns all locally stored values
ordered by descending frequency.
If multiple values have the same local frequency, their ordering is deterministic. The preferred value is selected by the following tie-breaking rules, in order:
- shorter
Stringrepresentation wins, based onvalue.toString() - if the lengths are equal, lexicographically lower
Stringrepresentation wins - if the textual representations are still equal, first-seen insertion order remains stable
Values may be stored at any trie node, including internal nodes and leaf nodes. Therefore, reduction and canonicalization always operate on both the node-local terminal values and the structure of all descendant edges.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final classBuilder ofFrequencyTrie.static interfaceCodec used to persist values stored in the trie. -
Method Summary
Modifier and TypeMethodDescriptionReturns the most frequent value stored at the node addressed by the supplied key.V[]Returns all values stored at the node addressed by the supplied key, ordered by descending frequency.List<ValueCount<V>> getEntries(String key) Returns all values stored at the node addressed by the supplied key together with their occurrence counts, ordered by the same rules asgetAll(String).static <V> FrequencyTrie<V> readFrom(InputStream inputStream, IntFunction<V[]> arrayFactory, FrequencyTrie.ValueStreamCodec<V> valueCodec) Reads a compiled trie from the supplied input stream.intsize()Returns the number of canonical compiled nodes reachable from the root.voidwriteTo(OutputStream outputStream, FrequencyTrie.ValueStreamCodec<V> valueCodec) Writes this compiled trie to the supplied output stream.
-
Method Details
-
get
Returns the most frequent value stored at the node addressed by the supplied key.If multiple values have the same local frequency, the returned value is selected deterministically by shorter
toString()value first, then by lexicographically lowertoString(), and finally by stable first-seen order.- Parameters:
key- key to resolve- Returns:
- most frequent value, or
nullif the key does not exist or no value is stored at the addressed node - Throws:
NullPointerException- ifkeyisnull
-
getAll
Returns all values stored at the node addressed by the supplied key, ordered by descending frequency.If multiple values have the same local frequency, the ordering is deterministic by shorter
toString()value first, then by lexicographically lowertoString(), and finally by stable first-seen order.The returned array is a defensive copy.
- Parameters:
key- key to resolve- Returns:
- all values stored at the addressed node, ordered by descending frequency; returns an empty array if the key does not exist or no value is stored at the addressed node
- Throws:
NullPointerException- ifkeyisnull
-
getEntries
Returns all values stored at the node addressed by the supplied key together with their occurrence counts, ordered by the same rules asgetAll(String).The returned list is aligned with the arrays returned by
getAll(String)and the internal compiled count representation.The returned list is immutable.
In reduction modes that merge semantically equivalent subtrees, the returned counts may be aggregated across multiple original build-time nodes that were reduced into the same canonical compiled node.
- Parameters:
key- key to resolve- Returns:
- immutable ordered list of value-count entries; returns an empty list if the key does not exist or no value is stored at the addressed node
- Throws:
NullPointerException- ifkeyisnull
-
writeTo
public void writeTo(OutputStream outputStream, FrequencyTrie.ValueStreamCodec<V> valueCodec) throws IOException Writes this compiled trie to the supplied output stream.The binary format is versioned and preserves canonical shared compiled nodes, therefore the serialized representation remains compact even for tries reduced by subtree merging.
The supplied codec is responsible for persisting individual values of type
V.- Parameters:
outputStream- target output streamvalueCodec- codec used to write values- Throws:
NullPointerException- if any argument isnullIOException- if writing fails
-
readFrom
public static <V> FrequencyTrie<V> readFrom(InputStream inputStream, IntFunction<V[]> arrayFactory, FrequencyTrie.ValueStreamCodec<V> valueCodec) throws IOException Reads a compiled trie from the supplied input stream.The caller must provide the same value codec semantics that were used during persistence as well as the array factory required for typed result arrays.
- Type Parameters:
V- value type- Parameters:
inputStream- source input streamarrayFactory- factory used to create typed arraysvalueCodec- codec used to read values- Returns:
- deserialized compiled trie
- Throws:
NullPointerException- if any argument isnullIOException- if reading fails or the binary format is invalid
-
size
public int size()Returns the number of canonical compiled nodes reachable from the root.The returned value reflects the size of the final reduced immutable trie, not the number of mutable build-time nodes inserted before reduction. Shared canonical subtrees are counted only once.
- Returns:
- number of canonical compiled nodes in this trie
-