All Classes and Interfaces

Class
Description
Default implementation of all NGramLanguageModel functionality except AbstractArrayEncodedNgramLanguageModel.getLogProb(int[], int, int).
Default implementation of all ContextEncodedNgramLanguageModel functionality except AbstractContextEncodedNgramLanguageModel.getLogProb(long, int, int, LmContextInfo), {@link #getOffsetForNgram(int[], int, int), and {
 
 
Contains some limited shared functionality between Custom[type]Maps
 
 
 
 
Just a fancy-pants comment.
Fields annotated with this annotation will have their memory usage added to the memory usage map returned by countApproximateMemoryUsage.
 
A parser for ARPA LM files.
Callback that is called for each n-gram in the collection
This class wraps ArrayEncodedNgramLanguageModel with a cache.
A direct-mapped cache.
 
Top-level interface for an n-gram language model which accepts n-gram in an array-of-integers encoding.
 
Language model implementation which uses Kneser-Ney-style backoff computation.
 
Wraps a portion of a long[] array with iterator-like functionality over a stream of bits.
 
List which returns special boundary symbols when get() is called outside the range of the list.
 
 
 
 
Computes the log probability of a list of files.
Stores some configuration options, with useful defaults.
This class wraps ContextEncodedNgramLanguageModel with a cache.
 
 
Interface for language models which expose the internal context-encoding for more efficient queries.
 
Simple class for returning context offsets
 
Language model implementation which uses Kneser-Ney style backoff computation.
A map from objects to doubles.
 
An array with a custom word "width" in bits.
Reader callback which adds n-grams to an NgramMap
Reads in n-gram count collections in the format that the Google n-grams Web1T corpus comes in.
 
Indexer<E extends Comparable<E>>
Maintains a two-way map between a set of objects and contiguous integers from 0 to the number of objects.
Some IO utility functions.
Utilities for dealing with Iterators
Wraps a two-level iteration scenario in an iterator.
Wraps a base iterator with a transformation function.
Stored type and token counts necessary for estimating a Kneser-Ney language model
Warning: type counts are stored internally as 32-bit ints.
Class for producing a Kneser-Ney language model in ARPA format from raw text.
Class for producing a Kneser-Ney language model in ARPA format from raw text.
 
Callback that is called for each n-gram in the collection
This class contains a number of static methods for reading/writing/estimating n-gram language models.
Basic logging singleton class.
Convenience class for stringing together loggers.
Logging interface.
Default logging goes nowhere.
Logs to System.out and System.err
 
 
Open address hash map with linear probing.
 
 
Open address hash map with linear probing.
Estimates a Kneser-Ney language model from raw text, and writes the language model out in ARPA-format.
Given a language model in ARPA format, builds a binary representation of the language model and writes it to disk.
Given a directory in Google n-grams format, builds a binary representation of a stupid-backoff language model language model and writes it to disk.
Like MakeLmBinaryFromGoogle, except it only writes the NgramMap portion of the LM, meaning the binary does not contain the vocabulary.
Experimental class for reading Moses phrase tables and storing them efficiently in memory using a trie.
 
 
Class for representing phrase tables efficiently in memory.
Taken/modified from http://d3s.mff.cuni.cz/~holub/sw/javamurmurhash/MurmurHash.java
Wraps an NgramMap as an Iterable, so it is easy to iterate over the n-grams and associated values.
Base interface for an n-gram language model, which exposes only inefficient convenience methods.
 
 
 
Reader callback which adds n-grams to an NgramMap
Wraps an NgramMap as a Java Map, with ngrams of all orders mixed together.
Callback that is called for each n-gram in the collection
Wraps an NgramMap as an Iterable, so it is easy to iterate over the n-grams of a particular order.
Wraps an NgramMap as a Java Map, but only ngrams of a particular order.
A generic-typed pair of objects.
 
Stored type and token counts necessary for estimating a Kneser-Ney language model
 
 
 
 
 
Implementation of a WordIndexer in which words are represented as strings.
 
Language model implementation which uses stupid backoff (Brants et al., 2007) computation.
Class for reading raw text files.
Provides a map from objects to non-negative integers.
 
 
Manages storage of arbitrary values in an NgramMap
 
Enumerates words in the vocabulary of a language model.