it.unimi.dsi.mg4j.io
Class InputBitStream

java.lang.Object
  extended byit.unimi.dsi.mg4j.io.InputBitStream

public class InputBitStream
extends Object

Bit-level input stream.

IMPORTANT: In MG4J 0.6, this class has been completely rewritten. It is much faster, but there are also incompabilities. Mainly, unary representations are one-terminated sequences of zeroes (instead of zero-terminated sequences of ones), and position(long) sets a bit, not a byte offset.

This class wraps any InputStream so that you can treat it as bit stream. Constructors and methods closely resemble those of InputStream. Data can be read from such a stream in several ways: reading an integer or long in fixed-width, unary, γ, δ, ζ and Golomb coding, or reading a number of bits that will be stored in a vector of bytes. There is limited support for mark(int)/reset() operations.

This class can also wrap a byte array; this is much more lightweight than wrapping a FastByteArrayInputStream wrapping the array. Overflowing the array will cause an EOFException.

Note that when reading using a vector of bytes bits are read in the stream format (see OutputBitStream): the first bit is bit 7 of the first byte, the eighth bit is bit 0 of the first byte, the ninth bit is bit 7 of the second byte and so on. When reading integers using some coding, instead, they are stored in the standard way, that is, in the lower bits.

Additional features:

This class is not synchronised. If multiple threads access an instance of this class concurrently, they must be synchronised externally.

Since:
0.1
Author:
Sebastiano Vigna
See Also:
OutputBitStream, java.io

Field Summary
static int DEFAULT_BUFFER_SIZE
          The default size of the byte buffer in bytes (16Ki).
static int UNGET_BUFFER_SIZE
          The size of the unget(boolean) buffer in bytes.
 
Constructor Summary
protected InputBitStream()
          This (non-public) constructor exists just to provide fake initialisation.
  InputBitStream(byte[] a)
          Creates a new input bit stream wrapping a given byte array.
  InputBitStream(File file)
          Creates a new input bit stream reading from a file.
  InputBitStream(File file, int bufSize)
          Creates a new input bit stream reading from a file.
  InputBitStream(InputStream is)
          Creates a new input bit stream wrapping a given input stream using a buffer of size DEFAULT_BUFFER_SIZE.
  InputBitStream(InputStream is, int bufSize)
          Creates a new input bit stream wrapping a given input stream with a specified buffer size.
  InputBitStream(String name)
          Creates a new input bit stream reading from a file.
  InputBitStream(String name, int bufSize)
          Creates a new input bit stream reading from a file.
 
Method Summary
 void align()
          Aligns the stream.
 long available()
          Returns the number of bits that can be read (or skipped over) from this bit stream without blocking by the next caller of a method.
 void close()
          Closes the bit stream.
 void flush()
          Flushes the bit stream.
 void mark(int readLimit)
          Marks the current position in this input stream.
 boolean markSupported()
          Tests if this stream supports the mark(int) and reset() methods.
 boolean overflow()
          Gets the overflow flag.
 void overflow(boolean overflow)
          Sets the overflow flag.
 boolean pastEOF()
          Checks whether we are past EOF.
 void position(long position)
          Sets this stream bit position, if it is based on a RepositionableStream or on a FileChannel.
 void read(byte[] bits, int len)
          Reads a sequence of bits.
 int readBit()
          Reads a bit.
 long readBits()
          Returns the number of bits read from this bit stream.
 void readBits(long readBits)
          Sets the number of bits read from this bit stream.
 int readDelta()
          Reads a natural number in δ coding.
 int readGamma()
          Reads a natural number in γ coding.
 int readGolomb(int b)
          Reads a natural number in Golomb coding.
 int readGolomb(int b, int log2b)
          Reads a natural number in Golomb coding.
 int readInt(int len)
          Reads a fixed number of bits into an integer.
 long readLong(int len)
          Reads a fixed number of bits into a long.
 long readLongDelta()
          Reads a long natural number in δ coding.
 long readLongGamma()
          Reads a long natural number in γ coding.
 long readLongGolomb(long b)
          Reads a long natural number in Golomb coding.
 long readLongGolomb(long b, int log2b)
          Reads a long natural number in Golomb coding.
 long readLongMinimalBinary(long b)
          Reads a long natural number in a limited range using a minimal binary coding.
 long readLongMinimalBinary(long b, int log2b)
          Reads a long natural number in a limited range using a minimal binary coding.
 long readLongNibble()
          Reads a long natural number in variable-length nibble coding.
 long readLongSkewedGolomb(long b)
          Reads a long natural number in skewed Golomb coding.
 long readLongUnary()
          Reads a long natural number in unary coding.
 long readLongZeta(int k)
          Reads a long natural number in ζ coding.
 int readMinimalBinary(int b)
          Reads a natural number in a limited range using a minimal binary coding.
 int readMinimalBinary(int b, int log2b)
          Reads a natural number in a limited range using a minimal binary coding.
 int readNibble()
          Reads a natural number in variable-length nibble coding.
 int readSkewedGolomb(int b)
          Reads a natural number in skewed Golomb coding.
 int readUnary()
          Reads a natural number in unary coding.
 int readZeta(int k)
          Reads a natural number in ζ coding.
 void reset()
          Repositions this bit stream to the position at the time the mark(int) method was last called.
 long skip(long n)
          Skips the given number of bits.
 void unget(boolean bit)
          Deprecated. As of MG4J 0.6, replaced by ungetBit(boolean).
 void unget(int bit)
          Deprecated. As of MG4J 0.6, replaced by ungetBit(int).
 void ungetBit(boolean bit)
          Ungets a bit.
 void ungetBit(int bit)
          Ungets a bit.
 void ungetInt(int x, int len)
          Ungets an integer.
 void ungetLong(long x, int len)
          Ungets a long.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

UNGET_BUFFER_SIZE

public static final int UNGET_BUFFER_SIZE
The size of the unget(boolean) buffer in bytes.

See Also:
Constant Field Values

DEFAULT_BUFFER_SIZE

public static final int DEFAULT_BUFFER_SIZE
The default size of the byte buffer in bytes (16Ki).

See Also:
Constant Field Values
Constructor Detail

InputBitStream

protected InputBitStream()
This (non-public) constructor exists just to provide fake initialisation.


InputBitStream

public InputBitStream(InputStream is)
Creates a new input bit stream wrapping a given input stream using a buffer of size DEFAULT_BUFFER_SIZE.

Parameters:
is - the input stream to wrap.

InputBitStream

public InputBitStream(InputStream is,
                      int bufSize)
Creates a new input bit stream wrapping a given input stream with a specified buffer size.

Parameters:
is - the input stream to wrap.
bufSize - the size in byte of the buffer; it may be 0, denoting no buffering.

InputBitStream

public InputBitStream(byte[] a)
Creates a new input bit stream wrapping a given byte array.

Parameters:
a - the byte array to wrap.

InputBitStream

public InputBitStream(String name,
                      int bufSize)
               throws FileNotFoundException
Creates a new input bit stream reading from a file.

Parameters:
name - the name of the file.
bufSize - the size in byte of the buffer; it may be 0, denoting no buffering.

InputBitStream

public InputBitStream(String name)
               throws FileNotFoundException
Creates a new input bit stream reading from a file.

Parameters:
name - the name of the file.

InputBitStream

public InputBitStream(File file)
               throws FileNotFoundException
Creates a new input bit stream reading from a file.

Parameters:
file - the file.

InputBitStream

public InputBitStream(File file,
                      int bufSize)
               throws FileNotFoundException
Creates a new input bit stream reading from a file.

Parameters:
file - the file.
bufSize - the size in byte of the buffer; it may be 0, denoting no buffering.
Method Detail

flush

public void flush()
Flushes the bit stream. All state information associated to the stream is reset. This includes bytes prefetched from the stream, bits in the bit buffer and unget'd bits.

This method is provided so that users of this class can easily wrap repositionable streams (for instance, file-based streams, which can be repositioned using the underlying FileChannel). It is guaranteed that after calling this method the underlying stream can be repositioned, and that the next read will draw data from the stream.


close

public void close()
           throws IOException
Closes the bit stream. All resources associated to the stream are released.

Throws:
IOException

overflow

public void overflow(boolean overflow)
Sets the overflow flag.

If this flag is true, then after the end of the underlying InputStream we will continue returning zeroes (in particular, no exception will be thrown).

Parameters:
overflow - the new value of the flag.
See Also:
pastEOF()

available

public long available()
               throws IOException
Returns the number of bits that can be read (or skipped over) from this bit stream without blocking by the next caller of a method.

Returns:
the number of bits that can be read from this bit stream without blocking.
Throws:
IOException

overflow

public boolean overflow()
Gets the overflow flag.

Returns:
the current value of the flag.
See Also:
overflow(boolean)

pastEOF

public boolean pastEOF()
Checks whether we are past EOF.

Returns:
true if we are reading zeroes past the EOF.
See Also:
overflow(boolean)

readBits

public long readBits()
Returns the number of bits read from this bit stream.

Note that ungetting bits from this stream will result in the number of read bits to be decreased.

Returns:
the number of bits read so far.

readBits

public void readBits(long readBits)
Sets the number of bits read from this bit stream.

This method is provided so that, for instance, the user can reset via readBits(0) the read-bits count after a flush().

Parameters:
readBits - the new value for the number of bits read so far.

align

public void align()
           throws IOException
Aligns the stream. After a call to this function, the stream is byte aligned. Bits that have been read to align are discarded.

Throws:
IOException

ungetBit

public void ungetBit(boolean bit)
              throws IOException
Ungets a bit. The provided bit will be the next one read from the stream.

Parameters:
bit - a bit.
Throws:
IOException

unget

public void unget(boolean bit)
           throws IOException
Deprecated. As of MG4J 0.6, replaced by ungetBit(boolean).

Ungets a bit.

Parameters:
bit - a bit.
Throws:
IOException

ungetBit

public void ungetBit(int bit)
              throws IOException
Ungets a bit.

Parameters:
bit - a bit.
Throws:
IOException
See Also:
unget(boolean)

unget

public void unget(int bit)
           throws IOException
Deprecated. As of MG4J 0.6, replaced by ungetBit(int).

Ungets a bit.

Parameters:
bit - a bit.
Throws:
IOException

ungetLong

public void ungetLong(long x,
                      int len)
               throws IOException
Ungets a long. The provided long will be the next one read from the stream, if the same length is used. This means that the bit of index len-1 of x will be the next one to be read from the stream.

Parameters:
x - a long.
len - the number of (lower) bits to unget from x.
Throws:
IOException

ungetInt

public void ungetInt(int x,
                     int len)
              throws IOException
Ungets an integer. The provided integer will be the next one read from the stream, if the same length is used. This means that the bit of index len-1 of x will be the next one to be read from the stream.

Parameters:
x - an integer.
len - the number of (lower) bits to unget from x.
Throws:
IOException

read

public void read(byte[] bits,
                 int len)
          throws IOException
Reads a sequence of bits. Bits will be read in the natural way: the first bit is bit 7 of the first byte, the eightth bit is bit 0 of the first byte, the ninth bit is bit 7 of the second byte and so on.

Parameters:
bits - an array of bytes to store the result.
len - the number of bits to read.
Throws:
IOException

readBit

public int readBit()
            throws IOException
Reads a bit.

Returns:
the next bit from the stream.
Throws:
IOException

readInt

public int readInt(int len)
            throws IOException
Reads a fixed number of bits into an integer.

Parameters:
len - a bit length.
Returns:
an integer whose lower len bits are taken from the stream; the rest is zeroed.
Throws:
IOException

readLong

public long readLong(int len)
              throws IOException
Reads a fixed number of bits into a long.

Parameters:
len - a bit length.
Returns:
a long whose lower len bits are taken from the stream; the rest is zeroed.
Throws:
IOException

skip

public long skip(long n)
          throws IOException
Skips the given number of bits.

Parameters:
n - the number of bits to skip.
Returns:
the actual number of skipped bits.
Throws:
IOException

position

public void position(long position)
              throws IOException
Sets this stream bit position, if it is based on a RepositionableStream or on a FileChannel.

Given an underlying stream that implements RepositionableStream or that can provide a FileChannel via the getChannel() method, a call to this method has the same semantics of a flush(), followed by a call to position(position / 8) on the byte stream, followed by a skip(position % 8).

Parameters:
position - the new position expressed as a bit offset.
Throws:
UnsupportedOperationException - if the underlying byte stream does not implement RepositionableStream and if the channel it returns is not a FileChannel.
IOException
See Also:
FileChannel.position(long)

markSupported

public boolean markSupported()
Tests if this stream supports the mark(int) and reset() methods.

This method will just delegate the test to the underlying InputStream.

Returns:
whether this stream supports mark(int)/reset().

mark

public void mark(int readLimit)
          throws IOException
Marks the current position in this input stream. A subsequent call to the reset() method repositions this stream at the last marked position so that subsequent reads re-read the same bits.

This method will just delegate the mark to the underlying InputStream. Moreover, it will throw an exception if you try to mark outsite byte boundaries.

Parameters:
readLimit - the maximum limit of bytes that can be read before the mark position becomes invalid.
Throws:
IOException - if you try to mark outside byte boundaries.

reset

public void reset()
           throws IOException
Repositions this bit stream to the position at the time the mark(int) method was last called.

This method will just flush the stream and delegate the reset to the underlying InputStream.

Throws:
IOException

readUnary

public int readUnary()
              throws IOException
Reads a natural number in unary coding. Note that by unary coding we mean that 1 encodes 0, 01 encodes 1 and so on.

Returns:
the next unary-encoded natural number.
Throws:
IOException

readLongUnary

public long readLongUnary()
                   throws IOException
Reads a long natural number in unary coding. Note that by unary coding we mean that 1 encodes 0, 01 encodes 1 and so on.

Returns:
the next unary-encoded long natural number.
Throws:
IOException

readGamma

public int readGamma()
              throws IOException
Reads a natural number in γ coding.

Returns:
the next γ-encoded natural number.
Throws:
IOException

readLongGamma

public long readLongGamma()
                   throws IOException
Reads a long natural number in γ coding.

Returns:
the next γ-encoded long natural number.
Throws:
IOException

readDelta

public int readDelta()
              throws IOException
Reads a natural number in δ coding.

Returns:
the next δ-encoded natural number.
Throws:
IOException

readLongDelta

public long readLongDelta()
                   throws IOException
Reads a long natural number in δ coding.

Returns:
the next δ-encoded long natural number.
Throws:
IOException

readMinimalBinary

public int readMinimalBinary(int b)
                      throws IOException
Reads a natural number in a limited range using a minimal binary coding.

Parameters:
b - a strict upper bound.
Returns:
the next minimally binary encoded natural number.
Throws:
IllegalArgumentException - if you try to read a negative number or use a nonpositive base.
IOException

readMinimalBinary

public int readMinimalBinary(int b,
                             int log2b)
                      throws IOException
Reads a natural number in a limited range using a minimal binary coding. This method is faster than readMinimalBinary(int) because it does not have to compute log2b.

Parameters:
b - a strict upper bound.
log2b - the floor of the base-2 logarithm of the bound.
Returns:
the next minimally binary encoded natural number.
Throws:
IllegalArgumentException - if you try to read a negative number or use a nonpositive base.
IOException

readLongMinimalBinary

public long readLongMinimalBinary(long b)
                           throws IOException
Reads a long natural number in a limited range using a minimal binary coding.

Parameters:
b - a strict upper bound.
Returns:
the next minimally binary encoded long natural number.
Throws:
IllegalArgumentException - if you try to read a negative number or use a nonpositive base.
IOException

readLongMinimalBinary

public long readLongMinimalBinary(long b,
                                  int log2b)
                           throws IOException
Reads a long natural number in a limited range using a minimal binary coding. This method is faster than readLongMinimalBinary(long) because it does not have to compute log2b.

Parameters:
b - a strict upper bound.
log2b - the floor of the base-2 logarithm of the bound.
Returns:
the next minimally binary encoded long natural number.
Throws:
IllegalArgumentException - if you try to read a negative number or use a nonpositive base.
IOException

readGolomb

public int readGolomb(int b)
               throws IOException
Reads a natural number in Golomb coding.

This method implements also the case in which b is 0: in this case, nothing will be read, and 0 will be returned.

Parameters:
b - the modulus for the coding.
Returns:
the next Golomb-encoded natural number.
Throws:
IllegalArgumentException - if you use a nonpositive modulus.
IOException

readGolomb

public int readGolomb(int b,
                      int log2b)
               throws IOException
Reads a natural number in Golomb coding. This method is faster than readGolomb(int) because it does not have to compute log2b.

This method implements also the case in which b is 0: in this case, nothing will be read, and 0 will be returned.

Parameters:
b - the modulus for the coding.
log2b - the floor of the base-2 logarithm of the coding modulus.
Returns:
the next Golomb-encoded natural number.
Throws:
IllegalArgumentException - if you use a nonpositive modulus.
IOException

readLongGolomb

public long readLongGolomb(long b)
                    throws IOException
Reads a long natural number in Golomb coding.

This method implements also the case in which b is 0: in this case, nothing will be read, and 0 will be returned.

Parameters:
b - the modulus for the coding.
Returns:
the next Golomb-encoded long natural number.
Throws:
IllegalArgumentException - if you use a nonpositive modulus.
IOException

readLongGolomb

public long readLongGolomb(long b,
                           int log2b)
                    throws IOException
Reads a long natural number in Golomb coding. This method is faster than readLongGolomb(long) because it does not have to compute log2b.

This method implements also the case in which b is 0: in this case, nothing will be read, and 0 will be returned.

Parameters:
b - the modulus for the coding.
log2b - the floor of the base-2 logarithm of the coding modulus.
Returns:
the next Golomb-encoded long natural number.
Throws:
IllegalArgumentException - if you use a nonpositive modulus.
IOException

readSkewedGolomb

public int readSkewedGolomb(int b)
                     throws IOException
Reads a natural number in skewed Golomb coding.

This method implements also the case in which b is 0: in this case, nothing will be read, and 0 will be returned.

Parameters:
b - the modulus for the coding.
Returns:
the next skewed Golomb-encoded natural number.
Throws:
IllegalArgumentException - if you use a negative modulus.
IOException

readLongSkewedGolomb

public long readLongSkewedGolomb(long b)
                          throws IOException
Reads a long natural number in skewed Golomb coding.

This method implements also the case in which b is 0: in this case, nothing will be read, and 0 will be returned.

Parameters:
b - the modulus for the coding.
Returns:
the next skewed Golomb-encoded long natural number.
Throws:
IllegalArgumentException - if you use a negative modulus.
IOException

readZeta

public int readZeta(int k)
             throws IOException
Reads a natural number in ζ coding.

Parameters:
k - the shrinking factor.
Returns:
the next ζ-encoded natural number.
Throws:
IllegalArgumentException - if you use a nonpositive shrinking factor.
IOException

readLongZeta

public long readLongZeta(int k)
                  throws IOException
Reads a long natural number in ζ coding.

Parameters:
k - the shrinking factor.
Returns:
the next ζ-encoded long natural number.
Throws:
IllegalArgumentException - if you use a nonpositive shrinking factor.
IOException

readNibble

public int readNibble()
               throws IOException
Reads a natural number in variable-length nibble coding.

Returns:
the next variable-length nibble-encoded natural number.
Throws:
IOException

readLongNibble

public long readLongNibble()
                    throws IOException
Reads a long natural number in variable-length nibble coding.

Returns:
the next variable-length nibble-encoded long natural number.
Throws:
IOException