Class Bzip2Compressor

java.lang.Object
org.apache.hadoop.io.compress.bzip2.Bzip2Compressor
All Implemented Interfaces:
Compressor

public class Bzip2Compressor extends Object implements Compressor
A Compressor based on the popular bzip2 compression algorithm. http://www.bzip2.org/
  • Constructor Summary

    Constructors
    Constructor
    Description
    Creates a new compressor with a default values for the compression block size and work factor.
    Bzip2Compressor(int blockSize, int workFactor, int directBufferSize)
    Creates a new compressor using the specified block size.
    Creates a new compressor, taking settings from the configuration.
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    compress(byte[] b, int off, int len)
    Fills specified buffer with compressed data.
    void
    end()
    Closes the compressor and discards any unprocessed input.
    void
    When called, indicates that compression should end with the current contents of the input buffer.
    boolean
    Returns true if the end of the compressed data output stream has been reached.
    long
    Returns the total number of uncompressed bytes input so far.
    long
    Returns the total number of compressed bytes output so far.
    static String
     
    boolean
    Returns true if the input data buffer is empty and #setInput() should be called to provide more input.
    void
    Prepare the compressor to be used in a new stream with settings defined in the given Configuration.
    void
    Resets compressor so that a new set of input data can be processed.
    void
    setDictionary(byte[] b, int off, int len)
    Sets preset dictionary for compression.
    void
    setInput(byte[] b, int off, int len)
    Sets input data for compression.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • Bzip2Compressor

      public Bzip2Compressor()
      Creates a new compressor with a default values for the compression block size and work factor. Compressed data will be generated in bzip2 format.
    • Bzip2Compressor

      public Bzip2Compressor(Configuration conf)
      Creates a new compressor, taking settings from the configuration.
      Parameters:
      conf - configuration.
    • Bzip2Compressor

      public Bzip2Compressor(int blockSize, int workFactor, int directBufferSize)
      Creates a new compressor using the specified block size. Compressed data will be generated in bzip2 format.
      Parameters:
      blockSize - The block size to be used for compression. This is an integer from 1 through 9, which is multiplied by 100,000 to obtain the actual block size in bytes.
      workFactor - This parameter is a threshold that determines when a fallback algorithm is used for pathological data. It ranges from 0 to 250.
      directBufferSize - Size of the direct buffer to be used.
  • Method Details

    • reinit

      public void reinit(Configuration conf)
      Prepare the compressor to be used in a new stream with settings defined in the given Configuration. It will reset the compressor's block size and and work factor.
      Specified by:
      reinit in interface Compressor
      Parameters:
      conf - Configuration storing new settings
    • setInput

      public void setInput(byte[] b, int off, int len)
      Description copied from interface: Compressor
      Sets input data for compression. This should be called whenever #needsInput() returns true indicating that more input data is required.
      Specified by:
      setInput in interface Compressor
      Parameters:
      b - Input data
      off - Start offset
      len - Length
    • setDictionary

      public void setDictionary(byte[] b, int off, int len)
      Description copied from interface: Compressor
      Sets preset dictionary for compression. A preset dictionary is used when the history buffer can be predetermined.
      Specified by:
      setDictionary in interface Compressor
      Parameters:
      b - Dictionary data bytes
      off - Start offset
      len - Length
    • needsInput

      public boolean needsInput()
      Description copied from interface: Compressor
      Returns true if the input data buffer is empty and #setInput() should be called to provide more input.
      Specified by:
      needsInput in interface Compressor
      Returns:
      true if the input data buffer is empty and #setInput() should be called in order to provide more input.
    • finish

      public void finish()
      Description copied from interface: Compressor
      When called, indicates that compression should end with the current contents of the input buffer.
      Specified by:
      finish in interface Compressor
    • finished

      public boolean finished()
      Description copied from interface: Compressor
      Returns true if the end of the compressed data output stream has been reached.
      Specified by:
      finished in interface Compressor
      Returns:
      true if the end of the compressed data output stream has been reached.
    • compress

      public int compress(byte[] b, int off, int len) throws IOException
      Description copied from interface: Compressor
      Fills specified buffer with compressed data. Returns actual number of bytes of compressed data. A return value of 0 indicates that needsInput() should be called in order to determine if more input data is required.
      Specified by:
      compress in interface Compressor
      Parameters:
      b - Buffer for the compressed data
      off - Start offset of the data
      len - Size of the buffer
      Returns:
      The actual number of bytes of compressed data.
      Throws:
      IOException - raised on errors performing I/O.
    • getBytesWritten

      public long getBytesWritten()
      Returns the total number of compressed bytes output so far.
      Specified by:
      getBytesWritten in interface Compressor
      Returns:
      the total (non-negative) number of compressed bytes output so far
    • getBytesRead

      public long getBytesRead()
      Returns the total number of uncompressed bytes input so far.
      Specified by:
      getBytesRead in interface Compressor
      Returns:
      the total (non-negative) number of uncompressed bytes input so far
    • reset

      public void reset()
      Description copied from interface: Compressor
      Resets compressor so that a new set of input data can be processed.
      Specified by:
      reset in interface Compressor
    • end

      public void end()
      Description copied from interface: Compressor
      Closes the compressor and discards any unprocessed input.
      Specified by:
      end in interface Compressor
    • getLibraryName

      public static String getLibraryName()