Class FileSystem.Statistics

java.lang.Object
org.apache.hadoop.fs.FileSystem.Statistics
Enclosing class:
FileSystem

public static final class FileSystem.Statistics extends Object
Tracks statistics about how many reads, writes, and so forth have been done in a FileSystem. Since there is only one of these objects per FileSystem, there will typically be many threads writing to this object. Almost every operation on an open file will involve a write to this object. In contrast, reading statistics is done infrequently by most programs, and not at all by others. Hence, this is optimized for writes. Each thread writes to its own thread-local area of memory. This removes contention and allows us to scale up to many, many threads. To read statistics, the reader thread totals up the contents of all of the thread-local data areas.
  • Constructor Details

    • Statistics

      public Statistics(String scheme)
    • Statistics

      public Statistics(FileSystem.Statistics other)
      Copy constructor.
      Parameters:
      other - The input Statistics object which is cloned.
  • Method Details

    • getThreadStatistics

      public FileSystem.Statistics.StatisticsData getThreadStatistics()
      Get or create the thread-local data associated with the current thread.
      Returns:
      statistics data.
    • incrementBytesRead

      public void incrementBytesRead(long newBytes)
      Increment the bytes read in the statistics.
      Parameters:
      newBytes - the additional bytes read
    • incrementBytesWritten

      public void incrementBytesWritten(long newBytes)
      Increment the bytes written in the statistics.
      Parameters:
      newBytes - the additional bytes written
    • incrementReadOps

      public void incrementReadOps(int count)
      Increment the number of read operations.
      Parameters:
      count - number of read operations
    • incrementLargeReadOps

      public void incrementLargeReadOps(int count)
      Increment the number of large read operations.
      Parameters:
      count - number of large read operations
    • incrementWriteOps

      public void incrementWriteOps(int count)
      Increment the number of write operations.
      Parameters:
      count - number of write operations
    • incrementBytesReadErasureCoded

      public void incrementBytesReadErasureCoded(long newBytes)
      Increment the bytes read on erasure-coded files in the statistics.
      Parameters:
      newBytes - the additional bytes read
    • incrementBytesReadByDistance

      public void incrementBytesReadByDistance(int distance, long newBytes)
      Increment the bytes read by the network distance in the statistics In the common network topology setup, distance value should be an even number such as 0, 2, 4, 6. To make it more general, we group distance by {1, 2}, {3, 4} and {5 and beyond} for accounting.
      Parameters:
      distance - the network distance
      newBytes - the additional bytes read
    • increaseRemoteReadTime

      public void increaseRemoteReadTime(long durationMS)
      Increment the time taken to read bytes from remote in the statistics.
      Parameters:
      durationMS - time taken in ms to read bytes from remote
    • getBytesRead

      public long getBytesRead()
      Get the total number of bytes read.
      Returns:
      the number of bytes
    • getBytesWritten

      public long getBytesWritten()
      Get the total number of bytes written.
      Returns:
      the number of bytes
    • getReadOps

      public int getReadOps()
      Get the number of file system read operations such as list files.
      Returns:
      number of read operations
    • getLargeReadOps

      public int getLargeReadOps()
      Get the number of large file system read operations such as list files under a large directory.
      Returns:
      number of large read operations
    • getWriteOps

      public int getWriteOps()
      Get the number of file system write operations such as create, append rename etc.
      Returns:
      number of write operations
    • getBytesReadByDistance

      public long getBytesReadByDistance(int distance)
      In the common network topology setup, distance value should be an even number such as 0, 2, 4, 6. To make it more general, we group distance by {1, 2}, {3, 4} and {5 and beyond} for accounting. So if the caller ask for bytes read for distance 2, the function will return the value for group {1, 2}.
      Parameters:
      distance - the network distance
      Returns:
      the total number of bytes read by the network distance
    • getRemoteReadTime

      public long getRemoteReadTime()
      Get total time taken in ms for bytes read from remote.
      Returns:
      time taken in ms for remote bytes read.
    • getData

      Get all statistics data. MR or other frameworks can use the method to get all statistics at once.
      Returns:
      the StatisticsData
    • getBytesReadErasureCoded

      public long getBytesReadErasureCoded()
      Get the total number of bytes read on erasure-coded files.
      Returns:
      the number of bytes
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • reset

      public void reset()
      Resets all statistics to 0. In order to reset, we add up all the thread-local statistics data, and set rootData to the negative of that. This may seem like a counterintuitive way to reset the statistics. Why can't we just zero out all the thread-local data? Well, thread-local data can only be modified by the thread that owns it. If we tried to modify the thread-local data from this thread, our modification might get interleaved with a read-modify-write operation done by the thread that owns the data. That would result in our update getting lost. The approach used here avoids this problem because it only ever reads (not writes) the thread-local data. Both reads and writes to rootData are done under the lock, so we're free to modify rootData from any thread that holds the lock.
    • getScheme

      public String getScheme()
      Get the uri scheme associated with this statistics object.
      Returns:
      the schema associated with this set of statistics