Class OutlierDetector
java.lang.Object
org.apache.hadoop.hdfs.server.datanode.metrics.OutlierDetector
A utility class to help detect resources (nodes/ disks) whose aggregate
latency is an outlier within a given set.
We use the median absolute deviation for outlier detection as
described in the following publication:
Leys, C., et al., Detecting outliers: Do not use standard deviation
around the mean, use absolute deviation around the median.
http://dx.doi.org/10.1016/j.jesp.2013.03.013
We augment the above scheme with the following heuristics to be even
more conservative:
1. Skip outlier detection if the sample size is too small.
2. Never flag resources whose aggregate latency is below a low threshold.
3. Never flag resources whose aggregate latency is less than a small
multiple of the median.
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic DoublecomputeMad(List<Double> sortedValues) Compute the Median Absolute Deviation of a sorted list.static DoublecomputeMedian(List<Double> sortedValues) Compute the median of a sorted list.longlonggetOutlierMetrics(Map<String, Double> stats) Return a set of nodes whose latency is much higher than their counterparts.getOutliers(Map<String, Double> stats) Return a set of nodes/ disks whose latency is much higher than their counterparts.voidsetLowThresholdMs(long thresholdMs) voidsetMinNumResources(long minNodes)
-
Field Details
-
LOG
public static final org.slf4j.Logger LOG
-
-
Constructor Details
-
OutlierDetector
public OutlierDetector(long minNumResources, long lowThresholdMs)
-
-
Method Details
-
getOutliers
Return a set of nodes/ disks whose latency is much higher than their counterparts. The input is a map of (resource -> aggregate latency) entries. The aggregate may be an arithmetic mean or a percentile e.g. 90th percentile. Percentiles are a better choice than median since latency is usually not a normal distribution. This method allocates temporary memory O(n) and has run time O(n.log(n)), where n = stats.size().- Returns:
-
getOutlierMetrics
public Map<String,org.apache.hadoop.hdfs.server.protocol.OutlierMetrics> getOutlierMetrics(Map<String, Double> stats) Return a set of nodes whose latency is much higher than their counterparts. The input is a map of (resource -> aggregate latency) entries. The aggregate may be an arithmetic mean or a percentile e.g. 90th percentile. Percentiles are a better choice than median since latency is usually not a normal distribution.- Parameters:
stats- map of aggregate latency entries.- Returns:
- map of outlier nodes to outlier metrics.
-
computeMad
Compute the Median Absolute Deviation of a sorted list. -
computeMedian
Compute the median of a sorted list. -
setMinNumResources
public void setMinNumResources(long minNodes) -
getMinOutlierDetectionNodes
public long getMinOutlierDetectionNodes() -
setLowThresholdMs
public void setLowThresholdMs(long thresholdMs) -
getLowThresholdMs
public long getLowThresholdMs()
-