Class Sets

java.lang.Object
org.apache.hadoop.util.Sets

@Private public final class Sets extends Object
Static utility methods pertaining to Set instances. This class is Hadoop's internal use alternative to Guava's Sets utility class. Javadocs for majority of APIs in this class are taken from Guava's Sets class from Guava release version 27.0-jre.
  • Method Details

    • newHashSet

      public static <E> HashSet<E> newHashSet()
      Creates a mutable, initially empty HashSet instance.

      Note: if mutability is not required, use ImmutableSet#of() instead. If E is an Enum type, use EnumSet.noneOf(java.lang.Class<E>) instead. Otherwise, strongly consider using a LinkedHashSet instead, at the cost of increased memory footprint, to get deterministic iteration behavior.

      Type Parameters:
      E - Generics Type E.
      Returns:
      a new, empty TreeSet
    • newTreeSet

      public static <E extends Comparable> TreeSet<E> newTreeSet()
      Creates a mutable, empty TreeSet instance sorted by the natural sort ordering of its elements.

      Note: if mutability is not required, use ImmutableSortedSet#of() instead.

      Type Parameters:
      E - Generics Type E
      Returns:
      a new, empty TreeSet
    • newHashSet

      @SafeVarargs public static <E> HashSet<E> newHashSet(E... elements)
      Creates a mutable HashSet instance initially containing the given elements.

      Note: if elements are non-null and won't be added or removed after this point, use ImmutableSet#of() or ImmutableSet#copyOf(Object[]) instead. If E is an Enum type, use EnumSet.of(Enum, Enum[]) instead. Otherwise, strongly consider using a LinkedHashSet instead, at the cost of increased memory footprint, to get deterministic iteration behavior.

      This method is just a small convenience, either for newHashSet(Arrays.asList(T...)(...)), or for creating an empty set then calling Collections.addAll(java.util.Collection<? super T>, T...).

      Type Parameters:
      E - Generics Type E.
      Parameters:
      elements - the elements that the set should contain.
      Returns:
      a new, empty thread-safe Set
    • newHashSet

      public static <E> HashSet<E> newHashSet(Iterable<? extends E> elements)
      Creates a mutable HashSet instance containing the given elements. A very thin convenience for creating an empty set then calling Collection.addAll(java.util.Collection<? extends E>) or Iterables#addAll.

      Note: if mutability is not required and the elements are non-null, use ImmutableSet#copyOf(Iterable) instead. (Or, change elements to be a FluentIterable and call elements.toSet().)

      Note: if E is an Enum type, use newEnumSet(Iterable, Class) instead.

      Type Parameters:
      E - Generics Type E.
      Parameters:
      elements - the elements that the set should contain.
      Returns:
      a new, empty thread-safe Set.
    • newTreeSet

      public static <E extends Comparable> TreeSet<E> newTreeSet(Iterable<? extends E> elements)
      Creates a mutable TreeSet instance containing the given elements sorted by their natural ordering.

      Note: if mutability is not required, use ImmutableSortedSet#copyOf(Iterable) instead.

      Note: If elements is a SortedSet with an explicit comparator, this method has different behavior than TreeSet(SortedSet), which returns a TreeSet with that comparator.

      Note for Java 7 and later: this method is now unnecessary and should be treated as deprecated. Instead, use the TreeSet constructor directly, taking advantage of the new "diamond" syntax.

      This method is just a small convenience for creating an empty set and then calling Iterables#addAll. This method is not very useful and will likely be deprecated in the future.

      Type Parameters:
      E - Generics Type E.
      Parameters:
      elements - the elements that the set should contain
      Returns:
      a new TreeSet containing those elements (minus duplicates)
    • newHashSet

      public static <E> HashSet<E> newHashSet(Iterator<? extends E> elements)
      Creates a mutable HashSet instance containing the given elements. A very thin convenience for creating an empty set and then calling Iterators#addAll.

      Note: if mutability is not required and the elements are non-null, use ImmutableSet#copyOf(Iterator) instead.

      Note: if E is an Enum type, you should create an EnumSet instead.

      Overall, this method is not very useful and will likely be deprecated in the future.

      Type Parameters:
      E - Generics Type E.
      Parameters:
      elements - elements.
      Returns:
      a new, empty thread-safe Set.
    • newHashSetWithExpectedSize

      public static <E> HashSet<E> newHashSetWithExpectedSize(int expectedSize)
      Returns a new hash set using the smallest initial table size that can hold expectedSize elements without resizing. Note that this is not what HashSet(int) does, but it is what most users want and expect it to do.

      This behavior can't be broadly guaranteed, but has been tested with OpenJDK 1.7 and 1.8.

      Type Parameters:
      E - Generics Type E.
      Parameters:
      expectedSize - the number of elements you expect to add to the returned set
      Returns:
      a new, empty hash set with enough capacity to hold expectedSize elements without resizing
      Throws:
      IllegalArgumentException - if expectedSize is negative
    • intersection

      public static <E> Set<E> intersection(Set<E> set1, Set<E> set2)
      Returns the intersection of two sets as an unmodifiable set. The returned set contains all elements that are contained by both backing sets.

      Results are undefined if set1 and set2 are sets based on different equivalence relations (as HashSet, TreeSet, and the keySet of an IdentityHashMap all are).

      Type Parameters:
      E - Generics Type E.
      Parameters:
      set1 - set1.
      set2 - set2.
      Returns:
      a new, empty thread-safe Set.
    • union

      public static <E> Set<E> union(Set<E> set1, Set<E> set2)
      Returns the union of two sets as an unmodifiable set. The returned set contains all elements that are contained in either backing set.

      Results are undefined if set1 and set2 are sets based on different equivalence relations (as HashSet, TreeSet, and the Map.keySet() of an IdentityHashMap all are).

      Type Parameters:
      E - Generics Type E.
      Parameters:
      set1 - set1.
      set2 - set2.
      Returns:
      a new, empty thread-safe Set.
    • difference

      public static <E> Set<E> difference(Set<E> set1, Set<E> set2)
      Returns the difference of two sets as an unmodifiable set. The returned set contains all elements that are contained by set1 and not contained by set2.

      Results are undefined if set1 and set2 are sets based on different equivalence relations (as HashSet, TreeSet, and the keySet of an IdentityHashMap all are). This method is used to find difference for HashSets. For TreeSets with strict order requirement, recommended method is differenceInTreeSets(Set, Set).

      Type Parameters:
      E - Generics Type E.
      Parameters:
      set1 - set1.
      set2 - set2.
      Returns:
      a new, empty thread-safe Set.
    • differenceInTreeSets

      public static <E> Set<E> differenceInTreeSets(Set<E> set1, Set<E> set2)
      Returns the difference of two sets as an unmodifiable set. The returned set contains all elements that are contained by set1 and not contained by set2.

      Results are undefined if set1 and set2 are sets based on different equivalence relations (as HashSet, TreeSet, and the keySet of an IdentityHashMap all are). This method is used to find difference for TreeSets. For HashSets, recommended method is difference(Set, Set).

      Type Parameters:
      E - Generics Type E.
      Parameters:
      set1 - set1.
      set2 - set2.
      Returns:
      a new, empty thread-safe Set.
    • symmetricDifference

      public static <E> Set<E> symmetricDifference(Set<E> set1, Set<E> set2)
      Returns the symmetric difference of two sets as an unmodifiable set. The returned set contains all elements that are contained in either set1 or set2 but not in both. The iteration order of the returned set is undefined.

      Results are undefined if set1 and set2 are sets based on different equivalence relations (as HashSet, TreeSet, and the keySet of an IdentityHashMap all are).

      Type Parameters:
      E - Generics Type E.
      Parameters:
      set1 - set1.
      set2 - set2.
      Returns:
      a new, empty thread-safe Set.
    • newConcurrentHashSet

      public static <E> Set<E> newConcurrentHashSet()
      Creates a thread-safe set backed by a hash map. The set is backed by a ConcurrentHashMap instance, and thus carries the same concurrency guarantees.

      Unlike HashSet, this class does NOT allow null to be used as an element. The set is serializable.

      Type Parameters:
      E - Generics Type.
      Returns:
      a new, empty thread-safe Set