Apache Commons Collections SetUtils

Introduction

In the last post, we learned about the Apache Commons Collections ListUtils class. In this post, we will cover the Apache Commons Collections SetUtils utility class.

Apache Commons Collections SetUtils

The Apache Commons Collections SetUtils class has utility methods and decorators for Set and SortedSet instances.

SetUtils#emptyIfNull

The emptyIfNull takes a Set as an argument and returns an immutable empty set if the passed argument is null; else it returns the argument itself.

In the below example, we create a Set and pass it to the emptyIfNull method. Since the passed Set is not null, it returns the same back.

Set<String> set = Set.of("apple", "orange");
Set<String> result = SetUtils.emptyIfNull(set);
System.out.println(result); //[apple, orange]

If we pass a null argument, then it will return an immutable empty set.

Set<String> set = null;
Set<String> result = SetUtils.emptyIfNull(set);
System.out.println(result); //[]

Since it is immutable, we will get an UnsupportedOperationException if we attempt to mutate the returned set.

Note: Set.of was added in JDK 9. To learn about it and other factory methods added to Collections, refer to the Convenience Factory Methods for Collections post.

SetUtils emptySet and emptySortedSet

This is a simple method which returns an empty unmodifiable Set.

Set<String> emptySet = SetUtils.emptySet(); //empty unmodifiable Set
System.out.println(emptySet); //[]

The emptySortedSet method is similar, but it returns an empty (and unmodifiable) SortedSet.

SortedSet<String> emptySortedSet = SetUtils.emptySortedSet(); 
System.out.println(emptySortedSet); //[]

SetUtils#orderedSet

The orderedSet method takes a Set and returns an ordered set backed by the given set. The returned order is ordered because that maintains the order of elements that are added afterwards.

Internally it uses an additional list (an ArrayList) to maintain the added elements.

In this example, we have a HashSet with two strings (fruit names) and we create an ordered set backed by the fruits set. The initial order will be based on the iteration order of the underlying set. Here, we get the string ‘orange’ before ‘apple’ as per the iteration order of a HashSet (which is undefined).

Set<String> fruits = new HashSet<>(Set.of("apple", "orange"));
Set<String> orderedSet = SetUtils.orderedSet(fruits);
System.out.println(orderedSet); //[orange, apple]

Next, let us add a new fruit through the ordered set as shown below. After this, we can see that the fruit ‘pear’ is added to both the lists.

orderedSet.add("pear");
System.out.println(orderedSet); //[orange, apple, pear]
System.out.println(fruits); //[orange, apple, pear]

Adding the same fruit name (’pear’) again will not have any effect as we are adding to a Set (no duplicates allowed in a Set).

orderedSet.add("pear");
System.out.println(orderedSet); //[orange, apple, pear]
System.out.println(fruits); //[orange, apple, pear]

Let us now add a ‘banana’ to the set via the ordered set. From the result, we can see that the ordered set maintains the insertion order as ‘banana’ is after ‘pear’. But in the underlying set (HashSet), the iteration order is different.

orderedSet.add("banana");
//insertion ordered maintained
System.out.println(orderedSet); //[orange, apple, pear, banana]
System.out.println(fruits); //[orange, banana, apple, pear]

SetUtils#IdentityHashSet

The newIdentityHashSet method returns a HashSet which matches elements based on identity equality (==) and not equals(). Hence, this set will violate the various Set contracts. It is thus recommended not to compare this set to other sets.

Let us create an identity hash set and add a couple of strings into it first.

Set<String> fruits = new HashSet<>(Set.of("apple", "orange"));
Set<String> identityHashSet = SetUtils.newIdentityHashSet();
identityHashSet.addAll(fruits);
System.out.println(identityHashSet); //[orange, apple]

Next, if we add a new object which is equal to an existing object into a normal HashSet, it will not be added again into the HashSet.

fruits.add(new String("apple")); 
System.out.println(fruits); //[orange, apple]

If we do the same on an identity hash set, since it compares two objects based identity/reference (which is ==), it will be added again. This is because here, the two ‘apple’ strings are different (i.e., have different references pointing to different objects in the memory).

identityHashSet.add(new String("apple"));
System.out.println(identityHashSet); //[orange, apple, apple]

SetUtils#hashSet

The hashSet method takes a var-args of elements and returns a HashSet from the passed elements. Few examples are shown below.

Set<String> hashSet = SetUtils.hashSet();
System.out.println(hashSet); //[]

hashSet = SetUtils.hashSet("a");
System.out.println(hashSet); //[a]

hashSet = SetUtils.hashSet("a", "b", "c", "c");
System.out.println(hashSet); //[a, b, c]

SetUtils#isEqualSet

The isEqualSet method takes two Set instances and checks if they are equal (as per the equals() contract in Set.equals). For the two sets to be equal, the two sets should have the same size and every element of the first set should be present in the second.

Here, we have two strings added to a HashSet and a TreeSet. Even though the iteration order of these two sets is different in this case, the two sets are equal, as they have the same set of elements in both.

Set<String> set1 = new HashSet<>(Set.of("orange", "apple"));
Set<String> set2 = new TreeSet<>(Set.of("orange", "apple"));
System.out.println(set1); //[orange, apple]
System.out.println(set2); //[apple, orange]

System.out.println(set1.equals(set2)); //true
System.out.println(SetUtils.isEqualSet(set1, set2)); //true

SetUtils#hashCodeForSet

The hashCodeForSet method takes a Collection as an argument and generates a hash code using the algorithm specified in Set.hashCode(). Some examples are shown below.

System.out.println(SetUtils.hashCodeForSet(
                List.of("a")
)); //0
System.out.println(SetUtils.hashCodeForSet(
        List.of("a", "b", "c")
)); //294
System.out.println(SetUtils.hashCodeForSet(
        List.of()
)); //0

Finding union, intersection, difference and disjunction between two sets

Let us now look at four methods to find the union, intersection, difference and disjunction between two sets. All these four methods take two Set instances and return an instance of SetUtils.SetView.

SetView is a static nested class, which is a view over the result set. If the underlying set changes, then it will be reflected in the view as well. But this view is unmodifiable (i.e., we cannot mutate the underlying data via this set).

SetUtils#union

The union takes two Set instances and returns a union of them. In other words, it returns an unmodifiable view, which represents the union of both the Sets i.e., has all the elements from both the sets.

In the below example, the result of the union will be the strings “a”, “b” and “c”.

Set<String> set1 = new HashSet<>(Set.of("a", "b"));
Set<String> set2 = new HashSet<>(Set.of("a", "c"));
System.out.println(set1); //[a, b]
System.out.println(set2); //[a, c]
Set<String> union = SetUtils.union(set1, set2);
System.out.println(union); //[a, b, c]

Since the returned Set is a view, it will reflect any changes made to either of the Set instances passed to the union method. Let us say we add a new element to the second set. Then we don’t have to call the union method again to get the new union result.

As shown below, after adding a new element to the second set, if we re-iterate over the union (which is a view), we get the correct (new) result.

set2.add("d");
System.out.println(union); //[a, b, c, d]

The actual return type of the union method is SetUtils.SetView which is a Set. However, the SetView static class provides two methods viz., copyInto and toSet to copy the contents of this view into another set.

When using copyTo, we can pass another set into which the results of the view will be copied into as shown below.

SetUtils.SetView<String> unionSetView = SetUtils.union(set1, set2);
Set<String> mySet = new HashSet<>();
unionSetView.copyInto(mySet);
System.out.println(mySet); //[a, b, c, d]

Or we can call the toSet method, which will copy the contents of the view into a new Set (Implementation detail: It uses a HashSet).

System.out.println(unionSetView.toSet()); //[a, b, c, d]

SetUtils#intersection

The intersection method returns an unmodifiable view (SetView) of the intersection of the given Sets. The returned view has all the elements which are present in both the input sets.

In this example, there is only one common element (”a”) between the two input sets.

Set<String> set1 = Set.of("a", "b");
Set<String> set2 = Set.of("a", "c");
Set<String> intersection = SetUtils.intersection(set1, set2);
System.out.println(intersection); //[a]        

In the below example, there are no common elements between the two sets. Hence, the result of intersection is an empty set.

System.out.println(SetUtils.intersection(Set.of("a"), Set.of("b"))); //[]

SetUtils#difference

The difference method takes two sets (say a and b) and returns an unmodifiable view (SetView) having the difference of sets a – b. It has all the elements of set a which are not present in set b.

Here, the string “b” is the only element present in the first set which is not present in the second set.

Set<String> set1 = Set.of("a", "b");
Set<String> set2 = Set.of("a", "c");

Set<String> difference = SetUtils.difference(set1, set2); 
System.out.println(difference); //[b]

Two other examples follow.

System.out.println(SetUtils.difference(
      Set.of("a", "b"),
      Set.of("c")
)); //[a, b]

System.out.println(SetUtils.difference(
      Set.of("a", "b"),
      Set.of("b", "c", "a")
)); //[]

SetUtils#disjunction

Finally, the disjunction method returns an unmodifiable view (SetView) of the symmetric difference of the given sets. In other words, the result contains all the elements which are present in either the first or the second set, but not in both. This is equivalent to union(difference(a, b), difference(b, a)).

In this example, the string “a” is present in both the sets and hence it is not part of the result.

Set<String> set1 = Set.of("a", "b");
Set<String> set2 = Set.of("a", "c");

Set<String> disjunction = SetUtils.disjunction(set1, set2);
System.out.println(disjunction); //[b, c]

Few more examples follow.

System.out.println(SetUtils.disjunction(
        Set.of("a", "b", "c"),
        Set.of("b", "a")
)); //[c]
System.out.println(SetUtils.disjunction(
        Set.of("a", "b"),
        Set.of("b", "a", "c")
)); //[c]
System.out.println(SetUtils.disjunction(
        Set.of("a", "b", "c"),
        Set.of("b", "c", "a")
)); //[]

SetUtils#predicatedSet

The predicatedSet takes a Set and a Predicate (org.apache.commons.collections4 Predicate) and returns a predicated set (validating set) which is backed by the passed set.

When we add new elements to the returned predicated set, it will add the element to the underlying set only if it passes the predicate. If it doesn’t pass the predicate check, it will throw an IllegalArgumentException.

In this example, we have a set of fruits which should hold only fruit whose length (length of the name) is even. We create a predicated set with this set and a predicate to check the length.

Set<String> evenLengthFruits = new HashSet<>(Set.of("pear", "orange"));
Set<String> predicatedSet = SetUtils.predicatedSet(evenLengthFruits,
        s -> s.length() % 2 == 0);
System.out.println(predicatedSet); //[orange, pear]

Then, if we add a valid fruit name, it will be added to the underlying set.

predicatedSet.add("banana");
System.out.println(predicatedSet); //[orange, banana, pear]
System.out.println(evenLengthFruits); //[orange, banana, pear]

If we add a fruit whose name is of odd length, then it will throw an IllegalArgumentException.

//throws java.lang.IllegalArgumentException
predicatedSet.add("apple");

We have to be careful to not mutate the original set after creating a predicated set as it can become a backdoor way to add invalid elements into the original (underlying set) as shown below.

evenLengthFruits.add("apple"); //!!
System.out.println(evenLengthFruits); //[orange, banana, apple, pear]
System.out.println(predicatedSet); //[orange, banana, apple, pear]

SetUtils – predicatedNavigableSet and predicatedSortedSet

The SetUtils class has two other methods viz., predicatedNavigableSet and predicatedSortedSet which accept a NavigableSet and a SortedSet, respectively. Let us look at using a NavigableSet (I’ll skip predicatedSortedSet as it is very similar).

As before, we have a set of even length fruit names, but here it is a TreeSet sorted in reverse order. We create a predicated set from this as before.

NavigableSet<String> evenLengthFruitsAsNavigableSet = new TreeSet<>(Comparator.reverseOrder());
evenLengthFruitsAsNavigableSet.addAll(Set.of("pear", "orange"));

SortedSet<String> predicatedNavigableSet = SetUtils.predicatedNavigableSet(evenLengthFruitsAsNavigableSet,
        s -> s.length() % 2 == 0);
System.out.println(predicatedNavigableSet); //[pear, orange]

As shown below, if we add an element which passes the predicate check, it will be added to the underlying set. If we add an invalid element, it will throw an IllegalArgumentException.

predicatedNavigableSet.add("banana");
System.out.println(predicatedNavigableSet); //[pear, orange, banana]
System.out.println(evenLengthFruitsAsNavigableSet); //[pear, orange, banana]

//throws java.lang.IllegalArgumentException
//predicatedNavigableSet.add("apple");

The backdoor mechanism warning applies here as well.

//backdoor!
evenLengthFruitsAsNavigableSet.add("apple");
System.out.println(evenLengthFruitsAsNavigableSet); //[pear, orange, banana, apple]
System.out.println(predicatedNavigableSet); //[pear, orange, banana, apple]

SetUtils#transformedSet

The transformedSet method takes a Set and a Transformer as an argument and returns a transformed set which is backed by the original set. A Transformer is a functional interface which can transform one object into another (It is very similar to a Java Function).

@FunctionalInterface
public interface Transformer<I, O> {
    O transform(I input);
}

All values we add through the returned transformed set will be first passed through this transformer before adding to the underlying set.

Note: If the original set has any initial elements, then those will not be passed through the transformer.

Here, we start off with a set having two elements. We create a transformer which takes a string and returns a string by appendingval: in front of the string.

Set<String> originalSet = new HashSet<>(Set.of("val:1", "val:2"));
Set<String> transformedSet = SetUtils.transformedSet(originalSet,
        s -> "val: " + s);
System.out.println(transformedSet); //[val:1, val:2]

Let us add a couple of elements through the transformed set. As we can see, all the elements will be first processed by the transformer and only then added into the original set.

transformedSet.add("3");
transformedSet.add("4");

System.out.println(transformedSet); //[val:1, val:2, val: 3, val: 4]
System.out.println(originalSet); //[val:1, val:2, val: 3, val: 4]

We have to be careful to not modify or add new elements directly to the underlying set as it can act as a backdoor to add elements without being transformed.

//!!
originalSet.add("5");
System.out.println(originalSet); //[val:1, val:2, 5, val: 3, val: 4]

SetUtils – transformedNavigableSet and transformedSortedSet

To handle NavigableSet and SortedSet, it also provides the transformedNavigableSet and transformedSortedSet method, respectively.

An example of using transformedNavigableSet is shown below. (Will skip transformedSortedSet as it is very similar).

NavigableSet<String> originalNavigableSet = new TreeSet<>();
Set<String> transformedNavigableSet = SetUtils.transformedNavigableSet(originalNavigableSet,
        s -> "val: " + s);
transformedNavigableSet.add("1");
transformedNavigableSet.add("2");

System.out.println(transformedNavigableSet); //[val: 1, val: 2]
System.out.println(originalNavigableSet); //[val: 1, val: 2]

SetUtils#unmodifiableSet

There are two overloaded unmodifiableSet methods – one taking a var-args and one taking a Set. They create and return an unmodifiable set from the passed elements or the set.

//unmodifiable set from the given items
Set<String> unmodifiableSet = SetUtils.unmodifiableSet("a", "b");
System.out.println(unmodifiableSet); //[a, b]

If we attempt to add to the returned set, it will throw an UnsupportedOperationException.

//java.lang.UnsupportedOperationException
unmodifiableSet.add("c");

Note that if we create an unmodifiable set from a modifiable set (like a HashSet), we can still modify the underlying set directly.

//unmodifiable set backed by the given set
Set<String> hashSet = new HashSet<>(Set.of("a", "b"));
unmodifiableSet = SetUtils.unmodifiableSet(hashSet);
System.out.println(unmodifiableSet); //[a, b]

hashSet.add("c"); 
System.out.println(unmodifiableSet); //[a, b, c]

SetUtils – unmodifiableNavigableSet and unmodifiableSortedSet

As in other methods, we also have two separate methods (unmodifiableNavigableSet and unmodifiableSortedSet) for working on a NavigableSet and SortedSet. A short example follows for unmodifiableNavigableSet.

TreeSet<String> treeSet = new TreeSet<>(Set.of("b", "a"));
System.out.println(treeSet); //[a, b]
SortedSet<String> unmodifiableNavigableSet = SetUtils.unmodifiableNavigableSet(treeSet);
System.out.println(unmodifiableNavigableSet); //[a, b]

SetUtils – synchronizedSet and synchronizedSortedSet

The synchronizedSet and synchronizedSortedSet methods take a Set and a SortedSet as argument respectively, and return a synchronized set backed by the given set. They are just a wrapper over Collections.synchronizedSet and Collections.synchronizedSortedSet, respectively. Hence, we must manually synchronize when iterating over the elements.

Set<String> set = Set.of("a", "b");
Set<String> synchronizedSet = SetUtils.synchronizedSet(set);
System.out.println(synchronizedSet); //[a, b]

Conclusion

Having learned about the Apache Commons Collections ListUtils class in the last post, this post covered the Apache Commons Collections SetUtils utility class. We looked at all the useful utility methods from the SetUtils class operating on Set instances.

Leave a Reply