Introduction
Google Guava library provides a lot of classes and utilities for dealing with Collections. Multiset is one of them. We will learn about Multiset in Google Guava library in this post.
Using Google Guava
You can include Google Guava into your project or application from Maven central. Refer to the Getting Google Guava section in one of my previous posts to know how to include the Google Guava library.
What is a Multiset
A Multiset is a collection that supports order-independent equality (behaves like a Set). But it can have duplicate elements. We can call a Multiset a bag.
It also enables to get the number of times an element is present in the Multiset. This is called as the count of that element. Two elements of the multiset that are equal to each other are called as the occurrences of the same element.
First, we will look at the APIs of a Multiset. Second, we will learn about the static utilities provided by the Multisets class. Lastly, we will look at some implementations of Multiset and their behaviour.
I will use a HashMultiset to explain the APIs. When we move to the last part of this post, we will look at other implementing classes of a Multiset.
Adding to a multiset
The usual Set has the add method to add an element to the Set and the multiset supports that. As we have seen already, a Multiset allows duplicates. Hence, adding an element more than once will result in adding it that many times.
Apart from this, it also has an add method to specify the occurrences of that element as the second parameter.
Multiset<String> multiset = HashMultiset.create();
multiset.add("apple");
multiset.add("orange", 2);
multiset.forEach(System.out::println);
The above code prints the following.
orange
orange
apple
Since we are using a HashMultiset it behaves like a HashMap and thus it does not guarantee the ordering.
Iterating over a multiset or printing it
We have seen one way to print a Multiset. The forEach takes a Consumer using which we can print the elements. The default toString implementation the Multiset provides is interesting. Let us see how it prints the above multiset.
System.out.println(multiset);
Prints,
[orange x 2, apple]
If an element is present only once, it prints the name of that element. Else, it prints the element name followed by the character x and follows it with the count of that element. This is very useful and a friendly representation of a multiset
forEachEntry (Experimental method)
It has a method called forEachEntry which takes a ObjIntConsumer implementation. It takes an element and an integer (the count). We can use it like
multiset.forEachEntry((fruit, c) -> System.out.println(fruit + " appears " + c + " times"));
It prints,
orange appears 2 times
apple appears 1 times
But this method is marked as @Beta and hence it is experimental and unstable. Thus, it is not recommended to use in production.
Query methods
contains
It is used to query if a multiset contains an element. Let us use that on our earlier-constructed multiset
System.out.println(multiset.contains("orange")); //true
System.out.println(multiset.contains("pear")); //false
containsAll
The containsAll method takes a collection and checks if each of the elements in the passed collection is present in the multiset. It does not check the number of times an element is present in the multiset. An example will make it clear
System.out.println(multiset.containsAll(Arrays.asList("orange", "apple", "orange", "orange")));
The above will print true even though our multiset has only two oranges, but the collection passed to the containsAll has three oranges.
count
This method is used to get the count of an element.
System.out.println(multiset.count("apple")); //Outputs - 1
System.out.println(multiset.count("orange")); //Outputs - 2
System.out.println(multiset.count("pear")); //Outputs - 0
Remove methods
Similar to the add method, there are two varieties of the remove method. One a plain remove (like in a Set) taking the element to remove. And one that takes the number of occurrences to remove.
Let us modify our multiset (having an apple and two oranges) to add 5 grapes. Then we will remove some using the remove method.
multiset.add("grape", 5);
multiset.remove("grape"); //Removes one grape
multiset.remove("grape", 2); //Removes two grapes
System.out.println(multiset); //Prints - [orange x 2, apple, grape x 2]
Modifying the count of an element
It would be painful if there was only add and remove elements in a multiset. We cannot easily change an item’s count. There is a setCount method that takes the name of an element and the expected count and sets the count of that element to that of the passed value.
multiset.setCount("grape", 20);
System.out.println(multiset.count("grape")); //prints 20
There is another overloaded setCount method that operates on a condition. In addition to taking the new count, it takes the expected existing count (oldCount) of the element. It changes the count of that element to the new count only if its current count is equal to the provided oldCount.
The method returns true if the operation succeeded i.e., it changed the element count and false otherwise.
boolean success = multiset.setCount("grape", 20, 40);
System.out.println(success);
System.out.println(multiset.count("grape"));
System.out.println();
success = multiset.setCount("grape", 100000, 20);
System.out.println(success);
System.out.println(multiset.count("grape"));
The first setCount sets the count of grape to 40 on condition that if its current count is 20. This would succeed. The second attempt attempts to set the count to 20 if the current count is 100000 which is false. The count of grape would remain as 40.
The above code snippet outputs the following
true
40
false
40
Multiset’s view methods
A Multiset provides two methods that returns a view of the multiset - elementSet and entrySet.
elementSet
The elementSet method returns the unique elements from the multiset. In other words, it returns the name of the elements in a Set (and hence no duplicates) without including the count. Since this is a view, the result set is backed up by the original multiset. Thus, if we remove any element from the result set all of the occurrences of that element will be removed in the backing multiset.
Set<String> elements = multiset.elementSet();
System.out.println(elements); // [orange, apple, grape]
entrySet
This is similar to a Map’s entrySet. It returns a view of the multiset where each entry has the element name and its count.
Set<Multiset.Entry<String>> entries = multiset.entrySet();
System.out.println(entries); //[orange x 2, apple, grape x 40]
We can use this to transform a Multiset into a Map<String, Integer> where the key is the element and value is its count.
Map<String, Integer> fruitsCount = entries.stream()
.collect(Collectors.toMap(Multiset.Entry::getElement, Multiset.Entry::getCou));
System.out.println(fruitsCount); //{orange=2, apple=1, grape=40}
Multiset - size
The size method of a Multiset prints the total occurrences of all the elements. To get the unique element’s count, use the size method of entrySet or elementSet.
Calling these on our multiset gives,
System.out.println(multiset.size()); //43
System.out.println(multiset.elementSet().size()); //3
System.out.println(multiset.entrySet().size()); //3
It prints 43 as we have 2 oranges, 1 apple and 40 grapes
RetainAll and RemoveAll methods
Both retainAll and removeAll takes a collection of elements. The retainAll only retains those elements in the collection and removes the others from the multiset. The removeAll removes the elements in the collection.
Note that these operate only using the name of the element. It does not take into account the number of times an element is present (its count)
Let me copy the Multiset to demonstate this as it will mutate the multiset.
Multiset<String> copy = HashMultiset.create(multiset);
copy.retainAll(Arrays.asList("apple", "grape", "pear"));
System.out.println(copy); //[apple, grape x 40]
If we ask it to retain a non-existing element (pear), it will not do anything. Asking to retain grape will retain all its occurrences.
copy = HashMultiset.create(multiset);
copy.removeAll(Arrays.asList("apple", "grape", "pear"));
System.out.println(copy); //[orange x 2]
Asking to remove grape will remove all of its occurrences.
If we wish to retain or remove by taking into account the count of that element, the MultiSets class has utility methods to achieve this. We will see this in a moment.
MultiSets class in Google Guava
It provides static utility methods for working with Multiset objects.
Creating an unmodifiable multiset
The unmodifiableMultiset method takes a Multiset and returns one that is unmodifiable. Invoking any method that can modify it (like add or remove) will throw an UnsupportedOperationException.
Finding the intersection between two multsets
The intersection method returns an unmodifiable view of the intersection of two multisets. It has an element in the result if and only if it is present in both the multisets. The count of that element in the result is the minimum of its count in the two multisets.
Multiset<String> multiset2 = HashMultiset.create();
multiset1.add("apple");
multiset1.add("orange", 2);
multiset1.add("grape", 6);
Multiset<String> multiset2 = HashMultiset.create();
multiset2.add("apple");
multiset2.add("orange", 1);
multiset2.add("grape", 10);
multiset2.add("pear", 6);
Multiset<String> intersection = Multisets.intersection(multiset1, multiset2);
System.out.println(intersection); //[orange, apple, grape x 6]
The first multiset has two oranges and the second has just one orange. Thus, the result has one orange. The same logic applies for others.
retainOccurrences and removeOccurrences
We saw that the retainAll and removeAll method of the Multiset does not consider the count of the elements. Multsets has two methods that retains and removes by considering the count of the elements.
retainOccurrences
The retainOccurrences method works like this:
For each element e in the second multiset, it retains the minimum of count of e in the two multisets. An example will better explain this
Multiset<String> multiset1 = HashMultiset.create();
multiset1.add("apple");
multiset1.add("orange", 2);
multiset1.add("grape", 8);
multiset2 = HashMultiset.create();
multiset2.add("orange", 8);
multiset2.add("grape", 3);
multiset2.add("pear", 2);
Multisets.retainOccurrences(multiset1, multiset2);
System.out.println(multiset1); //[orange x 2, grape x 3]
We have asked it to retain 8 oranges, 3 grapes and 2 pears.
- Since pear is not there in the first multiset, it does nothing for it.
- Retain 8 oranges - Since it has only two oranges, it retains both (minimum of count of oranges in the two multisets)
- Retain 3 grapes - It retains only 3 grapes (removing 5)
- We haven’t asked it to retain an apple, so it will remove it.
removeOccurrences
For each element e in the second multiset, it removes the count of e in the second multiset from the first multiset. If there are not enough occurrences to remove, it makes the count of e as 0 in the first multiset (which means it removes it from the first multiset).
multiset1 = HashMultiset.create();
multiset1.add("apple");
multiset1.add("orange", 2);
multiset1.add("grape", 8);
multiset2 = HashMultiset.create();
multiset2.add("orange", 8);
multiset2.add("grape", 3);
multiset2.add("pear", 2);
Multisets.removeOccurrences(multiset1, multiset2);
System.out.println(multiset1); //[apple, grape x 5]
We want to remove 8 oranges, 3 grapes and 2 pears
- We do not have pear in the first multiset and hence it does nothing
- After removing 3 grapes, we will be left with 5 grapes
- Remove 8 orange - we have only 2 oranges in the first multiset and hence it removes oranges entirely.
- We did not tell it to remove any apples and thus it remains.
The toMultiset method
The toMultiset method returns a Java Collector that can be wired into the Java streams. It takes a supplier for the Multiset, a function for mapping an element of the stream to an item, a function for mapping an element to derive the count. It accumulates the mapped result (item and the count) into the multiset returned by the supplier. If there are two mapped elements with the same name, the counts are summed.
Let us say we have a list of Location objects. Each location has a name and numItems that represents the number of items at that location. We want to transform this list to a multiset with the element name as the location name and the count as the total number of items at that location.
The following code does that using toMultiset method.
static class Location {
private String name;
private int numItems;
Location(String name, int numItems) {
this.name = name;
this.numItems = numItems;
}
public String getName() {
return name;
}
public int getNumItems() {
return numItems;
}
}
List<Location> items = Arrays.asList(new Location("city1", 10),
new Location("city2", 10),
new Location("city1", 20),
new Location("city3", 25));
Multiset<String> multiset = items.stream()
.collect(Multisets.toMultiset(Location::getName, Location::getNumItems, HashMultiset::create));
System.out.println(multiset);//[city1 x 30, city2 x 10, city3 x 25]
The counts of city1 (10 and 20) are summed into one entry in the result multiset. The counts of city2 and city3 end up as it is. This method enables us to convert a stream of elements into a multiset.
Other experimental methods
I thought I can share some nice experimental methods on the Multisets class (annotated with @Beta). I will use the same multisets used for retainOccurrences and removeOccurrences example for explaining all of the methods discussed in this section.
filter
The filter method takes a multiset and a predicate and returns a view of the multiset with the elements that satisfy the predicate.
Let us say we want to filter the first multiset to have the elements whose name starts with a vowel. We can do like:
private static Predicate<String> startsWithVowel() {
List<String> vowels = Arrays.asList("a", "e", "i", "o", "u");
return str -> vowels.stream().anyMatch(vowelCharacter -> str.startsWith(vowelCharacte));
}
Multiset<String> filtered = Multisets.filter(multiset, startsWithVowel());
System.out.println(filtered);//[orange x 2, apple]
union
Similar to intersection, we have a method to take union of two multisets. It returns an unmodifiable view as the result. It combines the elements from both the multisets by setting the count of the elements as the maximum from the two multisets.
Multiset<String> union = Multisets.union(multiset, multiset2);
System.out.println(union); //[orange x 2, apple, grape x 10, pear x 6]
sum
This returns an unmodifiable view of the sum of two multisets. The count of each element is the sum of the count from two multisets.
Multiset<String> sum = Multisets.sum(multiset, multiset2);
System.out.println(sum);//[orange x 3, apple x 2, grape x 16, pear x 6]
difference
Like the sum method, this returns an unmodifiable view of the difference of two multisets. The count of each element is the difference of the count from two multisets. Hence, only the elements present in both the multisets would end up in the result. If the count of an element in the first multiset is less than or equal to the count of it in the second multiset, it will not be there in the result
Multiset<String> difference = Multisets.difference(multiset, multiset2);
System.out.println(difference);//[orange]
Multiset classes
So far we have seen the methods provided by the Multiset interface and the utility methods of the MultiSets class. Now, we will look at a few key concrete implementations of the Multiset interface.
To create an instance of any of the below classes, we can use the static create method on it.
HashMultiset
This is the one we have used for our examples so far. It behaves like a HashSet (or a HashMap since a HashSet uses a HashMap). It hence does not provide any guarantees about the ordering of the elements in the multiset.
LinkedHashMultiset
It provides a predictable iteration order. It is like a LinkedHashSet and it maintains the elements according to the insertion order.
Multiset<String> linkedHashMultiSet = LinkedHashMultiset.create();
linkedHashMultiSet.add("orange");
linkedHashMultiSet.add("apple");
System.out.println(linkedHashMultiSet); //[orange, apple]
TreeMultiset
This implementation of the Multiset orders the elements by their natural ordering or an explicitly provided Comparator. Thus, this is like a TreeMap or a TreeSet.
The below example shows ordering strings by the natural ordering (lexicographic ordering)
Multiset<String> treeMultiSet = TreeMultiset.create();
treeMultiSet.add("orange");
treeMultiSet.add("apple");
treeMultiSet.add("pear");
System.out.println(treeMultiSet); //apple, orange, pear]
Let us pass it a comparator to order strings by their length
treeMultiSet = TreeMultiset.create(Comparator.comparing(s -> s.length()));
treeMultiSet.add("orange");
treeMultiSet.add("apple");
treeMultiSet.add("pear");
System.out.println(treeMultiSet); //[pear, apple, orange]
I have a separate post on the various Comparator.comparing methods. Do check it out after reading this if you are interested.
ConcurrentHashMultiset
This is equivalent to a ConcurrentHashMap that supports concurrent modifications.
Conclusion
In this post we learnt about the Multiset in Google Guava. We first learnt the basic APIs or methods provided by Multiset. Then we looked at the static utility methods from the Multisets class. Finally, we saw some common implementing classes of the Multiset interface along with their behaviour. I covered all these sections with sufficient examples.
Feel free to leave a comment if you have any questions or need clarification.
Check out the other posts on Google Guava to explore what other features it provides.