Collectors partitioningBy in Java

Introduction

The Java Collectors class has a lot of static utility methods that return various collectors. I have already written about the toMap and groupingBy Collectors in Java 8 and other Collector methods added newly in JDK 9 to 11. In this post, we will learn about the Collectors partitioningBy method. We will also see the differences between the Partitioning Collector and groupingBy Collector.

Partitioning Collector with a Predicate

This is the first (and simplest) of the two Collectors partitioningBy method that exists. This one takes a Predicate and returns a Collector that partitions the elements of the stream as per the passed predicate. Each element will be passed to the predicate. Since a predicate can result in either a true or a false, this Collector groups the elements into two sets and returns the result as a map typed as Map<Boolean, List<T>>.

This map will always have two entries; one for Boolean.TRUE and another for Boolean.FALSE. Those elements for which the predicate returned true will be added as a list against Boolean.TRUE and the rest will be mapped to the key Boolean.FALSE.

Note: There are no guarantees on the type, mutability, serializability, or thread-safety of the Map or the List returned.

Let us look at an example.

List<String> strings = List.of("apple", "orange", "banana", "pear");
Map<Boolean, List<String>> partitionedFruitsByLength = strings.stream()
        .collect(Collectors.partitioningBy(fruit -> fruit.length() > 4));

System.out.println(partitionedFruitsByLength);

There are a list of fruits. We create a stream out of the list and collect them using a Collector returned by Collectors.partitioningBy. We pass the predicate fruit -> fruit.length() > 4 which will return true if the name of the fruit is greater than 4 and false otherwise.

There is only one fruit (pear) which will fail to satisfy this condition. The result of this is shown below.
{false=[pear], true=[apple, orange, banana]}

The Map has the fruit ‘pear’ mapped to the key false and the rest of the fruits are mapped to the key true.

The Partitioned Map always has two entries

The map returned by partitioning will always have mappings for both false and true. If the predicate returned true for all the elements, the mapped value for Boolean.FALSE will be an empty list and vice versa.

Let us change the predicate to return true for fruits if its name is greater than or equal to 4.

partitionedFruitsByLength = strings.stream()
        .collect(Collectors.partitioningBy(fruit -> fruit.length() >= 4));
System.out.println(partitionedFruitsByLength);

This prints, 

{false=[], true=[apple, orange, banana, pear]}

The value for false is an empty list.

Partitioning Collectors with a Predicate and a downstream Collector

Sometimes, we might want to do more than just group the elements as a list. There is an overloaded partitioningBy method that takes a downstream collector in addition to a Predicate.

This method returns a Collector that partitions the elements by the passed predicate. But rather than collecting them as a list, it reduces them according to the passed downstream collector. Still it returns a Map with two boolean keys, but the value of the entry in the map is determined by the downstream collector (which does the downstream reduction).

Example: Let us use the same predicate as earlier but count the number of fruits falling in each category. We use a downstream collector of Collectors.counting() for this.

Map<Boolean, Long> fruitsCount = strings.stream()
        .collect(Collectors.partitioningBy(fruit -> fruit.length() > 4,
                Collectors.counting()));
System.out.println(fruitsCount); //{false=1, true=3}

We can see that the value for the key false is 1 as there was one fruit for which the predicate evaluated to false. And the value for the key true is 3.

If we had used a downstream collector of toList(), then it would be equivalent to the first partitioningBy method we saw.

partitionedFruitsByLength = strings.stream()
        .collect(Collectors.partitioningBy(fruit -> fruit.length() > 4,
                Collectors.toList()));

System.out.println(partitionedFruitsByLength); //{false=[pear], true=[apple, orange, banana]}

Partition with a mapping Collector

Let us partition again in the same way but transform the fruit names into upper case before collecting them into the list. We can do this using the mapping collector as the downstream collector. And toList is the downstream collector of the mapping collector.

Map<Boolean, List<String>> partitionedFruitsInUpperCaseByLength = strings.stream()
        .collect(Collectors.partitioningBy(fruit -> fruit.length() > 4,
                Collectors.mapping(String::toUpperCase, Collectors.toList())));

System.out.println(partitionedFruitsInUpperCaseByLength);

String::toUpperCase is a method reference of the lambda expression fruit -> fruit.toUpperCase(). This outputs,

{false=[PEAR], true=[APPLE, ORANGE, BANANA]}

Similarities between Collectors partitioningBy and groupingBy

Let us now look at the similarities between partitioningBy and groupingBy Collectors. There are three overloaded groupingBy methods.

  • Grouping by a Function
  • Grouping by a Function and a downstream collector
  • At last, grouping by a Function, a downstream collector and a supplier for the resultant map type.
Among these, the first two are somewhat equivalent to the two partitioningBy methods we saw.
 
Example:
We grouped the fruits by a predicate and counted the number of fruits in each grouped category. We can achieve the same using groupingBy too. They produce the same output.
 
//Group by length
Map<Boolean, List<String>> groupedByLength = strings.stream()
        .collect(Collectors.groupingBy(fruit -> fruit.length() > 4));
System.out.println(groupedByLength); //{false=[pear], true=[apple, orange, banana]}

Map<Boolean, Long> fruitsCount = strings.stream()
        .collect(Collectors.groupingBy(fruit -> fruit.length() > 4,
                Collectors.counting()));
System.out.println(fruitsCount); //{false=1, true=3}

A partitioningBy method takes a Predicate whereas a groupingBy takes a Function. Here, the function is the same  lambda expression fruit -> fruit.length() > 4. The same lambda expression acts both as a Function (in groupingBy) or as a Predicate (in partitioningBy). This is because the target type of a lambda expression is defined by the context in which it appears.

When it acts as a Predicate, it accepts a String and returns a Boolean. (Predicate<String>).
When it acts as a Function, it transforms a String to a Boolean (Function<String, Boolean>).
 
The downstream collector used is exactly the same in both cases.
 
In summary, when we are grouping by a function that returns a boolean, it is equivalent of a partitioningBy. Thus using a partitioningBy is more appropriate here.

Differences between Collectors partitioningBy and groupingBy

Let us now look at the differences between partitioningBy and groupingBy Collectors. 

Difference in the result keys

The first difference is in the nature of the returned map. As we saw, the partitioningBy will always return a map with exactly two entries; true and false. If no elements are grouped under a key, it would have an empty list.

But groupingBy does not behave like this. Say, if we use the fruit -> fruit.length() > 3 lambda, all the input elements will be grouped as a list for the key true. The result will not have an empty list for key false. This is because, for a partitioning Collector, the keys are defined viz., true and false. Thus, it will always generate a map with two keys. But a groupingBy takes a general purpose function and it cannot determine all possible output values of the function to include in the result.
partitionedFruitsByLength = strings.stream()
        .collect(Collectors.partitioningBy(fruit -> fruit.length() > 3));
System.out.println(partitionedFruitsByLength); //{false=[], true=[apple, orange, banana, pear]

groupedByLength = strings.stream()
        .collect(Collectors.groupingBy(fruit -> fruit.length() > 3));
System.out.println(groupedByLength); //{true=[apple, orange, banana, pear]}

Predicate vs Function

The second difference is related to the lambda’s type (and must be obvious). We have seen that we can use the same lambda expression in different contexts so they get different target type. But if the lambda is defined with a particular target type at compile time we cannot exchange it.

Predicate<String> stringPredicate = str -> str.length() > 4;
Function<String, Boolean> function = str -> str.length() > 4;

//change 1st argument and they won't compile (obviously)
strings.stream()
        .collect(Collectors.partitioningBy(stringPredicate));
strings.stream()
        .collect(Collectors.groupingBy(function));

If we have a Predicate<T>, we can pass it to partitioningBy only and we can pass a Function<T, Boolean> only to a groupingBy and not the other way around.

Conclusion

In this post, we saw about the Collectors partitioningBy method. We learnt that it groups elements by a predicate and can accept a downstream collector for further reduction. Next, we saw the similarities and functional differences between partitioningBy and groupingBy collectors.
Check out the other posts on Java Streams.

Reference links

Leave a Reply