Introduction
The groupingBy is one of the static utility methods in the Collectors class in the JDK. It returns a Collector used to group the stream elements by a key (or a field) that we choose. In this blog post, we will look at the Collectors groupingBy with examples.
Grouping by
Before we look at the Collectors groupingBy, I’ll show what grouping by is (Feel free to skip this section if you are already aware of this).
The formal definition of grouping is “the action of putting people or things in a group or groups.”. Let us say we have a list of items where each item has a name, price and the manufacturing date. We would like to group the items by the manufacturing month. As a result, we would end up with a Map whose key is the month and value is the list of items that was manufactured in that month. We are thus grouping items that have the same manufacturing month into a row.
The groupingBy we are discussing here has the same result as the GROUP BY statement in SQL.
Collectors groupingBy
Let us look at the javadoc of the Collectors groupingBy
Returns a {@code Collector} implementing a “group by” operation on input elements of type {@code T}, grouping elements according to a classification function, and returning the results in a {@code Map}.
The classification function maps elements to some key type {@code K}. The collector produces a {@code Map<K, List>} whose keys are the values resulting from applying the classification function to the input elements, and whose corresponding values are {@code List}s containing the input elements which map to the associated key under the classification function.
The basic groupingBy method accepts a function that maps/converts the stream elements (one at a time) (let us call it as T) into another value (let us call it as K). The collector returns a map whose keys are the values Ks derived. The value is the list of stream elements that were mapped to the same K by the mapper function passed.
Basic groupingBy
Signature:
public static <T, K> Collector<T, ?, Map<K, List<T>>>
groupingBy(Function<? super T, ? extends K> classifier)
The mapper function discussed above is the classifier here. The result type of the Collector is Map<K, List<T>> as explained above.
Using a Custom collector
Signature:
public static <T, K, A, D> Collector<T, ?, Map<K, D>>
groupingBy(Function<? super T, ? extends K> classifier,
Collector<? super T, A, D> downstream)
This is an overloaded version of the groupingBy method. Rather than just combining the grouped items into a list, we can perform a reduction operation on the values (associated with a key) to convert it to some other value. In order to enable this, it accepts a downstream Collector.
Custom map supplier
Signature:
public static <T, K, D, A, M extends Map<K, D>>
Collector<T, ?, M> groupingBy(Function<? super T, ? extends K> classifier,
Supplier<M> mapFactory,
Collector<? super T, A, D> downstream)
The API documentation makes no guarantee on the type of the resultant map. If we want to result to be a particular map type, we can pass a supplier returning a map of that type.
Example setup
I have created a Student class with simple fields for this example. The code that follows it builds a list of students. We will use this list in the examples in the next section
public class Student {
private Long id;
private String name;
private Integer age;
private String department;
private List<String> courses;
private Address address;
Student(Long id, String name, Integer age, String department, List<String> courses, Address address) {
this.id = id;
this.name = name;
this.age = age;
this.department = department;
this.courses = courses;
this.address = address;
}
public Long getId() {
return id;
}
public String getName() {
return name;
}
public Integer getAge() {
return age;
}
public String getDepartment() {
return department;
}
public List<String> getCourses() {
return new ArrayList<>(courses);
}
public Address getAddress() {
return address;
}
@Override
public String toString() {
//Printing only id and name for brevity
return "id = " + id
+ ", name = " + name;
}
}
public class Address {
private String addressLine;
private String city;
private String zip;
private Address(String addressLine, String city, String zip) {
this.addressLine = addressLine;
this.city = city;
this.zip = zip;
}
public String getAddressLine() {
return addressLine;
}
public String getCity() {
return city;
}
public String getZip() {
return zip;
}
@Override
public String toString() {
return "addressLine = " + addressLine
+ ", city = " + city
+ ", zip = " + zip;
}
}
Note that the getter method, getCourses, is returning a copy of the list of courses rather than returning the underlying list as it is. This is to make the class immutable.
I recommend reading on Benefits of Immutable class in Java and How to make a class immutable to know more.
Examples
In this section, we will look at various scenarios or use cases and the solution using the groupingBy. Before you look at the solution, try to roughly construct the solution without using the groupingBy method. Then, you will come to realise how much easier groupingBy makes it for us - both to make the code elegant, precise and error free).
Group the students by their name
From the list of students constructed earlier, group the students by their name.
Map<String, List<Student>> nameToStudents = students.stream()
.collect(Collectors.groupingBy(Student::getName));
System.out.println(nameToStudents);
Prints,
{Julia=[id = 4, name= Julia], Don=[id = 1, name= Don], James=[id = 2, name= James], Ben=[id= 3, name= Ben], Mary=[id = 5, name= Mary]}
Student ids in each department
List the set of student ids by each department.
Map<String, Set<Long>> departmentToStudentIds = students.stream()
.collect(Collectors.groupingBy(Student::getDepartment,
Collectors.mapping(Student::getId, Collectors.toSet())));
System.out.println(departmentToStudentIds);
Prints,
{CS=[1, 2], Management=[5], Business=[3, 4]}
The classifier function of the groupingBy converts a Student to his/her department. Then we are using a downstream collector here. We extract the Student id using Collector.mapping (downstream collector) function. The mapping is a collector built for this - to apply a mapping function an element. The Collectors.toSet() is a downstream collector of the mapping collector. Using it, we collect the student ids in a set.
To get the results in a TreeMap, we can pass a map supplier. The below output is the same as above, but since we are using a TreeMap, it orders the key by its natural ordering. Since, it is a String here, the natural ordering is the lexicographical ordering.
TreeMap<String, Set<Long>> departmentToStudentIdsAsTreeMap = students.stream()
.collect(Collectors.groupingBy(Student::getDepartment,
TreeMap::new,
Collectors.mapping(Student::getId, Collectors.toSet())));
System.out.println(departmentToStudentIdsAsTreeMap);
Prints,
{Business=[3, 4], CS=[1, 2], Management=[5]}
Average student age by department
Find the average student age in a department.
Map<String, Double> departmentToAvgAge = students.stream()
.collect(Collectors.groupingBy(Student::getDepartment,
Collectors.averagingInt(Student::getAge)));
System.out.println(departmentToAvgAge);
Prints,
{CS=22.0, Management=24.0, Business=21.5}
Collectors.averagingInt is used to find the average age of the students.
In CS, there are two students with age 22 both. So, the average is 22. The Management department has only one student. The Business department has two students with ages 20 and 23, and thus the average is 21.5.
Map student id to name, grouping by department
Construct a map of student id to name. Construct one such map for each department (grouping by department).
In other words, for each department, display the student id to name mapping.
Map<String, Map<Long, String>> deptToStudentIdToName = students.stream()
.collect(Collectors.groupingBy(Student::getDepartment,
Collectors.toMap(Student::getId, Student::getName)));
System.out.println(deptToStudentIdToName);
Prints,
{CS={1=Don, 2=James}, Management={5=Mary}, Business={3=Ben, 4=Julia}}
Collectors.toMap is used to create the desired inner map of student id to name. I have a separate post on Collectors.toMap.
Number of students from a city, grouped by department
For each department, show the number of students by cities belonging to that department.
Map<String, Map<String, Long>> deptToCityCount = students.stream()
.collect(Collectors.groupingBy(Student::getDepartment,
Collectors.groupingBy(student -> student.getAddress().getCity(),
Collectors.counting())));
Prints,
{CS={Midland=1, San Jose=1}, Management={Kent=1}, Business={Dallas=2}}
We can achieve the same result by using Collectors.toMap (but using Collectors.counting is more precise)
deptToCityCount = students.stream()
.collect(Collectors.groupingBy(Student::getDepartment,
Collectors.toMap(s -> s.getAddress().getCity(),
student -> 1L,
Long::sum)));
Conclusion
In this post we have looked at Collectors.groupingBy with a lot of examples. Collectors.groupingBy returns a collector that can be used to group the stream element by a key. Performing grouping reduction using this is extremely helpful when compared to the pre-Java 8 (without Streams API) way.