Introduction

When using the Java 8 streams API, we can use the collect method to perform mutable reductions on the data stream. A mutable reduction operation is also known as a fold operation. It involves taking the individual data or elements from the stream combining into a single result container by applying a combining operation. Examples include finding the sum or minimum of a set of numbers. The collect method mentioned earlier takes a Collector instance. The toMap static method in the Collectors class provides one such Collector instance. It is used to reduce the stream of objects into a map. In this blog post, I will demonstrate the Java 8 Collectors.toMap method with examples and how it’s used to fold a stream into a map.

Collectors.toMap - Formal definition

Collectors.toMap:

Returns a Collector that accumulates elements into a Map whose keys and values are the result of applying the provided mapping functions to the input elements.

Signature:

public static <T, K, U> 
    Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper, 
                                    Function<? super T, ? extends U> valueMapper)

A basic toMap method takes a key mapper and a value mapper. These are Java 8 Functions. These functions are used to derive the key and value. An input element from the stream is passed to the keyMapper. The output of that mapper will be used as the map key. Similarly, the valueMapper is used to derive the map value from the input.

Collectors.toMap - Overloads

Merge function

A couple of overloads exist for the toMap method. The default behaviour of the Collector returned by the toMap method seen above is to throw an IllegalStateException when more than one input element from the stream gets mapped to the same key. In other words, it does not allow duplicate keys in the map. When the key mapper can produce the same key for multiple inputs in the stream, we can pass a function to resolve this and provide the new value. This function is a BinaryOperator.

A BinaryOperator is a fancy representation of BiFunction<T, T, T>. It takes two arguments of the same type and returns an object of the same type (think of it as func(ob1, ob2) -> ob3 where ob1, ob2 and ob3 are the same type).

The signature for this is

public static <T, K, U> 
    Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper,
                                    Function<? super T, ? extends U> valueMapper,
                                    BinaryOperator<U> mergeFunction) 

The mergeFunction passed as the third argument is used to resolve collisions between values associated with the same key. We will look at examples soon which will clarify this if not clear at the moment.

Using a custom map type

The toMap methods seen so far returns a HashMap. But this is not part of the contract and not mentioned in the javadoc. Hence we should not rely on this and treat this as an implementation detail. If you need the result to be in a specific type of map, we can specify it by passing a map supplier.

The map supplier is modelled as a Supplier that return a subtype of Map.

Method Signature:

public static <T, K, U, M extends Map<K, U>>
    Collector<T, ?, M> toMap(Function<? super T, ? extends K> keyMapper,
                             Function<? super T, ? extends U> valueMapper,
                             BinaryOperator<U> mergeFunction,
                             Supplier<M> mapSupplier) 

Student example

We will use the below simple POJO for the Student and start with a List<Student>. Each student has an id (unique), a name and age.

public class Student {
    private Long id;
    private String name;
    private Integer age;

    private Student(Long id, String name, Integer age) {
        this.id = id;
        this.name = name;
        this.age = age;
    }

    public Long getId() {
        return id;
    }

    public String getName() {
        return name;
    }

    public Integer getAge() {
        return age;
    }
    
    //equals and hashcode left out for brevity (not mandatory for the examples to be seen)
    
    @Override
    public String toString() {
        return "[" + id + ", " + name + "]";
    }
}

Let us create a list of Students that we will use later.

List<Student> students = new ArrayList<>();
students.add(new Student(1L, "John", 25));
students.add(new Student(2L, "Evans", 33));
students.add(new Student(3L, "Chris", 19));
students.add(new Student(4L, "Jennifer", 25));
students.add(new Student(5L, "Mitch", 29));
students.add(new Student(6L, "Evans", 25));

Notice that there are two students with the same name (Evans).

Collectors toMap Examples

Using a key and a value mapper

Let us say we want to construct a map mapping the student id to their name. The traditional way to do it will look like

Map<Long, String> idToNames = new HashMap<>();
for (Student student : students) {
    idToNames.put(student.getId(), student.getName());
}

Using the Java 8 stream API and using the collect method to collect as map offers a succinct way

Map<Long, String> idToNames = students.stream()
        .collect(Collectors.toMap(Student::getId, Student::getName));
System.out.println(idToNames);

Outputs:

{1=John, 2=Evans, 3=Chris, 4=Jennifer, 5=Mitch, 6=Evans}

Using a merge function

Now, let us convert the Student stream to obtain a mapping from the student name to the Student object itself.

Map<String, Student> nameToStudent = students.stream()
            .collect(Collectors.toMap(Student::getName, Function.identity()));

This will throw an IllegalStateException (java.lang.IllegalStateException: Duplicate key Id: [2, Evans]) when it puts the second Evans object (with id 6). This is because the result map already contains an entry for Evans.

Sidenote: The above code was run with JDK8. The error message is not that clear as it reports the value (String representation of the Student object) corresponding to the existing key(Evans). This has been changed in JDK9 to report the duplicate key and the two conflicting values. Refer to this Stackoverflow post for details.

Now, let us fix this by resolving the conflict. For the sake of this post, we will pick one of the Student objects when there are more than one Student with the same name.

nameToStudent = students.stream()
            .collect(Collectors.toMap(Student::getName, Function.identity(),
                    (oldStudent, newStudent) -> oldStudent));
System.out.println(nameToStudent);     

Outputs:

{Mitch=[5, Mitch], Jennifer=[4, Jennifer], Chris=[3, Chris], Evans=[2, Evans], John=[1, John]}

We have picked the first Student object that is already there when we encounter a second with the same name. The above lambda when expanded as an anonymous class will look like

BinaryOperator<Student> resolver = new BinaryOperator<Student>() {
    @Override
    public Student apply(Student student1, Student student2) {
        return student1;
    }
};

Using a map supplier

For demonstrating the last overload of the toMap method, we will get the collector’s result in TreeMap. A TreeMap orders the map entries sorted by the key’s natural order or by the passed comparator (if passed). We will use a plain TreeMap and since the keys of the map are Strings, it will order them lexicographically.

nameToStudent = students.stream()
        .collect(Collectors.toMap(Student::getName, Function.identity(),
                (oldStudent, newStudent) -> oldStudent,
                TreeMap::new));
System.out.println(nameToStudent);

Outputs:

{Chris=[3, Chris], Evans=[2, Evans], Jennifer=[4, Jennifer], John=[1, John], Mitch=[5, Mitch]}

Compare this output from the previous output to notice that the keys (student names) are sorted lexicographically.

To learn about TreeMaps, refer to the post TreeMap in Java with examples.

Conclusion

In this post we have seen Java 8 Collectors.toMap with examples. Collectors’ toMap method returns a Collector that we can use to perform reduction on a stream of element into a map. First, a basic toMap method takes a key and a value mapper to generate the map. Second, when the key mapper maps more than one element to the same value it results in collisions. To resolve it, we can pass a function to resolve the collision. Finally, we can use a map supplier to create any subtype of map.