Java8-Stream流操作List去重問題

Posted on 2022-11-12 by WalkonNet

Java8Stream流操作List去重

根據屬性去重整體去重使用

distinct

ArrayList<LabelInfoDTO> collect = labelInfoDTOS.stream().
collect(Collectors.collectingAndThen(Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(LabelInfoDTO::getLabelCode))), ArrayList::new));

List列表運用Java8的stream流按某字段去重

問題

項目中經常會遇到列表去重的問題，一般可使用Java8的stream()流提供的distinct()方法：list.stream().distinct()。

list的類型為List<String>、List<Integer>，list裡的元素為簡單包裝類型。

或者List<Xxx>，其中Xxx為自定義對象類型，重寫equals和hashCode方法，可根據業務情況來實現，如id相同即認為對象相等。

有時會遇到這種情況，需要對按對象裡的某字段來去重。

例如：

@NoArgsConstructor
@AllArgsConstructor
@Data
class Book {
 
    public static Book of(Long id, String name, String createTime) {
        return new Book(id, name, Date.from(LocalDateTime.parse(createTime, DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")).atZone(ZoneId.systemDefault()).toInstant()));
    }
 
    private Long id;
 
    private String name;
 
    private Date createTime;
}

現在我們要按name字段來去重，假設list如下：

List<Book> books = new ArrayList<>();
books.add(Book.of(1L, "Thinking in Java", "2021-06-29 17:13:14"));
books.add(Book.of(2L, "Hibernate in action", "2021-06-29 18:13:14"));
books.add(Book.of(3L, "Thinking in Java", "2021-06-29 19:13:14"));

思路

1、重寫Book類的equals和hashCode方法，以name來判斷比較是否相同，然後用stream的distinct方法來去重

代碼：

class Book {
    ...
 
    @Override
    public String toString() {
        return String.format("(%s,%s,%s)", id, name, DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss").format(createTime.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime()));
    }
 
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Book book = (Book) o;
        return Objects.equals(name, book.name);
    }
}
 
List<Book> distinctNameBooks1 = books.stream().distinct().collect(Collectors.toList());
System.out.println(distinctNameBooks1);

總結：

通過重寫equals和hashCode方法，按實際需求來比較，可直接使用stream的distinct方法去重，比較方便；

有時對象類不方便或者不能修改，如它已實現好或者是引用的三方包不能修改，該方法不能靈活地按字段來去重。

2、通過Collectors.collectingAndThen的Collectors.toCollection，裡面用TreeSet在構造函數中指定字段

代碼：

List<Book> distinctNameBooks2 = books.stream().collect(Collectors.collectingAndThen(Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(o -> o.getName()))), ArrayList::new));
System.out.println(distinctNameBooks2);

總結：

使用stream流提供的方法，代碼很簡潔，但不足是雖然實現瞭去重效果，但list裡的順序變化瞭，而有的場景需要保持順序。

3、通過stream的filter方法來去重，定義一個去重方法，參數為Function類型，返回值為Predicate類型

代碼：

public static <T> Predicate<T> distinctByKey(Function<? super T, Object> keyExtractor) {
    Map<Object, Boolean> map = new HashMap<>();
    return t -> map.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
 
List<Book> distinctNameBooks3 = books.stream().filter(distinctByKey(o -> o.getName())).collect(Collectors.toList());
System.out.println(distinctNameBooks3);

總結：

通過封裝定義一個去重方法，配合filter方法可靈活的按字段去重，保持瞭原列表的順序，不足之處是內部定義瞭一個HashMap，有一定內存占用，並且多瞭一個方法定義。

4、通過stream的filter方法來去重，不定義去重方法，在外面創建HashMap

代碼：

Map<Object, Boolean> map = new HashMap<>();
List<Book> distinctNameBooks4 = books.stream().filter(i -> map.putIfAbsent(i.getName(), Boolean.TRUE) == null).collect(Collectors.toList());
System.out.println(distinctNameBooks4);

總結：

仍然是配合filter方法實現去重，沒有單獨創建方法，臨時定義一個HashMap，保持瞭原列表的順序，不足之處是有一定內存占用。

PS：暫時沒找到stream流原生支持的可按某字段去重並且保持原列表順序的方法

以上為個人經驗，希望能給大傢一個參考，也希望大傢多多支持WalkonNet。

Java8-Stream流操作List去重問題

目錄