云栈社区»论坛 › 技术文档「 Note & Doc 」 › Java集合框架核心接口与实现类实战指南：从数据结构到应用场景 ...

发回帖发新帖

983 积分	0 好友	139 主题

发消息

Java集合框架核心接口与实现类实战指南：从数据结构到应用场景

发表于昨天 21:50 | 查看: 1| 回复: 0

集合框架是Java中用于存储和操作对象组的核心工具，它提供了一套标准化的接口和类，用于处理各种数据结构需求。

1. 集合的基本概念

什么是集合？

集合是一个用于存放对象的容器。Java集合框架定义在java.util包中，包含了一系列代表和操作集合的接口与类。

List<String> itemList = new ArrayList<>();
itemList.add("Element1");
itemList.add("Element2");
itemList.add("Element3");

为什么需要集合？

动态大小：与固定长度的数组不同，集合可以根据需要动态扩展或收缩。
类型安全：通过泛型确保在编译期进行类型检查。
丰富操作：内置了添加、删除、遍历、查找、排序等多种便捷方法。
算法支持：框架本身提供了常用的算法，如排序和随机打乱。

集合框架的层次结构

Java集合框架主要由两个根接口Collection和Map构成。

Collection接口
├── List接口（有序、可重复）
│   ├── ArrayList（基于动态数组）
│   ├── LinkedList（基于双向链表）
│   └── Vector（线程安全的动态数组，已较少使用）
├── Set接口（无序、不可重复）
│   ├── HashSet（基于哈希表）
│   ├── TreeSet（基于红黑树，有序）
│   └── LinkedHashSet（基于哈希表与链表，保持插入顺序）
└── Queue接口（队列）
    ├── PriorityQueue（优先级队列）
    └── ArrayDeque（双端队列，也可用作栈）

Map接口（键值对映射）
├── HashMap（基于哈希表）
├── TreeMap（基于红黑树，按键排序）
├── LinkedHashMap（保持键的插入顺序或访问顺序）
└── ConcurrentHashMap（线程安全的哈希表）

2. 泛型与类型安全

泛型是什么？

泛型允许在定义类、接口或方法时使用类型参数，从而提高代码的类型安全性和重用性。

// 不使用泛型：需要强制转换，存在运行时风险
List rawList = new ArrayList();
rawList.add("String");
rawList.add(123); // 可以放入任意类型
String str = (String) rawList.get(1); // 运行时报错：ClassCastException

// 使用泛型：编译时检查，安全
List<String> safeList = new ArrayList<>();
safeList.add("String");
// safeList.add(123); // 编译错误，无法通过
String safeStr = safeList.get(0); // 无需强制转换

泛型的优点：

类型安全：将运行时错误转移至编译时发现。
代码简洁：消除了繁琐的类型强制转换。
代码重用：可以编写适用于多种类型的通用代码。

泛型的使用

// 自定义泛型类
class Container<T> {
    private T item;

    public void set(T item) {
        this.item = item;
    }

    public T get() {
        return item;
    }
}

// 使用
Container<String> stringContainer = new Container<>();
stringContainer.set("Hello");
String value = stringContainer.get(); // 类型为String

Container<Integer> intContainer = new Container<>();
intContainer.set(100);
int num = intContainer.get(); // 类型为Integer

3. 迭代器（Iterator）

迭代器是什么？

迭代器（Iterator）提供了一种统一的方式来遍历集合中的所有元素，而不需要了解底层集合的具体实现。

List<String> list = new ArrayList<>();
list.add("A");
list.add("B");
list.add("C");

// 使用迭代器遍历
Iterator<String> iterator = list.iterator();
while (iterator.hasNext()) {
    String element = iterator.next();
    System.out.println(element);
}

// 使用for-each循环（语法糖，底层调用迭代器）
for (String element : list) {
    System.out.println(element);
}

迭代器的特点：

统一访问：所有实现了Iterable接口的集合都可以用相同方式遍历。
安全删除：可以使用iterator.remove()在遍历时安全地删除当前元素，避免并发修改异常。
单向移动：标准迭代器通常只支持从前向后移动。

迭代器的使用场景

List<String> items = new ArrayList<>();
items.add("Apple");
items.add("Banana");
items.add("Apple");
items.add("Cherry");

// 安全地删除所有“Apple”
Iterator<String> it = items.iterator();
while (it.hasNext()) {
    if ("Apple".equals(it.next())) {
        it.remove(); // 使用迭代器的remove方法
    }
}
System.out.println(items); // 输出：[Banana, Cherry]

常见错误：

List<String> list = new ArrayList<>();
list.add("one");
list.add("two");

// 错误！在for-each循环中直接修改集合结构
for (String s : list) {
    if ("one".equals(s)) {
        list.remove(s); // 抛出ConcurrentModificationException异常
    }
}

4. 链表（LinkedList）

链表是什么？

LinkedList是一个基于双向链表实现的List。每个节点（Node）除了存储数据外，还存储指向前驱和后继节点的引用。

LinkedList<String> linkedList = new LinkedList<>();

// 添加元素
linkedList.add("Middle"); // 添加到末尾
linkedList.addFirst("First"); // 添加到头部
linkedList.addLast("Last");  // 添加到尾部
linkedList.add(1, "Inserted"); // 插入到指定索引

// 访问元素
String first = linkedList.getFirst();
String last = linkedList.getLast();

// 删除元素
linkedList.removeFirst();
linkedList.removeLast();

链表的优缺点：

优点：

插入删除高效：在已知位置（通过迭代器获得）进行插入或删除操作，时间复杂度为O(1)，因为只需修改相邻节点的指针。
无容量限制：理论上可以无限添加元素。
内存非连续：节点分散在内存中，不需要大块连续内存空间。

缺点：

随机访问慢：访问第n个元素需要从头部或尾部开始遍历，时间复杂度为O(n)。
内存开销大：每个节点都需要额外的空间存储前后节点的引用。

链表 vs 数组列表

理解不同数据结构的时间复杂度对性能优化至关重要。

// 测试在列表头部插入元素的性能差异
final int COUNT = 10000;
List<Integer> arrayList = new ArrayList<>(COUNT); // 预分配容量
List<Integer> linkedList = new LinkedList<>();

// LinkedList在头部插入
long start = System.nanoTime();
for (int i = 0; i < COUNT; i++) {
    linkedList.add(0, i); // O(1)操作
}
long linkedListTime = System.nanoTime() - start;

// ArrayList在头部插入
start = System.nanoTime();
for (int i = 0; i < COUNT; i++) {
    arrayList.add(0, i); // 每次插入都需要移动后面所有元素，O(n)操作
}
long arrayListTime = System.nanoTime() - start;

System.out.println("LinkedList 头部插入耗时：" + linkedListTime);
System.out.println("ArrayList 头部插入耗时：" + arrayListTime);

5. 堆栈（Stack）

堆栈是什么？

堆栈（Stack）是一种遵循后进先出（LIFO, Last In First Out）原则的数据结构。主要操作是push（压栈）和pop（弹栈）。

// Java中的旧Stack类（继承自Vector，线程安全但性能一般，不推荐）
Stack<String> oldStack = new Stack<>();
oldStack.push("A");
oldStack.push("B");
String top = oldStack.peek(); // 查看栈顶元素"B"，不移除
String popped = oldStack.pop(); // 弹出并移除栈顶元素"B"

// 推荐使用Deque接口的实现来模拟栈
Deque<String> stack = new ArrayDeque<>();
stack.push("First"); // 入栈
stack.push("Second");
String item = stack.pop(); // 出栈，返回"Second"
String peekItem = stack.peek(); // 查看栈顶，返回"First"

堆栈的特点：

LIFO：最后一个放入的元素最先被取出。
受限访问：只能访问或操作栈顶元素。
核心操作：push(E), pop(), peek()。

堆栈的应用场景

// 1. 括号匹配验证
public boolean isValid(String s) {
    Deque<Character> stack = new ArrayDeque<>();
    for (char c : s.toCharArray()) {
        if (c == '(' || c == '[' || c == '{') {
            stack.push(c);
        } else {
            if (stack.isEmpty()) return false;
            char top = stack.pop();
            if ((c == ')' && top != '(') ||
                (c == ']' && top != '[') ||
                (c == '}' && top != '{')) {
                return false;
            }
        }
    }
    return stack.isEmpty();
}

// 2. 模拟撤销(Undo)功能
Deque<String> stateHistory = new ArrayDeque<>();
stateHistory.push("Initial State");
stateHistory.push("State after edit 1");
stateHistory.push("State after edit 2");

// 用户执行撤销
String currentState = stateHistory.pop(); // 回到"State after edit 1"

6. 队列（Queue）

队列是什么？

队列（Queue）是一种遵循先进先出（FIFO, First In First Out）原则的数据结构。

// 基本队列操作（以LinkedList实现为例）
Queue<String> queue = new LinkedList<>();
queue.offer("Alice"); // 入队，优于add()（后者在失败时抛异常）
queue.offer("Bob");
queue.offer("Charlie");

String head = queue.peek(); // 获取队首元素"Alice"，不移除
String served = queue.poll(); // 出队并返回队首元素"Alice"

// 双端队列（Deque）
Deque<String> deque = new ArrayDeque<>();
deque.offerFirst("Front"); // 从队首入队
deque.offerLast("Rear");   // 从队尾入队
String first = deque.pollFirst(); // 从队首出队
String last = deque.pollLast();   // 从队尾出队

队列的种类：

普通队列：Queue，FIFO。
双端队列：Deque，两端都可进行入队出队操作。
优先级队列：PriorityQueue，元素按优先级顺序出队。
阻塞队列：BlockingQueue（在java.util.concurrent包中），当队列空/满时，操作会阻塞或等待。

优先级队列（PriorityQueue）

// 按任务优先级处理
class Task {
    String name;
    int priority; // 数值越小，优先级越高

    public Task(String name, int priority) {
        this.name = name;
        this.priority = priority;
    }
}

// 创建优先级队列，并指定比较器
PriorityQueue<Task> taskQueue = new PriorityQueue<>(
        Comparator.comparingInt(t -> t.priority));

taskQueue.offer(new Task("常规日志", 3));
taskQueue.offer(new Task("紧急错误", 1));
taskQueue.offer(new Task("一般警告", 2));

// 按优先级顺序出队处理
while (!taskQueue.isEmpty()) {
    Task task = taskQueue.poll();
    System.out.println("处理任务: " + task.name + " [优先级:" + task.priority + "]");
}
// 输出顺序：紧急错误 -> 一般警告 -> 常规日志

7. 映射（Map）

映射是什么？

Map用于存储键值对（Key-Value Pair）映射关系，提供了通过键快速查找值的功能。

// HashMap是最常用的Map实现
Map<String, String> countryCapital = new HashMap<>();

// 添加键值对
countryCapital.put("China", "Beijing");
countryCapital.put("Japan", "Tokyo");
countryCapital.put("USA", "Washington D.C.");

// 获取值
String capital = countryCapital.get("China"); // "Beijing"

// 检查键是否存在
boolean exists = countryCapital.containsKey("Japan");

// 遍历Map
// 1. 遍历键值对Entry（推荐）
for (Map.Entry<String, String> entry : countryCapital.entrySet()) {
    System.out.println(entry.getKey() + " -> " + entry.getValue());
}
// 2. 遍历键
for (String key : countryCapital.keySet()) {
    System.out.println("Key: " + key);
}
// 3. 遍历值
for (String value : countryCapital.values()) {
    System.out.println("Value: " + value);
}

常用Map实现类对比：

特性	HashMap	TreeMap	LinkedHashMap	ConcurrentHashMap
顺序	不保证顺序	按键的自然顺序或比较器顺序排序	保持键的插入顺序或访问顺序	不保证顺序
线程安全	否	否	否	是
允许null键/值	是/是	否（取决于比较器）/是	是/是	否/否
底层结构	数组+链表/红黑树	红黑树	数组+链表+双向链表	分段数组+链表/红黑树
get/put平均时间复杂度	O(1)	O(log n)	O(1)	O(1)

HashMap的工作原理简析

HashMap的核心在于哈希函数和冲突解决。

哈希函数：通过key.hashCode()计算哈希码，再经过扰动函数处理，最终映射到内部数组的一个下标。
哈希冲突：不同的键可能映射到同一数组下标。
链表法：每个数组位置（桶）是一个链表（或红黑树）的头节点，哈希冲突的元素被放入同一个链表中。
红黑树优化：当链表长度超过阈值（默认为8）且数组容量大于64时，链表会转换为红黑树，以提升极端情况下的查询效率（O(n) -> O(log n)）。
扩容：当元素数量超过容量 * 负载因子时，数组会扩容（通常翻倍），并重新计算所有元素的位置（rehash）。

TreeMap（有序映射）

TreeMap基于红黑树实现，能够保证键处于有序状态。

// 按键的自然顺序（String为字典序）排序
Map<String, Integer> scoreMap = new TreeMap<>();
scoreMap.put("Bob", 85);
scoreMap.put("Alice", 92);
scoreMap.put("Charlie", 78);

for (String name : scoreMap.keySet()) {
    System.out.println(name + ": " + scoreMap.get(name));
}
// 输出顺序：Alice -> Bob -> Charlie

// 使用自定义比较器（按分数降序）
Map<String, Integer> sortedByValue = new TreeMap<>((a, b) -> {
    int scoreCompare = scoreMap.get(b).compareTo(scoreMap.get(a));
    return scoreCompare != 0 ? scoreCompare : a.compareTo(b); // 分数相同按名字排序
});
sortedByValue.putAll(scoreMap);

LinkedHashMap（保持插入顺序）

LinkedHashMap是HashMap的子类，在哈希表的基础上维护了一个贯穿所有条目的双向链表，从而可以预测的迭代顺序（插入顺序或访问顺序）。

// 保持插入顺序
Map<String, String> insertOrderMap = new LinkedHashMap<>();
insertOrderMap.put("Key3", "Value3");
insertOrderMap.put("Key1", "Value1");
insertOrderMap.put("Key2", "Value2");

for (Map.Entry<String, String> entry : insertOrderMap.entrySet()) {
    System.out.println(entry.getKey()); // 输出顺序：Key3, Key1, Key2
}

// 创建按访问顺序排序的Map（可用于实现简单的LRU缓存）
Map<String, String> accessOrderMap = new LinkedHashMap<>(16, 0.75f, true);
accessOrderMap.put("A", "1");
accessOrderMap.put("B", "2");
accessOrderMap.put("C", "3");
accessOrderMap.get("A"); // 访问A，会将其移到链表末尾
// 此时迭代顺序为：B, C, A

8. 集合的最佳实践

选择正确的集合

根据具体场景选择合适的集合类，是编写高效、清晰代码的关键。

public class CollectionSelection {
    // 场景1：需要频繁按索引随机访问
    public List<String> getFastRandomAccessList() {
        return new ArrayList<>(); // 底层数组，O(1)访问
    }

    // 场景2：需要频繁在任意位置插入或删除
    public List<String> getFrequentInsertDeleteList() {
        return new LinkedList<>(); // 底层链表，已知位置时O(1)插入删除
    }

    // 场景3：需要存储不重复的元素
    public Set<String> getUniqueElementSet() {
        return new HashSet<>(); // O(1)的查找和添加
    }

    // 场景4：需要元素自动排序
    public Set<String> getAutoSortedSet() {
        return new TreeSet<>();
    }

    // 场景5：需要高效的键值对查找
    public Map<String, Object> getKeyValueStore() {
        return new HashMap<>(); // 通用首选
    }

    // 场景6：多线程环境下使用Map
    public Map<String, Object> getThreadSafeMap() {
        return new ConcurrentHashMap<>(); // 高并发首选
    }
}

性能考虑

了解不同集合在不同操作下的性能差异，有助于避免性能瓶颈。

public void comparePerformance() {
    final int SIZE = 100000;

    List<Integer> arrayList = new ArrayList<>(SIZE); // 预分配容量
    List<Integer> linkedList = new LinkedList<>();

    // 测试尾部追加：两者都很快（ArrayList可能需要偶尔扩容）
    long start = System.currentTimeMillis();
    for (int i = 0; i < SIZE; i++) arrayList.add(i);
    System.out.println("ArrayList尾部追加耗时: " + (System.currentTimeMillis() - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < SIZE; i++) linkedList.add(i);
    System.out.println("LinkedList尾部追加耗时: " + (System.currentTimeMillis() - start));

    // 测试随机访问：ArrayList绝对优势
    start = System.currentTimeMillis();
    for (int i = 0; i < SIZE; i++) arrayList.get(i);
    System.out.println("ArrayList随机访问耗时: " + (System.currentTimeMillis() - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < SIZE; i++) linkedList.get(i); // LinkedList需要遍历！
    System.out.println("LinkedList随机访问耗时: " + (System.currentTimeMillis() - start));
}

线程安全考虑

标准集合实现（如ArrayList, HashMap）不是线程安全的。在多线程环境下直接使用会导致数据不一致等问题。

// 不安全的做法
Map<String, Integer> unsafeMap = new HashMap<>();

// 两个线程同时修改
Runnable unsafeTask = () -> {
    for (int i = 0; i < 10000; i++) {
        unsafeMap.put("key", unsafeMap.getOrDefault("key", 0) + 1);
    }
};
// 运行后，unsafeMap.get("key")的结果很可能小于20000

// 解决方案1：使用并发集合（推荐）
Map<String, Integer> concurrentMap = new ConcurrentHashMap<>();

// 解决方案2：使用Collections的工具方法包装（性能开销较大，适用于低并发或遗留代码）
Map<String, Integer> synchronizedMap = Collections.synchronizedMap(new HashMap<>());
List<String> synchronizedList = Collections.synchronizedList(new ArrayList<>());

总结

Java集合框架是日常开发中最常用的工具库之一。核心接口List、Set、Queue、Map涵盖了绝大多数数据存储和处理场景。

核心选择指南：

ArrayList：默认的List选择，适用于大多数“读多写少”或“随机访问频繁”的场景。
LinkedList：适用于频繁在列表中间进行插入和删除操作的场景。
HashSet：默认的Set选择，用于快速去重和成员检查。
TreeSet：当需要有序且不重复的元素集合时使用。
HashMap：默认的Map选择，用于高效的键值对存取。
LinkedHashMap：当需要保持键的插入顺序或实现LRU缓存时使用。
ConcurrentHashMap：在多线程环境下需要高性能键值对存储时的首选。

掌握不同集合的特性和适用场景，能够帮助开发者编写出更高效、更健壮的代码。务必根据实际的访问模式（读/写/随机访问/顺序访问）、数据特性（是否需要排序、去重）和并发环境来做出最合适的选择。

上一篇：AIDC储能电源技术演进：从48V到800V高压直流架构的技术路线与市场机遇
下一篇：C++类型推导：auto与decltype(auto)核心差异与应用场景解析

Java, HashMap, ArrayList, LinkedList, ConcurrentMap