HashMap中的元素玩起了藏猫儿

2012-12-24

HashMap中的元素玩起了躲猫猫当你明明put进了一对非null?key-value进了HashMap，某个时候你再用这个key去取

HashMap中的元素玩起了躲猫猫

当你明明put进了一对非null?key-value进了HashMap，某个时候你再用这个key去取的时候却发现value为null，再次取的时候却又没问题，都知道是HashMap的非线程安全特性引起的，分析具体原因如下：

public V get(Object key) {if (key == null)return getForNullKey();int hash = hash(key.hashCode());// indexFor方法取得key在table数组中的索引，table数组中的元素是一个链表结构，遍历链表，取得对应key的valuefor (Entry<K, V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) {Object k;if (e.hash == hash && ((k = e.key) == key || key.equals(k)))return e.value;}return null;}

?再看看put方法：

public V put(K key, V value) {if (key == null)return putForNullKey(value);int hash = hash(key.hashCode());int i = indexFor(hash, table.length);for (Entry<K, V> e = table[i]; e != null; e = e.next) {Object k;if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {V oldValue = e.value;e.value = value;e.recordAccess(this);return oldValue;}}modCount++;// 若之前没有put进该key，则调用该方法addEntry(hash, key, value, i);return null;}

再看看addEntry里面的实现：

void addEntry(int hash, K key, V value, int bucketIndex) {Entry<K, V> e = table[bucketIndex];table[bucketIndex] = new Entry<K, V>(hash, key, value, e);if (size++ >= threshold)resize(2 * table.length);}

?里面有一个if块，当map中元素的个数（确切的说是元素的个数-1）大于或等于容量与加载因子的积时，里面的resize是就会被执行到的，继续resize方法：

void resize(int newCapacity) {Entry[] oldTable = table;int oldCapacity = oldTable.length;if (oldCapacity == MAXIMUM_CAPACITY) {threshold = Integer.MAX_VALUE;return;}Entry[] newTable = new Entry[newCapacity];transfer(newTable);table = newTable;threshold = (int) (newCapacity * loadFactor);}

resize里面重新new一个Entry数组，其容量就是旧容量的2倍，这时候，需要重新根据hash方法将旧数组分布到新的数组中，也就是其中的transfer方法：

void transfer(Entry[] newTable) {Entry[] src = table;int newCapacity = newTable.length;for (int j = 0; j < src.length; j++) {Entry<K, V> e = src[j];if (e != null) {src[j] = null;do {Entry<K, V> next = e.next;int i = indexFor(e.hash, newCapacity);e.next = newTable[i];newTable[i] = e;e = next;} while (e != null);}}}

在这个方法里，将旧数组赋值给src，遍历src，当src的元素非null时，就将src中的该元素置null，即将旧数组中的元素置null了，也就是这一句：

if (e != null) {src[j] = null;

?此时若有get方法访问这个key，它取得的还是旧数组，当然就取不到其对应的value了。

下面，我们重现一下场景：

import java.util.HashMap;import java.util.Map;public class TestHashMap {public static void main(String[] args) {final Map<String, String> map = new HashMap<String, String>(4, 0.5f);new Thread(){public void run() {while(true) { System.out.println(map.get("name1"));try {Thread.sleep(1000);} catch (InterruptedException e) {e.printStackTrace();}}}}.start();for(int i=0; i<3; i++) {map.put("name" + i, "value" + i);}}}

Debug上面这段程序，在map.put处设置断点，然后跟进put方法中，当i=2的时候就会发生resize操作，在transfer将元素置null处停留片刻，此时线程打印的值就变成null了。

总结：HashMap在并发程序中会产生许多微妙的问题，难以从表层找到原因。所以使用HashMap出现了违反直觉的现象，那么可能就是并发导致的了

public static void main(String[] args) {final Map<String, String> map = new HashMap<String, String>(4, 0.5f);Thread thread = new Thread() {@Overridepublic void run() {while (true) {System.out.println(map.get("name1"));try {Thread.sleep(1000);} catch (InterruptedException e) {e.printStackTrace();}}}};thread.setDaemon(true);thread.start();for (int i = 0; i < 3; i++) {map.put("name" + i, "value" + i);System.out.println("put");}try {Thread.sleep(1000000);} catch (InterruptedException e) {e.printStackTrace();}}public static void main(String[] args) {final Map<String, String> map = new HashMap<String, String>(4, 0.5f);Thread thread = new Thread() {@Overridepublic void run() {while (true) {System.out.println(map.get("name1"));try {Thread.sleep(1000);} catch (InterruptedException e) {e.printStackTrace();}}}};thread.setDaemon(true);thread.start();for (int i = 0; i < 3; i++) {map.put("name" + i, "value" + i);System.out.println("put");}try {Thread.sleep(1000000);} catch (InterruptedException e) {e.printStackTrace();}}

我实际debug的，没问题啊static int indexFor(int h, int length) { return h & (length-1);}
但是不同对象的hashCode 有可能一样，所以HashMap 中每个key 对应的是一个链表，当两个不同key 的hashCode 相同时，那么就放入到对应的同一个链表里，当你取的时候，根据key的hashCode定位到这个链表（链表中存的是 Entry<K,V> 对象），遍历然后逐个equals key 直到找到元素（不同对象equals绝对是false）。

假如一个链表你直接遍历那么当链表非常大的时候，会非常慢的，但一般情况下不同对象的hashCode值是不同的，根据hashCode 和 indexFor() 直接就能找到该元素的索引，然后直接就取出来了，万一hashCode 相同，仅需要遍历一个相对小的链表即可。

所以
1.当你需要存取大量元素的时候，运用 hashMap 这类集合自然比较高效
2.当你定义一个class的时候，假如需要重写 hashCode 和 equals 方法的时候要注意这两个方法
static int indexFor(int h, int length) { return h & (length-1);}
但是不同对象的hashCode 有可能一样，所以HashMap 中每个key 对应的是一个链表，当两个不同key 的hashCode 相同时，那么就放入到对应的同一个链表里，当你取的时候，根据key的hashCode定位到这个链表（链表中存的是 Entry<K,V> 对象），遍历然后逐个equals key 直到找到元素（不同对象equals绝对是false）。

假如一个链表你直接遍历那么当链表非常大的时候，会非常慢的，但一般情况下不同对象的hashCode值是不同的，根据hashCode 和 indexFor() 直接就能找到该元素的索引，然后直接就取出来了，万一hashCode 相同，仅需要遍历一个相对小的链表即可。

所以
1.当你需要存取大量元素的时候，运用 hashMap 这类集合自然比较高效
2.当你定义一个class的时候，假如需要重写 hashCode 和 equals 方法的时候要注意这两个方法

在定义好hashCode和equals方法后，加载因子就是一个重要因素，加载因子越大，重复的可能性就越大，但table数组的利用率越高；加载因子越小，重复的可能性越小，但table数组很多空间被浪费掉了。需要在时间和空间上有一个折中 17 楼 sebatinsky 2011-06-17   一直么有研究过，哈哈，看完QQ再看。 18 楼 jv520jv 2011-06-17   kingkan 写道HashMap是非线程安全的。

试下用ConcurrentHashMap吧。

楼上说的对,在多线种情况下对一个线程不安全的容器进行操作显然是不对的.还是用ConcurrentHashMap这个比较好或者Collections.synchronizedMap(map). 19 楼 angel243fly 2011-06-18   jv520jv 写道kingkan 写道HashMap是非线程安全的。

试下用ConcurrentHashMap吧。

楼上说的对,在多线种情况下对一个线程不安全的容器进行操作显然是不对的.还是用ConcurrentHashMap这个比较好或者Collections.synchronizedMap(map).
Collections.synchronizedMap(map)这个更好用些 20 楼 tianzizhi 2011-06-18   ConcurrentHashMap
Collections.synchronizedMap(map).

这俩不是一个等级的，
第一个是局部加锁，
第二个是整体加锁，
效率差很多

热点排行

编程

HashMap中的元素玩起了藏猫儿