作为线程安全的HashMap，ConcurrentHashMap的使用频率也是非常高的。相较于JDK7的ConcurrentHashMap，JDK8几乎是重写了ConcurrentHashMap，二者还是有很大的差距的，那就来看看究竟有什么区别吧。

ConcurrentHashMap (JDK7)

存储结构

JDK7的ConcurrentHashMap的存储结构有很多个Segment组合而成，每一个Segment是一个类似于HashMap的结构，所以每一个HashMap的内部可以进行扩容。但是Segment的个数一旦初始化就不能不改变。默认Segment的个数是16个，这也可以认为ConcurrentHashMap默认支持最多16个线程并发。

初始化

无参构造函数：

/**
  * Creates a new, empty map with a default initial capacity (16),
  * load factor (0.75) and concurrencyLevel (16).
  */
public ConcurrentHashMap() {
  this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR, DEFAULT_CONCURRENCY_LEVEL);
}

其中调用了有参构造函数，传递了三个参数：

/**
  * The default initial capacity for this table,
  * used when not otherwise specified in a constructor.
  */
static final int DEFAULT_INITIAL_CAPACITY = 16;

/**
  * The default load factor for this table, used when not
  * otherwise specified in a constructor.
  */
static final float DEFAULT_LOAD_FACTOR = 0.75f;

/**
  * The default concurrency level for this table, used when not
  * otherwise specified in a constructor.
  */
static final int DEFAULT_CONCURRENCY_LEVEL = 16;

将这三个参数传递给下面这个有参构造函数：

static final int MAX_SEGMENTS = 1 << 16; // slightly conservative
static final int MAXIMUM_CAPACITY = 1 << 30;
static final int MIN_SEGMENT_TABLE_CAPACITY = 2;

@SuppressWarnings("unchecked")
public ConcurrentHashMap(int initialCapacity,
                         float loadFactor, int concurrencyLevel) {
  if (!(loadFactor > 0) || initialCapacity < 0 || concurrencyLevel <= 0)
    throw new IllegalArgumentException();
  // 校验并发级别
  if (concurrencyLevel > MAX_SEGMENTS)
    concurrencyLevel = MAX_SEGMENTS;
  // Find power-of-two sizes best matching arguments
  int sshift = 0;
  int ssize = 1;
  // concurrencyLevel 之上最近2的幂次方
  while (ssize < concurrencyLevel) { 
    ++sshift;
    ssize <<= 1;
  }
  this.segmentShift = 32 - sshift; // 记录段偏移值
  this.segmentMask = ssize - 1; // 记录段掩码
  if (initialCapacity > MAXIMUM_CAPACITY)
    initialCapacity = MAXIMUM_CAPACITY;
  // 默认值 16/16=1 => 计算每个Segment中类似于HashMap的容量
  int c = initialCapacity / ssize;
  if (c * ssize < initialCapacity)
    ++c;
  int cap = MIN_SEGMENT_TABLE_CAPACITY;
  while (cap < c)
    cap <<= 1;
  // create segments and segments[0]
  Segment<K,V> s0 =
    new Segment<K,V>(loadFactor, (int)(cap * loadFactor),
                     (HashEntry<K,V>[])new HashEntry[cap]);
  Segment<K,V>[] ss = (Segment<K,V>[])new Segment[ssize];
  UNSAFE.putOrderedObject(ss, SBASE, s0); // ordered write of segments[0]
  this.segments = ss;
}

总结初始化逻辑：

参数校验
校验并发级别concurrencyLevel大小，如果大于最大值(1<<16)，重置为最大值，无参构造默认值是16
寻找并发级别concurrencyLevel之上的最接近的2的幂次方值，作为初始化容量的大小，默认是16
记录segmentShift偏移量，该值作为 capacity = 2 ^ N 中的N，用于put()计算位置，默认值是32 - sshift = 28
记录segmentMask，默认值是 16 - 1 = 15
初始化segment[0]，默认大小为2，负载因子为0.75，扩容阈值为2 * 0.75 = 1.5，即插入第二个值时才会进行扩容

put

/**
  * Maps the specified key to the specified value in this table.
  * Neither the key nor the value can be null.
  */
@SuppressWarnings("unchecked")
public V put(K key, V value) {
  Segment<K,V> s;
  if (value == null)
    throw new NullPointerException();
  int hash = hash(key.hashCode());
  // hash值无符号右移segmentShift（默认初始化28位），然后与segmentMask(=15)做与运算
  // => 将高几位与segmentMask做与运算
  int j = (hash >>> segmentShift) & segmentMask;
  if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck
       (segments, (j << SSHIFT) + SBASE)) == null) //  in ensureSegment
    // 如果查找到的Segment为空，初始化
    s = ensureSegment(j);
  return s.put(key, hash, value, false);
}

初始化Segment

/**
  * Returns the segment for the given index, creating it and
  * recording in segment table (via CAS) if not already present.
  */
@SuppressWarnings("unchecked")
private Segment<K,V> ensureSegment(int k) {
  final Segment<K,V>[] ss = this.segments;
  long u = (k << SSHIFT) + SBASE; // raw offset
  Segment<K,V> seg;
  // 判断u位置的Segment是否为null
  if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) {
    // 以Segment[0]作为原型初始化
    Segment<K,V> proto = ss[0]; // use segment 0 as prototype
    int cap = proto.table.length;
    float lf = proto.loadFactor;
    int threshold = (int)(cap * lf);
    HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap];
    // 再次检查u位置的Segment是否为null，因为此时可能有其他线程进行了操作
    if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
        == null) { // recheck
      Segment<K,V> s = new Segment<K,V>(lf, threshold, tab);
      // CAS自旋检查u位置的Segment是否为null
      while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
             == null) {
        // 使用CAS赋值，只会成功一次
        if (UNSAFE.compareAndSwapObject(ss, u, null, seg = s))
          break;
      }
    }
  }
  return seg;
}

put数据的处理流程：

计算要put的元素key的位置，获取指定位置的Segment
如果指定位置的Segment为空，初始化这个Segment
使用Segment#put插入key-value

初始化Segment的流程

检查计算得到的位置的Segment是否为null
如果为null继续初始化，使用Segment[0]作为原型计算容量、负载因子、阈值，创建HashEntry数组
再次检查计算得到的指定位置的Segment是否为null
如果为null，使用创建的HashEntry数组初始化该Segment
自旋判断得到的指定位置的Segment是否为null，使用CAS在这个位置赋值为Segment

接下来来看Segement#put()方法的具体逻辑：

final V put(K key, int hash, V value, boolean onlyIfAbsent) {
  // 获取ReentrantLock独占锁，获取不到，scanAndLockForPut获取
  HashEntry<K,V> node = tryLock() ? null : scanAndLockForPut(key, hash, value);
  V oldValue;
  try {
    HashEntry<K,V>[] tab = table;
    // 计算要put元素的位置
    int index = (tab.length - 1) & hash;
    // CAS 获取index坐标的值
    HashEntry<K,V> first = entryAt(tab, index);
    for (HashEntry<K,V> e = first;;) {
      // 检查key是否已经存在，如果存在，则遍历链表寻找位置，找到后更新旧值
      if (e != null) {
        K k;
        if ((k = e.key) == key ||
            (e.hash == hash && key.equals(k))) {
          oldValue = e.value;
          if (!onlyIfAbsent) {
            e.value = value;
            ++modCount;
          }
          break;
        }
        e = e.next;
      }
      else {
        // 头插法插入元素
        if (node != null)
          node.setNext(first);
        else
          node = new HashEntry<K,V>(hash, key, value, first);
        int c = count + 1;
        // 容量大于扩容阈值，小于最大容量，扩容
        if (c > threshold && tab.length < MAXIMUM_CAPACITY)
          rehash(node);
        else
          // index位置赋值node，node可能是一个元素，也可能是一个链表的表头
          setEntryAt(tab, index, node);
        ++modCount;
        count = c;
        oldValue = null;
        break;
      }
    }
  } finally {
    unlock();
  }
  return oldValue;
}

Segment 继承了ReentrantLock，所以 Segment 内部可以很方便的获取锁，put 流程就用到了这个功能。Segment#put方法的流程总结

tryLock() 获取锁，获取不到使用 scanAndLockForPut 方法继续获取。
计算 put 的数据要放入的 index 位置，然后获取这个位置上的HashEntry 。
遍历 put 新元素

为什么要遍历？因为这里获取的 HashEntry 可能是一个空元素，也可能是链表已存在，所以要区别对待。

如果这个位置上的HashEntry 不存在：
1. 如果当前容量大于扩容阀值，小于最大容量，进行扩容
2. 直接头插法插入
如果这个位置上的HashEntry 存在：
1. 判断链表当前元素key和hash值是否和要put的key和hash值一致。一致则替换旧值
2. 不一致，获取链表下一个节点，直到发现相同进行值替换，或者链表表里完毕没有相同的。
  1. 如果当前容量大于扩容阀值，小于最大容量，进行扩容
  2. 直接链表头插法插入
如果要插入的位置之前已经存在，替换后返回旧值，否则返回null

scanAndLockForPut()方法：不断的自旋tryLock()获取锁，当自旋次数大于指定次数时，使用lock()阻塞获取锁。在自旋的同时获取下hash位置的HashEntry。

private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
  HashEntry<K,V> first = entryForHash(this, hash);
  HashEntry<K,V> e = first;
  HashEntry<K,V> node = null;
  int retries = -1; // negative while locating node
  // 自旋获取锁
  while (!tryLock()) {
    HashEntry<K,V> f; // to recheck first below
    if (retries < 0) {
      if (e == null) {
        if (node == null) // speculatively create node
          node = new HashEntry<K,V>(hash, key, value, null);
        retries = 0;
      }
      else if (key.equals(e.key))
        retries = 0;
      else
        e = e.next;
    }
    else if (++retries > MAX_SCAN_RETRIES) {
      // 自旋达到一定次数后，阻塞等待直到获取到锁
      lock();
      break;
    }
    else if ((retries & 1) == 0 &&
             (f = entryForHash(this, hash)) != first) {
      e = first = f; // re-traverse if entry changed
      retries = -1;
    }
  }
  return node;
}

rehash

ConcurrentHashMap扩容会扩容到原来的两倍。原数组的数据移动到新数组时，位置要么不变，要么变为index + oldSize。参数中的node会在扩容后使用链表头插法插入到指定位置。

private void rehash(HashEntry<K,V> node) {
  /*
    * Reclassify nodes in each list to new table.  Because we
    * are using power-of-two expansion, the elements from
    * each bin must either stay at same index, or move with a
    * power of two offset. We eliminate unnecessary node
    * creation by catching cases where old nodes can be
    * reused because their next fields won't change.
    * Statistically, at the default threshold, only about
    * one-sixth of them need cloning when a table
    * doubles. The nodes they replace will be garbage
    * collectable as soon as they are no longer referenced by
    * any reader thread that may be in the midst of
    * concurrently traversing table. Entry accesses use plain
    * array indexing because they are followed by volatile
    * table write.
    */
  HashEntry<K,V>[] oldTable = table;
  int oldCapacity = oldTable.length;
  int newCapacity = oldCapacity << 1;
  threshold = (int)(newCapacity * loadFactor);
  HashEntry<K,V>[] newTable =
    (HashEntry<K,V>[]) new HashEntry[newCapacity];
  // 新掩码，默认2，扩容后是4，再减去1就是3，二进制就是11
  int sizeMask = newCapacity - 1;
  for (int i = 0; i < oldCapacity ; i++) {
    HashEntry<K,V> e = oldTable[i];
    if (e != null) {
      HashEntry<K,V> next = e.next;
      // 计算新的位置，新的位置只可能是不变或者是 旧位置+旧容量
      int idx = e.hash & sizeMask;
      if (next == null)   //  Single node on list
        // 当前位置不是链表，只是一个元素，直接赋值
        newTable[idx] = e;
      else { // Reuse consecutive sequence at same slot
        // 是链表
        HashEntry<K,V> lastRun = e;
        int lastIdx = idx;
        // 遍历结束后，lastRun 后面的元素位置都是相同的
        for (HashEntry<K,V> last = next;
             last != null;
             last = last.next) {
          int k = last.hash & sizeMask;
          if (k != lastIdx) {
            lastIdx = k;
            lastRun = last;
          }
        }
        // lastRun 后面的元素位置都是相同的，直接作为链表赋值到新位置。
        newTable[lastIdx] = lastRun;
        // Clone remaining nodes
        for (HashEntry<K,V> p = e; p != lastRun; p = p.next) {
          // 遍历剩余元素，头插法到指定的k位置
          V v = p.value;
          int h = p.hash;
          int k = h & sizeMask;
          HashEntry<K,V> n = newTable[k];
          newTable[k] = new HashEntry<K,V>(h, p.key, v, n);
        }
      }
    }
  }
  // 头插法插入新的节点
  int nodeIndex = node.hash & sizeMask; // add the new node
  node.setNext(newTable[nodeIndex]);
  newTable[nodeIndex] = node;
  table = newTable;
}

上述代码中的第一个 for 是为了寻找这样一个节点，这个节点后面的所有next节点的新位置都是不变的，然后把这个作为一个链表赋值到新位置。第二个 for 循环是为了把剩余的元素通过头插法插入到指定位置链表。

get

get方法的逻辑：

计算得到key的存放位置
遍历指定位置查找相同的key的value值

/**
  * Returns the value to which the specified key is mapped,
  * or {@code null} if this map contains no mapping for the key.
  */
public V get(Object key) {
  Segment<K,V> s; // manually integrate access methods to reduce overhead
  HashEntry<K,V>[] tab;
  int h = hash(key.hashCode());
  long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
  // 计算得到Key的存放位置
  if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
      (tab = s.table) != null) {
    for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
         (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
         e != null; e = e.next) {
      // 如果是链表，遍历查找相同key的value
      K k;
      if ((k = e.key) == key || (e.hash == h && key.equals(k)))
        return e.value;
    }
  }
  return null;
}

ConcurrentHashMap (JDK8)

存储结构

相较于JDK7的ConcurrentHashMap，JDK8的变化比较大，底层存储结构不再是Segment数组+HashEntry数组+链表，而是Node数组 + 链表/红黑树。（只有当链表达到一定长度后，链表才会转换成红黑树。

初始化 `initTable`

/**
  * Table initialization and resizing control.  When negative, the
  * table is being initialized or resized: -1 for initialization,
  * else -(1 + the number of active resizing threads).  Otherwise,
  * when table is null, holds the initial table size to use upon
  * creation, or 0 for default. After initialization, holds the
  * next element count value upon which to resize the table.
  */
private transient volatile int sizeCtl;
/**
 * Initializes table, using the size recorded in sizeCtl.
 */
private final Node<K,V>[] initTable() {
  Node<K,V>[] tab; int sc;
  while ((tab = table) == null || tab.length == 0) {
    // 如果sizeCtl < 0，说明存在其他线程执行CAS成功，正在执行初始化操作
    if ((sc = sizeCtl) < 0)
      // 主动让出CPU使用权
      Thread.yield(); // lost initialization race; just spin
    else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
      try {
        if ((tab = table) == null || tab.length == 0) {
          int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
          @SuppressWarnings("unchecked")
          Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
          table = tab = nt;
          sc = n - (n >>> 2);
        }
      } finally {
        sizeCtl = sc;
      }
      break;
    }
  }
  return tab;
}

从上述源码中发现ConcurrentHashMap的初始化是通过自旋和CAS操作来万恒的，需要注意的是变量sizeCtl，该值决定着当前的初始化状态：

-1表示正在初始化
-N说明有N-1个线程正在进行扩容
0表示table初始化大小（如果table还没有初始化）
>0表示table扩容的阈值（如果table已经初始化）

put

/**
  * Maps the specified key to the specified value in this table.
  * Neither the key nor the value can be null.
  *
  * <p>The value can be retrieved by calling the {@code get} method
  * with a key that is equal to the original key.
  */
public V put(K key, V value) {
  return putVal(key, value, false);
}
final V putVal(K key, V value, boolean onlyIfAbsent) {
  // key和value不能为空
  if (key == null || value == null) throw new NullPointerException();
  int hash = spread(key.hashCode());
  int binCount = 0;
  for (Node<K,V>[] tab = table;;) {
    // f=>目标位置元素
    Node<K,V> f; int n, i, fh;// fh=>存放目标位置元素的hash值
    if (tab == null || (n = tab.length) == 0)
      // Node数组为空，初始化数组
      tab = initTable();
    else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
      // 数组内元素为空，CAS放入，不加锁，成功了就直接退出
      if (casTabAt(tab, i, null,
                   new Node<K,V>(hash, key, value, null)))
        break;                   // no lock when adding to empty bin
    }
    else if ((fh = f.hash) == MOVED)
      tab = helpTransfer(tab, f);
    else {
      V oldVal = null;
      // 加锁，加入节点
      synchronized (f) {
        if (tabAt(tab, i) == f) {
          // 说明是链表
          if (fh >= 0) {
            binCount = 1;
            // 循环加入新的或者覆盖节点
            for (Node<K,V> e = f;; ++binCount) {
              K ek;
              if (e.hash == hash &&
                  ((ek = e.key) == key ||
                   (ek != null && key.equals(ek)))) {
                oldVal = e.val;
                if (!onlyIfAbsent)
                  e.val = value;
                break;
              }
              Node<K,V> pred = e;
              if ((e = e.next) == null) {
                pred.next = new Node<K,V>(hash, key,
                                          value, null);
                break;
              }
            }
          }
          // 说明是红黑树
          else if (f instanceof TreeBin) {
            Node<K,V> p;
            binCount = 2;
            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                  value)) != null) {
              oldVal = p.val;
              if (!onlyIfAbsent)
                p.val = value;
            }
          }
        }
      }
      if (binCount != 0) {
        if (binCount >= TREEIFY_THRESHOLD)
          treeifyBin(tab, i);
        if (oldVal != null)
          return oldVal;
        break;
      }
    }
  }
  addCount(1L, binCount);
  return null;
}

put()流程总结：

根据key计算出hash值
判断是否需要进行初始化
判断当前key定位出的Node节点，如果为空表示当前位置可以写入数据，利用CAS尝试写入，失败则自旋保证成功
如果当前位置的 hashcode == MOVED == -1,则需要进行扩容
如果都不满足，则利用 synchronized 锁写入数据
如果数量大于 TREEIFY_THRESHOLD 则要执行树化方法，在 treeifyBin 中会首先判断当前数组长度 ≥64 时才会将链表转换为红黑树

get

/**
 * Returns the value to which the specified key is mapped,
 * or {@code null} if this map contains no mapping for the key.
 *
 * <p>More formally, if this map contains a mapping from a key
 * {@code k} to a value {@code v} such that {@code key.equals(k)},
 * then this method returns {@code v}; otherwise it returns
 * {@code null}.  (There can be at most one such mapping.)
 */
public V get(Object key) {
  Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
	// 计算key的hash值
  int h = spread(key.hashCode());
  if ((tab = table) != null && (n = tab.length) > 0 &&
      (e = tabAt(tab, (n - 1) & h)) != null) {
    // 指定位置元素存在，头节点hash值相同
    if ((eh = e.hash) == h) {
      if ((ek = e.key) == key || (ek != null && key.equals(ek)))
        // key的hash值相等、key值相等，直接返回元素的value
        return e.val;
    }
    else if (eh < 0)
      // 头节点hash值小于0，说明正在扩容或者是红黑树，find查找
      return (p = e.find(h, key)) != null ? p.val : null;
    while ((e = e.next) != null) {
      // 是链表，遍历查找
      if (e.hash == h &&
          ((ek = e.key) == key || (ek != null && key.equals(ek))))
        return e.val;
    }
  }
  return null;
}

get 过程概述

根据hash值计算位置
查找到指定位置，如果头节点就是要找的，直接返回它的value
如果头节点 hash 值小于0，说明正在扩容或者是红黑树，查找之
如果是链表，遍历查找元素

总结

JDK7中的ConcurrentHashMap使用的分段锁，也就是每一个Segment上同时只有一个线程可以操作，每一个Segment都是一个类似HashMap数组的结构，它可以扩容，冲突会转换成链表。但是Segment数组的个数一旦初始化就不能改变，也就是说并发数是固定的。

JDK8中的ConcurrentHashMap使用的是synchronized锁+CAS的机制。结构也由JDK7中Segment数组+HashEntry数组+链表进化成了Node数组+链表/红黑树，Node是类似于一个HashEntry的结构。他的冲突达到一定的程度下会转化成红黑树，在冲突小于一定程度时又退回链表。

youyichannel

带你读ConcurrentHashMap源码

ConcurrentHashMap (JDK7)

存储结构

初始化

put

rehash

get

ConcurrentHashMap (JDK8)

存储结构

初始化 `initTable`

put

get

总结

ConcurrentHashMap (JDK7)

存储结构

初始化

put

rehash

get

ConcurrentHashMap (JDK8)

存储结构

初始化 initTable

put

get

总结

初始化 `initTable`