源碼剖析Golang中map擴容底層的實現

Posted on 2023-03-06 by WalkonNet

前言

之前的文章詳細介紹過Go切片和map的基本使用，以及切片的擴容機制。本文針對map的擴容，會從源碼的角度全面的剖析一下map擴容的底層實現。

map底層結構

主要包含兩個核心結構體hmap和bmap

數據會先存儲在正常桶hmap.buckets指向的bmap數組中，一個bmap隻能存儲8組鍵值對數據，超過則會將數據存儲到溢出桶hmap.extra.overflow指向的bmap數組中

那麼，當溢出桶也存儲不下瞭，會怎麼辦呢，數據得存儲到哪去呢？答案，肯定是擴容，那麼擴容怎麼實現的呢？接著往下看

擴容時機

在向 map 插入新 key 的時候，會進行條件檢測，符合下面這 2 個條件，就會觸發擴容

// If we hit the max load factor or we have too many overflow buckets,
// and we're not already in the middle of growing, start growing.
if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {
  hashGrow(t, h)
  goto again // Growing the table invalidates everything, so try again
}

// growing reports whether h is growing. The growth may be to the same size or bigger.
func (h *hmap) growing() bool {
  return h.oldbuckets != nil
}

條件1：超過負載

map元素個數 > 6.5 * 桶個數

// overLoadFactor reports whether count items placed in 1<<B buckets is over loadFactor.
func overLoadFactor(count int, B uint8) bool {
  return count > bucketCnt && uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen)
}

其中

bucketCnt = 8，一個桶可以裝的最大元素個數
loadFactor = 6.5，負載因子，平均每個桶的元素個數
bucketShift(B): 桶的個數

條件2：溢出桶太多

當桶總數 < 2 ^ 15 時，如果溢出桶總數 >= 桶總數，則認為溢出桶過多。

當桶總數 >= 2 ^ 15 時，直接與 2 ^ 15 比較，當溢出桶總數 >= 2 ^ 15 時，即認為溢出桶太多瞭。

// tooManyOverflowBuckets reports whether noverflow buckets is too many for a map with 1<<B buckets.
// Note that most of these overflow buckets must be in sparse use;
// if use was dense, then we'd have already triggered regular map growth.
func tooManyOverflowBuckets(noverflow uint16, B uint8) bool {
  // If the threshold is too low, we do extraneous work.
  // If the threshold is too high, maps that grow and shrink can hold on to lots of unused memory.
  // "too many" means (approximately) as many overflow buckets as regular buckets.
  // See incrnoverflow for more details.
  if B > 15 {
    B = 15
  }
  // The compiler doesn't see here that B < 16; mask B to generate shorter shift code.
  return noverflow >= uint16(1)<<(B&15)
}

對於條件2，其實算是對條件1的補充。因為在負載因子比較小的情況下，有可能 map 的查找和插入效率也很低，而第 1 點識別不出來這種情況。

表面現象就是負載因子比較小，即 map 裡元素總數少，但是桶數量多（真實分配的桶數量多，包括大量的溢出桶）。比如不斷的增刪，這樣會造成overflow的bucket數量增多，但負載因子又不高，達不到第 1 點的臨界值，就不能觸發擴容來緩解這種情況。這樣會造成桶的使用率不高，值存儲得比較稀疏，查找插入效率會變得非常低，因此有瞭第 2 擴容條件。

擴容方式

雙倍擴容

針對條件1，新建一個buckets數組，新的buckets大小是原來的2倍，然後舊buckets數據搬遷到新的buckets，該方法我們稱之為雙倍擴容

等量擴容

針對條件2，並不擴大容量，buckets數量維持不變，重新做一遍類似雙倍擴容的搬遷動作，把松散的鍵值對重新排列一次，使得同一個 bucket 中的 key 排列地更緊密，節省空間，提高 bucket 利用率，進而保證更快的存取，該方法我們稱之為等量擴容

擴容函數

上面說的 hashGrow() 函數實際上並沒有真正地“搬遷”，它隻是分配好瞭新的 buckets，並將老的 buckets 掛到瞭 oldbuckets 字段上。

真正搬遷 buckets 的動作在 growWork() 函數中，而調用 growWork() 函數的動作是在 mapassign 和 mapdelete 函數中。也就是插入或修改、刪除 key 的時候，都會嘗試進行搬遷 buckets 的工作。先檢查 oldbuckets 是否搬遷完畢，具體來說就是檢查 oldbuckets 是否為 nil

func hashGrow(t *maptype, h *hmap) {
  // If we've hit the load factor, get bigger.
  // Otherwise, there are too many overflow buckets,
  // so keep the same number of buckets and "grow" laterally.
  bigger := uint8(1)
  if !overLoadFactor(h.count+1, h.B) {
    bigger = 0
    h.flags |= sameSizeGrow
  }
  oldbuckets := h.buckets
  newbuckets, nextOverflow := makeBucketArray(t, h.B+bigger, nil)

  flags := h.flags &^ (iterator | oldIterator)
  if h.flags&iterator != 0 {
    flags |= oldIterator
  }
  // commit the grow (atomic wrt gc)
  h.B += bigger
  h.flags = flags
  h.oldbuckets = oldbuckets
  h.buckets = newbuckets
  h.nevacuate = 0
  h.noverflow = 0

  if h.extra != nil && h.extra.overflow != nil {
    // Promote current overflow buckets to the old generation.
    if h.extra.oldoverflow != nil {
      throw("oldoverflow is not nil")
    }
    h.extra.oldoverflow = h.extra.overflow
    h.extra.overflow = nil
  }
  if nextOverflow != nil {
    if h.extra == nil {
      h.extra = new(mapextra)
    }
    h.extra.nextOverflow = nextOverflow
  }

  // the actual copying of the hash table data is done incrementally
  // by growWork() and evacuate().
}

由於 map 擴容需要將原有的 key/value 重新搬遷到新的內存地址，如果map存儲瞭數以億計的key-value，一次性搬遷將會造成比較大的延時，因此 Go map 的擴容采取瞭一種稱為“漸進式”的方式，原有的 key 並不會一次性搬遷完畢，每次最多隻會搬遷 2 個 bucket。

func growWork(t *maptype, h *hmap, bucket uintptr) {
  // make sure we evacuate the oldbucket corresponding
  // to the bucket we're about to use
  evacuate(t, h, bucket&h.oldbucketmask())

  // evacuate one more oldbucket to make progress on growing
  if h.growing() {
    evacuate(t, h, h.nevacuate)
  }
}

總結

要想掌握Go map擴容的底層實現，必須先掌握map的底層結構設計。基於底層結構，再從底層實現的源碼，一步步分析。

到此這篇關於源碼剖析Golang中map擴容底層的實現的文章就介紹到這瞭,更多相關Golang map擴容內容請搜索WalkonNet以前的文章或繼續瀏覽下面的相關文章希望大傢以後多多支持WalkonNet！

源碼剖析Golang中map擴容底層的實現

目錄

前言

map底層結構

擴容時機

條件1：超過負載

條件2：溢出桶太多

擴容方式

雙倍擴容

等量擴容

擴容函數

總結

推薦閱讀：

發佈留言取消回覆

近期文章

目錄

前言

map底層結構

擴容時機

條件1：超過負載

條件2：溢出桶太多

擴容方式

雙倍擴容

等量擴容

擴容函數

總結

推薦閱讀：

發佈留言 取消回覆

近期文章

標籤

發佈留言取消回覆