法律聲明:《linux 3.4.10 內(nèi)核內(nèi)存管理源代碼分析》系列文章由陳晉飛(ancjf@163.com)發(fā)表于http://blog.csdn.net/ancjf,文章遵循GPL協(xié)議。歡迎轉(zhuǎn)載,轉(zhuǎn)載請(qǐng)注明作者和此條款。
Slab塊的管理=========================從slab中分配出去的內(nèi)存實(shí)際都是slab從伙伴系統(tǒng)申請(qǐng)一塊內(nèi)存,然后分割成若干小塊,再分配出去。一個(gè)slab塊被劃分為長度相等的若干小塊,第零個(gè)小塊的首地址保存在struct slab的成員s_mem中,每個(gè)小塊都有個(gè)編號(hào),在一個(gè)slab塊中是唯一的。一塊從伙伴系統(tǒng)申請(qǐng)的內(nèi)存塊用structslab描述,一塊slab的控制數(shù)據(jù)并不完全是保存在struct slab中,因?yàn)檫€需要一些數(shù)據(jù)來保存空閑小塊的信息,不同的slab緩存中的slab塊可能包含小塊的數(shù)量不一樣。空閑小塊實(shí)際是以單向編號(hào)鏈表的方式管理的,對(duì)每個(gè)小塊有一個(gè)編號(hào),每個(gè)編號(hào)在編號(hào)鏈表中有一項(xiàng)用來保存指向下一項(xiàng)的編號(hào)。當(dāng)分配內(nèi)存時(shí)從鏈表頭取下一項(xiàng),但釋放內(nèi)存時(shí)把釋放項(xiàng)加入鏈表頭??臻e鏈表總是保存在struct slab結(jié)構(gòu)之后。下圖是一個(gè)包含6個(gè)小塊,3個(gè)空閑小塊的slab塊的控制數(shù)據(jù)示例圖,最前面方格保存struct slab結(jié)構(gòu),后面是slab空閑編號(hào)鏈表。
獲取Slab塊的空閑編號(hào)鏈表的地址的函數(shù)是slab_bufctl,在mm/slab.c中實(shí)現(xiàn),代碼如下: 2804 static inline kmem_bufctl_t*slab_bufctl(struct slab *slabp) 2805 { 2806 return (kmem_bufctl_t *) (slabp + 1); 2807 }
index_to_obj和obj_to_index函數(shù)小塊編號(hào)到小塊的虛擬地址的轉(zhuǎn)換由index_to_obj實(shí)現(xiàn),小塊的虛擬地址到小塊編號(hào)由obj_to_index實(shí)現(xiàn),這兩個(gè)函數(shù)都在mm/slab.c中實(shí)現(xiàn),代碼如下: 532 static inline void *index_to_obj(structkmem_cache *cache, struct slab *slab, 533 unsigned intidx) 534{ 535 return slab->s_mem + cache->buffer_size * idx; 536} 537 538/* 539 *We want to avoid an expensive divide : (offset / cache->buffer_size) 540 * Using the fact thatbuffer_size is a constant for a particular cache, 541 * we can replace (offset /cache->buffer_size) by 542 * reciprocal_divide(offset,cache->reciprocal_buffer_size) 543 */ 544static inline unsigned int obj_to_index(const struct kmem_cache *cache, 545 conststruct slab *slab, void *obj) 546{ 547 u32 offset = (obj - slab->s_mem); 548 return reciprocal_divide(offset, cache->reciprocal_buffer_size); 549} 編號(hào)和地址的轉(zhuǎn)換需要小塊的長度信息,slab緩存中小塊的長度保存在structkmem_cache的成員buffer_size中。index_to_obj比較簡單,下面只說說obj_to_index reciprocal_buffer_size的計(jì)算方法在mm/slab.c中的kmem_cache_init中 1563 cache_cache.reciprocal_buffer_size = 1564 reciprocal_value(cache_cache.buffer_size); reciprocal_value的代碼中l(wèi)ib/reciprocal_div.c中,如下: 5u32 reciprocal_value(u32 k) 6 { 7 u64 val = (1LL <<32) + (k - 1); 8 do_div(val, k); 9 return (u32)val; 10 } reciprocal_divide的代碼在include/linux/reciprocal_div.h中,如下 28 static inline u32 reciprocal_divide(u32A, u32 R) 29 { 30 return (u32)(((u64)A * R) >> 32); 31 } 綜合起來obj_to_index的計(jì)算公式就是(((2^32+(buffer_size-1))/buffer_size)* offset)/2^32 由(((2^32+(buffer_size-1))/ buffer_size)* offset)/2^32 <= (((2^32+(buffer_size-1)) * offset)/ buffer_size)/2^32 == (((2^32+(buffer_size-1)) * offset) /2^32) / buffer_size ==( (2^32 * offset + (buffer_size-1) * offset) / 2^32)/ buffer_size= (2^32 * offset) / 2^32 / buffer_size = offset/ buffer_size,這樣就得到了(((2^32+(buffer_size-1))/ buffer_size)* offset)/2^32 <= offset/ buffer_size,這個(gè)推導(dǎo)使用了條件(buffer_size-1) * offset < 2^32 另外有offset/ buffer_size <= (((2^32+(buffer_size-1))/ buffer_size)* offset)/2^32 只要offset * 2^32 <= ((2^32+(buffer_size-1))/buffer_size)* offset) * buffer_size 只要offset * 2^32 <= ((2^32+(buffer_size-1))/buffer_size)* buffer_size) * offset 只要2^32 <= ((2^32+(buffer_size-1))/buffer_size)* buffer_size),這個(gè)條件總是滿足的,所以offset/buffer_size <= (((2^32+(buffer_size-1))/ buffer_size)*offset)/2^32也成立。
這樣我們得到了offset/ buffer_size == (((2^32+(buffer_size-1))/ buffer_size)* offset)/2^32 正如obj_to_index所注釋的,obj_to_index的計(jì)算結(jié)果是offset/ buffer_size,這樣實(shí)現(xiàn)只是為了避免使用除法,因?yàn)槌ㄖ噶顖?zhí)行比較慢。
slab_get_obj和slab_put_obj函數(shù)從slab塊分配對(duì)象的函數(shù)是slab_get_obj,在mm/slab.c中實(shí)現(xiàn),代碼如下: 2866static void *slab_get_obj(struct kmem_cache *cachep, struct slab *slabp, 2867 int nodeid) 2868 { 2869 void *objp = index_to_obj(cachep, slabp, slabp->free); 2870 kmem_bufctl_t next; 2871 2872 slabp->inuse++; 2873 next = slab_bufctl(slabp)[slabp->free]; 2874 #if DEBUG 2875 slab_bufctl(slabp)[slabp->free] = BUFCTL_FREE; 2876 WARN_ON(slabp->nodeid != nodeid); 2877 #endif 2878 slabp->free = next; 2879 2880 return objp; 2881 } 2869行獲得空閑編號(hào)鏈表頭的指針,2872行更新使用的分配出去的對(duì)象計(jì)數(shù),2873行獲得下一個(gè)空閑編號(hào),2878把下一個(gè)空閑編號(hào)作為鏈表頭保存起來。
釋放小塊內(nèi)存到slab塊的函數(shù)是slab_put_obj,在mm/slab.c中實(shí)現(xiàn),代碼如下: 2883 static void slab_put_obj(structkmem_cache *cachep, struct slab *slabp, 2884 void *objp,int nodeid) 2885 { 2886 unsigned int objnr = obj_to_index(cachep, slabp, objp); 2887 2888 #if DEBUG 2889 /* Verify that the slab belongs to the intended node */ 2890 WARN_ON(slabp->nodeid != nodeid); 2891 2892 if (slab_bufctl(slabp)[objnr] + 1 <= SLAB_LIMIT + 1) { 2893 printk(KERN_ERR "slab:double free detected in cache " 2894 "'%s',objp %p\n", cachep->name, objp); 2895 BUG(); 2896 } 2897 #endif 2898 slab_bufctl(slabp)[objnr] = slabp->free; 2899 slabp->free = objnr; 2900 slabp->inuse--; 2901 } 2886行求得編號(hào),2898-2899行把新的編號(hào)作為鏈表頭,并行鏈表頭的項(xiàng)指向以前的鏈表頭,2900減少對(duì)象使用計(jì)算。
cache_init_objs函數(shù)Slab塊的空閑編號(hào)鏈表的初始化函數(shù)是cache_init_objs,在mm/slab.c中實(shí)現(xiàn),代碼如下: 2809static void cache_init_objs(struct kmem_cache *cachep, 2810 struct slab*slabp) 2811 { 2812 int i; 2813 2814 for (i = 0; i < cachep->num; i++) { 2815 void *objp =index_to_obj(cachep, slabp, i); 2816 #if DEBUG 2817 /* need to poison the objs? */ 2818 if (cachep->flags &SLAB_POISON) 2819 poison_obj(cachep,objp, POISON_FREE); 2820 if (cachep->flags& SLAB_STORE_USER) 2821 *dbg_userword(cachep,objp) = NULL; 2822 2823 if (cachep->flags &SLAB_RED_ZONE) { 2824 *dbg_redzone1(cachep,objp) = RED_INACTIVE; 2825 *dbg_redzone2(cachep,objp) = RED_INACTIVE; 2826 } 2827 /* 2828 * Constructors are notallowed to allocate memory from the same 2829 * cache which they are aconstructor for. Otherwise, deadlock. 2830 * They must also be threaded. 2831 */ 2832 if (cachep->ctor &&!(cachep->flags & SLAB_POISON)) 2833 cachep->ctor(objp +obj_offset(cachep)); 2834 2835 if (cachep->flags &SLAB_RED_ZONE) { 2836 if(*dbg_redzone2(cachep, objp) != RED_INACTIVE) 2837 slab_error(cachep, "constructor overwrote the" 2838 " end of an object"); 2839 if(*dbg_redzone1(cachep, objp) != RED_INACTIVE) 2840 slab_error(cachep, "constructor overwrote the" 2841 " start of an object"); 2842 } 2843 if ((cachep->buffer_size %PAGE_SIZE) == 0 && 2844 OFF_SLAB(cachep)&& cachep->flags & SLAB_POISON) 2845 kernel_map_pages(virt_to_page(objp), 2846 cachep->buffer_size/ PAGE_SIZE, 0); 2847 #else 2848 if (cachep->ctor) 2849 cachep->ctor(objp); 2850 #endif 2851 slab_bufctl(slabp)[i] = i + 1; 2852 } 2853 slab_bufctl(slabp)[i - 1] = BUFCTL_END; 2854 } 如果不考慮DEBUG宏,cache_init_objs函數(shù)初始化空閑編號(hào)鏈表,并調(diào)用了對(duì)象的構(gòu)造函數(shù)。 |
|