Copyright (C) 커널연구회 (www.kernel.bz)작성자: 정재준(rgbi3307 골뱅이 nate.com)문서위치 메인(출처): https://www.kernel.bz/blogPost/htree-intro아래 내용은 누구나 사용할 수 있습니다. 저자정보는 지우지 말고 위의 출처를 기입해 주시기 바랍니다.커널 버전 [v6.10)안녕하세요?그동안 리눅스 커널에 새로운 자료구조로 Hash Tree 을 구현하기 위해 2~3개월 집중하다보니,블로그 글을 오랜만에 올리게 되었습니다. v1.0 정도로 완성을 해서 오늘(8월 2일) 리눅스 커널 메인테이너 메일링 리스트에 올려서 공유 했습니다.평가를 받아 보는데 시간이 걸릴듯 합니다.일단, 커널 메인테이너들에게 공유한 내용을 그대로 여기에도 올립니다.(영문으로 작성 하다 보니 한글 설명이 미흡 합니다. 향후 한글로 꾸준히 관련 내용을 공유 하겠습니다.)제가 작성한 그림들을 보시면 충분히 이해할 수 있고, 영문도 키워드 중심으로 정리해서 이해하기 쉬울듯 합니다. Hash Trees (lib/htree) in Linux Kernel Author: JaeJoon Jung <rgbi3307@gmail.com>Date: Fri Aug 2 13:43:05 2024 +0900 new Hash Tree Features* Very small hash tree structure. [16 Bytes]* Dynamic memory allocation and free.* Both 32-bit and 64-bit indexes are possible* Generate hash keys uniformly based on the index.* Hash trees are balanced by hash keys, and have no rotation costs.* Standard deviation of hash key is 4 or less.* Algorithm O(n) is depth(d) x nodes(c)* Finding walks is (d x c), min:4, avg:12, max:20* First hash table has smallest, largest index, algorithm O(1).* The codes implementing of the algorithm is simple.* Adjust hash tree depth according to system memory and index nr.* Hash list nodes use include/linux/list.h, hlist as their base. Hash Tree Summary (include/linux/htree.h)size of one hash tree struct: [16]Bytessize of one data struct: (40)Bytessize of middle: 32Bytes if system has 16GB memory, number of index(nr) is 256M(middle)if system has 4GB memory, number of index(nr) is 64M(middle)...index max: 1 << 50: 2^50: 1P ( 1P x 32: 32P) --> depth:6 (64bits index)index max: 1 << 40: 2^40: 1T ( 1T x 32: 32T) --> depth:6 (64bits index)...index max: 1 << 32: 2^32: 4G ( 4G x 32: 128G) --> depth:5index max: 1 << 28: 2^29: 512M (512M x 32: 16G) --> depth:4 (32bits index)index max: 1 << 28: 2^28: 256M (256M x 32: 8G) --> depth:4index max: 1 << 26: 2^26: 64M ( 64M x 32: 2G) --> depth:3 (32bits index)index max: 1 << 25: 2^25: 32M ( 32M x 32: 1G) --> depth:2 if number of index(nr) is between 32M and 64M, hash tree depth is [2,3) hash array size(anum): 1 << (sbit - depth)dnum: [d0:anum x d1:anum x d2:anum x d3:anum x d4:anum x d5:anum ...) if starting hash bit(sbit) is 9:dnum: [d0:512 x d1:256 x d2:128 x d3:64 x d4:32 x d5:16 ...) dcnt(max index): (d:dnum * HTREE_NODE_CNT): (dnum * 4): d0:2K, d1:512K, d2:64M, d3:4G, d4:128G, d5:2T, ... asum(mid index): (d:dnum * HTREE_NODE_MIN): (dnum * 2): d0:1K, d1:256K, d2:32M, d3:2G, d4: 64G, d5:1T, ... htree depth avg(d): (3)hlist node cnt(c) : [4)algorithm O(n) : (d) x c == 3 x 4 == 12 (finding walks)memory efficiency : (dcnt / asum) == 85%(/100 == 0.85) (usage eff) htree depth(d): 0 ----> 1 ----> 2 ----> 3 ----> 4 ----> 5hash bits(b) : 9 ----> 8 ----> 7 ----> 6 ----> 5 ----> 4table size(t) : 512 ----> 256 ----> 128 ----> 64 ----> 32 ----> 16 d0:b9:t512+-----[4)--> d1:b8:t256+-------> d2:b7:t128+-------> d3:b6:t64+------> d4:b5:t32+------> d5:b4:t16 if sort flag is HTREE_FLAG_ASCD, first htree depth(d0) has smallest index.if sort flag is HTREE_FLAG_DECD, first htree depth(d0) has largest index.hts->most has the hash key position, algorithm O(1). If there is the same index:if req is htf_ins, the new udata is inserted next to each other.if req is htf_erase, the new udata is inserted, and old udata is erased. Hash Tree API flow (lib/htree.c, lib/htree-test.c) Please refer to the attached PDF for more detailed information. 소스 공유(github): https://github.com/kernel-bz/htree.git 설명 문서(PDF):https://github.com/kernel-bz/htree/tree/main/docs/htree-20240802.pdf Hash list nodes use include/linux/list.h, hlist as their base. 궁금한 사항들은 댓글이나 메일 주세요.감사합니다.