INTERNAL: Extract hash_tree module from coll_set and coll_map by zhy2on · Pull Request #970 · naver/arcus-memcached

zhy2on · 2026-04-22T11:14:33Z

🔗 Related Issue

https://github.com/jam2in/arcus-works/issues/844#issuecomment-4278095307

⌨️ What I did

coll_set.c와 coll_map.c에 각각 중복으로 구현되어 있던 hash tree 로직을 hash_tree.c / hash_tree.h로 추출했습니다.

왜 분리가 필요했나

set과 map은 내부적으로 동일한 hash tree 구조를 사용하지만, 그 구현이 각 파일에 독립적으로 존재했습니다. 노드 탐색, 원소 삽입/삭제, 트리 순회 등 구조적으로 완전히 동일한 로직이 두 곳에 중복되어 있었고, 이를 하나의 모듈로 통합하는 것이 이번 작업의 목표였습니다.

elem 구조체 통일

hash tree 모듈에서 key 비교 및 사용을 일관되게 하기 위해, set과 map의 elem 구조체를 htree_elem_item 하나로 통합했습니다. 원래 set의 elem은 <value> 형식, map의 elem은 <field, value> 형식으로 구조가 달랐는데, nfield / nbytes 필드를 공통으로 정의하여 통일했습니다. set은 field를 사용하지 않으므로 nfield = 0으로 처리하기로 했습니다. 마찬가지로 hash node 구조체도 htree_hash_node로 통합하고, 각 collection에서 사용하던 별도의 카운터 필드도 tot_elem_cnt로 통일했습니다.

추출 과정 중 어려움: space accounting & CLOG

단순히 공통 로직을 옮기면 될 것 같았지만, hash tree 로직 사이사이에 각 collection이 담당하던
space accounting(do_coll_space_incr/decr)과 CLOG 호출이 끼어 있었습니다. 예를 들어 원소를 삽입할 때는 노드가 split되는 시점에 node의 공간을 반영해야 하고, 원소가 실제로 연결된 시점에 elem의 공간을 반영해야 합니다. 삭제 시에도 마찬가지입니다.

이를 hash tree 내부로 가져오면 각 collection이 타입별로 처리하던 내용을 hash tree가 알아야 하게 되는데, 이는 hash tree가 각 collection에 의존하는 구조가 되어 "collection이 hash tree를 사용한다"는 의존 방향에 맞지 않아 보였습니다. 그렇다고 collection 쪽에만 두자니 hash tree 로직 중간에 실행되어야 하는 처리들을 표현할 방법이 없었습니다.

해결: 콜백 패턴 도입

hash tree 로직 중간에 collection-specific 처리가 필요한 지점마다 콜백 함수 포인터를 인자로 전달받아 실행하는 방식으로 해결했습니다. hash tree는 구조적인 동작만 담당하고, 각 collection은 자신의 책임(space accounting, CLOG, ccnt 관리 등)을 콜백으로 정의해서 넘겨주는 구조입니다.

도입된 콜백 목록:

on_elem_insert — 원소가 새로 삽입된 후 호출 (ccnt 증가, space incr, CLOG)
on_elem_replace — 원소가 교체된 후 호출 (space 조정, CLOG, old elem free)
on_elem_delete — 원소가 삭제된 후 호출 (ccnt 감소, space decr, CLOG, elem free)
on_node_insert — 노드 split으로 새 노드가 추가된 후 호출 (space incr)
on_node_remove — 노드가 제거된 후 호출 (space decr)
on_pre_insert — 실제 삽입 전 호출 (overflow check, sticky memory check)
on_pre_replace — 실제 교체 전 호출 (sticky memory check)

일관성 문제

처음에는 일관성을 위해 대부분의 로직을 hash tree 모듈로 추출하려 했습니다. 그런데 do_map_elem_update처럼 map에서만 사용되는 함수는 추출하려면 콜백을 여러 개 인자로 전달해야 하는데, 그렇게까지 해서 추출해도 다른 collection에서 재사용되지 않아 복잡도만 높아지는 결과가 됐습니다. 결국 해당 함수는 추출을 포기하고 collection 레이어에 남겼는데, 그러다 보니 일부는 hash tree 모듈에, 일부는 collection 레이어에 남는 구조가 되어 완전한 일관성을 갖추지는 못했습니다.

추후 개선 방향 (미결)

이번 PR은 일단 중복으로 존재하던 로직을 기존 동작을 바꾸지 않고 hash tree 모듈로 옮기는 것에 집중했습니다. 다만 작업하면서 아래 두 가지 방향이 더 나은 구조일 수 있겠다는 고민이 남았고, 추후 개선 여지로 남겨둡니다.

find / insert_at / update_at 분리: 현재 do_htree_elem_insert처럼 find와 실제 삽입이 하나의 함수에 묶여 있는 구조를 find / insert_at / update_at으로 분리하고, 각 collection에서 이를 조립해서 사용하는 방식. 이렇게 하면 update처럼 map 전용 로직도 hash tree의 기본 연산을 재사용해서 구현할 수 있고, find를 한 번만 호출하는 단일 패스도 유지될 것으로 예상합니다. (node split 발생 시에만 재순회 허용)
CLOG 통합: hash tree를 사용하는 한 데이터 구조와 복구 로직은 set/map이 동일합니다. CLOG를 hash tree 수준에서 재정의하고, ITEM_TYPE 등 collection별로 달라지는 정보만 별도로 정의하는 방식으로 콜백 없이 통합할 수 있을 것으로 예상합니다.

현재 커밋을 나눠서 올린 상태로 리뷰 후 최종본에는 커밋을 하나로 통일하겠습니다.

jhpark816

일부 리뷰

jhpark816 · 2026-04-22T11:54:01Z

+                                     htree_hash_node *node);
+
+void              do_htree_node_remove(htree_hash_node **root,
+                                       htree_hash_node *par_node, int par_hidx);


내부에서 호출하는 함수만 do_ prefix 가지도록 하고,
외부에 제공하는 함수는 htree_ prefix 가지도록 합시다.

htree_xxxx()

jhpark816 · 2026-04-22T11:54:32Z

+    htree_hash_node *node;
+    htree_elem_item *prev;
+    uint16_t         hidx;
+} htree_prev_info;


이런 자료구조가 노출되지 않게 구현할 수 있나요?

jhpark816 · 2026-04-22T11:55:14Z

+
+typedef ENGINE_ERROR_CODE (*htree_pre_insert_cb)(htree_elem_item *new_elem, void *ctx);
+
+typedef void (*htree_node_insert_cb)(void *ctx);


생각보다 callback 정의가 많네요.
callback 수를 줄일 수 있으면 좋겠습니다.

jhpark816 · 2026-04-22T11:57:32Z

-            ndeleted = set_elem_delete_with_count((void *)info, count);
+            ndeleted = do_set_elem_delete((void *)info, count, ELEM_DELETE_COLL);
        } else if (IS_MAP_ITEM(it)) {
-            ndeleted = map_elem_delete_with_count((void *)info, count);


이 코드는 그대로 둡시다.
외부 코드 파일에 있는 do_xxxx() 함수를 직접 호출하지 않게 합시다.

zhy2on · 2026-04-23T01:52:01Z

내부에서 호출하는 함수만 do_ prefix 가지도록 하고, > 외부에 제공하는 함수는 htree_ prefix 가지도록 합시다.

htree_xxxx()

피드백 반영하여 내부에서만 사용하는 함수들은 do_ prefix를 적용하고 header에서 선언을 제거했습니다.

이런 자료구조가 노출되지 않게 구현할 수 있나요?

htree_prev_info는 do_htree_elem_find로 elem을 찾은 뒤 그 위치(node, hidx, prev)를 직접 조작해 포인터 교체를 수행하기 위한 구조체였습니다.
find와 포인터 교체를 htree_elem_insert 내부에서 한 번에 처리하고, replace가 발생했을 때 교체된 old elem의 포인터만 replaced_out 파라미터로 반환하도록 해 htree_prev_info 사용을 없애고 정의도 없앴습니다. 외부에서 내부 구조를 알 필요가 없게 했습니다.

생각보다 callback 정의가 많네요. > callback 수를 줄일 수 있으면 좋겠습니다.

더 고민해보니 space accounting은 hash tree 내부로 이동하는 게 맞다고 판단했으며 이에 따라 콜백 수를 줄일 수 있게 됐습니다.
처음에는 space accounting을 hash tree 내부로 가져오면 hash tree가 각 collection의 타입별 처리를 알아야 하게 되어 "collection이 hash tree를 사용한다"는 의존 방향에 맞지 않는다고 판단해 콜백으로 분리했습니다.
그러나 다시 생각해보니 노드와 원소의 삽입·삭제가 Hash Tree 내부에서 수행되는 이상, 그 결과인 공간 변화를 추적하는 것 역시 Hash Tree의 역할로 보는 것이 맞아보였습니다. 아이템 타입만 다를 뿐 카운팅 로직이 완전히 동일하며, coll_meta_info*에서 ITEM_TYPE을 도출해 hash tree 내부에서 처리하는 방식으로 통합할 수 있었습니다.
다만, CLOG 처리를 위한 콜백은 아직 남아있습니다.

추가로 set/map_elem_get_all의 중복된 DFS 구현을 제거하고, 기존 htree_traverse_dfs_bycnt(count=0, delete=false)를 재활용하는 방식으로 단순화했습니다.

남아있는 CLOG 콜백도 hash tree 내부로 가져올 수 있는지 검토하고, 추가로 추출 가능한 부분이 있는지 확인한 뒤 다시 피드백을 요청드리겠습니다.

jhpark816

일부 리뷰

jhpark816 · 2026-04-23T06:06:40Z

set_hash_node, map_hash_node 타입을 따로 두지 않고,
htree_hash_node 타입을 그대로 사용합시다.

htree_hash_node 타입과 htree_elem_item 타입은 모두
hash_tree 파일의 header에서 제공되어야 하지 않는 지 ?

jhpark816 · 2026-04-23T06:14:12Z

+            htree_node_unlink(&info->root, NULL, 0, (coll_meta_info *)info);
        }
        return ret;
    }


hash_tree에 넣는 insert 기능은 hash_tree 파일에서 제공해야 하지 않는 지 ?

namsic · 2026-04-23T07:13:43Z

+    uint8_t  nfield;              /* key size in data[]: 0 for set/sorted_set (value-only key) */
+    uint8_t  reserved;
+    uint16_t nbytes;              /* value size in data[] */


nfield => 2bytes
nbytes => 4bytes

- htree_elem_item, htree_node 타입 정의 및 hash_tree.h/c 모듈 신설 - htree_elem_insert, htree_elem_update 등 핵심 연산 추출 - htree_elem_traverse_dfs_bycnt, bykey, rand, sampling 순회 함수 추출 - set/map의 중복 구현 제거 및 hash_tree API로 일원화 - set/map elem 구조체를 htree_elem_item 레이아웃으로 통합

zhy2on · 2026-04-28T08:45:11Z

hash_tree 모듈 추출 및 set/map 공통화

set과 map 컬렉션에 중복으로 존재하던 hash tree 자료구조 구현을 hash_tree.h / hash_tree.c로 분리하고, 두 컬렉션이 공통 인터페이스를 통해 사용하도록 리팩토링했습니다.

핵심 설계 원칙

hash_tree는 순수 자료구조 레이어로, 아래 책임만 가집니다:

노드/엘리먼트 할당·해제 (htree_elem_alloc, htree_elem_free)
삽입·갱신·탐색·순회·삭제 연산
node collapse (split 역전), root cleanup 내부화

아래는 각 컬렉션(caller) 책임으로 유지합니다:

CLOG (단, traverse delete 시 순회 중 CLOG를 남겨야 하므로 unlink_fn을 callback으로 전달)
space accounting
ccnt
item alloc/find, cache lock, log replay: 각 컬렉션 레이어에서 처리

elem 레이아웃 통일

htree_elem_item을 <nkey, nbytes, data[]> 구조로 정의하고 두 타입을 typedef로 통일:

set: nkey == nbytes (data 전체가 key)
map: nkey == field 길이, nbytes == field + value 길이 (data = <field, value>)

추출된 공통 함수

함수	역할	비고
htree_elem_alloc	elem 메모리 할당	cache lock 타이밍은 컬렉션이 제어해야 해서 public
htree_elem_free	elem 메모리 해제	동일 이유로 public
htree_elem_release	refcount 감소, 0이면 free
htree_elem_ntotal	elem 전체 크기 반환
htree_elem_insert	삽입 (replace, overflow, sticky 포함)
htree_elem_update	값 갱신 (in-place 또는 chain replace)
htree_elem_traverse_dfs_bycnt	DFS 순회, count개 조회/삭제
htree_elem_traverse_dfs_bykey	key 기반 단건 탐색/삭제
htree_elem_traverse_rand	랜덤 선택 (sparse: dedup hash table, dense: reservoir sampling)

namsic · 2026-04-28T11:56:34Z

+    ssize_t space_delta = 0;
+    ENGINE_ERROR_CODE ret = htree_elem_update((htree_node **)&info->root,
+                                              new_elem,
+                                              IS_STICKY_COLLFLG(info),


IS_STICKY_COLLFLG()는 #ifdef ENABLE_STICKY_ITEM 안쪽에서만 호출해야 합니다.

수정하였습니다.

void arcus_zk_init(char *ensemble_list, int zk_to, EXTENSION_LOGGER_DESCRIPTOR *logger, int verbose, size_t maxbytes, int port, #ifdef PROXY_SUPPORT char *proxy, #endif ENGINE_HANDLE_V1 *engine);

위와 같이 is_sticky 인자 자체를 아예 받지 않는 방법도 있는데,
내부 콜스택에서도 is_sticky 인자를 계속 전달하고 있어서 이러한 구조를 적용했을 때 구현이 지저분해질 여지가 있어 보입니다.

현재 시점에서는 수정된 구현처럼 disable sticky item일 때 항상 false 전달하는 것으로 알고 있겠습니다.

namsic · 2026-04-28T12:09:33Z

+#define HTREE_HASHTAB_SIZE       16
+#define HTREE_HASHIDX_MASK       0x0000000F
+#define HTREE_MAX_HASHCHAIN_SIZE 64
+
+#define HTREE_GET_HASHIDX(hval, hdepth) \
+    (((hval) & (HTREE_HASHIDX_MASK << ((hdepth)*4))) >> ((hdepth)*4))


내부 구현 목적으로만 사용하고 외부에서 알 필요가 없는 값인 것 같은데,
hash_tree.c로 옮겨도 되지 않나 생각합니다.

반영하였습니다.

HTREE_HASHIDX_MASK, HTREE_MAX_HASHCHAIN_SIZE는 hash_tree.h에 유지해야 하나요?

기존 코드에서 SET_HASHIDX_MASK, SET_HASHTAB_SIZE 등이 item_base.h에 선언된 패턴을 따랐습니다.
하지만 이는 외부에 노출할 필요 없는 구현 상수가 노출된 경우로 수정하는 것이 맞는 것 같습니다. 반영하도록 하겠습니다.

BTREE_MAX_DEPTH 또한 동일한 방향으로 정리가 필요하나, 이는 btree 모듈 추출 시 함께 진행하겠습니다.

namsic · 2026-04-28T12:31:02Z

+        if (max_count > 0 && (int)(*root_pptr)->tot_elem_cnt >= max_count) {
+            if (new_root) { do_htree_node_free(*root_pptr); *root_pptr = NULL; }
+            return ENGINE_EOVERFLOW;
+        }


현재 PR 기준으로 max_count에 0을 전달하는 경우는 없기는 하지만,
hash_tree 모듈 수준에서는 max_count가 0인 경우 element 개수 제한 없이 무한정 생성하도록 하나요?

hash tree 객체 단위로 최대 개수를 가지는 것이 아니라
각 insert 호출마다 max_count 값을 전달하는 것이 약간 어색해 보이기는 합니다.

max_count는 hash tree 자체의 속성이 아니라, hash tree를 사용하는 collection의 정책입니다. hash tree 자체는 제한 없이 element를 생성하는 것이 맞다고 생각했습니다.

원래는 caller에서 overflow를 먼저 판단하고 hash tree는 순수하게 insert만 담당하게 하는 게 맞지만, upsert 경로에서는 key가 이미 존재하면 replace(ccnt 불변), 존재하지 않으면 새 삽입(ccnt 증가)으로 분기되기 때문에 find 결과를 알기 전까지 overflow 여부를 판단할 수 없습니다.

이를 해결하는 방법으로

find를 두 번 수행: caller에서 먼저 find로 존재 여부를 확인한 뒤 overflow를 판단하고, hash tree에 insert를 요청하는 방식입니다. 불필요한 중복 탐색이 발생합니다.

find + insert_at 분리: htree_elem_find로 위치(node, hidx, prev)를 반환하고 caller가 overflow 판단 후 htree_elem_insert_at으로 삽입하는 방식입니다. 내부 구현 detail인 node/prev 포인터가 public API로 노출되어 캡슐화가 깨집니다.

두 방법을 생각해봤지만, max_count를 insert 인자로 넘기는 방식이 가장 괜찮은 방법인 것 같아 택했습니다.

namsic · 2026-04-28T12:53:12Z

+/* set/map elements are the canonical htree elem type.
+ * set:  nkey == nbytes (the whole value is the lookup key).
+ * map:  nkey == field length, nbytes == field + value length. */
+typedef htree_elem_item set_elem_item;
+typedef htree_elem_item map_elem_item;


type alias 설정하고 있는데, htree_elem_item 직접 사용하도록 변경해도 동작에 차이 없죠?

nkey + nvalue 조합보다, nkey + nbytes 조합이 구현하기에 낫다고 보는 것이죠?

type alias 설정하고 있는데, htree_elem_item 직접 사용하도록 변경해도 동작에 차이 없죠?

set, map만 놓고 보면 동작에 차이는 없습니다. 다만 default_engine.c, item_clog.c 등에서 collection별로 elem 타입을 구분해서 사용하는 곳이 많기 때문에, typedef alias로 유지하는 것이 낫다고 판단했습니다.

nkey + nvalue 조합보다, nkey + nbytes 조합이 구현하기에 낫다고 보는 것이죠?

기존 구조와의 일관성을 위해 채택했습니다. item_base.h를 보면 다른 자료구조에서도 nbytes로 data[] 전체 크기를 표현하고, data 안의 섹션을 나눌 경우 구분 기준값을 별도 필드로 두는 패턴을 사용하고 있습니다. htree_elem_item도 이 패턴에 맞춰 nbytes로 전체 크기를, nkey로 key/value 경계를 표현하도록 했습니다.

…로 이동

namsic · 2026-04-29T12:02:15Z

+    if (space_delta_out)
+        *space_delta_out = -(ssize_t)slabs_space_size(
+                               offsetof(htree_elem_item, data) + elem->nbytes);
+}


아래 내용이 맞는지 확인해 주세요.

여기 구현을 보면, space_delta_out에 이전 값과 무관하게 이 함수 자체의 변경분을 설정하고 있습니다.

그렇기 때문에, 이 함수를 호출하는 위치를 보면 아래와 같이 지역 변수에 값을 받아서
다시 자신의 space_delta_out에 더하고 있습니다.

ssize_t delta = 0; do_htree_elem_unlink(…, &delta); if (space_delta_out) *space_delta_out += delta;

이 위치에서 단순히 할당하는 대신 증감 연산 하도록 변경하면,
호출 위치에서 자신의 space_delta_out을 그대로 넘겨도 될 것 같습니다.

space_delta_out은 공간의 변화량을 담는 output(반환값) 입니다.
함수 실행 중 메모리 변동이 없으면 0, 할당/해제가 있으면 그 크기를 반환하게 될 것입니다.
현재 PR은 space_delta_out의 초기화 책임을 호출하는 측에서 갖도록 구현하고 있습니다.
즉, 인자로 넘어오는 space_delta_out 값이 이미 0이라는 전제 하에 구현합니다.
아래와 같이 public API 내부에서 값을 초기화하는 편이 자연스럽지 않나 생각합니다.

{ ssize_t space_delta; // 초기화할 필요 없음 htree_elem_traverse_rand(…, &space_delta); do_coll_space_decr(…, -space_delta); } int htree_elem_traverse_rand(…, ssize_t *space_delta_out) { if (space_delta_out) *space_delta_out = 0; // 내부 static 함수에서는 기존처럼 증감 연산 수행해도 됨 // (같은 모듈 내에서 이미 초기화한 것을 확신할 수 있으므로) }

또는, 모듈 수준에서는 사용자가 현재 메모리 사용량을 넘겨준다고 가정하고
초기화 없이 기존 값에 대한 증감 연산만을 수행하는 것으로 동작을 정의할 수도 있을텐데요.
(반환값을 다시 처리하지 않고 함수 수행 과정에서 메모리 변동을 실시간으로 따라가도록)
이 경우에는 delta_out이라는 용어가 약간 맞지 않는 것 같습니다.

전달 받은 공간값에 직접 변화량을 더하지 않았던 이유는 hash tree 레이어에서는 순수하게 변화량만 돌려주고, 그걸 반영하는 건 각 collection의 책임으로 두고 싶었기 때문입니다.
do_coll_space_incr/decr로 공간 변화를 관리하고 있기 때문에, 직접 info->stotal 값을 업데이트할 수 있는 것이 아니라 변화량을 do_coll_space_incr/decr에 넘겨 처리해야 했습니다.
이 때 해시트리에서 do_coll_space_incr/decr을 하려면 collection item의 타입을 알아야 하는데, 이를 인자로 전달받아 처리하기 보단 각 collection에서 반환받은 변화량을 직접 do_coll_space_incr/decr 하는 것이 맞다고 생각했습니다.

다만 이 경우에도 space_delta_out의 초기화를 함수 내부에서 맡는 것이 맞아 보입니다. 현재 방식을 유지하더라도 초기화는 내부로 옮기도록 하겠습니다.

jhpark816

일부 리뷰

jhpark816 · 2026-04-29T11:48:03Z

+    uint32_t hval;
+    struct _htree_elem_item *next;
+    uint16_t nkey;          /* bytes of data[] used as lookup key */
+    uint16_t nbytes;        /* total bytes in data[] */


nbytes는 uint32_t 타입 사용을 검토

jhpark816 · 2026-04-29T12:04:14Z

+            if (new_root) { do_htree_node_free(*root_pptr); *root_pptr = NULL; }
+            return ENGINE_ENOMEM;
+        }
+#endif


sticky 검사는 htree 외부에서 진행하는 것이 좋겠습니다.

jhpark816 · 2026-04-30T02:18:59Z

+    *node_out = node;
+    *hidx_out = hidx;
+    *prev_out = prev;
+    return find;


find == NULL인 경우 output parameter에 값을 설정할 필요가 있나요?

…PI로 재설계 - collection layer에서 sticky overflow, overflow check 하도록 변경 - htree_elem_link는 단순 링킹만 담당하도록 변경

jhpark816

일부 리뷰

jhpark816 · 2026-05-04T02:42:38Z

-                *exist = false;
+            *exist = htree_elem_traverse_dfs_bykey((htree_node **)&info->root, info->root,
+                                                   nbytes, (const unsigned char *)value,
+                                                   false, NULL, NULL, NULL, NULL);


htree_elem_find() 함수를 사용하면 어떤가요?

jhpark816 · 2026-05-04T02:47:27Z

+                                               field->length,
+                                               (const unsigned char *)field->value,
+                                               true, NULL,
+                                               do_map_elem_unlink_clog, info, &space_delta);


아래 2 함수로 구현하는 것은 어떤지?

htree_elem_find()

htree_elem_unlink()

…link 분리 - htree_elem_link → htree_elem_insert로 rename (node split 포함한 완전한 삽입 의미 명확화) - 순수 체인 링킹 로직을 do_htree_elem_link static 함수로 분리 - coll_set.c, coll_map.c 호출부 반영

- htree_elem_delete 추가: find + unlink + node merge를 한 번의 traverse로 처리

- htree_elem_insert: 삽입 전 hash chain 순회하여 중복 키 검사, 중복 시 ENGINE_ELEM_EEXISTS 반환 (new_root 생성 시 rollback) - coll_set: htree_elem_insert가 중복 검사를 담당하므로 do_set_elem_insert의 htree_elem_find 호출 제거 - htree_elem_insert에 단계별 주석 추가

- do_map_elem_replace: 기존 elem 교체 (sticky 체크, replace_at, CLOG, free, space 갱신) - do_map_elem_add: 신규 elem 삽입 (sticky/overflow 체크, insert, CLOG, ccnt, space 갱신) - do_map_elem_insert: find 후 replace/add 분기만 담당

- do_htree_elem_unlink: root부터 target node까지 경로상 모든 tot_elem_cnt 감소 (기존에 leaf node만 감소하던 버그 수정) - do_htree_elem_unlink: space_delta 계산 제거, 호출부에서 직접 계산 (do_htree_elem_link와 동일한 패턴으로 대칭화) - do_htree_elem_link/unlink: 단계별 주석 추가

- do_htree_node_link/unlink, do_htree_elem_link/unlink: space_delta 계산 제거, 순수 구조 조작만 담당 - space_delta 계산을 htree_elem_insert/delete 상위 함수로 이동 - space_delta_add() 헬퍼 추가로 반복 패턴 제거 - htree_elem_delete: insert와 대칭 구조로 단계별 주석 추가 - htree_elem_delete: node merge 조건 분기 통합

- do_set_elem_delete_with_value: htree_elem_traverse_dfs_bykey → htree_elem_delete + CLOG_SET_ELEM_DELETE + htree_elem_release - do_map_elem_delete_with_field: htree_elem_traverse_dfs_bykey → htree_elem_delete + CLOG_MAP_ELEM_DELETE + htree_elem_release

- delete=false 단순 존재 확인은 htree_elem_find로 충분

- item_base.h에 do_coll_space_update() inline 헬퍼 추가 - space_delta > 0이면 incr, < 0이면 decr, 0이면 no-op - coll_set.c, coll_map.c의 do_coll_space_incr/decr 직접 호출을 do_coll_space_update()로 교체

- htree_elem_traverse_dfs_bykey, do_htree_elem_traverse_bykey 제거 - delete=false: htree_elem_find로 대체 - delete=true: htree_elem_delete + caller CLOG으로 대체 - do_map_elem_get을 do_map_elem_traverse_all, do_map_elem_traverse_byfield로 분리 - numfields == 0: traverse_all (htree_elem_traverse_dfs_bycnt 호출) - numfields > 0: traverse_byfield (htree_elem_find/delete 호출) - do_coll_space_update를 item_base.h로 이동, set/map 중복 정의 제거

- do_set_elem_delete_with_value → do_set_elem_delete_by_value - do_map_elem_delete_with_field → do_map_elem_delete_by_field - do_map_elem_get_all → do_map_elem_traverse_all - do_map_elem_get_byfield → do_map_elem_traverse_by_field - htree_elem_traverse_dfs_bycnt → htree_elem_traverse_dfs_by_cnt

namsic · 2026-04-30T00:26:17Z

+
+void htree_elem_free(htree_elem_item *elem)
+{
+    do_item_mem_free(elem, offsetof(htree_elem_item, data) + elem->nbytes);


htree_elem_ntotal 사용하는 것은 어떤지?

Suggested change

do_item_mem_free(elem, offsetof(htree_elem_item, data) + elem->nbytes);

do_item_mem_free(elem, htree_elem_ntotal(elem));

namsic · 2026-05-05T05:08:21Z

+            if (new_root) {
+                do_htree_node_free(*root_pptr);
+                *root_pptr = NULL;
+            }


new_root는 기존에 hash tree가 없었던 상태에서만 설정되므로,
이 위치에서 new_root == true인 경우가 없을 것 같습니다.

아래 HTREE_MAX_HASHCHAIN_SIZE 비교에서도 기존에 element가 있었다면
new_root는 항상 false일 것으로 보입니다.

namsic self-requested a review April 22, 2026 11:15

zhy2on changed the title ~~Internal/extract hash tree module~~ INTERNAL: Extract hash_tree module from coll_set and coll_map Apr 22, 2026

jhpark816 reviewed Apr 22, 2026

View reviewed changes

zhy2on force-pushed the internal/extract-hash-tree-module branch from a614f6b to 28a1b82 Compare April 22, 2026 15:15

zhy2on force-pushed the internal/extract-hash-tree-module branch 2 times, most recently from 9434b3f to 2eef927 Compare April 23, 2026 02:50

This comment was marked as outdated.

Sign in to view

jhpark816 reviewed Apr 23, 2026

View reviewed changes

namsic reviewed Apr 23, 2026

View reviewed changes

zhy2on force-pushed the internal/extract-hash-tree-module branch 2 times, most recently from 2d1a15a to 234938c Compare April 28, 2026 08:16

zhy2on force-pushed the internal/extract-hash-tree-module branch from 234938c to 8d3450e Compare April 28, 2026 08:40

namsic reviewed Apr 28, 2026

View reviewed changes

zhy2on added 3 commits April 28, 2026 23:31

INTERNAL: 코드 정렬 및 HTREE_GET_HASHIDX 이동

8f80a26

INTERNAL: IS_STICKY_FOLLFLG()를 #ifdef ENABLE_STICKY_ITEM과 함께 호출하도록 수정

74b6857

INTERNAL: HTREE_HASHIDX_MASK, HTREE_MAX_HASHCHAIN_SIZE hash_tree.c 내부…

6a617d4

…로 이동

namsic reviewed Apr 29, 2026

View reviewed changes

INTERNAL: space_delta 초기화 hash tree 함수 내부로 이동

2c84f9d

jhpark816 reviewed Apr 30, 2026

View reviewed changes

INTERNAL: htree_elem_find / htree_elem_link / htree_elem_replace_at A…

ce2adfe

…PI로 재설계 - collection layer에서 sticky overflow, overflow check 하도록 변경 - htree_elem_link는 단순 링킹만 담당하도록 변경

jhpark816 reviewed May 4, 2026

View reviewed changes

zhy2on added 6 commits May 4, 2026 14:00

INTERNAL: htree_elem_delete 추가

7959cbf

- htree_elem_delete 추가: find + unlink + node merge를 한 번의 traverse로 처리

zhy2on added 5 commits May 4, 2026 18:17

INTERNAL: set_elem_exist에서 traverse_dfs_bykey를 htree_elem_find로 교체

0df06d6

- delete=false 단순 존재 확인은 htree_elem_find로 충분

namsic reviewed May 6, 2026

View reviewed changes


		typedef ENGINE_ERROR_CODE (htree_pre_insert_cb)(htree_elem_item new_elem, void *ctx);

		typedef void (htree_node_insert_cb)(void ctx);

	do_item_mem_free(elem, offsetof(htree_elem_item, data) + elem->nbytes);
	do_item_mem_free(elem, htree_elem_ntotal(elem));

Conversation

zhy2on commented Apr 22, 2026

🔗 Related Issue

⌨️ What I did

Uh oh!

jhpark816 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhy2on commented Apr 23, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

jhpark816 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhy2on commented Apr 28, 2026

hash_tree 모듈 추출 및 set/map 공통화

핵심 설계 원칙

elem 레이아웃 통일

추출된 공통 함수

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhy2on Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jhpark816 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jhpark816 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

zhy2on Apr 28, 2026 •

edited

Loading