A collection of high-performance, generic data structures implemented in C, including vectors, hashmaps, hashsets, binary search trees, queues, stacks, bit vectors, and dynamic strings.
- Generic Vector (
genVec): Dynamic array with automatic resizing - HashMap: Hash table with linear probing and dynamic resizing
- HashSet: Set implementation using hash table
- Binary Search Tree (
BST): Array-based BST with bit vector for tracking occupied nodes - Bit Vector (
bitVec): Space-efficient bit array - Queue: Circular queue implementation
- Stack: LIFO stack wrapper around vector
- String: Dynamic string with rich manipulation API
- Trie: Prefix tree for string storage (header-only)
- Custom memory management with optional delete functions
- Type-safe generic operations using
uint8_t*casting
A dynamic array that can store any data type.
Key Features:
- Automatic growth (1.5x) and shrinking (0.5x at 25% capacity)
- Custom delete functions for complex types
- Efficient memory management
- Rich API for insertion, removal, and access operations
- Stack allocation support with
genVec_init_stk()
Usage:
// Create a vector of integers
genVec* vec = genVec_init(10, sizeof(int), NULL);
// Push elements
int value = 42;
genVec_push(vec, (u8*)&value);
// Access elements
int result;
genVec_get(vec, 0, (u8*)&result);
// Stack allocation
genVec vec_stk;
genVec_init_stk(10, sizeof(int), NULL, &vec_stk);
genVec_destroy_stk(&vec_stk);
// Cleanup
genVec_destroy(vec);A hash table implementation with linear probing collision resolution.
Key Features:
- Dynamic resizing (grows at 70% load factor, shrinks at 20%)
- Prime number capacities for better distribution
- Tombstone marking for efficient deletion
- Custom hash, compare, and delete functions
- FNV-1a hash function by default
- In-place value modification with
hashmap_modify()
Usage:
// Create a hashmap with String keys and int values
hashmap* map = hashmap_create(
sizeof(String),
sizeof(int),
murmurhash3_string, // Custom hash for String
string_custom_delete, // Delete function for keys
NULL, // No delete needed for int values
string_custom_compare // Compare function for keys
);
// Insert key-value pair
String key;
string_create_onstack(&key, "hello");
int value = 100;
hashmap_put(map, (u8*)&key, (u8*)&value);
// Retrieve value
int retrieved;
if (hashmap_get(map, (u8*)&key, (u8*)&retrieved) == 0) {
printf("Value: %d\n", retrieved);
}
// Cleanup
string_destroy_fromstk(&key);
hashmap_destroy(map);A set implementation for storing unique elements.
Key Features:
- Same collision resolution and resizing strategy as HashMap
- Efficient membership testing
- Automatic duplicate prevention
- Linear probing with tombstone marking
- Dynamic resizing based on load factor
Usage:
// Create a hashset of integers
hashset* set = hashset_create(
sizeof(int),
NULL, // Use default FNV-1a hash
NULL, // No custom delete needed for ints
NULL // Use default memcmp compare
);
// Insert elements
int value = 42;
hashset_insert(set, (u8*)&value);
// Check membership
if (hashset_has(set, (u8*)&value)) {
printf("Element 42 found!\n");
}
// Remove element
hashset_remove(set, (u8*)&value);
// Cleanup
hashset_destroy(set);An array-based BST implementation using implicit heap indexing.
Key Features:
- Array-based storage with implicit parent-child relationships
- Bit vector to track occupied vs empty array slots
- Standard BST operations (insert, remove, search)
- Multiple traversal methods (preorder, inorder, postorder, BFS)
- Find min/max operations
- Balance operation (experimental)
Implementation Details:
- Parent index:
(i - 1) / 2 - Left child:
2 * i + 1 - Right child:
2 * i + 2 - Uses bit vector to distinguish allocated vs occupied slots
Usage:
// Create BST for integers
BST* bst = bst_create(
sizeof(int),
int_cmp, // Comparison function
int_to_str, // Convert to String for traversal
NULL // No custom delete needed
);
// Insert elements
int values[] = {50, 30, 70, 20, 40, 60, 80};
for (int i = 0; i < 7; i++) {
bst_insert(bst, (u8*)&values[i]);
}
// Search
int search_val = 40;
if (bst_search(bst, (u8*)&search_val)) {
printf("Found %d\n", search_val);
}
// Traversals
String* inorder = bst_inorder(bst);
printf("Inorder: %s\n", string_to_cstr(inorder));
string_destroy(inorder);
String* bfs = bst_bfs(bst);
printf("BFS: %s\n", string_to_cstr(bfs));
string_destroy(bfs);
// Find min/max
int min, max;
bst_find_min(bst, (u8*)&min);
bst_find_max(bst, (u8*)&max);
printf("Min: %d, Max: %d\n", min, max);
// Balance (WARNING: experimental, may have issues)
bst_balance(bst);
// Cleanup
bst_destroy(bst);A space-efficient bit array using byte-packed storage.
Key Features:
- 8x memory savings compared to bool arrays
- Individual bit manipulation (set, clear, test, toggle)
- Push/pop operations
- Built on top of
genVecfor automatic resizing
Usage:
// Create bit vector
bitVec* bvec = bitVec_create();
// Set bits
bitVec_set(bvec, 0); // Set bit 0
bitVec_set(bvec, 5); // Set bit 5
bitVec_set(bvec, 10); // Set bit 10
// Test bits
if (bitVec_test(bvec, 5)) {
printf("Bit 5 is set\n");
}
// Clear bit
bitVec_clear(bvec, 5);
// Toggle bit
bitVec_toggle(bvec, 0);
// Push a set bit
bitVec_push(bvec);
// Print byte at index
bitVec_print(bvec, 0); // Print first byte
// Cleanup
bitVec_destroy(bvec);A circular queue implementation with automatic growth.
Key Features:
- Circular buffer for O(1) enqueue/dequeue
- Automatic resizing (1.5x growth)
- No shrinking (for performance)
- Built on top of
genVec
Usage:
// Create queue of integers
Queue* q = queue_create(10, sizeof(int), NULL);
// Enqueue elements
for (int i = 0; i < 5; i++) {
enqueue(q, (u8*)&i);
}
// Dequeue element
int val;
dequeue(q, (u8*)&val);
printf("Dequeued: %d\n", val);
// Peek at front
queue_peek(q, (u8*)&val);
printf("Front: %d\n", val);
// Check size
printf("Size: %zu\n", queue_size(q));
// Cleanup
queue_destroy(q);A simple LIFO stack wrapper around genVec.
Key Features:
- Standard push/pop/peek operations
- Automatic resizing through underlying vector
- Thin wrapper for clarity
Usage:
// Create stack of integers
Stack* stk = stack_create(10, sizeof(int), NULL);
// Push elements
for (int i = 0; i < 5; i++) {
stack_push(stk, (u8*)&i);
}
// Pop element
int val;
stack_pop(stk, (u8*)&val);
printf("Popped: %d\n", val);
// Peek at top
stack_peek(stk, (u8*)&val);
printf("Top: %d\n", val);
// Cleanup
stack_destroy(stk);A robust string implementation with automatic memory management.
Key Features:
- Built on top of
genVec - Automatic null termination
- Rich string manipulation API
- Support for both heap and stack allocation
- Substring extraction
- Search operations
Usage:
// Create from C string
String* str = string_from_cstr("Hello");
// Append operations
string_append_cstr(str, " World");
string_append_char(str, '!');
// Insert operations
string_insert_char(str, 5, ',');
string_insert_cstr(str, 6, " beautiful");
// Access and modify
char c = string_at(str, 0);
string_set_char(str, 0, 'h');
// Search
int pos = string_find_cstr(str, "World");
if (pos != -1) {
printf("Found at position %d\n", pos);
}
// Substring
String* sub = string_substr(str, 0, 5);
printf("Substring: %s\n", string_to_cstr(sub));
string_destroy(sub);
// Comparison
String* other = string_from_cstr("hello, beautiful World!");
if (string_equals(str, other)) {
printf("Strings are equal\n");
}
// Cleanup
string_destroy(str);
string_destroy(other);A header-only trie implementation for efficient string storage and prefix matching.
Key Features:
- 29-character alphabet (a-z, apostrophe, hyphen, period)
- Word normalization (lowercase, digit removal, smart apostrophe handling)
- Space-efficient prefix sharing
- Fast prefix-based queries
Usage:
// Create trie
Trie* trie = trie_create();
// Insert words
trie_insert_cstr(trie, "hello");
trie_insert_cstr(trie, "world");
trie_insert_cstr(trie, "help");
// Cleanup
trie_destroy(trie);The library includes a Shakespeare text parser that demonstrates practical usage:
int parse(void) {
// Create hashmap with String keys and int values
hashmap* map = hashmap_create(
sizeof(String),
sizeof(int),
murmurhash3_string,
string_custom_delete,
NULL,
string_custom_compare
);
// Process file and count word frequencies
FILE* f = fopen("shakespeare.txt", "r");
char line[512];
char cleaned[256];
while (fgets(line, sizeof(line), f)) {
char* token = strtok(line, " \n\t\r");
while (token) {
// Clean word (remove numbers, handle contractions)
if (clean_word(token, cleaned, sizeof(cleaned))) {
String str;
string_create_onstack(&str, cleaned);
int count;
if (hashmap_get(map, (u8*)&str, (u8*)&count) == 0) {
count++;
} else {
count = 1;
}
hashmap_put(map, (u8*)&str, (u8*)&count);
}
token = strtok(NULL, " \n\t\r");
}
}
// Query results
String query;
string_create_onstack(&query, "romeo");
int count;
if (hashmap_get(map, (u8*)&query, (u8*)&count) == 0) {
printf("Count of %s: %d\n", string_to_cstr(&query), count);
}
// Cleanup
string_destroy_fromstk(&query);
hashmap_destroy(map);
fclose(f);
return 0;
}The library includes convenient wrapper functions in helper_functions.h:
// Vector helpers
vec_push_int(vec, 42);
int val = vec_get_int(vec, 0);
// HashMap helpers
map_put_strToInt(map, "key", 100);
int value = map_get_strToInt(map, "key");
// HashSet helpers
set_insert_int(set, 42);
if (set_has_int(set, 42)) { /* ... */ }
// Print functions
int_print((u8*)&value);
str_print((u8*)&string);
// Comparison functions
u8 cmp = int_cmp((u8*)&a, (u8*)&b); // Returns 1, 0, or 255
// Conversion functions
String* str = int_to_str((u8*)&value);- Ownership: Data structures take ownership of inserted data
- Custom Deleters: Optional delete functions for cleanup of complex types
- RAII-style: Destroy functions handle all cleanup automatically
- Stack Support: Many structures support stack allocation with separate cleanup
- Uses
uint8_t*(u8*) for generic byte manipulation - Size parameters ensure type safety at runtime
- Function pointers for type-specific operations
- Cache-friendly memory layout with alignment attributes
- Prime number capacities for hash tables
- Dynamic resizing with configurable load factors
- Linear probing for cache locality
- Circular buffer for queue efficiency
gcc -O2 -Wall -Wextra \
main.c \
gen_vector.c \
hashmap.c \
hashset.c \
String.c \
BST.c \
bit_vector.c \
Queue.c \
Stack.c \
-o programgenVec_init()- Create vectorgenVec_init_stk()- Initialize on stackgenVec_push()- Add element to endgenVec_pop()- Remove last elementgenVec_get()/genVec_get_ptr()- Access elementgenVec_insert()/genVec_insert_multi()- Insert operationsgenVec_remove()- Remove at positiongenVec_replace()- Replace elementgenVec_reserve()/genVec_reserve_val()- Pre-allocate capacitygenVec_copy()- Shallow copygenVec_destroy()/genVec_destroy_stk()- Cleanup
hashmap_create()- Create hashmaphashmap_put()- Insert/update key-value pairhashmap_get()- Retrieve value by keyhashmap_del()- Remove key-value pairhashmap_has()- Check key existencehashmap_modify()- Modify value in-place with callbackhashmap_print()- Print all key-value pairshashmap_destroy()- Free all memory
hashset_create()- Create hashsethashset_insert()- Add element (returns 0 if already exists)hashset_has()- Check element existencehashset_remove()- Remove elementhashset_print()- Print all elementshashset_destroy()- Free all memory
bst_create()- Create BSTbst_insert()- Insert elementbst_remove()- Remove elementbst_search()- Search for elementbst_find_min()/bst_find_max()- Find extremabst_preorder()/bst_inorder()/bst_postorder()- DFS traversalsbst_bfs()- Level-order traversalbst_balance()- Balance tree (experimental)bst_destroy()- Free all memory
bitVec_create()- Create bit vectorbitVec_set()- Set bit to 1bitVec_clear()- Set bit to 0bitVec_test()- Test if bit is setbitVec_toggle()- Flip bit valuebitVec_push()- Add set bitbitVec_pop()- Remove last bitbitVec_print()- Print byte contentsbitVec_destroy()- Free memory
queue_create()- Create queueenqueue()- Add element to backdequeue()- Remove element from frontqueue_peek()- View front elementqueue_size()/queue_empty()- Query statequeue_print()- Print contentsqueue_destroy()- Free memory
stack_create()- Create stackstack_push()- Push elementstack_pop()- Pop elementstack_peek()- View top elementstack_size()- Get sizestack_print()- Print contentsstack_destroy()- Free memory
string_create()/string_from_cstr()- Create stringstring_create_onstack()- Initialize on stackstring_append_*()- Append operationsstring_insert_*()- Insert operationsstring_remove_char()- Remove characterstring_compare()/string_equals()- Comparisonstring_find_*()- Search operationsstring_substr()- Extract substringstring_at()/string_set_char()- Character accessstring_len()/string_empty()- Query statestring_destroy()/string_destroy_fromstk()- Cleanup
Key constants in default_functions.h:
#define LOAD_FACTOR_GROW 0.70 // Resize up threshold
#define LOAD_FACTOR_SHRINK 0.20 // Resize down threshold
#define HASHMAP_INIT_CAPACITY 17 // Initial capacity (prime)Key constants for growth/shrink in gen_vector.c:
#define GROWTH 1.5 // Growth multiplier
#define SHRINK_AT 0.25 // Shrink when 25% full
#define SHRINK_BY 0.5 // Shrink to 50% capacitytypedef enum {
EMPTY = 0, // Bucket never used
FILLED = 1, // Bucket contains valid data
TOMBSTONE = 2 // Bucket previously used, now deleted
} STATE;#define PARENT(i) (((i) - 1) / 2)
#define L_CHILD(i) ((2 * (i)) + 1)
#define R_CHILD(i) ((2 * (i)) + 2)The parser includes advanced word normalization:
- Removes all digits
- Handles contractions (keeps apostrophes in words like "don't")
- Removes possessive forms (e.g., "Shakespeare's" → "shakespeare")
- Handles UTF-8 curly quotes
- Case-insensitive processing
- BST Balance: The
bst_balance()function is marked as experimental and may have correctness issues - Queue Shrinking: Queue does not shrink to optimize for performance
- String Helper Functions: Some string vector helpers in
helper_functions.hhave memory management issues (see WARN/TODO comments)