A generic and memory-safe C implementation of a hash map (sometimes called a dictionary or associative array) with separate chaining for collision handling and automatic resizing when a specified load factor is exceeded. This implementation is inspired by Go’s map in the sense that it supports arbitrary key and value types by storing them as raw bytes (void * plus a size).
- Features
- How It Works
- Usage
- API Reference
- Default Hash & Equality
- Custom Hash & Equality
- Example Program
- License
-
Generic Key/Value
- Store any type of data as the key or value by passing a pointer and size.
-
Separate Chaining
- Uses a linked list (chain) to handle collisions in each bucket.
-
Automatic Resizing
- When the number of stored pairs exceeds
capacity * load_factor, the map capacity is doubled to maintain efficient lookups.
- When the number of stored pairs exceeds
-
Default or Custom Hash/Equality
- A default hashing function and equality check (byte-wise comparison) are provided.
- You can also pass your own for specialized data types.
-
Shallow Copy
- The map copies the raw bytes of your key/value. If you store pointers, you must manage deeper memory yourself.
The load factor is a ratio that defines how “full” the map is allowed to become before a resize is triggered. It is calculated as:
current_size / capacity
- Default:
0.75f(or 75%) - When the size-to-capacity ratio exceeds
0.75, the map doubles its capacity and re-hashes existing entries into the new buckets.
Each bucket in the hash map array points to a linked list (chain) of entries. When two keys hash to the same bucket index:
- The new entry is placed at the head of the linked list.
- During a lookup or insertion, we traverse this linked list and compare keys to find the correct entry.
When the load factor is exceeded:
- The map doubles its capacity.
- All existing entries are re-hashed into the new buckets, preserving key-value associations.
- This usually brings the load factor back below the threshold to maintain O(1) amortized performance.
-
Clone this repository:
git clone https://github.com/araujo88/chashmap.git cd chashmap -
Build with
make:make
- By default, this creates an example binary named
chashmap_examplein the project root.
- By default, this creates an example binary named
-
Run the example:
./chashmap_example
You should see output demonstrating inserts, lookups, and removals.
- Copy the
include/chashmap.hheader andsrc/chashmap.cfile into your project, or simply add this repo as a submodule. - Ensure you include
chashmap.hin any source file that calls the hash map functions. - Link or compile
chashmap.calongside your code.
Below is a brief summary of the major functions. For detailed usage, see chashmap.h or the Example Program.
int hashmap_init(HashMap *map,
size_t capacity,
hash_func_t hash_func,
eq_func_t eq_func,
float load_factor);map: YourHashMapstruct (will be initialized).capacity: Initial number of buckets (use0to let the library choose a default).hash_func: Custom hash function (passNULLto use default).eq_func: Custom equality function (passNULLto use default).load_factor: Trigger resize threshold (<= 0=> default of0.75f).- Returns
0on success, non-zero on error.
int hashmap_insert(HashMap *map,
const void *key_data, size_t key_size,
const void *val_data, size_t val_size);- If the key already exists, the value is updated.
- If the key does not exist, a new entry is created.
- Memory: The map allocates and stores its own copies of
key_dataandval_data.
int hashmap_get(const HashMap *map,
const void *key_data, size_t key_size,
void **out_val, size_t *out_size);- Retrieves the value for a given key, if it exists.
- If
out_valandout_sizeare provided, a copy of the value is allocated and returned. - Caller must
free(*out_val)once done reading it. - Returns:
1if found0if not found< 0on error (e.g., invalid arguments)
int hashmap_remove(HashMap *map, const void *key_data, size_t key_size);- Removes the key/value if found.
- Returns:
1if removed0if not found< 0on error
void hashmap_destroy(HashMap *map);- Frees all buckets and entries, as well as keys and values stored within those entries.
- After calling, the
mapcan be reused only after callinghashmap_initagain.
- Default Hash: A variant of Jenkins' one-at-a-time hash.
- Default Equality: A byte-wise comparison via
memcmp.
These defaults are appropriate for most primitive and plain-old-data (POD) structures where a simple memory comparison is valid.
If you have keys that aren’t trivially comparable by bytes, or you need a more specific notion of equality, you can define custom functions:
typedef uint64_t (*hash_func_t)(const void *key_data, size_t key_size);
typedef int (*eq_func_t)(const void *key_a, const void *key_b, size_t key_size);- Hash Function: Must produce a 64-bit integer for any given key data.
- Equality Function: Must return non-zero if keys are equal,
0otherwise.
Suppose you have:
struct Point {
int x, y;
};You might define:
uint64_t custom_point_hash(const void *data, size_t size) {
// For demonstration, a simple polynomial combination:
// hash = x + 31 * y
if (size < sizeof(int) * 2) return 0; // safety check
const int *xy = (const int *)data;
return (uint64_t)xy[0] + 31ULL * (uint64_t)xy[1];
}
int custom_point_eq(const void *data1, const void *data2, size_t size) {
// Compare x and y
const int *xy1 = (const int *)data1;
const int *xy2 = (const int *)data2;
return (xy1[0] == xy2[0]) && (xy1[1] == xy2[1]);
}Then initialize your map with these functions:
HashMap map;
hashmap_init(&map, 16, custom_point_hash, custom_point_eq, 0.75f);
// usage...Below is a simple example that demonstrates inserting and retrieving different types of keys and values:
#include <stdio.h>
#include "chashmap.h"
// Example usage
int main(void)
{
HashMap map;
if (hashmap_init(&map, 0, NULL, NULL, 0.0f) != 0)
{
fprintf(stderr, "Failed to initialize HashMap.\n");
return 1;
}
// Insert int -> double
int key_int = 42;
double val_double = 3.14159;
hashmap_insert(&map, &key_int, sizeof(key_int), &val_double, sizeof(val_double));
// Insert string -> string
const char *key_str = "hello";
const char *val_str = "world";
hashmap_insert(&map, key_str, strlen(key_str) + 1, val_str, strlen(val_str) + 1);
// Insert custom struct as key
struct Point
{
int x, y;
} p = {10, 20};
const char *val_for_struct = "a point";
hashmap_insert(&map, &p, sizeof(p), val_for_struct, strlen(val_for_struct) + 1);
// Lookup int -> double
void *out_val = NULL;
size_t out_size = 0;
int found = hashmap_get(&map, &key_int, sizeof(key_int), &out_val, &out_size);
if (found == 1)
{
double retrieved = *(double *)out_val;
printf("Retrieved value for key %d is %f\n", key_int, retrieved);
free(out_val); // Must free after retrieval
}
// Lookup string -> string
found = hashmap_get(&map, key_str, strlen(key_str) + 1, &out_val, &out_size);
if (found == 1)
{
printf("Retrieved value for key \"%s\" is \"%s\"\n", key_str, (char *)out_val);
free(out_val);
}
// Remove int -> double
int removed = hashmap_remove(&map, &key_int, sizeof(key_int));
printf("Removed key %d: %s\n", key_int, (removed == 1) ? "true" : "false");
// Cleanup
hashmap_destroy(&map);
return 0;
}Compile and run the example to see output indicating successful storage, retrieval, and removal.
This project is provided under the MIT License, which permits reuse within proprietary projects. See LICENSE for details.
Feel free to tailor this hash map to suit your specific performance, memory, or customization requirements. If you encounter any issues or want to suggest improvements, please open an issue or submit a pull request. Happy coding!