[
](LICENSE)
A library for encoding/decoding HHC(hexahexacontadecimal) for 32bit and 64bit integers, with C++ and Python bindings.
Inspired by hexahexacontadecimal by Alexander Ljungberg
This project is intended to be an exploration in creating performant, secure, and portable algorithms, while also being a learning experience for me. Take it with a grain of salt as this project moves to full production.
Why?
The base66 alphabet is comprised of all the URL unreserved characters (see: RFC 3986), This is useful for creating short, human-readable URLs for things like short links, tracking codes, or other identifiers.
If encoding information in a URL is important, base66 outperforms base64 in information density. However, base66 is much slower to encode and decode, this project aims to provide optimized implementations that attempt to compete with base64 in terms of speed for integer sized operations.
In theory, base66 will always underperform when compared against base64 as the naive algorithm in O(n^2) time complexity, and inherently requires many modulo operations.
Procedural approaches are currently being explored, but are not yet implemented.
Dirty Benchmark
(Ryzen 9 5950X) at v1.0.8, clang 20.1.8:
Running ./benchmarks/hhc_benchmarks
Run on (32 X 5086.18 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x16)
L1 Instruction 32 KiB (x16)
L2 Unified 512 KiB (x16)
L3 Unified 32768 KiB (x2)
Load Average: 2.13, 1.53, 1.38
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------
BM_hhc32BitEncodePadded 3.64 ns 3.63 ns 191195224
BM_hhc32BitEncodeUnpadded 10.7 ns 10.7 ns 65552301
BM_hhc64BitEncodePadded 7.55 ns 7.52 ns 93689754
BM_hhc64BitEncodeUnpadded 13.8 ns 13.8 ns 51108440
BM_hhc32BitDecodeUnsafe 2.26 ns 2.25 ns 298530751
BM_hhc32BitDecodeSafe 5.03 ns 5.01 ns 100000000
BM_hhc64BitDecodeUnsafe 4.86 ns 4.85 ns 179958209
BM_hhc64BitDecodeSafePadded 8.32 ns 8.29 ns 83133464
BM_hhc64BitDecodeSafeUnpadded 9.25 ns 9.22 ns 75757135
BM_hhcValidateString32 2.82 ns 2.81 ns 260687294
BM_hhcValidateString64 6.96 ns 6.94 ns 101104402
HM_rand32Bit 4.77 ns 4.76 ns 146694025
BM_Permuted32Next 0.701 ns 0.699 ns 997647405
As you can see, the performance is not quite there yet, additionally segmentation between the different performance cases needs to be done, such as unalligned access and aligned access, or differing length of strings.
Quick Start
Python
# Install from PyPI
pip install k-hhc
# Verify
python -c "import k_hhc; print(k_hhc.encode_padded_32bit(629717763))"
C++
# Clone the repository
git clone https://github.com/kirbyevanj/k-hhc.git
cd k-hhc
# Basic build: produces library, tests, and examples
mkdir build && cd build
cmake ..
cmake --build .
# Run the tests
ctest --output-on-failure
# Run the benchmarks
./benchmarks/hhc_benchmarks
# Run the examples
./examples/hhc_encode_example
./examples/hhc_decode_example
Building the Project
Prerequisites
- CMake 3.20 or higher
- C++17 compatible compiler (GCC 7+, Clang 5+, or MSVC 19.14+ / VS 2017 15.7+)
- Python 3.6+ and pybind11 (optional, for Python bindings)
Build Instructions
Basic Build
# Clone the repository
git clone https://github.com/kirbyevanj/k-hhc.git
cd k-hhc
# Create build directory
mkdir build && cd build
# Configure the project (defaults to Release build)
cmake ..
# Or specify build type explicitly
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
# Build all targets
cmake --build .
Build with Python Bindings
# Clone the repository
...
# Build with Python bindings
cmake -DHHC_BUILD_PYTHON=ON ..
cmake --build .
...
# Goto python directory
cd ../python
# Build the Python package
pip install .
# Test the module
PYTHONPATH=./python python3 -c "import k_hhc; print(k_hhc.encode_padded_32bit(629717763))"
Build with Code Coverage
# Configure with coverage enabled (requires Clang)
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DHHC_ENABLE_COVERAGE=ON \
..
# Build the project
cmake --build .
# Run tests to generate coverage data
ctest --output-on-failure
# Generate coverage report
cmake --build . --target coverage
# View the HTML report
xdg-open coverage/html/index.html # Linux
# or
open coverage/html/index.html # macOS
Build with Fuzzing
# Configure with fuzzing enabled (requires Clang)
mkdir build && cd build
cmake -DHHC_ENABLE_FUZZING=ON \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
..
# Build fuzzing targets
cmake --build .
# Run a fuzzer
./fuzz/hhc_fuzz_decode32 -max_total_time=60
CMake Configuration Options
| Option | Default | Description |
HHC_ENABLE_COVERAGE | OFF | Enable LLVM code coverage instrumentation. Requires Clang compiler. Adds a coverage target that generates HTML reports. |
HHC_BUILD_PYTHON | OFF | Build Python bindings using pybind11. Requires Python 3.6+ and pybind11. |
HHC_ENABLE_FUZZING | OFF | Build libFuzzer targets for fuzzing. Requires Clang compiler with fuzzing support. |
CMAKE_BUILD_TYPE | Release | Build type: Debug, Release, RelWithDebInfo, or MinSizeRel. |
CMAKE_C_COMPILER | (system default) | C compiler to use (e.g., clang, gcc). |
CMAKE_CXX_COMPILER | (system default) | C++ compiler to use (e.g., clang++, g++). |
Running Benchmarks
# From the build directory
./benchmarks/hhc_benchmarks
API Reference
External Dependencies
Dependencies are managed via CMake's ExternalProject_Add and will be automatically downloaded and built.
Python Bindings
Building from Source with CMake
mkdir build && cd build
cmake -DHHC_BUILD_PYTHON=ON ..
cmake --build .
# Test the module
PYTHONPATH=./python python3 -c "import k_hhc; print(k_hhc.encode_padded_32bit(42))"
Python API
import k_hhc
encoded = k_hhc.encode_padded_32bit(424242)
encoded = k_hhc.encode_unpadded_32bit(424242)
decoded = k_hhc.decode_32bit(".TNv")
encoded = k_hhc.encode_padded_64bit(9876543210)
encoded = k_hhc.encode_unpadded_64bit(9876543210)
decoded = k_hhc.decode_64bit("5tVfK4")
See python/examples/ for more detailed examples.