documentation (generated by doing ./make_test.sh 8): HTML
libsrt has been included into Paul Hsieh's String Library Comparisons table. What a great honor! :-)
libsrt: Safe Real-Time library for the C programming language
libsrt is a C library that provides string, vector, bit set, set, and map handling. It's been designed for avoiding explicit memory management when using dynamic-size data structures, allowing safe and expressive code, while keeping high performance. It is also suitable for low level and hard real-time applications, because functions are predictable in both space and time (asuming OS and underlying C library is also real-time suitable).
- Easy: write high-level-like C code. Write code faster and safer.
- Fast: using O(n)/O(log n)/O(1) state of the art algorithms.
- Useful: strings supporting raw data handling (per-byte), vector, tree, and map structures.
- Efficient: space optimized, minimum allocation calls (heap and stack support).
- Compatible: OS-independent (e.g. built-in space-optimized UTF-8 support).
- Predictable: suitable for microcontrollers and hard real-time compliant code.
- Unicode: although string internal representation is raw data (bytes), functions for handling Unicode interpretation/generation/transformation are provided, so when Unicode-specific functions are used, the result of these functions is stored internally as UTF-8 (also, caching some operations, like Unicode string length -e.g. ss_len()/ss_size() give length in bytes, and ss_len_u() the length in Unicode characters-).
How to build
In most POSIX systems "make" will be enough, as the Makefile includes platform detection for corner cases. However, you can use multiple flags for tuning the build (you can also mix them):
- make # Defaults (equivalent to make ADD_CFLAGS="-DS_CRC32_SLC=12")
- make -j 8 # Make, spawning up to 8 concurrent jobs (faster on multiprocessor systems)
- make DEBUG=1 # Disable optimizations (-O0 -ggdb)
- make PROFILING=1 # Profiling and coverage flags
- make CC=gcc PEDANTIC=1 C99=1 # Pedantic build for GCC in C99 mode (this should give no warnings)
- make CC=tcc # Use the Tiny C Compiler
- make CC=gcc CXX=g++ # Use gcc for the library and examples and g++ for the benchmark
- make CC=clang CXX=clang++ # Use CLang for the library and examples and CLang C++ for the benchmark
- make FORCE32=1 # Force 32 bit build on 64 bit system
- make MINIMAL=1 # Microcontroller-suitable build (optimize for size and low memory usage)
- make CC=gcc PROFILING=1 # Build with gcc and profiling
- make CC=gcc C90=1 # Build with gcc using C89/90 standard
- make CC=gcc C99=1 # Build with gcc using C99 standard
- make CC=gcc C11=1 # Build with gcc using C11 standard
- make CC=g++ # Build with g++ (C++ instead of C)
- make CC=clang++ CPP11=1 # Build with clang++ using C++11 standard (instead of C)
- make CC=clang CXX=clang++ C99=1 CPP11=1 # Build with clang using C99 mode and build the benchmark using C++11
- make CC=tcc DEBUG=1 # Build with TinyCC with debug symbols
- make CC=powerpc-linux-gnu-gcc # Build with gcc cross compiler (PPC)
- make CC=arm-linux-gnueabi-gcc # Build with gcc cross compiler (ARM)
- make ADD_CFLAGS="-DS_CRC32_SLC=0" # Build without CRC32 hash tables, 1 bit/loop (100MB/s on i5@3GHz)
- make ADD_CFLAGS="-DS_CRC32_SLC=1" # Build with CRC32 1024 byte hash table, 1 byte/loop (400MB/s on i5@3GHz)
- make ADD_CFLAGS="-DS_CRC32_SLC=4" # Build with CRC32 4096 byte hash table, 4 bytes/loop (1000MB/s on i5@3GHz)
- make ADD_CFLAGS="-DS_CRC32_SLC=8" # Build with CRC32 8192 byte hash table, 8 bytes/loop (2000MB/s on i5@3GHz)
- make ADD_CFLAGS="-DS_CRC32_SLC=12" # Build with CRC32 12288 byte hash table, 12 bytes/loop (2500MB/s on i5@3GHz) -this is the default CRC32 mode-
- make ADD_CFLAGS="-DS_CRC32_SLC=16" # Build with CRC32 16384 byte hash table, 16 bytes/loop (2700MB/s on i5@3GHz):
- make ADD_FLAGS=-DSD_DISABLE_HEURISTIC_GROWTH # Build with growth heuristics disabled (not recommended)
- make ADD_FLAGS=-DS_DISABLE_SM_STRING_OPTIMIZATION # Build without map string optimizations (not recommended, except for benchmarking)
- make HAS_PNG=1 HAS_JPG=1 # Build enabling PNG and JPG usage so the 'imgc' example can convert import/export those formats (libpng and jpeg 6b -e.g. libjpegturbo- compatible dev libs and headers must be installed in the system)
- Every make call, in addition to building the targets, it does a full test for that build (unit tests covering all the API function calls -'stest' executable-)
- Between make calls with different parameters, please use "make clean"
- It should build with any GCC version >= 2.95.2 (1999), in any hardware platform (x86, x86-64, MIPS32/64, MIPSLE, PPC/PPC64, ARM, etc.)
- CLang should work in all versions (except in cases of reporting wrong/missing compiler or C standard version)
- For Windows it is provided a .sln example just for building the test (used for the CI check)
- In BSD systems not using GNU Make as default, use gmake instead of make
For launching the extensive tests:
- ./make_test.sh # All tests (used in the CI validation): all builds (23 * 4 tests), Valgrind memcheck, CLang static analyzer, documentation generation and validation, coding style check
- ./make_test.sh 1 # Validate all available C/C++ builds
- ./make_test.sh 2 # Valgrind memcheck
- ./make_test.sh 4 # Clang static analyzer
- ./make_test.sh 8 # Generate documentation
- ./make_test.sh 16 # Check coding style
- ./make_test.sh 24 # Like 8 plus 16 (3 would be like 1 plus 2, etc.)
Ease of use
- Use strings, vectors, maps, sets, and bit sets in a similar way to higher level languages.
- Dynamic one-block linear addressing space.
- Internal structures use indexes instead of pointers (i.e. similar memory usage in both 32 and 64 bit mode).
- Details: doc/benchmarks.md
- Buffer direct access
- Preallocation hints (reducing memory allocation calls)
- Heap and stack memory allocation support
- Details: doc/benchmarks.md
Predictable (suitable for hard and soft real-time)
- Predictable execution speed: all API calls have documented time complexity. Also space complexity, when extra space involving dynamic memory is required.
- Hard real-time: allocating maximum size (strings, vector, trees, map) will imply calling 0 or 1 times to the malloc() function (C library). Zero times if using the stack as memory or externally allocated memory pool, and one if using the heap. This is important because malloc() implementation has both memory and time overhead, as internally manages a pool of free memory (which could have O(1), O(log(n)), or even O(n), depending on the compiler provider C library implementation).
- Soft real-time: logarithmic time memory allocation (when not doing preallocation): for 10^n new elements just log(n) calls to the malloc() function. This is not a guarantee for hard real time, but for soft real time (being careful could provide almost same results, except in the case of very poor malloc() implementation in the C library being used -not a problem with modern compilers-).
RAM, ROM, and disk operation
- Data structures can be stored in ROM memory.
- Data structures are suitable for memory mapped operation, and disk store/restore. This is currently true for strings, vectors, and bit sets. For maps, only when using integer data (as currently strings inside a tree are external references -this will be fixed in the future, adding the possibility of in-map arbitrary fixed data, allowing e.g. fixes size strings inside a map-)
Known edge case behavior
- Allowing both "carefree code" and per-operation error check. I.e. memory errors and UTF8 format error can be checked after every operation.
- Simple internal structures, with linear addressing. It allows to reduce memory allocation calls to the minimum (using the stack it is possible to avoid heap usage).
- Implementation kept as simple as possible/known.
- Focus in avoiding code-bloat, avoiding expensive code duplication/inlining (inlining is used when being "free" or because of big speed impact)
- Small code, suitable for static linking. In fact, currently no dynamic linking is provided (will be added, but it is not a priority).
- Coverage, profiling, and memory leak/overflow/uninitialized checks (Valgrind, CLang static analysis, Coverity, MS VS).
- C99 (and later) compatible (requiring alloca() function)
- C++11 (and later) compatible (requiring alloca() function)
- Endian-safe: the library works with any CPU endianess (tested with little and big endian, but it should work with other modes, e.g. with PDP endianess -not tested-)
- POSIX builds (Linux, BSD's) require GNU Make (e.g. on FreeBSD use 'gmake' instead of 'make')
- Compatibility with old C compilers:
- Oldest GNU C compiler being tested is GCC 2.95.2 (1999)
- Other old compilers may work in C89 mode only if following language extensions are available:
- __VA_ARGS__ macro
- type of bit-field in 'struct'
- %S printf extension (only for unit testing)
- Compatibility with old C++ compilers:
- It may work in C++98 mode only if following language extensions are available:
- Anonymous variadic macros
- Long long integer constant support (LL)
- It may work in C++98 mode only if following language extensions are available:
Double pointer usage: because of using just one allocation, write operations require to address a double pointer, so in the case of reallocation the source pointer could be changed.
Concurrent read-only operations are safe, but concurrent read/write must be protected by the user (e.g. using mutexes or spinlocks). That can be seen as a disadvantage or as a "feature" (it is faster).
String-specific advantages (srt_string)
- Binary data support
- I.e. strings can have zeros in the middle
- Search and replace into binary data is supported
- Unicode support
- Although strings internal storage is binary, Unicode-aware functions store data in UTF-8.
- Search and replace into UTF-8 data is supported
- Full and fast Unicode lowercase/uppercase support without requiring "setlocale" nor hash tables.
- Efficient raw and Unicode (UTF-8) handling. Unicode size is tracked, so resulting operations with cached Unicode size, will keep that, keeping the O(1) for getting that information afterwards.
- Find/search: O(n), one pass.
- Replace: O(n), one pass. Worst case overhead is limited to a realloc and a copy of the part already processed.
- Concatenation: O(n), one pass for multiple concatenation. I.e. Optimal concatenation of multiple elements require just one allocation, which is computed before the concatination. When concatenating ss_t strings the allocation size compute time is O(1).
- Resize: O(n) for worst case (when requiring reallocation for extra space. O(n) for resize giving as cut indicator the number of Unicode characters. O(1) for cutting raw data (bytes).
- Case conversion: O(n), one pass, using the same input if case conversion requires no allocation over current string capacity. If resize is required, in order to keep O(n) complexity, the string is scanned for computing required size. After that, the conversion outputs to the secondary string. Before returning, the output string replaces the input, and the input becomes freed.
- Avoid double copies for I/O (read access, write reserve)
- Avoid re-scan (e.g. commands with offset for random access)
- Transformation operations are supported in all dup/cpy/cat functions, in order to both increase expressiveness and avoid unnecessary copies (e.g. tolower, erase, replace, etc.). E.g. you can both convert to lower a string in the same container, or copy/concatenate to another container.
- Using just 4 byte overhead for strings with size <= 255 bytes
- Using sizeof(size_t) * 5 byte overhead for strings with size >= 256 bytes (e.g. 20 bytes for a 32-bit CPU, 40 for 64-bit)
- Data structure has no pointers, i.e. just one allocation is required for building a string. Or zero, if using the stack.
- No additional memory allocation for search.
- Extra memory allocation may be required for: UTF-8 uppercase/lowercase and replace.
- Strings can grow from 0 bytes to ((size_t)~0 - metainfo_size)
- String operations
- Copy, cat, tolower/toupper, find, split, printf, cmp, base64, data compression, crc32 on buffers, etc.
- All string operations allow C strings and raw buffers as input, without extra copies (ss_[c]refa functions)
- Allocation, buffer pre-reserve,
- Raw binary content is allowed, including 0's.
- "Wide char" and "C style" strings R/W interoperability support.
- I/O helpers: buffer read, reserve space for async write
- Aliasing suport, e.g. ss_cat(&a, a) is valid
- Misc string/buffer operations:
- Real-time O(n) data compression (stateless, unlimited buffer size, and hash table resource usage proportional to the input size, i.e. efficient also for small inputs)
- State of the art encodings: base64, hexadecimal, etc. (at GB/s speeds)
- State of the art CRC32 and Adler32 hashes on strings (at >2 GB/s speeds)
- Focus on reducing verbosity:
- ss_cat(&t, s1, ..., sN);
- ss_cat(&t, s1, s2, ss_printf(&s3, "%i", cnt), ..., sN);
- ss_free(&s1, &s2, ..., &sN);
- Expressive code without explicit memory handling
- Focus on reducing errors, e.g.
- If a string operation fails, the string is kept in the last successful state (e.g. ss_cat(&a, b, huge_string, other))
- String operations always return valid strings, e.g. This is OK: srt_string *s = NULL; ss_cpy_c(&s, "a"); Same behavior as: srt_string *s = ss_dup_c("a");
- ss_free(&s1, ..., &sN); (no manual set to NULL is required)
- No reference counting support. Rationale: simplicity.
Vector-specific advantages (srt_vector)
- Variable-length concatenation and push functions.
- Allow explicit size for allocation (8, 16, 32, 64 bits) with size-agnostic generic signed/unsigned functions (easier coding).
- Allow variable-size generic element.
- O(n) for 8-bit elements (counting sort algorithm), much faster than GNU/Clang qsort() (C), and up to 5x faster than GNU/Clang std::vector sort (C++)
- O(n log n) -pseudo O(n)- for 16/32/64-bit elements (in-place MSD binary radix sort algorithm), 2x-3x faster than GNU/Clang qsort() (C), performing similar to GNU/Clang std::vector sort (C++)
- O(n log n) for generic elements using the C library (qsort())
- No insert function. Rationale: insert is slow (O(n)). Could be added, if someone asks for it.
Map-specific advantages (srt_map)
- Abstraction over Red-Black tree implementation using linear memory pool with just 8 byte per node overhead, allowing up to (2^32)-1 nodes (for both 32 an 64 bit compilers). E.g. one million 32 bit key, 32 bit value map will take just 16MB of memory (16 bytes per element -8 byte metadata, 4 + 4 byte data-).
- Keys: integer (8, 16, 32, 64 bits) and string (ss_t)
- Values: integer (8, 16, 32, 64 bits), string (ss_t), and pointer
- O(1) for allocation
- O(1) for deleting maps without strings (one or zero calls to 'free' C function)
- O(n) for deleting maps with strings (n + one or zero calls to 'free' C function)
- O(n) for map copy (in case of maps without strings, would be as fast as a memcpy())
- O(log n) insert, search, delete
- O(n) sorted enumeration (amortized O(n log n))
- O(n) unsorted enumeration (faster than the sorted case)
- O(n) copy: tree structure is copied as fast as a memcpy(). For map types involving strings, additional allocation is used for duplicating strings.
- Because of being implemented as a tree, it is slower than a hash-map, on average. However, in total execution time is not that bad, as because of allocation heuristics a lot of calls to the allocator are avoided.
- There is room for node deletion speed up (currently deletion is a bit slower than insertion, because of an additional tree search used for avoiding having memory fragmentation, as implementation guarantees linear/compacted memory usage, it could be optimized with a small LRU for cases of multiple delete/insert operation mix).
|ISA||Word size||Endianess||Unaligned memory access HW support||OS||Compilers||Code analysis||Test coverage|
|x86, x86-64 (Core i5)||32, 64||little||yes||Linux Ubuntu 12.04/14.04/17.10||gcc, g++, tcc, clang, clang++||Valgrind, clang, Coverity||Travis CI (automatic, every public commit)|
|x86, x86-64 (Core i5)||32, 64||little||yes||Windows||Visual Studio Express 2013, AppVeyor's VS||VS||AppVeyor (automatic, every public commit)|
|x86, x86-64 (Core i5)||32, 64||little||yes||FreeBSD 10.2||gcc, g++, clang, clang++||Valgrind clang||manual|
|x86, x86-64 (Core2Duo)||32, 64||little||yes||Darwin 11.4.2||gcc, g++, clang, clang++||none||manual|
|ARMv5 (ARM926EJ-S)||32||little||no||Arch Linux||gcc, g++, clang, clang++||none||manual|
|ARMv5 (ARM926EJ-S)||32||little||no||Linux Debian 7.0 "Wheezy"||gcc, g++, clang, clang++||none||manual|
|ARMv5 (Feroceon)||32||little||no||Linux Debian 7.0 "Wheezy"||gcc, g++||none||manual|
|ARMv6 (ARM1176JZF-S)||32||little||yes||Linux Raspbian||gcc, g++, clang, clang++||Valgrind, clang||manual|
|ARMv7-A (Krait 400)||32||little||yes||Linux Android 5.1.1 + BusyBox||gcc, g++||none||manual|
|ARMv8-A (Cortex A53)||64||little||yes||Debian 8.5 "Jessie"||gcc, g++, clang, clang++||Valgrind, clang||manual|
|ARMv8-A (MSM8996)||64||little||yes||Linux Android 7.1.1||clang, clang++||clang||manual|
|MIPS, MIPS64 (Octeon)||32, 64||big||yes||EdgeOS v1.6.0 (Linux Vyatta-based using Debian 7 "Wheezy" packages)||gcc, g++, clang, clang++||Valgrind, clang||manual|
|MIPS (EE R5900)||32||little||no||Playstation 2 Linux (Red Hat based)||gcc, g++ 2.95.2||none||manual|
|PowerPC (G4)||32||big||yes||Linux Ubuntu 12.04||gcc, g++||none||manual|
|PowerPC, PPC64 (Cell)||32, 64||big||yes||Yellow Dog Linux 6.2 (Red Hat based)||gcc, g++ 4.1.2||none||manual|
Copyright (c) 2015-2018, F. Aragon. All rights reserved. Released under the BSD 3-Clause License (see the LICENSE file included).
email: faragon.github (GMail account, add @gmail.com)
Beta. API still can change: suggestions are welcome.
"to do" list
Acknowledgements and references