15 Apr 2019 • C++ tricks: compile time type IDs

Here's a little trick for getting a unique ID for each type in your codebase, entirely at compile time.

typeid

C++ has typeid which is garbage. It works like this:

#include <stdio.h>
#include <typeinfo>

int main() {
	constexpr const std::type_info & a = typeid( int );
	printf( "%zu\n", a.hash_code() );
	return 0;
}

and compiles to

leaq    _ZTS1A(%rip), %rdi
movl    $3339675911, %edx
movl    $2, %esi
call    _ZSt11_Hash_bytesPKvmm@PLT

which looks not awesome. If we go look at gcc's implementation of hash_code we get

size_t hash_code() const noexcept {
	return _Hash_bytes(name(), __builtin_strlen(name()), static_cast<size_t>(0xc70f6907UL));
}

so this does a string hash at runtime every time you want to get the ID.

All of this could be trivially done at compile time and work fine with -fno-rtti, but it's C++ so they picked the absolute most useless implementation instead.

DIY

An actual solution is to use a constexpr hash function to hash the typename and (optionally) a little template to ensure the argument is actually a type.

#include <stdio.h>
#include <stdint.h>

// compile time FNV-1a
constexpr uint32_t Hash32_CT( const char * str, size_t n, uint32_t basis = UINT32_C( 2166136261 ) ) {
	return n == 0 ? basis : Hash32_CT( str + 1, n - 1, ( basis ^ str[ 0 ] ) * UINT32_C( 16777619 ) );
}

struct A {
	int a;
};

template< uint32_t id >
uint32_t typeid_helper() {
	return id;
}
#define TYPEID( T ) typeid_helper< Hash32_CT( #T, sizeof( #T ) - 1 ) >()

int main() {
	printf( "%u\n", TYPEID( A ) );
	// printf( "%u\n", TYPEID( 1 ) );
	return 0;
}

which compiles to

movl    $1735789992, %esi

This breaks in many situations. It doesn't understand namespaces and using and nested types, it doesn't understand typedef, and if you have structs in different files with the same name (technically UB, compilers don't warn about it so good luck with that) they get the same ID.

But it's good enough for ECS so good enough for me.

Addendum: C++17

C++17 adds a feature called inline variables and you can implement typeid with them too:

#include <stdio.h>

// stuff this in a header somewhere
inline int type_id_seq = 0;
template< typename T > inline const int type_id = type_id_seq++;

int main() {
	printf( "%d\n", type_id< int > );
	printf( "%d\n", type_id< float > );
	printf( "%d\n", type_id< int > );
	return 0;
}

Basically type_id_seq = 0/type_id< int > = 0/etc (names get mangled but let's write them with template syntax for clarity) get put in .data, then some code runs before main that does type_id< int > = type_id_seq++; and so on. The advantages of doing it this way is if you stick in some more templates to remove const/references/etc you can make it actually always accurate, and that it counts from 0 so you can use typeids as array indices. The disadvantages are that it compiles to a load rather than a constant, it needs C++17, and the implementation is a WTF.