Integrating the allocator

At this point we've covered all the code needed to get a memory allocator up and running. In this last section, let's explore how we can make the memory allocator easier to integrate into an existing codebase. We're going to override the standard c allocation functions (malloc, free, memset, memcpy, realloc, calloc), as well as the C++ new / delete operators, and finally we will implement an allocator class that can be used with the standard c++ library containers.

C API

Implementing the C API is trivial, we have functions to allocate, free, set and copy memory. We can use macro functions to rename the Allocate, Release, Set, and Copy functions to malloc, free, memset, and memcpy respectivley. One bit of code that might look odd is the __LOCATION__ macro, it just makes a constant string which contains the file name and line that the macro appeared on.

#define xstr(a) str(a)
#define str(a) #a
#define __LOCATION__ "On line: " xstr(__LINE__) ", in file: " __FILE__

#define malloc(bytes) Allocate(bytes, DefaultAlignment, __LOCATION__, GlobalAllocator)
#define free(data) Release(data, __LOCATION__, GlobalAllocator)
#define memset(mem, val, size) Set(mem, val, size, __LOCATION__)
#define memcpy(dest, src, size) Copy(dest, src, size, __LOCATION__)

There are two more functions in the C API for memory that we should implement, calloc and realloc. Calloc, which allocates a contigous chunk of memory is similar to malloc, except calloc also clear the allocated memory. This function can be implemented with our existing Allocate and Set functions.

void* AllocateContigous(u32 num_elems, u32 elem_size, u32 alignment, const char* location, Allocator* allocator) {
    if (allocator == 0) {
        allocator = GlobalAllocator;
    }
    void* mem = Allocate(num_elems * elem_size, alignment, location, allocator);
    if (mem == 0) {
        return 0;
    }
    Set(mem, num_elems * elem_size, 0, location);

    return mem;
}

The realloc function allocates a new chunk of memory and copies the contents of the old memory to the new allocation. We can implement this function using Allocate, Copy and Release. If the new size being passed in is 0, and the memory being passed in is not 0, the memory will be freed and realloc will return 0.

void* ReAllocate(void* mem, u32 newSize, u32 newAlignment, const char* location, Allocator* allocator) {
    if (allocator == 0) {
        allocator = GlobalAllocator;
    }

    if (newSize == 0 && mem != 0) {
        Release(mem, location, allocator);
        return 0;
    }

    void* newMem = Allocate(newSize, newAlignment, location, allocator);
    u32 oldMemSize = 0;
    {
        u8* memory = (u8*)mem;
        Allocation* header = (Allocation*)(memory - sizeof(Allocation));
        oldMemSize = header->size - sizeof(Allocation);
        u32 allocationHeaderPadding = sizeof(Allocation) % header->alignment > 0 ? header->alignment - sizeof(Allocation) % header->alignment : 0;
        oldMemSize -= allocationHeaderPadding;
    }

    if (mem != 0 && newMem != 0) {
        u32 copySize = newSize;
        if (newSize > oldMemSize) {
            copySize = oldMemSize;
        }

        Copy(newMem, mem, copySize, location);
        Release(mem, location, allocator);
    }

    return newMem;
}

And finally we can define the AllocateContigous and ReAllocate functions as calloc and realloc. You might also want to consider creating actual functions named malloc, free, etc... That way the function can be an extern function and called without including the memory header file.

#define calloc(numelem, elemsize) AllocateContigous(numelem, elemsize, DefaultAlignment, __LOCATION__, GlobalAllocator)
#define realloc(mem, size) ReAllocate(mem, size, DefaultAlignment, __LOCATION__, GlobalAllocator)

Overloading the standard memory functions can get hairy, compilers do fun things like treat memset as an intrinsic instruction (which is the right call). Check out the mi-malloc docs for a more robust list of functions to re-implement

C++ API

Implementing the C++ API is a bit more verbose, but the functionality is pretty trivial. All we need to do is override new, delete, new[] and delete[]. This implementation isn't 100% to the C++ spec, for example new_handler is ignored. All of these overloaded functions are going to be calling our Allocate and Release functions.

We will need to override a total of 16 functions (3 new, 5 delete, 3 new[], and 5 delete[]) to cover the broad range of uses for new / delete. The placement new / delete operators are basically stub functions as they don't need to allocate any memory.

void* operator new (decltype(sizeof(0)) size) {
    return Allocate(size, DefaultAlignment, "internal - ::new(size_t)", GlobalAllocator);
}

void* operator new (decltype(sizeof(0)) size, const std::nothrow_t& nothrow_value) noexcept {
    return Allocate(size, DefaultAlignment, "internal - ::new(size_t, nothrow_t&)", GlobalAllocator);
}

void __cdecl operator delete (void* ptr) noexcept {
    return Release(ptr, "internal - ::delete(void*)", GlobalAllocator);
}

void __cdecl operator delete (void* ptr, const std::nothrow_t& nothrow_constant) noexcept {
    return Release(ptr, "internal - ::delete(void*, nothrow_t&)", GlobalAllocator);
}

void __cdecl operator delete(void* memory, const char* location, Allocator* allocator) {
    return Release(memory, location, allocator);
}

void __cdecl operator delete (void* ptr, decltype(sizeof(0)) size) noexcept {
    return Release(ptr, "internal - ::delete(void*, size_t)", GlobalAllocator);
}

void __cdecl operator delete (void* ptr, decltype(sizeof(0)) size, const std::nothrow_t& nothrow_constant) noexcept {
    return Release(ptr, "internal - ::delete(void*, size_t, nothrow_t&)", GlobalAllocator);
}

void* operator new[](decltype(sizeof(0)) size) {
    return Allocate(size, DefaultAlignment, "internal - ::new[](size_t)", GlobalAllocator);
}

void* operator new[](decltype(sizeof(0)) size, const std::nothrow_t& nothrow_value) noexcept {
    return Allocate(size, DefaultAlignment, "internal - ::new[](size_t, nothrow_t&)", GlobalAllocator);
}

void __cdecl operator delete[](void* ptr) noexcept {
    return Release(ptr, "internal - ::delete[](void*)", GlobalAllocator);
}

void __cdecl operator delete[](void* ptr, const std::nothrow_t& nothrow_constant) noexcept {
    return Release(ptr, "internal - ::delete[](void*, nothrow_t&)", GlobalAllocator);
}

void __cdecl operator delete[](void* ptr, decltype(sizeof(0)) size) noexcept {
    return Release(ptr, "internal - ::delete[](void*, size_t)", GlobalAllocator);
}

void __cdecl operator delete[](void* ptr, decltype(sizeof(0)) size, const std::nothrow_t& nothrow_constant) noexcept {
    return Release(ptr, "internal - ::delete[](void*, size_t, nothrow_t&)", GlobalAllocator);
}

This works pretty well, but there is one disadvantage, we lost the line number and file name where a memory allocation or release happened. We can solve this for the new function by making a version that takes more robust arguments like the allocation location. We will then #define new to call this version of new with extra arguments.

void* operator new (decltype(sizeof(0)) size, u32 alignment, const char* location, Allocator* allocator) noexcept {
    return Allocate(size, alignment, location, allocator);
}

#define new new(DefaultAlignment, __LOCATION__, GlobalAllocator)

We won't add additional paramaters to the delete operator. I just ignore the location for delete, but we could make a global pointer / integer and use a #define to set that. Similarly to what is being discussed in this stack overflow post.

Allocator API

The last API to integrate with is STL containers. If you make any new STL container, it should be able to use the allocator we just created. Implementing an STL allocator is trivial, it's covered in depth on the anki3d blog. We need to provide an allocator class that matches the API provided in the linked blog, which will act as a thin wrapper for the allocator.

template<typename T>
    struct stl_identity {
        typedef T type;
    };

    template <typename T>
    T&& stl_forward(typename stl_identity<T>::type& param) {
        return static_cast<T&&>(param);
    }

template<typename T>
class STLAllocator {
public:
    typedef ptr_type size_type;
    typedef ptr_type difference_type;
    typedef T* pointer;
    typedef const T* const_pointer;
    typedef T& reference;
    typedef const T& const_reference;
    typedef T value_type;
public:
    /// Default constructor
    inline STLAllocator() throw() { }

    inline ~STLAllocator() { }

    /// Copy constructor
    inline STLAllocator(const STLAllocator& other) throw() { }

    /// Copy constructor with another type
    template<typename U>
    inline STLAllocator(const STLAllocator<U>&) throw() { }

    /// Copy
    inline STLAllocator<T>& operator=(const STLAllocator& other) {
        return *this;
    }

    /// Copy with another type
    template<typename U>
    inline STLAllocator& operator=(const STLAllocator<U>& other) {
        return *this;
    }

    /// Get address of a reference
    inline pointer address(reference x) const {
        return &x;
    }

    /// Get const address of a const reference
    inline const_pointer address(const_reference x) const {
        return &x;
    }

    /// Allocate n elements of type T
    inline pointer allocate(size_type n, const void* hint = 0) {
        return Allocate(n * sizeof(T), DefaultAlignment, "STLAllocator::allocate", GlobalAllocator);
    }

    /// Free memory of pointer p
    inline void deallocate(void* p, size_type n) {
        Release(p, "STLAllocator::deallocate", GlobalAllocator);
    }

    /// Call the constructor of p
    inline void construct(pointer p, const T& val) {
        new ((T*)p) T(val);
    }

    /// Call the constructor of p with many arguments. C++11
    template<typename U, typename... Args>
    inline void construct(U* p, Args&&... args) {
        ::new((void*)p) U(stl_forward<Args>(args)...);
    }

    /// Call the destructor of p
    inline void destroy(pointer p) {
        p->~T();
    }

    /// Call the destructor of p of type U
    template<typename U>
    inline void destroy(U* p) {
        p->~U();
    }

    /// Get the max allocation size
    inline size_type max_size() const {
        return size_type(-1);
    }

    /// A struct to rebind the allocator to another allocator of type U
    template<typename U>
    struct rebind {
        typedef STLAllocator<U> other;
    };
};