As mentioned in the section before, LLAMA distinguishes between the array and the record dimensions. The most important difference is that the array dimensions are defined at compile or run time whereas the record dimension is defined fully at compile time. This allows to make the problem size itself a run time value but leaves the compiler room to optimize the data access.

Array dimensions

The array dimensions form an \(N\)-dimensional array with \(N\) itself being a compile time value. The extent of each dimension can be a compile time or runtime values.

A simple definition of three array dimensions of the extents \(128 \times 256 \times 32\) looks like this:

llama::ArrayExtents extents{128, 256, 32};

The template arguments are deduced by the compiler using Class Template Argument Deduction (CTAD). The full type of extents is llama::ArrayExtents<int, llama::dyn, llama::dyn, llama::dyn>.

By explicitly specifying the template arguments, we can mix compile time and runtime extents, where the constant llama::dyn denotes a dynamic extent:

llama::ArrayExtents<int, llama::dyn, 256, llama::dyn> extents{128, 32};

The template argument list specifies the integral type used for index calculations and the order and nature (compile vs. runtime) of the extents. Choosing the right index type depends on the possible magnitude of values occurring during index calculations (e.g. int only allows a maximum flat index space and blob size of INT_MAX), as well as target specific optimization aspects (e.g. size_t consuming more CUDA registers than unsigned int). An instance of llama::ArrayExtents can then be constructed with as many runtime extents as llama::dyns specified in the template argument list.

By setting a specific value for all template arguments, the array extents are fully determined at compile time.

llama::ArrayExtents<int, 128, 256, 32> extents{};

This is important if such extents are later embedded into other LLAMA objects such as mappings or views, where they should not occupy any additional memory.

llama::ArrayExtents<int, 128, 256, 32> extents{};

struct S : llama::ArrayExtents<int, 128, 256, 32> { char c; } s;
static_assert(sizeof(s) == sizeof(char)); // empty base optimization eliminates storage

To later described indices into the array dimensions described by a llama::ArrayExtents, an instance of llama::ArrayIndex is used:

llama::ArrayIndex i{2, 3, 4};
// full type of i: llama::ArrayIndex<int, 3>

Contrary to llama::ArrayExtents which can store a mix of compile and runtime values, llama::ArrayIndex only stores runtime indices, so it is templated on the number of dimensions. This might change at some point in the future, if we find sufficient evidence that a design similar to llama::ArrayExtents is also useful for llama::ArrayIndex.

Record dimension

The record dimension is a tree structure completely defined at compile time. Nested C++ structs, which the record dimension tries to abstract, they are trees too. Let’s have a look at this simple example struct for storing a pixel value:

struct Pixel {
    struct {
        float r
        float g
        float b;
    } color;
    char alpha;

This defines this tree


Unfortunately with C++ it is not possible yet to “iterate” over a struct at compile time and extract member types and names, as it would be needed for LLAMA’s mapping (although there are proposals to provide such a facility). For now LLAMA needs to define such a tree itself using two classes, llama::Record and llama::Field. llama::Record is a compile time list of llama::Field. llama::Field has a name and a fundamental type or another llama::Record list of child llama::Fields. The name of a llama::Field needs to be C++ type as well. We recommend creating empty tag types for this. These tags serve as names when describing accesses later. Furthermore, these tags also enable a semantic binding even between two different record dimensions.

To make the code easier to read, the following shortcuts are defined:

  • llama::Recordllama::Record

  • llama::Fieldllama::Field

A record dimension itself is just a llama::Record (or a fundamental type), as seen here for the given tree:

struct color {};
struct alpha {};
struct r {};
struct g {};
struct b {};

using RGB = llama::Record<
    llama::Field<r, float>,
    llama::Field<g, float>,
    llama::Field<b, float>
using Pixel = llama::Record<
    llama::Field<color, RGB>,
    llama::Field<alpha, char>

Arrays of compile-time extent are also supported as arguments to llama::Field. Such arrays are expanded into a llama::Record with multiple llama::Fields of the same type. E.g. llama::Field<Tag, float[4]> is expanded into

llama::Field<Tag, llama::Record<
    llama::Field<llama::RecordCoord<0>, float>,
    llama::Field<llama::RecordCoord<1>, float>,
    llama::Field<llama::RecordCoord<2>, float>,
    llama::Field<llama::RecordCoord<3>, float>