API

Users should just include llama.hpp and all functionality should be available. All basic functionality of the library is in the namespace llama or sub namespaces.

Useful helpers

template<typename T>
struct NrAndOffset
template<typename T>
constexpr auto llama::structName(T = {}) -> std::string_view
template<typename FromT, typename ToT>
using llama::CopyConst = std::conditional_t<std::is_const_v<FromT>, const ToT, ToT>

Alias for ToT, adding const if FromT is const qualified.

template<typename Derived, typename ValueType>
struct ProxyRefOpMixin

CRTP mixin for proxy reference types to support all compound assignment and increment/decrement operators.

template<typename T>
inline auto llama::decayCopy(T &&valueOrRef) -> typename internal::ValueOf<T>::type

Pulls a copy of the given value or reference. Proxy references are resolved to their value types.

template<typename Reference, typename = void>
struct ScopedUpdate : public internal::ValueOf::type

Scope guard type. ScopedUpdate takes a copy of a value through a reference and stores it internally during construction. The stored value is written back when ScopedUpdate is destroyed. ScopedUpdate tries to act like the stored value as much as possible, exposing member functions of the stored value and acting like a proxy reference if the stored value is a primitive type.

Array

template<typename T, std::size_t N>
struct Array

Array class like std::array but suitable for use with offloading devices like GPUs.

Template Parameters:
  • T – type if array elements.

  • N – rank of the array.

template<typename T, std::size_t N>
inline constexpr auto llama::pushFront([[maybe_unused]] Array<T, N> a, T v) -> Array<T, N + 1>
template<typename T, std::size_t N>
inline constexpr auto llama::pushBack([[maybe_unused]] Array<T, N> a, T v) -> Array<T, N + 1>

Warning

doxygenfunction: Unable to resolve function “llama::popFront” with arguments (Array<T, N>) in doxygen xml output for project “LLAMA” from directory: ./doxygen/xml. Potential matches:

- template<typename ...Elements> constexpr auto popFront(const Tuple<Elements...> &tuple)
- template<typename T, std::size_t N> constexpr auto popFront([[maybe_unused]] Array<T, N> a)
template<typename T, std::size_t N>
inline constexpr auto llama::popBack([[maybe_unused]] Array<T, N> a)
template<typename T, std::size_t N>
inline constexpr auto llama::product(Array<T, N> a) -> T

Tuple

template<typename ...Elements>
struct Tuple
template<std::size_t I, typename ...Elements>
inline constexpr auto llama::get(Tuple<Elements...> &tuple) -> auto&
template<typename Tuple1, typename Tuple2>
inline constexpr auto llama::tupleCat(const Tuple1 &t1, const Tuple2 &t2)
template<std::size_t Pos, typename Tuple, typename Replacement>
inline constexpr auto llama::tupleReplace(Tuple &&tuple, Replacement &&replacement)

Creates a copy of a tuple with the element at position Pos replaced by replacement.

template<typename ...Elements, typename Functor>
inline constexpr auto llama::tupleTransform(const Tuple<Elements...> &tuple, const Functor &functor)

Applies a functor to every element of a tuple, creating a new tuple with the result of the element transformations. The functor needs to implement a template operator() to which all tuple elements are passed.

template<typename ...Elements>
inline constexpr auto llama::popFront(const Tuple<Elements...> &tuple)

Returns a copy of the tuple without the first element.

Array dimensions

template<typename T = std::size_t, T... Sizes>
struct ArrayExtents : public llama::Array<std::size_t, ((Sizes == dyn) + ... + 0)>

ArrayExtents holding compile and runtime indices. This is conceptually equivalent to the std::extent of std::mdspan (

See also

: https://wg21.link/P0009) including the changes to make the size_type controllable (

Subclassed by llama::ArrayIndexRange< ArrayExtents >

template<typename SizeType, std::size_t N>
using llama::ArrayExtentsDynamic = ArrayExtentsNCube<SizeType, N, dyn>

N-dimensional ArrayExtents where all values are dynamic.

template<typename SizeType, std::size_t N, SizeType Extent>
using llama::ArrayExtentsNCube = decltype(internal::makeArrayExtents<SizeType, Extent>(std::make_index_sequence<N>{}))

N-dimensional ArrayExtents where all N extents are Extent.

template<typename T, std::size_t Dim>
struct ArrayIndex : public llama::Array<T, Dim>

Represents a run-time index into the array dimensions.

Template Parameters:

Dim – Compile-time number of dimensions.

template<typename ArrayExtents>
struct ArrayIndexIterator

Iterator supporting ArrayIndexRange.

template<typename ArrayExtents>
struct ArrayIndexRange : private llama::ArrayExtents<T, Sizes>

Range allowing to iterate over all indices in an ArrayExtents.

template<typename SizeType, SizeType... Sizes, typename Func>
inline void llama::forEachArrayIndex(ArrayExtents<SizeType, Sizes...> extents, Func &&func)

Record dimension

template<typename ...Fields>
struct Record

A type list of Fields which may be used to define a record dimension.

template<typename Tag, typename Type>
struct Field

Record dimension tree node which may either be a leaf or refer to a child tree presented as another Record.

Template Parameters:
  • Tag – Name of the node. May be any type (struct, class).

  • Type – Type of the node. May be one of three cases. 1. another sub tree consisting of a nested Record. 2. an array of static size of any type, in which case a Record with as many Field as the array size is created, named RecordCoord specialized on consecutive numbers I. 3. A scalar type different from Record, making this node a leaf of this type.

struct NoName

Anonymous naming for a Field.

template<typename Field>
using llama::GetFieldTag = mp_first<Field>

Get the tag from a Field.

template<typename Field>
using llama::GetFieldType = mp_second<Field>

Get the type from a Field.

template<typename RecordDim, typename RecordCoord, bool Align = false>
constexpr std::size_t llama::offsetOf = flatOffsetOf<FlatRecordDim<RecordDim>, flatRecordCoord<RecordDim, RecordCoord>, Align>

The byte offset of an element in a record dimension if it would be a normal struct.

Template Parameters:
  • RecordDimRecord dimension tree.

  • RecordCoordRecord coordinate of an element inrecord dimension tree.

template<typename T, bool Align = false, bool IncludeTailPadding = true>
constexpr std::size_t llama::sizeOf = sizeof(T)

The size of a type T.

template<typename T>
constexpr std::size_t llama::alignOf = alignof(T)

The alignment of a type T.

template<typename RecordDim, typename RecordCoord>
using llama::GetTags = typename internal::GetTagsImpl<RecordDim, RecordCoord>::type

Get the tags of all Fields from the root of the record dimension tree until to the node identified by RecordCoord.

template<typename RecordDim, typename RecordCoord>
using llama::GetTag = typename internal::GetTagImpl<RecordDim, RecordCoord>::type

Get the tag of the Field at a RecordCoord inside the record dimension tree.

template<typename RecordDimA, typename RecordCoordA, typename RecordDimB, typename RecordCoordB>
constexpr auto llama::hasSameTags

Is true if, starting at two coordinates in two record dimensions, all subsequent nodes in the record dimension tree have the same tag.

Template Parameters:
  • RecordDimA – First record dimension.

  • RecordCoordARecordCoord based on RecordDimA along which the tags are compared.

  • RecordDimB – second record dimension.

  • RecordCoordBRecordCoord based on RecordDimB along which the tags are compared.

template<typename RecordDim, typename ...TagsOrTagList>
using llama::GetCoordFromTags = typename internal::GetCoordFromTagsImpl<RecordDim, RecordCoord<>, TagsOrTagList...>::type

Converts a series of tags, or a list of tags, navigating down a record dimension into a RecordCoord. A RecordCoord will be passed through unmodified.

template<typename RecordDim, typename ...RecordCoordOrTags>
using llama::GetType = typename internal::GetTypeImpl<RecordDim, RecordCoordOrTags...>::type

Returns the type of a node in a record dimension tree identified by a given RecordCoord or a series of tags.

template<typename RecordDim>
using llama::FlatRecordDim = typename internal::FlattenRecordDimImpl<RecordDim>::type

Returns a flat type list containing all leaf field types of the given record dimension.

template<typename RecordDim, typename RecordCoord>
constexpr std::size_t llama::flatRecordCoord = 0

The equivalent zero based index into a flat record dimension (FlatRecordDim) of the given hierarchical record coordinate.

template<typename RecordDim>
using llama::LeafRecordCoords = typename internal::LeafRecordCoordsImpl<RecordDim, RecordCoord<>>::type

Returns a flat type list containing all record coordinates to all leaves of the given record dimension.

template<typename RecordDim, template<typename> typename FieldTypeFunctor>
using llama::TransformLeaves = TransformLeavesWithCoord<RecordDim, internal::MakePassSecond<FieldTypeFunctor>::template fn>

Creates a new record dimension where each new leaf field’s type is the result of applying FieldTypeFunctor to the original leaf field’s type.

template<typename RecordDimA, typename RecordDimB> llama::MergedRecordDims = typename decltype(internal::mergeRecordDimsImpl(mp_identity< RecordDimA >{}, mp_identity< RecordDimB >{}))::type

Creates a merged record dimension, where duplicated, nested fields are unified.

template<typename RecordDim, typename Functor, typename ...Tags>
inline constexpr void llama::forEachLeafCoord(Functor &&functor, Tags...)

Iterates over the record dimension tree and calls a functor on each element.

Parameters:
  • functor – Functor to execute at each element of. Needs to have operator() with a template parameter for the RecordCoord in the record dimension tree.

  • baseTags – Tags used to define where the iteration should be started. The functor is called on elements beneath this coordinate.

template<typename RecordDim, typename Functor, std::size_t... Coords>
inline constexpr void llama::forEachLeafCoord(Functor &&functor, RecordCoord<Coords...> baseCoord)

Iterates over the record dimension tree and calls a functor on each element.

Parameters:
  • functor – Functor to execute at each element of. Needs to have operator() with a template parameter for the RecordCoord in the record dimension tree.

  • baseCoordRecordCoord at which the iteration should be started. The functor is called on elements beneath this coordinate.

template<typename RecordDim, std::size_t... Coords>
constexpr auto llama::prettyRecordCoord(RecordCoord<Coords...> = {}) -> std::string_view

Returns a pretty representation of the record coordinate inside the given record dimension. Tags are interspersed by ‘.’ and arrays are represented using subscript notation (“[123]”).

Record coordinates

template<std::size_t... Coords>
struct RecordCoord

Represents a coordinate for a record inside the record dimension tree.

Template Parameters:

Coords... – the compile time coordinate.

Public Types

using List = mp_list_c<std::size_t, Coords...>

The list of integral coordinates as mp_list.

template<typename L>
using llama::RecordCoordFromList = internal::mp_unwrap_values_into<L, RecordCoord>

Converts a type list of integral constants into a RecordCoord.

template<typename ...RecordCoords>
using llama::Cat = RecordCoordFromList<mp_append<typename RecordCoords::List...>>

Concatenate a set of RecordCoords.

template<typename RecordCoord>
using llama::PopFront = RecordCoordFromList<mp_pop_front<typename RecordCoord::List>>

RecordCoord without first coordinate component.

template<typename First, typename Second>
constexpr auto llama::recordCoordCommonPrefixIsBigger = internal::recordCoordCommonPrefixIsBiggerImpl(First{}, Second{})

Checks wether the first RecordCoord is bigger than the second.

template<typename First, typename Second>
constexpr auto llama::recordCoordCommonPrefixIsSame = internal::recordCoordCommonPrefixIsSameImpl(First{}, Second{})

Checks whether two RecordCoords are the same or one is the prefix of the other.

Views

template<typename Mapping, typename Allocator = bloballoc::Vector, typename Accessor = accessor::Default>
inline auto llama::allocView(Mapping mapping = {}, const Allocator &alloc = {}, Accessor accessor = {}) -> View<Mapping, internal::AllocatorBlobType<Allocator, typename Mapping::RecordDim>, Accessor>

Creates a view based on the given mapping, e.g. mapping::AoS or mapping::SoA. For allocating the view’s underlying memory, the specified allocator callable is used (or the default one, which is bloballoc::Vector). The allocator callable is called with the alignment and size of bytes to allocate for each blob of the mapping. Value-initialization is performed for all fields by calling constructFields. This function is the preferred way to create a View. See also allocViewUninitialized.

template<typename Mapping, typename BlobType, typename Accessor>
inline void llama::constructFields(View<Mapping, BlobType, Accessor> &view)

Value-initializes all fields reachable through the given view. That is, constructors are run and fundamental types are zero-initialized. Computed fields are constructed if they return l-value references and assigned a default constructed value if they return a proxy reference.

template<typename Mapping, typename Allocator = bloballoc::Vector, typename Accessor = accessor::Default>
inline auto llama::allocViewUninitialized(Mapping mapping = {}, const Allocator &alloc = {}, Accessor accessor = {})

Same as allocView but does not run field constructors.

template<std::size_t Dim, typename RecordDim>
inline auto llama::allocScalarView() -> decltype(auto)

Allocates a View holding a single record backed by a byte array (bloballoc::Array).

Template Parameters:

Dim – Dimension of the ArrayExtents of the View.

template<typename RecordDim>
using llama::One = RecordRef<decltype(allocScalarView<0, RecordDim>()), RecordCoord<>, true>

A RecordRef that owns and holds a single value.

template<typename View, typename BoundRecordCoord, bool OwnView>
inline auto llama::copyRecord(const RecordRef<View, BoundRecordCoord, OwnView> &rr)

Returns a One with the same record dimension as the given record ref, with values copyied from rr.

template<typename ViewFwd, typename TransformBlobFunc, typename = std::enable_if_t<isView<std::decay_t<ViewFwd>>>>
inline auto llama::transformBlobs(ViewFwd &&view, const TransformBlobFunc &transformBlob)

Applies the given transformation to the blobs of a view and creates a new view with the transformed blobs and the same mapping and accessor as the old view.

template<typename View, typename NewBlobType = CopyConst<std::remove_reference_t<View>, std::byte>*, typename = std::enable_if_t<isView<std::decay_t<View>>>>
inline auto llama::shallowCopy(View &&view)

Creates a shallow copy of a view. This copy must not outlive the view, since it references its blob array.

Template Parameters:

NewBlobType – The blob type of the shallow copy. Must be a non owning pointer like type.

Returns:

A new view with the same mapping as view, where each blob refers to the blob in view.

template<typename NewMapping, typename ViewFwd, typename = std::enable_if_t<isView<std::decay_t<ViewFwd>>>>
inline auto llama::withMapping(ViewFwd &&view, NewMapping newMapping = {})
template<typename NewAccessor, typename ViewFwd, typename = std::enable_if_t<isView<std::decay_t<ViewFwd>>>>
inline auto llama::withAccessor(ViewFwd &&view, NewAccessor newAccessor = {})

Blob allocators

struct Vector

Allocates heap memory managed by a std::vector for a View, which is copied each time a View is copied.

struct SharedPtr

Allocates heap memory managed by a std::shared_ptr for a View. This memory is shared between all copies of a View.

struct UniquePtr

Allocates heap memory managed by a std::unique_ptr for a View. This memory can only be uniquely owned by a single View.

template<std::size_t BytesToReserve>
struct Array

Allocates statically sized memory for a View, which is copied each time a View is copied.

Template Parameters:

BytesToReserve – the amount of memory to reserve.

template<std::size_t Alignment>
struct AlignedArray : public llama::Array<std::byte, BytesToReserve>

Mappings

template<typename TArrayExtents, typename TRecordDim, FieldAlignment TFieldAlignment = FieldAlignment::Align, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, template<typename> typename PermuteFields = PermuteFieldsInOrder>
struct AoS : public llama::mapping::MappingBase<TArrayExtents, TRecordDim>

Array of struct mapping. Used to create a View via allocView.

Template Parameters:
template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::AlignedAoS = AoS<ArrayExtents, RecordDim, FieldAlignment::Align, LinearizeArrayIndexFunctor>

Array of struct mapping preserving the alignment of the field types by inserting padding.

See also

AoS

template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::MinAlignedAoS = AoS<ArrayExtents, RecordDim, FieldAlignment::Align, LinearizeArrayIndexFunctor, PermuteFieldsMinimizePadding>

Array of struct mapping preserving the alignment of the field types by inserting padding and permuting the field order to minimize this padding.

See also

AoS

template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::PackedAoS = AoS<ArrayExtents, RecordDim, FieldAlignment::Pack, LinearizeArrayIndexFunctor>

Array of struct mapping packing the field types tightly, violating the type’s alignment requirements.

See also

AoS

template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::AlignedSingleBlobSoA = SoA<ArrayExtents, RecordDim, Blobs::Single, SubArrayAlignment::Align, LinearizeArrayIndexFunctor>

Struct of array mapping storing the entire layout in a single blob. The starts of the sub arrays are aligned by inserting padding.

See also

SoA

template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::PackedSingleBlobSoA = SoA<ArrayExtents, RecordDim, Blobs::Single, SubArrayAlignment::Pack, LinearizeArrayIndexFunctor>

Struct of array mapping storing the entire layout in a single blob. The sub arrays are tightly packed, violating the type’s alignment requirements.

See also

SoA

template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::MultiBlobSoA = SoA<ArrayExtents, RecordDim, Blobs::OnePerField, SubArrayAlignment::Pack, LinearizeArrayIndexFunctor>

Struct of array mapping storing each attribute of the record dimension in a separate blob.

See also

SoA

template<typename TArrayExtents, typename TRecordDim, typename TArrayExtents::value_type Lanes, FieldAlignment TFieldAlignment = FieldAlignment::Align, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, template<typename> typename PermuteFields = PermuteFieldsInOrder>
struct AoSoA : public llama::mapping::MappingBase<TArrayExtents, TRecordDim>

Array of struct of arrays mapping. Used to create a View via allocView.

Template Parameters:
template<typename RecordDim, std::size_t VectorRegisterBits>
constexpr std::size_t llama::mapping::maxLanes

The maximum number of vector lanes that can be used to fetch each leaf type in the record dimension into a vector register of the given size in bits.

template<typename TArrayExtents, typename TRecordDim, typename Bits = typename TArrayExtents::value_type, SignBit SignBit = SignBit::Keep, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, template<typename> typename PermuteFields = PermuteFieldsInOrder, typename TStoredIntegral = internal::StoredUnsignedFor<TRecordDim>>
struct BitPackedIntAoS : public llama::mapping::internal::BitPackedIntCommon<TArrayExtents, TRecordDim, typename TArrayExtents::value_type, SignBit::Keep, LinearizeArrayIndexRight, internal::StoredUnsignedFor<TRecordDim>>

Array of struct mapping using bit packing to reduce size/precision of integral data types. If your record dimension contains non-integral types, split them off using the Split mapping first.

Template Parameters:
  • Bits – If Bits is llama::Constant<N>, the compile-time N specifies the number of bits to use. If Bits is an integral type T, the number of bits is specified at runtime, passed to the constructor and stored as type T. Must not be zero and must not be bigger than the bits of TStoredIntegral.

  • SignBit – When set to SignBit::Discard, discards the sign bit when storing signed integers. All numbers will be read back positive.

  • TLinearizeArrayIndexFunctor – Defines how the array dimensions should be mapped into linear numbers and how big the linear domain gets.

  • PermuteFields – Defines how the record dimension’s fields should be permuted. See \tparam TStoredIntegral Integral type used as storage of reduced precision integers. Must be std::uint32_t or std::uint64_t.

template<typename TArrayExtents, typename TRecordDim, typename Bits = typename TArrayExtents::value_type, SignBit SignBit = SignBit::Keep, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, typename TStoredIntegral = internal::StoredUnsignedFor<TRecordDim>>
struct BitPackedIntSoA : public llama::mapping::internal::BitPackedIntCommon<TArrayExtents, TRecordDim, typename TArrayExtents::value_type, SignBit::Keep, LinearizeArrayIndexRight, internal::StoredUnsignedFor<TRecordDim>>

Struct of array mapping using bit packing to reduce size/precision of integral data types. If your record dimension contains non-integral types, split them off using the Split mapping first.

Template Parameters:
  • Bits – If Bits is llama::Constant<N>, the compile-time N specifies the number of bits to use. If Bits is an integral type T, the number of bits is specified at runtime, passed to the constructor and stored as type T. Must not be zero and must not be bigger than the bits of TStoredIntegral.

  • SignBit – When set to SignBit::Discard, discards the sign bit when storing signed integers. All numbers will be read back positive.

  • TLinearizeArrayIndexFunctor – Defines how the array dimensions should be mapped into linear numbers and how big the linear domain gets.

  • TStoredIntegral – Integral type used as storage of reduced precision integers. Must be std::uint32_t or std::uint64_t.

template<typename TArrayExtents, typename TRecordDim, typename ExponentBits = typename TArrayExtents::value_type, typename MantissaBits = ExponentBits, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, template<typename> typename PermuteFields = PermuteFieldsInOrder, typename TStoredIntegral = internal::StoredIntegralFor<TRecordDim>>
struct BitPackedFloatAoS : public llama::mapping::MappingBase<TArrayExtents, TRecordDim>, public llama::internal::BoxedValue<typename TArrayExtents::value_type, 0>, public llama::internal::BoxedValue<typename TArrayExtents::value_type, 1>
template<typename TArrayExtents, typename TRecordDim, typename ExponentBits = typename TArrayExtents::value_type, typename MantissaBits = ExponentBits, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, typename TStoredIntegral = internal::StoredIntegralFor<TRecordDim>>
struct BitPackedFloatSoA : public llama::mapping::MappingBase<TArrayExtents, TRecordDim>, public llama::internal::BoxedValue<typename TArrayExtents::value_type, 0>, public llama::internal::BoxedValue<typename TArrayExtents::value_type, 1>

Struct of array mapping using bit packing to reduce size/precision of floating-point data types. The bit layout is [1 sign bit, exponentBits bits from the exponent, mantissaBits bits from the mantissa]+ and tries to follow IEEE 754. Infinity and NAN are supported. If the packed exponent bits are not big enough to hold a number, it will be set to infinity (preserving the sign). If your record dimension contains non-floating-point types, split them off using the Split mapping first.

Template Parameters:
  • ExponentBits – If ExponentBits is llama::Constant<N>, the compile-time N specifies the number of bits to use to store the exponent. If ExponentBits is llama::Value<T>, the number of bits is specified at runtime, passed to the constructor and stored as type T. Must not be zero.

  • MantissaBits – Like ExponentBits but for the mantissa bits. Must not be zero (otherwise values turn INF).

  • TLinearizeArrayIndexFunctor – Defines how the array dimensions should be mapped into linear numbers and how big the linear domain gets.

  • TStoredIntegral – Integral type used as storage of reduced precision floating-point values.

template<typename TArrayExtents, typename TRecordDim, template<typename, typename> typename InnerMapping>
struct Bytesplit : private InnerMapping<TArrayExtents, internal::SplitBytes<TRecordDim>>

Meta mapping splitting each field in the record dimension into an array of bytes and mapping the resulting record dimension using a further mapping.

template<typename RC, typename BlobArray>
struct Reference : public llama::ProxyRefOpMixin<Reference<RC, BlobArray>, GetType<TRecordDim, RC>>
template<typename ArrayExtents, typename RecordDim, template<typename, typename> typename InnerMapping>
struct Byteswap : public llama::mapping::Projection<ArrayExtents, RecordDim, InnerMapping, internal::MakeByteswapProjectionMap<RecordDim>>

Mapping that swaps the byte order of all values when loading/storing.

template<typename ArrayExtents, typename RecordDim, template<typename, typename> typename InnerMapping, typename ReplacementMap>
struct ChangeType : public llama::mapping::Projection<ArrayExtents, RecordDim, InnerMapping, internal::MakeProjectionMap<RecordDim, ReplacementMap>>

Mapping that changes the type in the record domain for a different one in storage. Conversions happen during load and store.

Template Parameters:

ReplacementMap – A type list of binary type lists (a map) specifiying which type or the type at a RecordCoord (map key) to replace by which other type (mapped value).

template<typename Mapping, typename Mapping::ArrayExtents::value_type Granularity = 1, typename TCountType = std::size_t>
struct Heatmap : private Mapping

Forwards all calls to the inner mapping. Counts all accesses made to blocks inside the blobs, allowing to extract a heatmap.

Template Parameters:
  • Mapping – The type of the inner mapping.

  • Granularity – The granularity in bytes on which to could accesses. A value of 1 counts every byte. individually. A value of e.g. 64, counts accesses per 64 byte block.

  • TCountType – Data type used to count the number of accesses. Atomic increments must be supported for this type.

Public Functions

template<typename Blobs, typename OStream>
inline void writeGnuplotDataFileAscii(const Blobs &blobs, OStream &&os, bool trimEnd = true, std::size_t wrapAfterBlocks = 64) const

Writes a data file suitable for gnuplot containing the heatmap data. You can use the script provided by gnuplotScript to plot this data file.

Parameters:
  • blobs – The blobs of the view containing this mapping

  • os – The stream to write the data to. Should be some form of std::ostream.

Public Static Attributes

static constexpr std::string_view gnuplotScriptAscii

An example script for plotting the ASCII heatmap data using gnuplot.

static constexpr std::string_view gnuplotScriptBinary

An example script for plotting the binary heatmap data using gnuplot.

template<typename TArrayExtents, typename TRecordDim>
struct Null : public llama::mapping::MappingBase<TArrayExtents, TRecordDim>

The Null mappings maps all elements to nothing. Writing data through a reference obtained from the Null mapping discards the value. Reading through such a reference returns a default constructed object.

template<typename TArrayExtents, typename TRecordDim, FieldAlignment TFieldAlignment = FieldAlignment::Align, template<typename> typename PermuteFields = PermuteFieldsMinimizePadding>
struct One : public llama::mapping::MappingBase<TArrayExtents, TRecordDim>

Maps all array dimension indices to the same location and layouts struct members consecutively. This mapping is used for temporary, single element views.

Template Parameters:
template<typename TArrayExtents, typename TRecordDim, template<typename, typename> typename InnerMapping, typename TProjectionMap>
struct Projection : private InnerMapping<TArrayExtents, internal::ReplaceTypesByProjectionResults<TRecordDim, TProjectionMap>>

Mapping that projects types in the record domain to different types. Projections are executed during load and store.

Template Parameters:

TProjectionMap – A type list of binary type lists (a map) specifing a projection (map value) for a type or the type at a RecordCoord (map key). A projection is a type with two functions: struct Proj { static auto load(auto&& fromMem); static auto store(auto&& toMem); };

template<typename TArrayExtents, typename TRecordDim, Blobs TBlobs = Blobs::OnePerField, SubArrayAlignment TSubArrayAlignment = TBlobs == Blobs::Single ? SubArrayAlignment::Align : SubArrayAlignment::Pack, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, template<typename> typename PermuteFieldsSingleBlob = PermuteFieldsInOrder>
struct SoA : public llama::mapping::MappingBase<TArrayExtents, TRecordDim>

Struct of array mapping. Used to create a View via allocView. We recommend to use multiple blobs when the array extents are dynamic and an aligned single blob version when they are static.

Template Parameters:
  • TBlobs – If OnePerField, every element of the record dimension is mapped to its own blob.

  • TSubArrayAlignment – Only relevant when TBlobs == Single, ignored otherwise. If Align, aligns the sub arrays created within the single blob by inserting padding. If the array extents are dynamic, this may add some overhead to the mapping logic.

  • TLinearizeArrayIndexFunctor – Defines how the array dimensions should be mapped into linear numbers and how big the linear domain gets.

  • PermuteFieldsSingleBlob – Defines how the record dimension’s fields should be permuted if Blobs is Single. See PermuteFieldsInOrder, PermuteFieldsIncreasingAlignment, PermuteFieldsDecreasingAlignment and PermuteFieldsMinimizePadding.

template<typename TArrayExtents, typename TRecordDim, typename TSelectorForMapping1, template<typename...> typename MappingTemplate1, template<typename...> typename MappingTemplate2, bool SeparateBlobs = false>
struct Split

Mapping which splits off a part of the record dimension and maps it differently then the rest.

Template Parameters:
  • TSelectorForMapping1 – Selects a part of the record dimension to be mapped by MappingTemplate1. Can be a RecordCoord, a type list of RecordCoords, a type list of tags (selecting one field), or a type list of type list of tags (selecting one field per sub list). dimension to be mapped differently.

  • MappingTemplate1 – The mapping used for the selected part of the record dimension.

  • MappingTemplate2 – The mapping used for the not selected part of the record dimension.

  • SeparateBlobs – If true, both pieces of the record dimension are mapped to separate blobs.

template<typename Mapping, typename TCountType = std::size_t, bool MyCodeHandlesProxyReferences = true>
struct FieldAccessCount : public Mapping

Forwards all calls to the inner mapping. Counts all accesses made through this mapping and allows printing a summary.

Template Parameters:
  • Mapping – The type of the inner mapping.

  • TCountType – The type used for counting the number of accesses.

  • MyCodeHandlesProxyReferences – If false, FieldAccessCount will avoid proxy references but can then only count the number of address computations

struct FieldHitsArray : public llama::Array<AccessCounts<CountType>, flatFieldCount<RecordDim>>

Public Functions

inline auto totalBytes() const

When MyCodeHandlesProxyReferences is true, return a pair of the total read and written bytes. If false, returns the total bytes of accessed data as a single value.

struct TotalBytes

Acessors

struct Default

Default accessor. Passes through the given reference.

Subclassed by llama::accessor::internal::StackedLeave< 0, Default >, llama::View< TMapping, TBlobType, TAccessor >

struct ByValue

Allows only read access and returns values instead of references to memory.

struct Const

Allows only read access by qualifying the references to memory with const.

struct Restrict

Qualifies references to memory with __restrict. Only works on l-value references.

struct Atomic

Accessor wrapping a reference into a std::atomic_ref. Can only wrap l-value references.

template<typename ...Accessors>
struct Stacked : public llama::accessor::internal::StackedLeave<0, Default>

Accessor combining multiple other accessors. The contained accessors are applied in left to right order to the memory location when forming the reference returned from a view.

RecordDim field permuters

template<typename TFlatRecordDim>
struct PermuteFieldsInOrder

Retains the order of the record dimension’s fields.

template<typename FlatOrigRecordDim, template<typename, typename> typename Less>
struct PermuteFieldsSorted

Sorts the record dimension’s the fields according to a given predicate on the field types.

Template Parameters:

Less – A binary predicate accepting two field types, which exposes a member value. Value must be true if the first field type is less than the second one, otherwise false.

template<typename FlatRecordDim>
using llama::mapping::PermuteFieldsIncreasingAlignment = PermuteFieldsSorted<FlatRecordDim, internal::LessAlignment>

Sorts the record dimension fields by increasing alignment of its fields.

template<typename FlatRecordDim>
using llama::mapping::PermuteFieldsDecreasingAlignment = PermuteFieldsSorted<FlatRecordDim, internal::MoreAlignment>

Sorts the record dimension fields by decreasing alignment of its fields.

template<typename FlatRecordDim>
using llama::mapping::PermuteFieldsMinimizePadding = PermuteFieldsIncreasingAlignment<FlatRecordDim>

Sorts the record dimension fields by the alignment of its fields to minimize padding.

Common utilities

struct LinearizeArrayIndexRight

Functor that maps an ArrayIndex into linear numbers, where the fast moving index should be the rightmost one, which models how C++ arrays work and is analogous to mdspan’s layout_right. E.g. ArrayIndex<3> a; stores 3 indices where a[2] should be incremented in the innermost loop.

Public Functions

template<typename ArrayExtents>
inline constexpr auto operator()(const typename ArrayExtents::Index &ai, const ArrayExtents &extents) const -> typename ArrayExtents::value_type
Parameters:
  • ai – Index in the array dimensions.

  • extents – Total size of the array dimensions.

Returns:

Linearized index.

struct LinearizeArrayIndexLeft

Functor that maps a ArrayIndex into linear numbers the way Fortran arrays work. The fast moving index of the ArrayIndex object should be the last one. E.g. ArrayIndex<3> a; stores 3 indices where a[0] should be incremented in the innermost loop.

Public Functions

template<typename ArrayExtents>
inline constexpr auto operator()(const typename ArrayExtents::Index &ai, const ArrayExtents &extents) const -> typename ArrayExtents::value_type
Parameters:
  • ai – Index in the array dimensions.

  • extents – Total size of the array dimensions.

Returns:

Linearized index.

struct LinearizeArrayIndexMorton

Functor that maps an ArrayIndex into linear numbers using the Z-order space filling curve (Morton codes).

Public Functions

template<typename ArrayExtents>
inline constexpr auto operator()(const typename ArrayExtents::Index &ai, [[maybe_unused]] const ArrayExtents &extents) const -> typename ArrayExtents::value_type
Parameters:
  • ai – Coordinate in the array dimensions.

  • extents – Total size of the array dimensions.

Returns:

Linearized index.

Dumping

Warning

doxygenfunction: Cannot find function “llama::toSvg” in doxygen xml output for project “LLAMA” from directory: ./doxygen/xml

Warning

doxygenfunction: Cannot find function “llama::toHtml” in doxygen xml output for project “LLAMA” from directory: ./doxygen/xml

Data access

template<typename TMapping, typename TBlobType, typename TAccessor = accessor::Default>
struct View : private TMapping, private llama::accessor::Default

Central LLAMA class holding memory for storage and giving access to values stored there defined by a mapping. A view should be created using allocView.

Template Parameters:
  • TMapping – The mapping used by the view to map accesses into memory.

  • TBlobType – The storage type used by the view holding memory.

  • TAccessor – The accessor to use when an access is made through this view.

Public Functions

View() = default

Performs default initialization of the blob array.

inline explicit View(Mapping mapping, Array<BlobType, Mapping::blobCount> blobs = {}, Accessor accessor = {})

Creates a LLAMA View manually. Prefer the allocations functions allocView and allocViewUninitialized if possible.

Parameters:
  • mapping – The mapping used by the view to map accesses into memory.

  • blobs – An array of blobs providing storage space for the mapped data.

  • accessor – The accessor to use when an access is made through this view.

inline auto operator()(ArrayIndex ai) const -> decltype(auto)

Retrieves the RecordRef at the given ArrayIndex index.

template<typename ...Indices, std::enable_if_t<std::conjunction_v<std::is_convertible<Indices, size_type>...>, int> = 0>
inline auto operator()(Indices... indices) const -> decltype(auto)

Retrieves the RecordRef at the ArrayIndex index constructed from the passed component indices.

inline auto operator[](ArrayIndex ai) const -> decltype(auto)

Retrieves the RecordRef at the ArrayIndex index constructed from the passed component indices.

inline auto operator[](size_type index) const -> decltype(auto)

Retrieves the RecordRef at the 1D ArrayIndex index constructed from the passed index.

template<typename TStoredParentView>
struct SubView

Like a View, but array indices are shifted.

Template Parameters:

TStoredParentView – Type of the underlying view. May be cv qualified and/or a reference type.

Public Types

using ParentView = std::remove_const_t<std::remove_reference_t<StoredParentView>>

type of the parent view

Public Functions

inline explicit SubView(ArrayIndex offset)

Creates a SubView given an offset. The parent view is default constructed.

template<typename StoredParentViewFwd>
inline SubView(StoredParentViewFwd &&parentView, ArrayIndex offset)

Creates a SubView given a parent View and offset.

inline auto operator()(ArrayIndex ai) const -> decltype(auto)

Same as View::operator()(ArrayIndex), but shifted by the offset of this SubView.

template<typename ...Indices>
inline auto operator()(Indices... indices) const -> decltype(auto)

Same as corresponding operator in View, but shifted by the offset of this SubView.

Public Members

const ArrayIndex offset

offset by which this view’s ArrayIndex indices are shifted when passed to the parent view.

template<typename TView, typename TBoundRecordCoord, bool OwnView>
struct RecordRef : private TView::Mapping::ArrayExtents::Index

Record reference type returned by View after resolving an array dimensions coordinate or partially resolving a RecordCoord. A record reference does not hold data itself, it just binds enough information (array dimensions coord and partial record coord) to retrieve it later from a View. Records references should not be created by the user. They are returned from various access functions in View and RecordRef itself.

Public Types

using View = TView

View this record reference points into.

using BoundRecordCoord = TBoundRecordCoord

Record coords into View::RecordDim which are already bound by this RecordRef.

using AccessibleRecordDim = GetType<RecordDim, BoundRecordCoord>

Subtree of the record dimension of View starting at BoundRecordCoord. If BoundRecordCoord is RecordCoord<> (default) AccessibleRecordDim is the same as Mapping::RecordDim.

Public Functions

inline RecordRef()

Creates an empty RecordRef. Only available for if the view is owned. Used by llama::One.

template<typename OtherView, typename OtherBoundRecordCoord, bool OtherOwnView>
inline RecordRef(const RecordRef<OtherView, OtherBoundRecordCoord, OtherOwnView> &recordRef)

Create a RecordRef from a different RecordRef. Only available for if the view is owned. Used by llama::One.

template<typename T, typename = std::enable_if_t<!isRecordRef<T>>>
inline explicit RecordRef(const T &scalar)

Create a RecordRef from a scalar. Only available for if the view is owned. Used by llama::One.

template<std::size_t... Coord>
inline auto operator()(RecordCoord<Coord...>) const -> decltype(auto)

Access a record in the record dimension underneath the current record reference using a RecordCoord. If the access resolves to a leaf, an l-value reference to a variable inside the View storage is returned, otherwise another RecordRef.

template<typename ...Tags>
inline auto operator()(Tags...) const -> decltype(auto)

Access a record in the record dimension underneath the current record reference using a series of tags. If the access resolves to a leaf, an l-value reference to a variable inside the View storage is returned, otherwise another RecordRef.

struct Loader
struct LoaderConst

Copying

template<typename SrcMapping, typename SrcBlob, typename DstMapping, typename DstBlob>
void llama::copy(const View<SrcMapping, SrcBlob> &srcView, View<DstMapping, DstBlob> &dstView, std::size_t threadId = 0, std::size_t threadCount = 1)

Copy data from source to destination view. Both views need to have the same array and record dimensions, but may have different mappings. The blobs need to be read- and writeable. Delegates to Copy to choose an implementation.

Parameters:
  • threadId – Optional. Zero-based id of calling thread for multi-threaded invocations.

  • threadCount – Optional. Thread count in case of multi-threaded invocation.

template<typename SrcMapping, typename DstMapping, typename SFINAE = void>
struct Copy

Generic implementation of copy defaulting to fieldWiseCopy. LLAMA provides several specializations of this construct for specific mappings. Users are encouraged to also specialize this template with better copy algorithms for further combinations of mappings, if they can and want to provide a better implementation.

template<typename SrcMapping, typename SrcBlob, typename DstMapping, typename DstBlob>
void llama::fieldWiseCopy(const View<SrcMapping, SrcBlob> &srcView, View<DstMapping, DstBlob> &dstView, std::size_t threadId = 0, std::size_t threadCount = 1)

Field-wise copy from source to destination view. Both views need to have the same array and record dimensions.

Parameters:
  • threadId – Optional. Thread id in case of multi-threaded copy.

  • threadCount – Optional. Thread count in case of multi-threaded copy.

template<typename SrcMapping, typename SrcBlob, typename DstMapping, typename DstBlob>
void llama::aosoaCommonBlockCopy(const View<SrcMapping, SrcBlob> &srcView, View<DstMapping, DstBlob> &dstView, std::size_t threadId = 0, std::size_t threadCount = 1)

AoSoA copy strategy which transfers data in common blocks. SoA mappings are also allowed for at most 1 argument.

Parameters:
  • threadId – Optional. Zero-based id of calling thread for multi-threaded invocations.

  • threadCount – Optional. Thread count in case of multi-threaded invocation.

SIMD

template<typename Simd, typename SFINAE = void>
struct SimdTraits

Traits of a specific Simd implementation. Please specialize this template for the SIMD types you are going to use in your program. Each specialization SimdTraits<Simd> must provide:

  • an alias value_type to indicate the element type of the Simd.

  • a static constexpr size_t lanes variable holding the number of SIMD lanes of the Simd.

  • a static auto loadUnalinged(const value_type* mem) -> Simd function, loading a Simd from the given memory address.

  • a static void storeUnaligned(Simd simd, value_type* mem) function, storing the given Simd to a given memory address.

  • a static auto gather(const value_type* mem, std::array<int, lanes> indices) -> Simd function, gathering values into a Simd from the memory addresses identified by mem + indices * sizeof(value_type).

  • a static void scatter(Simd simd, value_type* mem, std::array<int, lanes> indices) function, scattering the values from a Simd to the memory addresses identified by mem + indices * sizeof(value_type).

template<typename Simd, typename SFINAE = void>
constexpr auto llama::simdLanes = SimdTraits<Simd>::lanes

The number of SIMD simdLanes the given SIMD vector or Simd<T> has. If Simd is not a structural Simd or SimdN, this is a shortcut for SimdTraits<Simd>::lanes.

template<typename RecordDim, std::size_t N, template<typename, auto> typename MakeSizedSimd>
using llama::SimdizeN = typename internal::SimdizeNImpl<RecordDim, N, MakeSizedSimd>::type

Transforms the given record dimension into a SIMD version of it. Each leaf field type will be replaced by a sized SIMD vector with length N, as determined by MakeSizedSimd. If N is 1, SimdizeN<T, 1, …> is an alias for T.

template<typename RecordDim, template<typename> typename MakeSimd>
using llama::Simdize = TransformLeaves<RecordDim, MakeSimd>

Transforms the given record dimension into a SIMD version of it. Each leaf field type will be replaced by a SIMD vector, as determined by MakeSimd.

template<typename RecordDim, template<typename> typename MakeSimd>
constexpr std::size_t llama::simdLanesWithFullVectorsFor

Determines the number of simd lanes suitable to process all types occurring in the given record dimension. The algorithm ensures that even SIMD vectors for the smallest field type are filled completely and may thus require multiple SIMD vectors for some field types.

Template Parameters:
  • RecordDim – The record dimension to simdize

  • MakeSimd – Type function creating a SIMD type given a field type from the record dimension.

template<typename RecordDim, template<typename> typename MakeSimd>
constexpr std::size_t llama::simdLanesWithLeastRegistersFor

Determines the number of simd lanes suitable to process all types occurring in the given record dimension. The algorithm ensures that the smallest number of SIMD registers is needed and may thus only partially fill registers for some data types.

Template Parameters:
  • RecordDim – The record dimension to simdize

  • MakeSimd – Type function creating a SIMD type given a field type from the record dimension.

template<typename T, std::size_t N, template<typename, auto> typename MakeSizedSimd>
using llama::SimdN = typename std::conditional_t<isRecordDim<T>, std::conditional_t<N == 1, mp_identity<One<T>>, mp_identity<One<SimdizeN<T, N, MakeSizedSimd>>>>, std::conditional_t<N == 1, mp_identity<T>, mp_identity<SimdizeN<T, N, MakeSizedSimd>>>>::type

Creates a SIMD version of the given type. Of T is a record dimension, creates a One where each field is a SIMD type of the original field type. The SIMD vectors have length N. If N is 1, an ordinary One of the record dimension T is created. If T is not a record dimension, a SIMD vector with value T and length N is created. If N is 1 (and T is not a record dimension), then T is produced.

template<typename T, template<typename> typename MakeSimd>
using llama::Simd = typename std::conditional_t<isRecordDim<T>, mp_identity<One<Simdize<T, MakeSimd>>>, mp_identity<Simdize<T, MakeSimd>>>::type

Creates a SIMD version of the given type. Of T is a record dimension, creates a One where each field is a SIMD type of the original field type.

template<typename T, typename Simd>
inline void llama::loadSimd(const T &srcRef, Simd &dstSimd)

Loads SIMD vectors of data starting from the given record reference to dstSimd. Only field tags occurring in RecordRef are loaded. If Simd contains multiple fields of SIMD types, a SIMD vector will be fetched for each of the fields. The number of elements fetched per SIMD vector depends on the SIMD width of the vector. Simd is allowed to have different vector lengths per element.

template<typename Simd, typename TFwd>
inline void llama::storeSimd(const Simd &srcSimd, TFwd &&dstRef)

Stores SIMD vectors of element data from the given srcSimd into memory starting at the provided record reference. Only field tags occurring in RecordRef are stored. If Simd contains multiple fields of SIMD types, a SIMD vector will be stored for each of the fields. The number of elements stored per SIMD vector depends on the SIMD width of the vector. Simd is allowed to have different vector lengths per element.

template<std::size_t N, template<typename, auto> typename MakeSizedSimd, typename View, typename UnarySimdFunction>
void llama::simdForEachN(View &view, UnarySimdFunction f)
template<template<typename> typename MakeSimd, template<typename, auto> typename MakeSizedSimd, typename View, typename UnarySimdFunction>
void llama::simdForEach(View &view, UnarySimdFunction f)

Macros

LLAMA_INDEPENDENT_DATA

May be put in front of a loop statement. Indicates that all (!) data access inside the loop is indepent, so the loop can be safely vectorized. Example:

LLAMA_INDEPENDENT_DATA
for(int i = 0; i < N; ++i)
    // because of LLAMA_INDEPENDENT_DATA the compiler knows that a and b
    // do not overlap and the operation can safely be vectorized
    a[i] += b[i];

LLAMA_FORCE_INLINE

Forces the compiler to inline a function annotated with this macro.

LLAMA_UNROLL(...)

Requests the compiler to unroll the loop following this directive. An optional unrolling count may be provided as argument, which must be a constant expression.

LLAMA_HOST_ACC

Some offloading parallelization language extensions such a CUDA, OpenACC or OpenMP 4.5 need to specify whether a class, struct, function or method “resides” on the host, the accelerator (the offloading device) or both. LLAMA supports this with marking every function needed on an accelerator with LLAMA_HOST_ACC.

LLAMA_FN_HOST_ACC_INLINE
LLAMA_LAMBDA_INLINE

Gives strong indication to the compiler to inline the attributed lambda.

LLAMA_COPY(x)

Forces a copy of a value. This is useful to prevent ODR usage of constants when compiling for GPU targets.