API
Users should just include llama.hpp
and all functionality should be available.
All basic functionality of the library is in the namespace llama
or sub namespaces.
Useful helpers
-
template<typename T>
struct NrAndOffset
-
template<typename FromT, typename ToT>
using llama::CopyConst = std::conditional_t<std::is_const_v<FromT>, const ToT, ToT> Alias for ToT, adding
const
if FromT is const qualified.
-
template<typename Derived, typename ValueType>
struct ProxyRefOpMixin CRTP mixin for proxy reference types to support all compound assignment and increment/decrement operators.
-
template<typename T>
inline auto llama::decayCopy(T &&valueOrRef) -> typename internal::ValueOf<T>::type Pulls a copy of the given value or reference. Proxy references are resolved to their value types.
-
template<typename Reference, typename = void>
struct ScopedUpdate : public internal::ValueOf::type Scope guard type. ScopedUpdate takes a copy of a value through a reference and stores it internally during construction. The stored value is written back when ScopedUpdate is destroyed. ScopedUpdate tries to act like the stored value as much as possible, exposing member functions of the stored value and acting like a proxy reference if the stored value is a primitive type.
Array
-
template<typename T, std::size_t N>
struct Array Array class like
std::array
but suitable for use with offloading devices like GPUs.- Template Parameters:
T – type if array elements.
N – rank of the array.
-
template<typename T, std::size_t N>
inline constexpr auto llama::pushFront([[maybe_unused]] Array<T, N> a, T v) -> Array<T, N + 1>
-
template<typename T, std::size_t N>
inline constexpr auto llama::pushBack([[maybe_unused]] Array<T, N> a, T v) -> Array<T, N + 1>
Warning
doxygenfunction: Unable to resolve function “llama::popFront” with arguments (Array<T, N>) in doxygen xml output for project “LLAMA” from directory: ./doxygen/xml. Potential matches:
- template<typename ...Elements> constexpr auto popFront(const Tuple<Elements...> &tuple)
- template<typename T, std::size_t N> constexpr auto popFront([[maybe_unused]] Array<T, N> a)
Tuple
-
template<typename ...Elements>
struct Tuple
-
template<std::size_t I, typename ...Elements>
inline constexpr auto llama::get(Tuple<Elements...> &tuple) -> auto&
-
template<typename Tuple1, typename Tuple2>
inline constexpr auto llama::tupleCat(const Tuple1 &t1, const Tuple2 &t2)
-
template<std::size_t Pos, typename Tuple, typename Replacement>
inline constexpr auto llama::tupleReplace(Tuple &&tuple, Replacement &&replacement) Creates a copy of a tuple with the element at position Pos replaced by replacement.
-
template<typename ...Elements, typename Functor>
inline constexpr auto llama::tupleTransform(const Tuple<Elements...> &tuple, const Functor &functor) Applies a functor to every element of a tuple, creating a new tuple with the result of the element transformations. The functor needs to implement a template
operator()
to which all tuple elements are passed.
Array dimensions
-
template<typename T = std::size_t, T... Sizes>
struct ArrayExtents : public llama::Array<std::size_t, ((Sizes == dyn) + ... + 0)> ArrayExtents holding compile and runtime indices. This is conceptually equivalent to the std::extent of std::mdspan (
See also
: https://wg21.link/P0009) including the changes to make the size_type controllable (
See also
Subclassed by llama::ArrayIndexRange< ArrayExtents >
-
template<typename SizeType, std::size_t N>
using llama::ArrayExtentsDynamic = ArrayExtentsNCube<SizeType, N, dyn> N-dimensional ArrayExtents where all values are dynamic.
-
template<typename SizeType, std::size_t N, SizeType Extent>
using llama::ArrayExtentsNCube = decltype(internal::makeArrayExtents<SizeType, Extent>(std::make_index_sequence<N>{})) N-dimensional ArrayExtents where all N extents are Extent.
-
template<typename T, std::size_t Dim>
struct ArrayIndex : public llama::Array<T, Dim> Represents a run-time index into the array dimensions.
- Template Parameters:
Dim – Compile-time number of dimensions.
-
template<typename ArrayExtents>
struct ArrayIndexIterator Iterator supporting ArrayIndexRange.
-
template<typename ArrayExtents>
struct ArrayIndexRange : private llama::ArrayExtents<T, Sizes> Range allowing to iterate over all indices in an ArrayExtents.
Record dimension
-
template<typename ...Fields>
struct Record A type list of Fields which may be used to define a record dimension.
-
template<typename Tag, typename Type>
struct Field Record dimension tree node which may either be a leaf or refer to a child tree presented as another Record.
- Template Parameters:
Tag – Name of the node. May be any type (struct, class).
Type – Type of the node. May be one of three cases. 1. another sub tree consisting of a nested Record. 2. an array of static size of any type, in which case a Record with as many Field as the array size is created, named RecordCoord specialized on consecutive numbers I. 3. A scalar type different from Record, making this node a leaf of this type.
-
template<typename RecordDim, typename RecordCoord, bool Align = false>
constexpr std::size_t llama::offsetOf = flatOffsetOf<FlatRecordDim<RecordDim>, flatRecordCoord<RecordDim, RecordCoord>, Align> The byte offset of an element in a record dimension if it would be a normal struct.
-
template<typename T, bool Align = false, bool IncludeTailPadding = true>
constexpr std::size_t llama::sizeOf = sizeof(T) The size of a type T.
-
template<typename RecordDim, typename RecordCoord>
using llama::GetTags = typename internal::GetTagsImpl<RecordDim, RecordCoord>::type Get the tags of all Fields from the root of the record dimension tree until to the node identified by RecordCoord.
-
template<typename RecordDim, typename RecordCoord>
using llama::GetTag = typename internal::GetTagImpl<RecordDim, RecordCoord>::type Get the tag of the Field at a RecordCoord inside the record dimension tree.
-
template<typename RecordDimA, typename RecordCoordA, typename RecordDimB, typename RecordCoordB>
constexpr auto llama::hasSameTags Is true if, starting at two coordinates in two record dimensions, all subsequent nodes in the record dimension tree have the same tag.
- Template Parameters:
RecordDimA – First record dimension.
RecordCoordA – RecordCoord based on RecordDimA along which the tags are compared.
RecordDimB – second record dimension.
RecordCoordB – RecordCoord based on RecordDimB along which the tags are compared.
-
template<typename RecordDim, typename ...TagsOrTagList>
using llama::GetCoordFromTags = typename internal::GetCoordFromTagsImpl<RecordDim, RecordCoord<>, TagsOrTagList...>::type Converts a series of tags, or a list of tags, navigating down a record dimension into a RecordCoord. A RecordCoord will be passed through unmodified.
-
template<typename RecordDim, typename ...RecordCoordOrTags>
using llama::GetType = typename internal::GetTypeImpl<RecordDim, RecordCoordOrTags...>::type Returns the type of a node in a record dimension tree identified by a given RecordCoord or a series of tags.
-
template<typename RecordDim>
using llama::FlatRecordDim = typename internal::FlattenRecordDimImpl<RecordDim>::type Returns a flat type list containing all leaf field types of the given record dimension.
-
template<typename RecordDim, typename RecordCoord>
constexpr std::size_t llama::flatRecordCoord = 0 The equivalent zero based index into a flat record dimension (FlatRecordDim) of the given hierarchical record coordinate.
-
template<typename RecordDim>
using llama::LeafRecordCoords = typename internal::LeafRecordCoordsImpl<RecordDim, RecordCoord<>>::type Returns a flat type list containing all record coordinates to all leaves of the given record dimension.
-
template<typename RecordDim, template<typename> typename FieldTypeFunctor>
using llama::TransformLeaves = TransformLeavesWithCoord<RecordDim, internal::MakePassSecond<FieldTypeFunctor>::template fn> Creates a new record dimension where each new leaf field’s type is the result of applying FieldTypeFunctor to the original leaf field’s type.
- template<typename RecordDimA, typename RecordDimB> llama::MergedRecordDims = typename decltype(internal::mergeRecordDimsImpl(mp_identity< RecordDimA >{}, mp_identity< RecordDimB >{}))::type
Creates a merged record dimension, where duplicated, nested fields are unified.
-
template<typename RecordDim, typename Functor, typename ...Tags>
inline constexpr void llama::forEachLeafCoord(Functor &&functor, Tags...) Iterates over the record dimension tree and calls a functor on each element.
- Parameters:
functor – Functor to execute at each element of. Needs to have
operator()
with a template parameter for the RecordCoord in the record dimension tree.baseTags – Tags used to define where the iteration should be started. The functor is called on elements beneath this coordinate.
-
template<typename RecordDim, typename Functor, std::size_t... Coords>
inline constexpr void llama::forEachLeafCoord(Functor &&functor, RecordCoord<Coords...> baseCoord) Iterates over the record dimension tree and calls a functor on each element.
- Parameters:
functor – Functor to execute at each element of. Needs to have
operator()
with a template parameter for the RecordCoord in the record dimension tree.baseCoord – RecordCoord at which the iteration should be started. The functor is called on elements beneath this coordinate.
-
template<typename RecordDim, std::size_t... Coords>
constexpr auto llama::prettyRecordCoord(RecordCoord<Coords...> = {}) -> std::string_view Returns a pretty representation of the record coordinate inside the given record dimension. Tags are interspersed by ‘.’ and arrays are represented using subscript notation (“[123]”).
Record coordinates
-
template<std::size_t... Coords>
struct RecordCoord Represents a coordinate for a record inside the record dimension tree.
- Template Parameters:
Coords... – the compile time coordinate.
-
template<typename L>
using llama::RecordCoordFromList = internal::mp_unwrap_values_into<L, RecordCoord> Converts a type list of integral constants into a RecordCoord.
-
template<typename ...RecordCoords>
using llama::Cat = RecordCoordFromList<mp_append<typename RecordCoords::List...>> Concatenate a set of RecordCoords.
-
template<typename RecordCoord>
using llama::PopFront = RecordCoordFromList<mp_pop_front<typename RecordCoord::List>> RecordCoord without first coordinate component.
-
template<typename First, typename Second>
constexpr auto llama::recordCoordCommonPrefixIsBigger = internal::recordCoordCommonPrefixIsBiggerImpl(First{}, Second{}) Checks wether the first RecordCoord is bigger than the second.
-
template<typename First, typename Second>
constexpr auto llama::recordCoordCommonPrefixIsSame = internal::recordCoordCommonPrefixIsSameImpl(First{}, Second{}) Checks whether two RecordCoords are the same or one is the prefix of the other.
Views
-
template<typename Mapping, typename Allocator = bloballoc::Vector, typename Accessor = accessor::Default>
inline auto llama::allocView(Mapping mapping = {}, const Allocator &alloc = {}, Accessor accessor = {}) -> View<Mapping, internal::AllocatorBlobType<Allocator, typename Mapping::RecordDim>, Accessor> Creates a view based on the given mapping, e.g. mapping::AoS or mapping::SoA. For allocating the view’s underlying memory, the specified allocator callable is used (or the default one, which is bloballoc::Vector). The allocator callable is called with the alignment and size of bytes to allocate for each blob of the mapping. Value-initialization is performed for all fields by calling constructFields. This function is the preferred way to create a View. See also allocViewUninitialized.
-
template<typename Mapping, typename BlobType, typename Accessor>
inline void llama::constructFields(View<Mapping, BlobType, Accessor> &view) Value-initializes all fields reachable through the given view. That is, constructors are run and fundamental types are zero-initialized. Computed fields are constructed if they return l-value references and assigned a default constructed value if they return a proxy reference.
-
template<typename Mapping, typename Allocator = bloballoc::Vector, typename Accessor = accessor::Default>
inline auto llama::allocViewUninitialized(Mapping mapping = {}, const Allocator &alloc = {}, Accessor accessor = {}) Same as allocView but does not run field constructors.
-
template<std::size_t Dim, typename RecordDim>
inline auto llama::allocScalarView() -> decltype(auto) Allocates a View holding a single record backed by a byte array (bloballoc::Array).
- Template Parameters:
Dim – Dimension of the ArrayExtents of the View.
-
template<typename RecordDim>
using llama::One = RecordRef<decltype(allocScalarView<0, RecordDim>()), RecordCoord<>, true> A RecordRef that owns and holds a single value.
-
template<typename View, typename BoundRecordCoord, bool OwnView>
inline auto llama::copyRecord(const RecordRef<View, BoundRecordCoord, OwnView> &rr) Returns a One with the same record dimension as the given record ref, with values copyied from rr.
-
template<typename ViewFwd, typename TransformBlobFunc, typename = std::enable_if_t<isView<std::decay_t<ViewFwd>>>>
inline auto llama::transformBlobs(ViewFwd &&view, const TransformBlobFunc &transformBlob) Applies the given transformation to the blobs of a view and creates a new view with the transformed blobs and the same mapping and accessor as the old view.
-
template<typename View, typename NewBlobType = CopyConst<std::remove_reference_t<View>, std::byte>*, typename = std::enable_if_t<isView<std::decay_t<View>>>>
inline auto llama::shallowCopy(View &&view) Creates a shallow copy of a view. This copy must not outlive the view, since it references its blob array.
- Template Parameters:
NewBlobType – The blob type of the shallow copy. Must be a non owning pointer like type.
- Returns:
A new view with the same mapping as view, where each blob refers to the blob in view.
-
template<typename NewMapping, typename ViewFwd, typename = std::enable_if_t<isView<std::decay_t<ViewFwd>>>>
inline auto llama::withMapping(ViewFwd &&view, NewMapping newMapping = {})
-
template<typename NewAccessor, typename ViewFwd, typename = std::enable_if_t<isView<std::decay_t<ViewFwd>>>>
inline auto llama::withAccessor(ViewFwd &&view, NewAccessor newAccessor = {})
Blob allocators
-
struct Vector
Allocates heap memory managed by a
std::vector
for a View, which is copied each time a View is copied.
Allocates heap memory managed by a
std::shared_ptr
for a View. This memory is shared between all copies of a View.
-
struct UniquePtr
Allocates heap memory managed by a
std::unique_ptr
for a View. This memory can only be uniquely owned by a single View.
-
template<std::size_t BytesToReserve>
struct Array Allocates statically sized memory for a View, which is copied each time a View is copied.
- Template Parameters:
BytesToReserve – the amount of memory to reserve.
-
template<std::size_t Alignment>
struct AlignedArray : public llama::Array<std::byte, BytesToReserve>
Mappings
-
template<typename TArrayExtents, typename TRecordDim, FieldAlignment TFieldAlignment = FieldAlignment::Align, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, template<typename> typename PermuteFields = PermuteFieldsInOrder>
struct AoS : public llama::mapping::MappingBase<TArrayExtents, TRecordDim> Array of struct mapping. Used to create a View via allocView.
- Template Parameters:
TFieldAlignment – If Align, padding bytes are inserted to guarantee that struct members are properly aligned. If Pack, struct members are tightly packed.
TLinearizeArrayIndexFunctor – Defines how the array dimensions should be mapped into linear numbers and how big the linear domain gets.
PermuteFields – Defines how the record dimension’s fields should be permuted. See PermuteFieldsInOrder, PermuteFieldsIncreasingAlignment, PermuteFieldsDecreasingAlignment and PermuteFieldsMinimizePadding.
-
template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::AlignedAoS = AoS<ArrayExtents, RecordDim, FieldAlignment::Align, LinearizeArrayIndexFunctor> Array of struct mapping preserving the alignment of the field types by inserting padding.
See also
-
template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::MinAlignedAoS = AoS<ArrayExtents, RecordDim, FieldAlignment::Align, LinearizeArrayIndexFunctor, PermuteFieldsMinimizePadding> Array of struct mapping preserving the alignment of the field types by inserting padding and permuting the field order to minimize this padding.
See also
-
template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::PackedAoS = AoS<ArrayExtents, RecordDim, FieldAlignment::Pack, LinearizeArrayIndexFunctor> Array of struct mapping packing the field types tightly, violating the type’s alignment requirements.
See also
-
template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::AlignedSingleBlobSoA = SoA<ArrayExtents, RecordDim, Blobs::Single, SubArrayAlignment::Align, LinearizeArrayIndexFunctor> Struct of array mapping storing the entire layout in a single blob. The starts of the sub arrays are aligned by inserting padding.
See also
-
template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::PackedSingleBlobSoA = SoA<ArrayExtents, RecordDim, Blobs::Single, SubArrayAlignment::Pack, LinearizeArrayIndexFunctor> Struct of array mapping storing the entire layout in a single blob. The sub arrays are tightly packed, violating the type’s alignment requirements.
See also
-
template<typename ArrayExtents, typename RecordDim, typename LinearizeArrayIndexFunctor = LinearizeArrayIndexRight>
using llama::mapping::MultiBlobSoA = SoA<ArrayExtents, RecordDim, Blobs::OnePerField, SubArrayAlignment::Pack, LinearizeArrayIndexFunctor> Struct of array mapping storing each attribute of the record dimension in a separate blob.
See also
-
template<typename TArrayExtents, typename TRecordDim, typename TArrayExtents::value_type Lanes, FieldAlignment TFieldAlignment = FieldAlignment::Align, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, template<typename> typename PermuteFields = PermuteFieldsInOrder>
struct AoSoA : public llama::mapping::MappingBase<TArrayExtents, TRecordDim> Array of struct of arrays mapping. Used to create a View via allocView.
- Template Parameters:
Lanes – The size of the inner arrays of this array of struct of arrays.
TFieldAlignment – If Align, padding bytes are inserted to guarantee that struct members are properly aligned. If Pack, struct members are tightly packed.
PermuteFields – Defines how the record dimension’s fields should be permuted. See PermuteFieldsInOrder, PermuteFieldsIncreasingAlignment, PermuteFieldsDecreasingAlignment and PermuteFieldsMinimizePadding.
-
template<typename RecordDim, std::size_t VectorRegisterBits>
constexpr std::size_t llama::mapping::maxLanes The maximum number of vector lanes that can be used to fetch each leaf type in the record dimension into a vector register of the given size in bits.
-
template<typename TArrayExtents, typename TRecordDim, typename Bits = typename TArrayExtents::value_type, SignBit SignBit = SignBit::Keep, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, template<typename> typename PermuteFields = PermuteFieldsInOrder, typename TStoredIntegral = internal::StoredUnsignedFor<TRecordDim>>
struct BitPackedIntAoS : public llama::mapping::internal::BitPackedIntCommon<TArrayExtents, TRecordDim, typename TArrayExtents::value_type, SignBit::Keep, LinearizeArrayIndexRight, internal::StoredUnsignedFor<TRecordDim>> Array of struct mapping using bit packing to reduce size/precision of integral data types. If your record dimension contains non-integral types, split them off using the Split mapping first.
- Template Parameters:
Bits – If Bits is llama::Constant<N>, the compile-time N specifies the number of bits to use. If Bits is an integral type T, the number of bits is specified at runtime, passed to the constructor and stored as type T. Must not be zero and must not be bigger than the bits of TStoredIntegral.
SignBit – When set to SignBit::Discard, discards the sign bit when storing signed integers. All numbers will be read back positive.
TLinearizeArrayIndexFunctor – Defines how the array dimensions should be mapped into linear numbers and how big the linear domain gets.
PermuteFields – Defines how the record dimension’s fields should be permuted. See \tparam TStoredIntegral Integral type used as storage of reduced precision integers. Must be std::uint32_t or std::uint64_t.
-
template<typename TArrayExtents, typename TRecordDim, typename Bits = typename TArrayExtents::value_type, SignBit SignBit = SignBit::Keep, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, typename TStoredIntegral = internal::StoredUnsignedFor<TRecordDim>>
struct BitPackedIntSoA : public llama::mapping::internal::BitPackedIntCommon<TArrayExtents, TRecordDim, typename TArrayExtents::value_type, SignBit::Keep, LinearizeArrayIndexRight, internal::StoredUnsignedFor<TRecordDim>> Struct of array mapping using bit packing to reduce size/precision of integral data types. If your record dimension contains non-integral types, split them off using the Split mapping first.
- Template Parameters:
Bits – If Bits is llama::Constant<N>, the compile-time N specifies the number of bits to use. If Bits is an integral type T, the number of bits is specified at runtime, passed to the constructor and stored as type T. Must not be zero and must not be bigger than the bits of TStoredIntegral.
SignBit – When set to SignBit::Discard, discards the sign bit when storing signed integers. All numbers will be read back positive.
TLinearizeArrayIndexFunctor – Defines how the array dimensions should be mapped into linear numbers and how big the linear domain gets.
TStoredIntegral – Integral type used as storage of reduced precision integers. Must be std::uint32_t or std::uint64_t.
-
template<typename TArrayExtents, typename TRecordDim, typename ExponentBits = typename TArrayExtents::value_type, typename MantissaBits = ExponentBits, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, template<typename> typename PermuteFields = PermuteFieldsInOrder, typename TStoredIntegral = internal::StoredIntegralFor<TRecordDim>>
struct BitPackedFloatAoS : public llama::mapping::MappingBase<TArrayExtents, TRecordDim>, public llama::internal::BoxedValue<typename TArrayExtents::value_type, 0>, public llama::internal::BoxedValue<typename TArrayExtents::value_type, 1>
-
template<typename TArrayExtents, typename TRecordDim, typename ExponentBits = typename TArrayExtents::value_type, typename MantissaBits = ExponentBits, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, typename TStoredIntegral = internal::StoredIntegralFor<TRecordDim>>
struct BitPackedFloatSoA : public llama::mapping::MappingBase<TArrayExtents, TRecordDim>, public llama::internal::BoxedValue<typename TArrayExtents::value_type, 0>, public llama::internal::BoxedValue<typename TArrayExtents::value_type, 1> Struct of array mapping using bit packing to reduce size/precision of floating-point data types. The bit layout is [1 sign bit, exponentBits bits from the exponent, mantissaBits bits from the mantissa]+ and tries to follow IEEE 754. Infinity and NAN are supported. If the packed exponent bits are not big enough to hold a number, it will be set to infinity (preserving the sign). If your record dimension contains non-floating-point types, split them off using the Split mapping first.
- Template Parameters:
ExponentBits – If ExponentBits is llama::Constant<N>, the compile-time N specifies the number of bits to use to store the exponent. If ExponentBits is llama::Value<T>, the number of bits is specified at runtime, passed to the constructor and stored as type T. Must not be zero.
MantissaBits – Like ExponentBits but for the mantissa bits. Must not be zero (otherwise values turn INF).
TLinearizeArrayIndexFunctor – Defines how the array dimensions should be mapped into linear numbers and how big the linear domain gets.
TStoredIntegral – Integral type used as storage of reduced precision floating-point values.
-
template<typename TArrayExtents, typename TRecordDim, template<typename, typename> typename InnerMapping>
struct Bytesplit : private InnerMapping<TArrayExtents, internal::SplitBytes<TRecordDim>> Meta mapping splitting each field in the record dimension into an array of bytes and mapping the resulting record dimension using a further mapping.
-
template<typename RC, typename BlobArray>
struct Reference : public llama::ProxyRefOpMixin<Reference<RC, BlobArray>, GetType<TRecordDim, RC>>
-
template<typename RC, typename BlobArray>
-
template<typename ArrayExtents, typename RecordDim, template<typename, typename> typename InnerMapping>
struct Byteswap : public llama::mapping::Projection<ArrayExtents, RecordDim, InnerMapping, internal::MakeByteswapProjectionMap<RecordDim>> Mapping that swaps the byte order of all values when loading/storing.
-
template<typename ArrayExtents, typename RecordDim, template<typename, typename> typename InnerMapping, typename ReplacementMap>
struct ChangeType : public llama::mapping::Projection<ArrayExtents, RecordDim, InnerMapping, internal::MakeProjectionMap<RecordDim, ReplacementMap>> Mapping that changes the type in the record domain for a different one in storage. Conversions happen during load and store.
- Template Parameters:
ReplacementMap – A type list of binary type lists (a map) specifiying which type or the type at a RecordCoord (map key) to replace by which other type (mapped value).
-
template<typename Mapping, typename Mapping::ArrayExtents::value_type Granularity = 1, typename TCountType = std::size_t>
struct Heatmap : private Mapping Forwards all calls to the inner mapping. Counts all accesses made to blocks inside the blobs, allowing to extract a heatmap.
- Template Parameters:
Mapping – The type of the inner mapping.
Granularity – The granularity in bytes on which to could accesses. A value of 1 counts every byte. individually. A value of e.g. 64, counts accesses per 64 byte block.
TCountType – Data type used to count the number of accesses. Atomic increments must be supported for this type.
Public Functions
-
template<typename Blobs, typename OStream>
inline void writeGnuplotDataFileAscii(const Blobs &blobs, OStream &&os, bool trimEnd = true, std::size_t wrapAfterBlocks = 64) const Writes a data file suitable for gnuplot containing the heatmap data. You can use the script provided by gnuplotScript to plot this data file.
- Parameters:
blobs – The blobs of the view containing this mapping
os – The stream to write the data to. Should be some form of std::ostream.
-
template<typename TArrayExtents, typename TRecordDim>
struct Null : public llama::mapping::MappingBase<TArrayExtents, TRecordDim> The Null mappings maps all elements to nothing. Writing data through a reference obtained from the Null mapping discards the value. Reading through such a reference returns a default constructed object.
-
template<typename TArrayExtents, typename TRecordDim, FieldAlignment TFieldAlignment = FieldAlignment::Align, template<typename> typename PermuteFields = PermuteFieldsMinimizePadding>
struct One : public llama::mapping::MappingBase<TArrayExtents, TRecordDim> Maps all array dimension indices to the same location and layouts struct members consecutively. This mapping is used for temporary, single element views.
- Template Parameters:
TFieldAlignment – If Align, padding bytes are inserted to guarantee that struct members are properly aligned. If false, struct members are tightly packed.
PermuteFields – Defines how the record dimension’s fields should be permuted. See PermuteFieldsInOrder, PermuteFieldsIncreasingAlignment, PermuteFieldsDecreasingAlignment and PermuteFieldsMinimizePadding.
-
template<typename TArrayExtents, typename TRecordDim, template<typename, typename> typename InnerMapping, typename TProjectionMap>
struct Projection : private InnerMapping<TArrayExtents, internal::ReplaceTypesByProjectionResults<TRecordDim, TProjectionMap>> Mapping that projects types in the record domain to different types. Projections are executed during load and store.
- Template Parameters:
TProjectionMap – A type list of binary type lists (a map) specifing a projection (map value) for a type or the type at a RecordCoord (map key). A projection is a type with two functions: struct Proj { static auto load(auto&& fromMem); static auto store(auto&& toMem); };
-
template<typename TArrayExtents, typename TRecordDim, Blobs TBlobs = Blobs::OnePerField, SubArrayAlignment TSubArrayAlignment = TBlobs == Blobs::Single ? SubArrayAlignment::Align : SubArrayAlignment::Pack, typename TLinearizeArrayIndexFunctor = LinearizeArrayIndexRight, template<typename> typename PermuteFieldsSingleBlob = PermuteFieldsInOrder>
struct SoA : public llama::mapping::MappingBase<TArrayExtents, TRecordDim> Struct of array mapping. Used to create a View via allocView. We recommend to use multiple blobs when the array extents are dynamic and an aligned single blob version when they are static.
- Template Parameters:
TBlobs – If OnePerField, every element of the record dimension is mapped to its own blob.
TSubArrayAlignment – Only relevant when TBlobs == Single, ignored otherwise. If Align, aligns the sub arrays created within the single blob by inserting padding. If the array extents are dynamic, this may add some overhead to the mapping logic.
TLinearizeArrayIndexFunctor – Defines how the array dimensions should be mapped into linear numbers and how big the linear domain gets.
PermuteFieldsSingleBlob – Defines how the record dimension’s fields should be permuted if Blobs is Single. See PermuteFieldsInOrder, PermuteFieldsIncreasingAlignment, PermuteFieldsDecreasingAlignment and PermuteFieldsMinimizePadding.
-
template<typename TArrayExtents, typename TRecordDim, typename TSelectorForMapping1, template<typename...> typename MappingTemplate1, template<typename...> typename MappingTemplate2, bool SeparateBlobs = false>
struct Split Mapping which splits off a part of the record dimension and maps it differently then the rest.
- Template Parameters:
TSelectorForMapping1 – Selects a part of the record dimension to be mapped by MappingTemplate1. Can be a RecordCoord, a type list of RecordCoords, a type list of tags (selecting one field), or a type list of type list of tags (selecting one field per sub list). dimension to be mapped differently.
MappingTemplate1 – The mapping used for the selected part of the record dimension.
MappingTemplate2 – The mapping used for the not selected part of the record dimension.
SeparateBlobs – If true, both pieces of the record dimension are mapped to separate blobs.
-
template<typename Mapping, typename TCountType = std::size_t, bool MyCodeHandlesProxyReferences = true>
struct FieldAccessCount : public Mapping Forwards all calls to the inner mapping. Counts all accesses made through this mapping and allows printing a summary.
- Template Parameters:
Mapping – The type of the inner mapping.
TCountType – The type used for counting the number of accesses.
MyCodeHandlesProxyReferences – If false, FieldAccessCount will avoid proxy references but can then only count the number of address computations
-
struct FieldHitsArray : public llama::Array<AccessCounts<CountType>, flatFieldCount<RecordDim>>
Public Functions
-
inline auto totalBytes() const
When MyCodeHandlesProxyReferences is true, return a pair of the total read and written bytes. If false, returns the total bytes of accessed data as a single value.
-
struct TotalBytes
-
inline auto totalBytes() const
Acessors
-
struct Default
Default accessor. Passes through the given reference.
Subclassed by llama::accessor::internal::StackedLeave< 0, Default >, llama::View< TMapping, TBlobType, TAccessor >
-
struct ByValue
Allows only read access and returns values instead of references to memory.
-
struct Const
Allows only read access by qualifying the references to memory with const.
-
struct Restrict
Qualifies references to memory with __restrict. Only works on l-value references.
-
struct Atomic
Accessor wrapping a reference into a std::atomic_ref. Can only wrap l-value references.
RecordDim field permuters
-
template<typename TFlatRecordDim>
struct PermuteFieldsInOrder Retains the order of the record dimension’s fields.
-
template<typename FlatOrigRecordDim, template<typename, typename> typename Less>
struct PermuteFieldsSorted Sorts the record dimension’s the fields according to a given predicate on the field types.
- Template Parameters:
Less – A binary predicate accepting two field types, which exposes a member value. Value must be true if the first field type is less than the second one, otherwise false.
-
template<typename FlatRecordDim>
using llama::mapping::PermuteFieldsIncreasingAlignment = PermuteFieldsSorted<FlatRecordDim, internal::LessAlignment> Sorts the record dimension fields by increasing alignment of its fields.
-
template<typename FlatRecordDim>
using llama::mapping::PermuteFieldsDecreasingAlignment = PermuteFieldsSorted<FlatRecordDim, internal::MoreAlignment> Sorts the record dimension fields by decreasing alignment of its fields.
-
template<typename FlatRecordDim>
using llama::mapping::PermuteFieldsMinimizePadding = PermuteFieldsIncreasingAlignment<FlatRecordDim> Sorts the record dimension fields by the alignment of its fields to minimize padding.
Common utilities
-
struct LinearizeArrayIndexRight
Functor that maps an ArrayIndex into linear numbers, where the fast moving index should be the rightmost one, which models how C++ arrays work and is analogous to mdspan’s layout_right. E.g. ArrayIndex<3> a; stores 3 indices where a[2] should be incremented in the innermost loop.
Public Functions
-
template<typename ArrayExtents>
inline constexpr auto operator()(const typename ArrayExtents::Index &ai, const ArrayExtents &extents) const -> typename ArrayExtents::value_type - Parameters:
ai – Index in the array dimensions.
extents – Total size of the array dimensions.
- Returns:
Linearized index.
-
template<typename ArrayExtents>
-
struct LinearizeArrayIndexLeft
Functor that maps a ArrayIndex into linear numbers the way Fortran arrays work. The fast moving index of the ArrayIndex object should be the last one. E.g. ArrayIndex<3> a; stores 3 indices where a[0] should be incremented in the innermost loop.
Public Functions
-
template<typename ArrayExtents>
inline constexpr auto operator()(const typename ArrayExtents::Index &ai, const ArrayExtents &extents) const -> typename ArrayExtents::value_type - Parameters:
ai – Index in the array dimensions.
extents – Total size of the array dimensions.
- Returns:
Linearized index.
-
template<typename ArrayExtents>
-
struct LinearizeArrayIndexMorton
Functor that maps an ArrayIndex into linear numbers using the Z-order space filling curve (Morton codes).
Public Functions
-
template<typename ArrayExtents>
inline constexpr auto operator()(const typename ArrayExtents::Index &ai, [[maybe_unused]] const ArrayExtents &extents) const -> typename ArrayExtents::value_type - Parameters:
ai – Coordinate in the array dimensions.
extents – Total size of the array dimensions.
- Returns:
Linearized index.
-
template<typename ArrayExtents>
Dumping
Warning
doxygenfunction: Cannot find function “llama::toSvg” in doxygen xml output for project “LLAMA” from directory: ./doxygen/xml
Warning
doxygenfunction: Cannot find function “llama::toHtml” in doxygen xml output for project “LLAMA” from directory: ./doxygen/xml
Data access
-
template<typename TMapping, typename TBlobType, typename TAccessor = accessor::Default>
struct View : private TMapping, private llama::accessor::Default Central LLAMA class holding memory for storage and giving access to values stored there defined by a mapping. A view should be created using allocView.
- Template Parameters:
TMapping – The mapping used by the view to map accesses into memory.
TBlobType – The storage type used by the view holding memory.
TAccessor – The accessor to use when an access is made through this view.
Public Functions
-
View() = default
Performs default initialization of the blob array.
-
inline explicit View(Mapping mapping, Array<BlobType, Mapping::blobCount> blobs = {}, Accessor accessor = {})
Creates a LLAMA View manually. Prefer the allocations functions allocView and allocViewUninitialized if possible.
- Parameters:
mapping – The mapping used by the view to map accesses into memory.
blobs – An array of blobs providing storage space for the mapped data.
accessor – The accessor to use when an access is made through this view.
-
inline auto operator()(ArrayIndex ai) const -> decltype(auto)
Retrieves the RecordRef at the given ArrayIndex index.
-
template<typename ...Indices, std::enable_if_t<std::conjunction_v<std::is_convertible<Indices, size_type>...>, int> = 0>
inline auto operator()(Indices... indices) const -> decltype(auto) Retrieves the RecordRef at the ArrayIndex index constructed from the passed component indices.
-
inline auto operator[](ArrayIndex ai) const -> decltype(auto)
Retrieves the RecordRef at the ArrayIndex index constructed from the passed component indices.
-
inline auto operator[](size_type index) const -> decltype(auto)
Retrieves the RecordRef at the 1D ArrayIndex index constructed from the passed index.
-
template<typename TStoredParentView>
struct SubView Like a View, but array indices are shifted.
- Template Parameters:
TStoredParentView – Type of the underlying view. May be cv qualified and/or a reference type.
Public Types
-
using ParentView = std::remove_const_t<std::remove_reference_t<StoredParentView>>
type of the parent view
Public Functions
-
inline explicit SubView(ArrayIndex offset)
Creates a SubView given an offset. The parent view is default constructed.
-
template<typename StoredParentViewFwd>
inline SubView(StoredParentViewFwd &&parentView, ArrayIndex offset)
-
inline auto operator()(ArrayIndex ai) const -> decltype(auto)
Same as View::operator()(ArrayIndex), but shifted by the offset of this SubView.
Public Members
-
const ArrayIndex offset
offset by which this view’s ArrayIndex indices are shifted when passed to the parent view.
-
template<typename TView, typename TBoundRecordCoord, bool OwnView>
struct RecordRef : private TView::Mapping::ArrayExtents::Index Record reference type returned by View after resolving an array dimensions coordinate or partially resolving a RecordCoord. A record reference does not hold data itself, it just binds enough information (array dimensions coord and partial record coord) to retrieve it later from a View. Records references should not be created by the user. They are returned from various access functions in View and RecordRef itself.
Public Types
-
using BoundRecordCoord = TBoundRecordCoord
Record coords into View::RecordDim which are already bound by this RecordRef.
-
using AccessibleRecordDim = GetType<RecordDim, BoundRecordCoord>
Subtree of the record dimension of View starting at BoundRecordCoord. If BoundRecordCoord is
RecordCoord<>
(default) AccessibleRecordDim is the same asMapping::RecordDim
.
Public Functions
-
inline RecordRef()
Creates an empty RecordRef. Only available for if the view is owned. Used by llama::One.
-
template<typename OtherView, typename OtherBoundRecordCoord, bool OtherOwnView>
inline RecordRef(const RecordRef<OtherView, OtherBoundRecordCoord, OtherOwnView> &recordRef) Create a RecordRef from a different RecordRef. Only available for if the view is owned. Used by llama::One.
-
template<typename T, typename = std::enable_if_t<!isRecordRef<T>>>
inline explicit RecordRef(const T &scalar) Create a RecordRef from a scalar. Only available for if the view is owned. Used by llama::One.
-
template<std::size_t... Coord>
inline auto operator()(RecordCoord<Coord...>) const -> decltype(auto) Access a record in the record dimension underneath the current record reference using a RecordCoord. If the access resolves to a leaf, an l-value reference to a variable inside the View storage is returned, otherwise another RecordRef.
-
template<typename ...Tags>
inline auto operator()(Tags...) const -> decltype(auto) Access a record in the record dimension underneath the current record reference using a series of tags. If the access resolves to a leaf, an l-value reference to a variable inside the View storage is returned, otherwise another RecordRef.
-
struct Loader
-
struct LoaderConst
-
using BoundRecordCoord = TBoundRecordCoord
Copying
-
template<typename SrcMapping, typename SrcBlob, typename DstMapping, typename DstBlob>
void llama::copy(const View<SrcMapping, SrcBlob> &srcView, View<DstMapping, DstBlob> &dstView, std::size_t threadId = 0, std::size_t threadCount = 1) Copy data from source to destination view. Both views need to have the same array and record dimensions, but may have different mappings. The blobs need to be read- and writeable. Delegates to Copy to choose an implementation.
- Parameters:
threadId – Optional. Zero-based id of calling thread for multi-threaded invocations.
threadCount – Optional. Thread count in case of multi-threaded invocation.
-
template<typename SrcMapping, typename DstMapping, typename SFINAE = void>
struct Copy Generic implementation of copy defaulting to fieldWiseCopy. LLAMA provides several specializations of this construct for specific mappings. Users are encouraged to also specialize this template with better copy algorithms for further combinations of mappings, if they can and want to provide a better implementation.
-
template<typename SrcMapping, typename SrcBlob, typename DstMapping, typename DstBlob>
void llama::fieldWiseCopy(const View<SrcMapping, SrcBlob> &srcView, View<DstMapping, DstBlob> &dstView, std::size_t threadId = 0, std::size_t threadCount = 1) Field-wise copy from source to destination view. Both views need to have the same array and record dimensions.
- Parameters:
threadId – Optional. Thread id in case of multi-threaded copy.
threadCount – Optional. Thread count in case of multi-threaded copy.
-
template<typename SrcMapping, typename SrcBlob, typename DstMapping, typename DstBlob>
void llama::aosoaCommonBlockCopy(const View<SrcMapping, SrcBlob> &srcView, View<DstMapping, DstBlob> &dstView, std::size_t threadId = 0, std::size_t threadCount = 1) AoSoA copy strategy which transfers data in common blocks. SoA mappings are also allowed for at most 1 argument.
- Parameters:
threadId – Optional. Zero-based id of calling thread for multi-threaded invocations.
threadCount – Optional. Thread count in case of multi-threaded invocation.
SIMD
-
template<typename Simd, typename SFINAE = void>
struct SimdTraits Traits of a specific Simd implementation. Please specialize this template for the SIMD types you are going to use in your program. Each specialization SimdTraits<Simd> must provide:
an alias
value_type
to indicate the element type of the Simd.a
static constexpr size_t lanes
variable holding the number of SIMD lanes of the Simd.a
static auto loadUnalinged(const value_type* mem) -> Simd
function, loading a Simd from the given memory address.a
static void storeUnaligned(Simd simd, value_type* mem)
function, storing the given Simd to a given memory address.a
static auto gather(const value_type* mem, std::array<int, lanes> indices) -> Simd
function, gathering values into a Simd from the memory addresses identified by mem + indices * sizeof(value_type).a
static void scatter(Simd simd, value_type* mem, std::array<int, lanes> indices)
function, scattering the values from a Simd to the memory addresses identified by mem + indices * sizeof(value_type).
-
template<typename Simd, typename SFINAE = void>
constexpr auto llama::simdLanes = SimdTraits<Simd>::lanes The number of SIMD simdLanes the given SIMD vector or Simd<T> has. If Simd is not a structural Simd or SimdN, this is a shortcut for SimdTraits<Simd>::lanes.
-
template<typename RecordDim, std::size_t N, template<typename, auto> typename MakeSizedSimd>
using llama::SimdizeN = typename internal::SimdizeNImpl<RecordDim, N, MakeSizedSimd>::type Transforms the given record dimension into a SIMD version of it. Each leaf field type will be replaced by a sized SIMD vector with length N, as determined by MakeSizedSimd. If N is 1, SimdizeN<T, 1, …> is an alias for T.
-
template<typename RecordDim, template<typename> typename MakeSimd>
using llama::Simdize = TransformLeaves<RecordDim, MakeSimd> Transforms the given record dimension into a SIMD version of it. Each leaf field type will be replaced by a SIMD vector, as determined by MakeSimd.
-
template<typename RecordDim, template<typename> typename MakeSimd>
constexpr std::size_t llama::simdLanesWithFullVectorsFor Determines the number of simd lanes suitable to process all types occurring in the given record dimension. The algorithm ensures that even SIMD vectors for the smallest field type are filled completely and may thus require multiple SIMD vectors for some field types.
- Template Parameters:
RecordDim – The record dimension to simdize
MakeSimd – Type function creating a SIMD type given a field type from the record dimension.
-
template<typename RecordDim, template<typename> typename MakeSimd>
constexpr std::size_t llama::simdLanesWithLeastRegistersFor Determines the number of simd lanes suitable to process all types occurring in the given record dimension. The algorithm ensures that the smallest number of SIMD registers is needed and may thus only partially fill registers for some data types.
- Template Parameters:
RecordDim – The record dimension to simdize
MakeSimd – Type function creating a SIMD type given a field type from the record dimension.
-
template<typename T, std::size_t N, template<typename, auto> typename MakeSizedSimd>
using llama::SimdN = typename std::conditional_t<isRecordDim<T>, std::conditional_t<N == 1, mp_identity<One<T>>, mp_identity<One<SimdizeN<T, N, MakeSizedSimd>>>>, std::conditional_t<N == 1, mp_identity<T>, mp_identity<SimdizeN<T, N, MakeSizedSimd>>>>::type Creates a SIMD version of the given type. Of T is a record dimension, creates a One where each field is a SIMD type of the original field type. The SIMD vectors have length N. If N is 1, an ordinary One of the record dimension T is created. If T is not a record dimension, a SIMD vector with value T and length N is created. If N is 1 (and T is not a record dimension), then T is produced.
-
template<typename T, template<typename> typename MakeSimd>
using llama::Simd = typename std::conditional_t<isRecordDim<T>, mp_identity<One<Simdize<T, MakeSimd>>>, mp_identity<Simdize<T, MakeSimd>>>::type Creates a SIMD version of the given type. Of T is a record dimension, creates a One where each field is a SIMD type of the original field type.
-
template<typename T, typename Simd>
inline void llama::loadSimd(const T &srcRef, Simd &dstSimd) Loads SIMD vectors of data starting from the given record reference to dstSimd. Only field tags occurring in RecordRef are loaded. If Simd contains multiple fields of SIMD types, a SIMD vector will be fetched for each of the fields. The number of elements fetched per SIMD vector depends on the SIMD width of the vector. Simd is allowed to have different vector lengths per element.
-
template<typename Simd, typename TFwd>
inline void llama::storeSimd(const Simd &srcSimd, TFwd &&dstRef) Stores SIMD vectors of element data from the given srcSimd into memory starting at the provided record reference. Only field tags occurring in RecordRef are stored. If Simd contains multiple fields of SIMD types, a SIMD vector will be stored for each of the fields. The number of elements stored per SIMD vector depends on the SIMD width of the vector. Simd is allowed to have different vector lengths per element.
-
template<std::size_t N, template<typename, auto> typename MakeSizedSimd, typename View, typename UnarySimdFunction>
void llama::simdForEachN(View &view, UnarySimdFunction f)
-
template<template<typename> typename MakeSimd, template<typename, auto> typename MakeSizedSimd, typename View, typename UnarySimdFunction>
void llama::simdForEach(View &view, UnarySimdFunction f)
Macros
-
LLAMA_INDEPENDENT_DATA
May be put in front of a loop statement. Indicates that all (!) data access inside the loop is indepent, so the loop can be safely vectorized. Example:
LLAMA_INDEPENDENT_DATA for(int i = 0; i < N; ++i) // because of LLAMA_INDEPENDENT_DATA the compiler knows that a and b // do not overlap and the operation can safely be vectorized a[i] += b[i];
-
LLAMA_FORCE_INLINE
Forces the compiler to inline a function annotated with this macro.
-
LLAMA_UNROLL(...)
Requests the compiler to unroll the loop following this directive. An optional unrolling count may be provided as argument, which must be a constant expression.
-
LLAMA_HOST_ACC
Some offloading parallelization language extensions such a CUDA, OpenACC or OpenMP 4.5 need to specify whether a class, struct, function or method “resides” on the host, the accelerator (the offloading device) or both. LLAMA supports this with marking every function needed on an accelerator with
LLAMA_HOST_ACC
.
-
LLAMA_FN_HOST_ACC_INLINE
-
LLAMA_LAMBDA_INLINE
Gives strong indication to the compiler to inline the attributed lambda.
-
LLAMA_COPY(x)
Forces a copy of a value. This is useful to prevent ODR usage of constants when compiling for GPU targets.