Components

Overview

This is an overview of the components available in timemory. For detailed info on the member functions, etc. please refer to the Doxygen.

The component documentation below is categorized into some general subsections and then sorted alphabetically. In general, which member function are present are not that important as long as you use the variadic component bundlers – these handle ignoring trying to call start() on a component is the component does not have a start() member function but is bundled alongside other components which do (that the start() was intended for).

Component Basics

Timemory components are C++ structs (class which defaults to public instead of private) which define a single collection instance, e.g. the wall_clock component is written as a simple class with two 64-bit integers with start() and stop() member functions.

// This "component" is for conceptual demonstration only
// It is not intended to be copy+pasted
struct wall_clock
{
    int64_t m_value = 0;
    int64_t m_accum = 0;

    void start();
    void stop();
};

The start() member function which records a timestamp and assigns it to one of the integers temporarily, the stop() member function which records another timestamp, computes the difference and then assigns the difference to the first integer and adds the difference to the second integer.

void wall_clock::start()
{
    m_value = get_timestamp();
}

void wall_clock::stop()
{
    // compute difference b/t when start and stop were called
    m_value = (get_timestamp() - m_value);
    // accumulate the difference
    m_accum += m_value;
}

Thus, after start() and stop() is invoked twice on the object:

wall_clock foo;

foo.start();
sleep(1); // sleep for 1 second
foo.stop();

foo.start();
sleep(1); // sleep for 1 second
foo.stop();

The first integer (m_value) represents the most recent timing interval of 1 second and the second integer (m_accum) represents the accumulated timing interval totaling 2 seconds. This design not only encapsulates how to take the measurement, but also provides it’s own data storage model. With this design, timemory measurements naturally support asynchronous data collection. Additionally, as part of the design for generating the call-graph, call-graphs are accumulated locally on each thread and on each process and merged at the termination of the thread or process. This allows parallel data to be collection free from synchronization overheads. On the worker threads, there is a concept of being at “sea-level” – the call-graphs relative position based on the base-line of the primary thread in the application. When a worker thread is at sea-level, it reads the position of the call-graph on the primary thread and creates a copy of that entry in it’s call-graph, ensuring that when merged into the primary thread at the end, the accumulated call-graph across all threads is inserted into the appropriate location. This approach has been found to produce the fewest number of artifacts.

In general, components do not need to conform to a specific interface. This is relatively unique approach. Most performance analysis which allow user extensions use callbacks and dynamic polymorphism to integrate the user extensions into their workflow. It should be noted that there is nothing preventing a component from creating a similar system but timemory is designed to query the presence of member function names for feature detection and adapts accordingly to the overloads of that function name and it’s return type. This is all possible due to the template-based design which makes extensive use of variadic functions to accept any arguments at a high-level and SFINAE to decide at compile-time which function to invoke (if a function is invoked at all). For example:

  • component A can contain these member functions:

    • void start()

    • int get()

    • void set_prefix(const char*)

  • component B can contains these member functions:

    • void start()

    • void start(cudaStream_t)

    • double get()

  • component C can contain these member functions:

    • void start()

    • void set_prefix(const std::string&)

And for a given bundle component_tuple<A, B, C> obj:

  • When obj is created, a string identifer, instance of a source_location struct, or a hash is required

    • This is the label for the measurement

    • If a string is passed, obj generates the hash and adds the hash and the string to a hash-map if it didn’t previously exist

    • A::set_prefix(const char*) will be invoked with the underlying const char* from the string that the hash maps to in the hash-map

    • C::set_prefix(const std::string&) will be invoked with string that the hash maps to in the hash-map

    • It will be detected that B does not have a member function named set_prefix and no member function will be invoked

  • Invoking obj.start() calls the following member functions on instances of A, B, and C:

    • A::start()

    • B::start()

    • C::start()

  • Invoking obj.start(cudaStream_t) calls the following member functions on instances of A, B, and C:

    • A::start()

    • B::start(cudaStream_t)

    • C::start()

  • Invoking obj.get():

    • Returns std::tuple<int, double> because it detects the two return types from A and B and the lack of get() member function in component C.

This design makes has several benefits and one downside in particular. The benefits are that timemory: (1) makes it extremely easy to create a unified interface between two or more components which different interfaces/capabilities, (2) invoking the different interfaces is efficient since no feature detection logic is required at runtime, and (3) components define their own interface.

With respect to #2, consider the two more traditional implementations. If callbacks are used, a function pointer exists and a component which does not implement a feature will either have a null function pointer (requiring a check at runtime time) or the tool will implement an array of function pointers with an unknown size at compile-time. In the latter case, this will require heap allocations (which are expensive operations) and in both cases, the loop of the function pointers will likely be quite ineffienct since function pointers have a very high probability of thrashing the instruction cache. If dynamic polymorphism is used, then virtual table look-ups are required during every iteration. In the timemory approach, none of these additional overheads are present and there isn’t even a loop – the bundle either expands into a direct call to the member function without any abstractions or nothing.

With respect to #1 and #3, this has some interesting implications with regard to a universal instrumentation interface and is discussed in the following section and the CONTRIBUTING.md documentation.

The aforementioned downside is that the byproduct of all this flexibility and adaption to custom interfaces by each component is that directly using the template interface can take quite a long time to compile.

Component Metadata

template<int Idx>
struct tim::component::enumerator : public tim::component::properties<placeholder<nothing>>

This is a critical specialization for mapping string and integers to component types at runtime (should always be specialized alongside tim::component::properties) and it is also critical for performing template metaprogramming “loops” over all the components. E.g.:

template <size_t Idx>
using Enumerator_t = typename tim::component::enumerator<Idx>::type;

template <size_t... Idx>
auto init(std::index_sequence<Idx...>)
{
    // expand for [0, TIMEMORY_COMPONENTS_END)
    TIMEMORY_FOLD_EXPRESSION(tim::storage_initializer::get<
        Enumerator_t<Idx>>());
}

void init()
{
    init(std::make_index_sequence<TIMEMORY_COMPONENTS_END>{});
}
tparam Idx

Enumeration value

Public Types

using type = placeholder<nothing>
using value_type = TIMEMORY_COMPONENT

Public Functions

inline bool operator==(int) const
inline bool operator==(const char*) const
inline bool operator==(const std::string&) const
inline void serialize(Archive&, const unsigned int)
inline TIMEMORY_COMPONENT operator()()
inline constexpr operator TIMEMORY_COMPONENT() const

Public Static Functions

static inline constexpr bool specialized()
static inline constexpr const char *enum_string()
static inline constexpr const char *id()
static inline idset_t ids()

Public Static Attributes

static constexpr bool value = false
template<typename Tp>
struct tim::component::metadata

Provides forward declaration support for assigning static metadata properties. This is most useful for specialization of template components. If this class is specialized for component, then the component does not need to provide the static member functions label() and description().

Public Types

using type = Tp
using value_type = TIMEMORY_COMPONENT

Public Static Functions

static std::string name()
static std::string label()
static std::string description()
static inline std::string extra_description()
static inline constexpr bool specialized()

Public Static Attributes

static constexpr TIMEMORY_COMPONENT value = TIMEMORY_COMPONENTS_END
template<typename Tp>
struct tim::component::properties : public tim::component::static_properties<Tp>

This is a critical specialization for mapping string and integers to component types at runtime. The enum_string() function is the enum id as a string. The id() function is (typically) the name of the C++ component as a string. The ids() function returns a set of strings which are alternative string identifiers to the enum string or the string ID. Additionally, it provides serializaiton of these values.

A macro is provides to simplify this specialization:

tparam Tp

Component type

TIMEMORY_PROPERTY_SPECIALIZATION(wall_clock, TIMEMORY_WALL_CLOCK, "wall_clock",
                                 "real_clock", "virtual_clock")

In the above, the first parameter is the C++ type, the second is the enumeration id, the enum string is automatically generated via preprocessor # on the second parameter, the third parameter is the string ID, and the remaining values are placed in the ids(). Additionally, this macro specializes the tim::component::enumerator.

Public Types

using type = Tp
using value_type = TIMEMORY_COMPONENT

Public Functions

template<typename Archive>
inline void serialize(Archive&, const unsigned int)
inline TIMEMORY_COMPONENT operator()()
inline constexpr operator TIMEMORY_COMPONENT() const

Public Static Functions

static inline constexpr bool specialized()
static inline constexpr const char *enum_string()
static inline constexpr const char *id()
static inline idset_t ids()

Public Static Attributes

static constexpr TIMEMORY_COMPONENT value = TIMEMORY_COMPONENTS_END
template<typename Tp, bool PlaceHolder = concepts::is_placeholder<Tp>::value>
struct static_properties

Provides three variants of a matches function for determining if a component is identified by a given string or enumeration value.

tparam Tp

Component type

tparam Placeholder

Whether or not the component type is a placeholder type that should be ignored during runtime initialization.

Subclassed by tim::component::properties< placeholder< nothing > >, tim::component::properties< Tp >

Timing Components

struct cpu_clock : public tim::component::base<cpu_clock>

this component extracts only the CPU time spent in both user- and kernel- mode. Only relevant as a time when a different is computed Do not use a single CPU time as an amount of time; it doesn’t work that way.

struct cpu_util : public tim::component::base<cpu_util, std::pair<int64_t, int64_t>>

this computes the CPU utilization percentage for the calling process and child processes. Only relevant as a time when a different is computed Do not use a single CPU time as an amount of time; it doesn’t work that way.

struct kernel_mode_time : public tim::component::base<kernel_mode_time, int64_t>

This is the total amount of time spent executing in kernel mode.

struct monotonic_clock : public tim::component::base<monotonic_clock>

clock that increments monotonically, tracking the time since an arbitrary point, and will continue to increment while the system is asleep.

struct monotonic_raw_clock : public tim::component::base<monotonic_raw_clock>

clock that increments monotonically, tracking the time since an arbitrary point like CLOCK_MONOTONIC. However, this clock is unaffected by frequency or time adjustments. It should not be compared to other system time sources.

struct process_cpu_clock : public tim::component::base<process_cpu_clock>

this clock measures the CPU time within the current process (excludes child processes). Only relevant as a time when a different is computed Do not use a single CPU time as an amount of time; it doesn’t work that way.

struct process_cpu_util : public tim::component::base<process_cpu_util, std::pair<int64_t, int64_t>>

this computes the CPU utilization percentage for ONLY the calling process (excludes child processes). Only relevant as a time when a different is computed Do not use a single CPU time as an amount of time; it doesn’t work that way.

struct system_clock : public tim::component::base<system_clock>

this component extracts only the CPU time spent in kernel-mode. Only relevant as a time when a different is computed Do not use a single CPU time as an amount of time; it doesn’t work that way.

struct thread_cpu_clock : public tim::component::base<thread_cpu_clock>

this clock measures the CPU time within the current thread (excludes sibling/child threads). Only relevant as a time when a different is computed Do not use a single CPU time as an amount of time; it doesn’t work that way.

struct thread_cpu_util : public tim::component::base<thread_cpu_util, std::pair<int64_t, int64_t>>

this computes the CPU utilization percentage for ONLY the calling thread (excludes sibling and child threads). Only relevant as a time when a different is computed Do not use a single CPU time as an amount of time; it doesn’t work that way.

struct user_clock : public tim::component::base<user_clock>

this component extracts only the CPU time spent in user-mode. Only relevant as a time when a different is computed Do not use a single CPU time as an amount of time; it doesn’t work that way.

struct user_mode_time : public tim::component::base<user_mode_time, int64_t>

This is the total amount of time spent executing in user mode.

struct wall_clock : public tim::component::base<wall_clock, int64_t>

Resource Usage Components

struct current_peak_rss : public tim::component::base<current_peak_rss, std::pair<int64_t, int64_t>>

this struct extracts the absolute value of high-water mark of the resident set size (RSS) at start and stop points. RSS is current amount of memory in RAM.

struct num_io_in : public tim::component::base<num_io_in>

the number of times the file system had to perform input.

struct num_io_out : public tim::component::base<num_io_out>

the number of times the file system had to perform output.

struct num_major_page_faults : public tim::component::base<num_major_page_faults>

the number of page faults serviced that required I/O activity.

struct num_minor_page_faults : public tim::component::base<num_minor_page_faults>

the number of page faults serviced without any I/O activity; here I/O activity is avoided by reclaiming a page frame from the list of pages awaiting reallocation.

struct page_rss : public tim::component::base<page_rss, int64_t>

this struct measures the resident set size (RSS) currently allocated in pages of memory. Unlike the peak_rss, this value will fluctuate as memory gets freed and allocated

struct peak_rss : public tim::component::base<peak_rss>

this struct extracts the high-water mark (or a change in the high-water mark) of the resident set size (RSS). Which is current amount of memory in RAM. When used on a system with swap enabled, this value may fluctuate but should not on an HPC system.

struct priority_context_switch : public tim::component::base<priority_context_switch>

the number of times a context switch resulted due to a higher priority process becoming runnable or because the current process exceeded its time slice

struct virtual_memory : public tim::component::base<virtual_memory>

this struct extracts the virtual memory usage

struct voluntary_context_switch : public tim::component::base<voluntary_context_switch>

the number of times a context switch resulted due to a process voluntarily giving up the processor before its time slice was completed (usually to await availability of a resource).

I/O Components

struct read_bytes : public tim::component::base<read_bytes, std::pair<int64_t, int64_t>>

I/O counter for bytes read. Attempt to count the number of bytes which this process really did cause to be fetched from the storage layer. Done at the submit_bio() level, so it is accurate for block-backed filesystems.

struct read_char : public tim::component::base<read_char, std::pair<int64_t, int64_t>>

I/O counter for chars read. The number of bytes which this task has caused to be read from storage. This is simply the sum of bytes which this process passed to read() and pread(). It includes things like tty IO and it is unaffected by whether or not actual physical disk IO was required (the read might have been satisfied from pagecache)

struct written_bytes : public tim::component::base<written_bytes, std::array<int64_t, 2>>

I/O counter for bytes written. Attempt to count the number of bytes which this process caused to be sent to the storage layer. This is done at page-dirtying time.

struct written_char : public tim::component::base<written_char, std::array<int64_t, 2>>

I/O counter for chars written. The number of bytes which this task has caused, or shall cause to be written to disk. Similar caveats apply here as with tim::component::read_char (rchar).

User Bundle Components

Timemory provides the user_bundle component as a generic component bundler that the user can use to insert components at runtime. This component is heavily used when mapping timemory to languages other than C++. Timemory implements many specialization of this template class for various tools. For example, user_mpip_bundle is the bundle used by the MPI wrappers, user_profiler_bundle is used by the Python function profiler, user_trace_bundle is used by the dynamic instrumentation tool timemory-run and the Python line tracing profiler, etc. These specialization are all individually configurable and it is recommended that applications create their own specialization specific to their project – this will ensure that the desired set of components configured by your application will not be affected by a third-party library configuring their own set of components.

The general design is that each user-bundle:

  • Has their own unique environment variable for exclusive configuration, usually "TIMEMORY_<LABEL>_COMPONENTS", e.g.:

    • "TIMEMORY_TRACE_COMPONENTS" for user_trace_bundle

    • "TIMEMORY_MPIP_COMPONENTS" for user_mpip_components

  • If the unique environment variable is set, only the components in the variable are used

    • Thus making the bundle uniquely configurable

  • If the unique environment variable is not set, it searches one or more backup environment variables, the last of which being "TIMEMORY_GLOBAL_COMPONENTS"

    • Thus, if no specific environment variables are set, all user bundles collect the components specified in "TIMEMORY_GLOBAL_COMPONENTS"

  • If the unique environment variable is set to "none", it terminates searching the backup environment variables

    • Thus, "TIMEMORY_GLOBAL_COMPONENTS" can be set but the user can suppress a specific bundle from being affected by this configuration

  • If the unique environment variable contains "fallthrough", it will continue adding the components specified by the backup environment variables

    • Thus, the components specified in "TIMEMORY_GLOBAL_COMPONENTS" and "TIMEMORY_<LABEL>_COMPONENTS" will be added

template<size_t Idx, typename Tag>
struct user_bundle : public tim::component::base<user_bundle<Idx, Tag>, void>, public tim::concepts::runtime_configurable, private tim::component::internal::user_bundle

Warning

doxygentypedef: Cannot find typedef “tim::component::user_global_bundle” in doxygen xml output for project “timemory” from directory: doxygen-xml

Warning

doxygentypedef: Cannot find typedef “tim::component::user_mpip_bundle” in doxygen xml output for project “timemory” from directory: doxygen-xml

Warning

doxygentypedef: Cannot find typedef “tim::component::user_ncclp_bundle” in doxygen xml output for project “timemory” from directory: doxygen-xml

Warning

doxygentypedef: Cannot find typedef “tim::component::user_ompt_bundle” in doxygen xml output for project “timemory” from directory: doxygen-xml

Warning

doxygentypedef: Cannot find typedef “tim::component::user_profiler_bundle” in doxygen xml output for project “timemory” from directory: doxygen-xml

Warning

doxygentypedef: Cannot find typedef “tim::component::user_trace_bundle” in doxygen xml output for project “timemory” from directory: doxygen-xml

Warning

doxygentypedef: Cannot find typedef “tim::component::user_kokkosp_bundle” in doxygen xml output for project “timemory” from directory: doxygen-xml

Third-Party Interface Components

struct allinea_map : public tim::component::base<allinea_map, void>, private tim::policy::instance_tracker<allinea_map, false>

Controls the AllineaMap sampling profiler.

struct caliper_marker : public tim::component::base<caliper_marker, void>, public tim::component::base<caliper_marker, void>, public tim::component::caliper_common

Standard marker for the Caliper Performance Analysis Toolbox.

struct caliper_config : public tim::component::base<caliper_config, void>, public tim::component::base<caliper_config, void>, private tim::policy::instance_tracker<caliper_config, false>

Component which provides Caliper cali::ConfigManager.

struct caliper_loop_marker : public tim::component::base<caliper_loop_marker, void>, public tim::component::base<caliper_loop_marker, void>, public tim::component::caliper_common

Loop marker for the Caliper Performance Analysis Toolbox.

struct craypat_counters : public tim::component::base<craypat_counters, std::vector<unsigned long>>
struct craypat_flush_buffer : public tim::component::base<craypat_flush_buffer, unsigned long>

Writes all the recorded contents in the data buffer. Returns the number of bytes flushed.

struct craypat_heap_stats : public tim::component::base<craypat_heap_stats, void>

Dumps the craypat heap statistics.

struct craypat_record : public tim::component::base<craypat_record, void>, private tim::policy::instance_tracker<craypat_record>

Provides scoping the CrayPAT profiler. Global initialization stops the profiler, the first call to start() starts the profiler again on the calling thread. Instance counting is enabled per-thread and each call to start increments the counter. All calls to stop() have no effect until the counter reaches zero, at which point the compiler is turned off again.

struct craypat_region : public tim::component::base<craypat_region, void>, private tim::policy::instance_tracker<craypat_region, false>

Adds a region label to the CrayPAT profiling output.

Retrieves the names and value of any counter events that have been set to count on the hardware category.

struct gperftools_cpu_profiler : public tim::component::base<gperftools_cpu_profiler, void>
struct gperftools_heap_profiler : public tim::component::base<gperftools_heap_profiler, void>
struct likwid_marker : public tim::component::base<likwid_marker, void>

Provides likwid perfmon marker forwarding. Requires.

struct likwid_nvmarker : public tim::component::base<likwid_nvmarker, void>

Provides likwid nvmon marker forwarding. Requires.

template<typename Api>
struct ompt_handle : public tim::component::base<ompt_handle<Api>, void>, private tim::policy::instance_tracker<ompt_handle<Api>>
struct tau_marker : public tim::component::base<tau_marker, void>

Forwards timemory labels to the TAU (Tuning and Analysis Utilities)

struct vtune_event : public tim::component::base<vtune_event, void>

Implements __itt_event

struct vtune_frame : public tim::component::base<vtune_frame, void>

Implements __itt_domain

struct vtune_profiler : public tim::component::base<vtune_profiler, void>, private tim::policy::instance_tracker<vtune_profiler, false>

Implements __itt_pause() and __itt_resume() to control where the vtune profiler is active.

Hardware Counter Components

template<int... EventTypes>
struct papi_tuple : public tim::component::base<papi_tuple<EventTypes...>, std::array<long long, sizeof...(EventTypes)>>, private tim::policy::instance_tracker<papi_tuple<EventTypes...>>, private tim::component::papi_common

This component is useful for bundling together a fixed set of hardware counter identifiers which require no runtime configuration.

// the "Instructions" alias below explicitly collects the total instructions,
// the number of load instructions, the number of store instructions
using Instructions = papi_tuple<PAPI_TOT_INS, PAPI_LD_INS, PAPI_SR_INS>;

Instructions inst{};
inst.start();
...
inst.stop();
std::vector<double> data = inst.get();
tparam EventTypes

Compile-time constant list of PAPI event identifiers

template<typename RateT, int... EventTypes>
struct papi_rate_tuple : public tim::component::base<papi_rate_tuple<RateT, EventTypes...>, std::pair<papi_tuple<EventTypes...>, RateT>>, private tim::component::papi_common

This component pairs a tim::component::papi_tuple with a component which will provide an interval over which the hardware counters will be reported, e.g. if RateT is tim::component::wall_clock, the reported values will be the hardware-counters w.r.t. the wall-clock time. If RateT is tim::component::cpu_clock, the reported values will be the hardware counters w.r.t. the cpu time.

// the "Instructions" alias below explicitly collects the total instructions per
second,
// the number of load instructions per second, the number of store instructions per
second using Instructions = papi_rate_tuple<wall_clock, PAPI_TOT_INS, PAPI_LD_INS,
PAPI_SR_INS>;

Instructions inst{};
inst.start();
...
inst.stop();
std::vector<double> data = inst.get();
tparam RateT

Component whose value will be the divisor for all the hardware counters

tparam EventTypes

Compile-time constant list of PAPI event identifiers

template<size_t MaxNumEvents>
struct papi_array : public tim::component::base<papi_array<MaxNumEvents>, std::array<long long, MaxNumEvents>>, private tim::policy::instance_tracker<papi_array<MaxNumEvents>>, private tim::component::papi_common
struct papi_vector : public tim::component::base<papi_vector, std::vector<long long>>, private tim::policy::instance_tracker<papi_vector>, private tim::component::papi_common

Miscellaneous Components

template<typename ...Types>
struct cpu_roofline : public tim::component::base<cpu_roofline<Types...>, std::pair<std::vector<long long>, double>>

Combines hardware counters and timers and executes the empirical roofline toolkit during application termination to estimate the peak possible performance for the machine.

tparam Types

Variadic list of data types for roofline analysis

typedef cpu_roofline<double> tim::component::cpu_roofline_dp_flops

A specialization of tim::component::cpu_roofline for 64-bit floating point operations.

using tim::component::cpu_roofline_flops = cpu_roofline<float, double>
typedef cpu_roofline<float> tim::component::cpu_roofline_sp_flops

A specialization of tim::component::cpu_roofline for 32-bit floating point operations.

GPU Components

struct cuda_event : public tim::component::base<cuda_event, float>

Records the time interval between two points in a CUDA stream. Less accurate than ‘cupti_activity’ for kernel timing but does not require linking to the CUDA driver.

struct cupti_activity : public tim::component::base<cupti_activity, intmax_t>

CUPTI activity tracing component for high-precision kernel timing. For low-precision kernel timing, use tim::component::cuda_event component.

struct cupti_counters : public tim::component::base<cupti_counters, cupti::profiler::results_t>

NVprof-style hardware counters via the CUpti callback API. Collecting these hardware counters has a higher overhead than the new CUpti Profiling API (tim::component::cupti_profiler). However, there are currently some issues with nesting the Profiling API and it is currently recommended to use this component for NVIDIA hardware counters in timemory. The callback API / NVprof is quite specific about the distinction between an “event” and a “metric”. For your convenience, timemory removes this distinction and events can be specified arbitrarily as metrics and vice-versa and this component will sort them into their appropriate category. For the full list of the available events/metrics, use timemory-avail -H from the command-line.

Warning

doxygenstruct: Cannot find class “tim::component::cupti_profiler” in doxygen xml output for project “timemory” from directory: doxygen-xml

template<typename ...Types>
struct gpu_roofline : public tim::component::base<gpu_roofline<Types...>, std::tuple<cupti_activity::value_type, cupti_counters::value_type>>

Combines hardware counters and timers and executes the empirical roofline toolkit during application termination to estimate the peak possible performance for the machine.

tparam Types

Variadic list of data types for roofline analysis

typedef gpu_roofline<double> tim::component::gpu_roofline_dp_flops

A specialization of tim::component::gpu_roofline for 64-bit floating point operations.

using tim::component::gpu_roofline_flops = gpu_roofline<float, double>
typedef gpu_roofline<cuda::fp16_t> tim::component::gpu_roofline_hp_flops

A specialization of tim::component::gpu_roofline for 16-bit floating point operations (depending on availability).

typedef gpu_roofline<float> tim::component::gpu_roofline_sp_flops

A specialization of tim::component::gpu_roofline for 32-bit floating point operations.

struct tim::component::nvtx_marker : public tim::component::base<nvtx_marker, void>

Inserts NVTX markers with the current timemory prefix. The default color scheme is a round-robin of red, blue, green, yellow, purple, cyan, pink, and light_green. These colors.

Public Functions

inline explicit nvtx_marker(const nvtx::color::color_t &_color)

construct with an specific color

inline explicit nvtx_marker(cuda::stream_t _stream)

construct with an specific CUDA stream

inline nvtx_marker(const nvtx::color::color_t &_color, cuda::stream_t _stream)

construct with an specific color and CUDA stream

inline void start()

start an nvtx range. Equivalent to nvtxRangeStartEx

inline void stop()

stop the nvtx range. Equivalent to nvtxRangeEnd. Depending on settings::nvtx_marker_device_sync() this will either call cudaDeviceSynchronize() or cudaStreamSynchronize(m_stream) before stopping the range.

inline void mark_begin()

asynchronously add a marker. Equivalent to nvtxMarkA

inline void mark_end()

asynchronously add a marker. Equivalent to nvtxMarkA

inline void mark_begin(cuda::stream_t _stream)

asynchronously add a marker for a specific stream. Equivalent to nvtxMarkA

inline void mark_end(cuda::stream_t _stream)

asynchronously add a marker for a specific stream. Equivalent to nvtxMarkA

inline void set_stream(cuda::stream_t _stream)

set the current CUDA stream

inline void set_color(nvtx::color::color_t _color)

set the current color

Data Tracking Components

template<typename InpT, typename Tag>
struct tim::component::data_tracker : public tim::component::base<data_tracker<InpT, Tag>, InpT>

This component is provided to facilitate data tracking. The first template parameter is the type of data to be tracked, the second is a custom tag for differentiating trackers which handle the same data types but record different high-level data.

Usage:

// declarations

struct myproject {};

using itr_tracker_type   = data_tracker<uint64_t, myproject>;
using err_tracker_type   = data_tracker<double, myproject>;

// add statistics capabilities
TIMEMORY_STATISTICS_TYPE(itr_tracker_type, int64_t)
TIMEMORY_STATISTICS_TYPE(err_tracker_type, double)

// set the label and descriptions
TIMEMORY_METADATA_SPECIALIZATION(
    itr_tracker_type, "myproject_iterations", "short desc", "long description")

TIMEMORY_METADATA_SPECIALIZATION(
    err_tracker_type, "myproject_convergence", "short desc", "long description")

// this is the generic bundle pairing a timer with an iteration tracker
// using this and not updating the iteration tracker will create entries
// in the call-graph with zero iterations.
using bundle_t           = tim::auto_tuple<wall_clock, itr_tracker_type>;

// this is a dedicated bundle for adding data-tracker entries. This style
// can also be used with the iteration tracker or you can bundle
// both trackers together. The auto_tuple will call start on construction
// and stop on destruction so once can construct a nameless temporary of the
// this bundle type and call store(...) on the nameless tmp. This will
// ensure that the statistics are updated for each entry
//
using err_bundle_t       = tim::auto_tuple<err_tracker_type>;

// usage in a function is implied below

double err             = std::numeric_limits<double>::max();
const double tolerance = 1.0e-6;

bundle_t t("iteration_time");

while(err > tolerance)
{
    // store the starting error
    double initial_err = err;

    // add 1 for each iteration. Stats only updated when t is destroyed or t.stop() is
    // called t.store(std::plus<uint64_t>{}, 1);

    // ... do something ...

    // construct a nameless temporary which records the change in the error and
    // update the statistics <-- "foo" will have mean/min/max/stddev of the
    // error
    err_bundle_t{ "foo" }.store(err - initial_err);

    // NOTE: std::plus is used with t above bc it has multiple components so std::plus
    // helps ensure 1 doesn't get added to some other component with `store(int)`
    // In above err_bundle_t, there is only one component so there is not concern.
}

When creating new data trackers, it is recommended to have this in header:

TIMEMORY_DECLARE_EXTERN_COMPONENT(custom_data_tracker_t, true, data_type)

And this in one source file (preferably one that is not re-compiled often)

TIMEMORY_INSTANTIATE_EXTERN_COMPONENT(custom_data_tracker_t, true, data_type)
TIMEMORY_INITIALIZE_STORAGE(custom_data_tracker_t)

where custom_data_tracker_t is the custom data tracker type (or an alias to the type) and data_type is the data type being tracked.

Public Functions

inline auto get() const

get the data in the final form after unit conversion

inline auto get_display() const

get the data in a form suitable for display

inline auto get_secondary() const

map of the secondary entries. When TIMEMORY_ADD_SECONDARY is enabled contents of this map will be added as direct children of the current node in the call-graph.

template<typename T>
void store(T &&val, enable_if_acceptable_t<T, int> = 0)

store some data. Uses tim::data::handler for the type.

template<typename T>
void store(handler_type&&, T &&val, enable_if_acceptable_t<T, int> = 0)

overload which takes a handler to ensure proper overload resolution

template<typename FuncT, typename T>
auto store(FuncT &&f, T &&val, enable_if_acceptable_t<T, int> = 0) -> decltype(std::declval<handler_type>().store(*this, std::forward<FuncT>(f), std::forward<T>(val)), void())

overload which uses a lambda to bypass the default behavior of how the handler updates the values

template<typename FuncT, typename T>
auto store(handler_type&&, FuncT &&f, T &&val, enable_if_acceptable_t<T, int> = 0) -> decltype(std::declval<handler_type>().store(*this, std::forward<FuncT>(f), std::forward<T>(val)), void())

overload which uses a lambda to bypass the default behavior of how the handler updates the values and takes a handler to ensure proper overload resolution

template<typename T>
void mark_begin(T &&val, enable_if_acceptable_t<T, int> = 0)

The combination of mark_begin(...) and mark_end(...) can be used to store some initial data which may be needed later. When mark_end(...) is called, the value is updated with the difference of the value provided to mark_end and the temporary stored during mark_begin.

template<typename T>
void mark_begin(handler_type&&, T &&val, enable_if_acceptable_t<T, int> = 0)

overload which takes a handler to ensure proper overload resolution

template<typename FuncT, typename T>
void mark_begin(FuncT &&f, T &&val, enable_if_acceptable_t<T, int> = 0)

overload which uses a lambda to bypass the default behavior of how the handler updates the values

template<typename FuncT, typename T>
void mark_begin(handler_type&&, FuncT &&f, T &&val, enable_if_acceptable_t<T, int> = 0)

overload which uses a lambda to bypass the default behavior of how the handler updates the values and takes a handler to ensure proper overload resolution

template<typename T>
void mark_end(T &&val, enable_if_acceptable_t<T, int> = 0)

The combination of mark_begin(...) and mark_end(...) can be used to store some initial data which may be needed later. When mark_end(...) is called, the value is updated with the difference of the value provided to mark_end and the temporary stored during mark_begin. It may be valid to call mark_end without calling mark_begin but the result will effectively be a more expensive version of calling store.

template<typename T>
void mark_end(handler_type&&, T &&val, enable_if_acceptable_t<T, int> = 0)

overload which takes a handler to ensure proper overload resolution

template<typename FuncT, typename T>
void mark_end(FuncT &&f, T &&val, enable_if_acceptable_t<T, int> = 0)

overload which uses a lambda to bypass the default behavior of how the handler updates the values

template<typename FuncT, typename T>
void mark_end(handler_type&&, FuncT &&f, T &&val, enable_if_acceptable_t<T, int> = 0)

overload which uses a lambda to bypass the default behavior of how the handler updates the values and takes a handler to ensure proper overload resolution

template<typename T>
this_type *add_secondary(const std::string &_key, T &&val, enable_if_acceptable_t<T, int> = 0)

add a secondary value to the current node in the call-graph. When TIMEMORY_ADD_SECONDARY is enabled contents of this map will be added as direct children of the current node in the call-graph. This is useful for finer-grained details that might not always be desirable to display

template<typename T>
this_type *add_secondary(const std::string &_key, handler_type &&h, T &&val, enable_if_acceptable_t<T, int> = 0)

overload which takes a handler to ensure proper overload resolution

template<typename FuncT, typename T>
this_type *add_secondary(const std::string &_key, FuncT &&f, T &&val, enable_if_acceptable_t<T, int> = 0)

overload which uses a lambda to bypass the default behavior of how the handler updates the values

template<typename FuncT, typename T>
this_type *add_secondary(const std::string &_key, handler_type &&h, FuncT &&f, T &&val, enable_if_acceptable_t<T, int> = 0)

overload which uses a lambda to bypass the default behavior of how the handler updates the values and takes a handler to ensure proper overload resolution

inline void set_value(const value_type &v)

set the current value

inline void set_value(value_type &&v)

set the current value via move

Public Static Functions

static inline std::string &label()

a reference is returned here so that it can be easily updated

static std::string &description()

a reference is returned here so that it can be easily updated

static inline auto &get_unit()

this returns a reference so that it can be easily modified

template<typename LhsT, typename RhsT, typename HashT = std::hash<LhsT>>
class vector_map : public std::vector<std::pair<LhsT, RhsT>>
typedef data_tracker<intmax_t, TIMEMORY_API> tim::component::data_tracker_integer
typedef data_tracker<size_t, TIMEMORY_API> tim::component::data_tracker_unsigned
using tim::component::data_tracker_floating = data_tracker<double, TIMEMORY_API>

Function Wrapping Components

template<size_t Nt, typename BundleT, typename DiffT>
struct tim::component::gotcha : public tim::component::base<gotcha<Nt, BundleT, DiffT>, void>, public tim::concepts::external_function_wrapper

The gotcha component rewrites the global offset table such that calling the wrapped function actually invokes either a function which is wrapped by timemory instrumentation or is replaced by a timemory component with an function call operator (operator()) whose return value and arguments exactly match the original function. This component is only available on Linux and can only by applied to external, dynamically-linked functions (i.e. functions defined in a shared library). If the BundleT template parameter is a non-empty component bundle, this component will surround the original function call with:

bundle_type _obj{ "<NAME-OF-ORIGINAL-FUNCTION>" };
_obj.construct(_args...);
_obj.start();
_obj.audit("<NAME-OF-ORIGINAL-FUNCTION>", _args...);

Ret _ret = <CALL-ORIGINAL-FUNCTION>

_obj.audit("<NAME-OF-ORIGINAL-FUNCTION>", _ret);
_obj.stop();
tparam Nt

Max number of functions which will wrapped by this component

tparam BundleT

Component bundle to wrap around the function(s)

tparam DiffT

Differentiator type to distinguish different sets of wrappers with identical values of Nt and BundleT (or provide function call operator if replacing functions instead of wrapping functions)

If the BundleT template parameter is an empty variadic class, e.g. std::tuple<>, tim::component_tuple<>, etc., and the DiffT template parameter is a timemory component, the assumption is that the DiffT component has a function call operator which should replace the original function call, e.g. void* malloc(size_t) can be replaced with a component with void* operator()(size_t), e.g.:

// replace 'double exp(double)'
struct exp_replace : base<exp_replace, void>
{
    double operator()(double value)
    {
        float result = expf(static_cast<float>(value));
        return static_cast<double>(result);
    }
};

Example usage:

#include <timemory/timemory.hpp>

#include <cassert>
#include <cmath>
#include <tuple>

using empty_tuple_t = std::tuple<>;
using base_bundle_t = tim::component_tuple<wall_clock, cpu_clock>;
using gotcha_wrap_t = tim::component::gotcha<2, base_bundle_t, void>;
using gotcha_repl_t = tim::component::gotcha<2, empty_tuple_t, exp_replace>;
using impl_bundle_t = tim::mpl::append_type_t<base_bundle_t,
                                              tim::type_list<gotcha_wrap_t,
                                                             gotcha_repl_t>>;

void init_wrappers()
{
    // wraps the sin and cos math functions
    gotcha_wrap_t::get_initializer() = []()
    {
        TIMEMORY_C_GOTCHA(gotcha_wrap_t, 0, sin);   // index 0 replaces sin
        TIMEMORY_C_GOTCHA(gotcha_wrap_t, 1, cos);   // index 1 replace cos
    };

    // replaces the 'exp' function which may be 'exp' in symbols table
    // or '__exp_finite' in symbols table (use `nm <bindary>` to determine)
    gotcha_repl_t::get_initializer() = []()
    {
        TIMEMORY_C_GOTCHA(gotcha_repl_t, 0, exp);
        TIMEMORY_DERIVED_GOTCHA(gotcha_repl_t, 1, exp, "__exp_finite");
    };
}

// the following is useful to avoid having to call 'init_wrappers()' explicitly:
// use comma operator to call 'init_wrappers' and return true
static auto called_init_at_load = (init_wrappers(), true);

int main()
{
    assert(called_init_at_load == true);

    double angle = 45.0 * (M_PI / 180.0);

    impl_bundle_t _obj{ "main" };

    // gotcha wrappers not activated yet
    printf("cos(%f) = %f\n", angle, cos(angle));
    printf("sin(%f) = %f\n", angle, sin(angle));
    printf("exp(%f) = %f\n", angle, exp(angle));

    // gotcha wrappers are reference counted according to start/stop
    _obj.start();

    printf("cos(%f) = %f\n", angle, cos(angle));
    printf("sin(%f) = %f\n", angle, sin(angle));
    printf("exp(%f) = %f\n", angle, exp(angle));

    _obj.stop();

    // gotcha wrappers will be deactivated
    printf("cos(%f) = %f\n", angle, cos(angle));
    printf("sin(%f) = %f\n", angle, sin(angle));
    printf("exp(%f) = %f\n", angle, exp(angle));

    return 0;
}

Public Static Functions

static inline get_select_list_t &get_permit_list()

when a permit list is provided, only these functions are wrapped by GOTCHA

static inline get_select_list_t &get_reject_list()

reject listed functions are never wrapped by GOTCHA

static inline void add_global_suppression(const std::string &func)

add function names at runtime to suppress wrappers

static inline auto get_ready()

get an array of whether the wrappers are filled and ready

static inline auto set_ready(bool val)

set filled wrappers to array of ready values

static inline auto set_ready(const std::array<bool, Nt> &values)

set filled wrappers to array of ready values

template<size_t N, typename Ret, typename ...Args>
struct instrument
struct tim::component::malloc_gotcha : public tim::component::base<malloc_gotcha, double>, public tim::concepts::external_function_wrapper

Public Functions

inline void audit(audit::incoming, size_t nbytes)

nbytes is passed to malloc

inline void audit(audit::incoming, size_t nmemb, size_t size)

nmemb and size is passed to calloc

inline void audit(audit::outgoing, void *ptr)

void* is returned from malloc and calloc

inline void audit(audit::incoming, void *ptr)

void* is passed to free

struct memory_allocations : public tim::component::base<memory_allocations, void>, public tim::concepts::external_function_wrapper, private tim::policy::instance_tracker<memory_allocations, true>

This component wraps malloc, calloc, free, CUDA/HIP malloc/free via GOTCHA and tracks the number of bytes requested/freed in each call. This component is useful for detecting the locations where memory re-use would provide a performance benefit.

Base Components

template<typename Tp, typename Value>
struct tim::component::base : public tim::component::empty_base, private tim::component::base_state, private base_data_t<Tp, Value>, public tim::concepts::component

Public Types

using EmptyT = std::tuple<>
template<typename U>
using vector_t = std::vector<U>
using Type = Tp
using value_type = Value
using data_type = base_data_t<Tp, Value>
using accum_type = typename data_type::accum_type
using last_type = typename data_type::last_type
using dynamic_type = typename trait::dynamic_base<Tp>::type
using cache_type = typename trait::cache<Tp>::type
using this_type = Tp
using base_type = base<Tp, Value>
using base_storage_type = tim::base::storage
using storage_type = storage<Tp, Value>
using graph_iterator = graph_iterator_t<Tp>
using state_t = state<this_type>
using statistics_policy = policy::record_statistics<Tp, Value>
using fmtflags = std::ios_base::fmtflags

Public Functions

~base() = default
void set_started()

store that start has been called

void set_stopped()

store that stop has been called

void reset()

reset the values

void get(void *&ptr, size_t _typeid_hash) const

assign type to a pointer

inline auto get() const

retrieve the current measurement value in the units for the type

inline auto get_display() const

retrieve the current measurement value in the units for the type in a format that can be piped to the output stream operator (‘<<’)

inline Type &operator+=(const Type &rhs)
inline Type &operator-=(const Type &rhs)
inline Type &operator*=(const Type &rhs)
inline Type &operator/=(const Type &rhs)
inline Type &operator+=(const Value &rhs)
inline Type &operator-=(const Value &rhs)
inline Type &operator*=(const Value &rhs)
inline Type &operator/=(const Value &rhs)
template<typename Up = Tp>
void print(std::ostream&, enable_if_t<trait::uses_value_storage<Up, Value>::value, int> = 0) const
template<typename Up = Tp>
void print(std::ostream&, enable_if_t<!trait::uses_value_storage<Up, Value>::value, long> = 0) const
template<typename Archive, typename Up = Type, enable_if_t<!trait::custom_serialization<Up>::value, int> = 0>
void load(Archive &ar, unsigned int)

serialization load (input)

template<typename Archive, typename Up = Type, enable_if_t<!trait::custom_serialization<Up>::value, int> = 0>
void save(Archive &ar, unsigned int version) const

serialization store (output)

inline int64_t get_laps() const

add a sample

get number of measurement

inline auto get_iterator() const
inline void set_laps(int64_t v)
inline void set_iterator(graph_iterator itr)
inline decltype(auto) load()
inline decltype(auto) load() const
inline auto plus(crtp::base, const base_type &rhs)
inline auto minus(crtp::base, const base_type &rhs)
inline bool get_depth_change() const
inline bool get_is_flat() const
inline bool get_is_invalid() const
inline bool get_is_on_stack() const
inline bool get_is_running() const
inline bool get_is_transient() const
inline void set_depth_change(bool v)
inline void set_is_flat(bool v)
inline void set_is_invalid(bool v)
inline void set_is_on_stack(bool v)
inline void set_is_running(bool v)
inline void set_is_transient(bool v)

Public Static Functions

template<typename ...Args>
static inline void configure(Args&&...)
static opaque get_opaque(scope::config)

get the opaque binding for user-bundle

template<typename Vp, typename Up = Tp, enable_if_t<trait::sampler<Up>::value, int> = 0>
static void add_sample(Vp&&)
static base_storage_type *get_storage()
template<typename Up = Type, typename UnitT = typename trait::units<Up>::type, enable_if_t<std::is_same<UnitT, int64_t>::value, int> = 0>
static int64_t unit()
template<typename Up = Type, typename UnitT = typename trait::units<Up>::display_type, enable_if_t<std::is_same<UnitT, std::string>::value, int> = 0>
static std::string display_unit()
template<typename Up = Type, typename UnitT = typename trait::units<Up>::type, enable_if_t<std::is_same<UnitT, int64_t>::value, int> = 0>
static int64_t get_unit()
template<typename Up = Type, typename UnitT = typename trait::units<Up>::display_type, enable_if_t<std::is_same<UnitT, std::string>::value, int> = 0>
static std::string get_display_unit()
static short get_width()
static short get_precision()
static fmtflags get_format_flags()
static std::string label()
static std::string description()
static std::string get_label()
static std::string get_description()
template<typename ...Args>
static inline opaque get_opaque(Args&&...)

Public Static Attributes

static constexpr bool is_component = true
static constexpr bool timing_category_v = trait::is_timing_category<Type>::value
static constexpr bool memory_category_v = trait::is_memory_category<Type>::value
static constexpr bool timing_units_v = trait::uses_timing_units<Type>::value
static constexpr bool memory_units_v = trait::uses_memory_units<Type>::value
static constexpr bool percent_units_v = trait::uses_percent_units<Type>::value
static constexpr auto ios_fixed = std::ios_base::fixed
static constexpr auto ios_decimal = std::ios_base::dec
static constexpr auto ios_showpoint = std::ios_base::showpoint
static const fmtflags format_flags = ios_fixed | ios_decimal | ios_showpoint

Friends

friend struct node::graph< Tp >
friend struct operation::init_storage< Tp >
friend struct operation::fini_storage< Tp >
friend struct operation::cache< Tp >
friend struct operation::construct< Tp >
friend struct operation::set_prefix< Tp >
friend struct operation::push_node< Tp >
friend struct operation::pop_node< Tp >
friend struct operation::record< Tp >
friend struct operation::reset< Tp >
friend struct operation::measure< Tp >
friend struct operation::start< Tp >
friend struct operation::stop< Tp >
friend struct operation::set_started< Tp >
friend struct operation::set_stopped< Tp >
friend struct operation::minus< Tp >
friend struct operation::plus< Tp >
friend struct operation::multiply< Tp >
friend struct operation::divide< Tp >
friend struct operation::base_printer< Tp >
friend struct operation::print< Tp >
friend struct operation::print_storage< Tp >
friend struct operation::copy< Tp >
friend struct operation::sample< Tp >
friend struct operation::serialization< Tp >
friend struct operation::finalize::get< Tp, true >
friend struct operation::finalize::get< Tp, false >
friend struct operation::finalize::merge< Tp, true >
friend struct operation::finalize::merge< Tp, false >
friend struct operation::finalize::print< Tp, true >
friend struct operation::finalize::print< Tp, false >
friend struct operation::compose
inline friend std::ostream &operator<<(std::ostream &os, const base_type &obj)
struct tim::component::empty_base

A very lightweight base which provides no storage.

Subclassed by tim::component::base< mpi_trace_gotcha, void >, tim::component::base< pthread_gotcha, void >, tim::component::base< allinea_map, void >, tim::component::base< caliper_config, void >, tim::component::base< caliper_loop_marker, void >, tim::component::base< caliper_marker, void >, tim::component::base< cpu_clock >, tim::component::base< cpu_roofline< Types… >, std::pair< std::vector< long long >, double > >, tim::component::base< cpu_util, std::pair< int64_t, int64_t > >, tim::component::base< craypat_counters, std::vector< unsigned long > >, tim::component::base< craypat_flush_buffer, unsigned long >, tim::component::base< craypat_heap_stats, void >, tim::component::base< craypat_record, void >, tim::component::base< craypat_region, void >, tim::component::base< cuda_event, float >, tim::component::base< cuda_profiler, void >, tim::component::base< cupti_activity, intmax_t >, tim::component::base< cupti_counters, cupti::profiler::results_t >, tim::component::base< cupti_pcsampling, cupti::pcsample >, tim::component::base< current_peak_rss, std::pair< int64_t, int64_t > >, tim::component::base< data_tracker< InpT, Tag >, InpT >, tim::component::base< gotcha< Nt, BundleT, DiffT >, void >, tim::component::base< gperftools_cpu_profiler, void >, tim::component::base< gperftools_heap_profiler, void >, tim::component::base< gpu_roofline< Types… >, std::tuple< cupti_activity::value_type, cupti_counters::value_type > >, tim::component::base< hip_event, float >, tim::component::base< kernel_mode_time, int64_t >, tim::component::base< likwid_marker, void >, tim::component::base< likwid_nvmarker, void >, tim::component::base< malloc_gotcha, double >, tim::component::base< memory_allocations, void >, tim::component::base< monotonic_clock >, tim::component::base< monotonic_raw_clock >, tim::component::base< mpip_handle< Toolset, Tag >, void >, tim::component::base< ncclp_handle< Toolset, Tag >, void >, tim::component::base< network_stats, cache::network_stats >, tim::component::base< nothing, skeleton::base >, tim::component::base< num_io_in >, tim::component::base< num_io_out >, tim::component::base< num_major_page_faults >, tim::component::base< num_minor_page_faults >, tim::component::base< nvtx_marker, void >, tim::component::base< ompt_data_tracker< Api >, void >, tim::component::base< ompt_handle< Api >, void >, tim::component::base< page_rss, int64_t >, tim::component::base< papi_array< MaxNumEvents >, std::array< long long, MaxNumEvents > >, tim::component::base< papi_rate_tuple< RateT, EventTypes… >, std::pair< papi_tuple< EventTypes… >, RateT > >, tim::component::base< papi_tuple< EventTypes… >, std::array< long long, sizeof…(EventTypes)> >, tim::component::base< papi_vector, std::vector< long long > >, tim::component::base< peak_rss >, tim::component::base< perfetto_trace, void >, tim::component::base< placeholder< Types… >, void >, tim::component::base< priority_context_switch >, tim::component::base< process_cpu_clock >, tim::component::base< process_cpu_util, std::pair< int64_t, int64_t > >, tim::component::base< read_bytes, std::pair< int64_t, int64_t > >, tim::component::base< read_char, std::pair< int64_t, int64_t > >, tim::component::base< roctx_marker, void >, tim::component::base< system_clock >, tim::component::base< tau_marker, void >, tim::component::base< thread_cpu_clock >, tim::component::base< thread_cpu_util, std::pair< int64_t, int64_t > >, tim::component::base< timestamp, timestamp_entry_t >, tim::component::base< trip_count >, tim::component::base< user_bundle< Idx, Tag >, void >, tim::component::base< user_clock >, tim::component::base< user_mode_time, int64_t >, tim::component::base< virtual_memory >, tim::component::base< voluntary_context_switch >, tim::component::base< vtune_event, void >, tim::component::base< vtune_frame, void >, tim::component::base< vtune_profiler, void >, tim::component::base< wall_clock, int64_t >, tim::component::base< written_bytes, std::array< int64_t, 2 > >, tim::component::base< written_char, std::array< int64_t, 2 > >, tim::component::base< kernel_logger, void >, tim::component::base< sampler< CompT< Types… >, N, SigIds… >, void >, tim::component::base< Tp, Value >, tim::component::base< Tp, void >, tim::component::printer

Public Types

using storage_type = empty_storage
using base_type = void

Public Functions

inline void get() const

Public Static Functions

template<typename ...Args>
static inline opaque get_opaque(Args&&...)