timemory 3.3.0
Modular C++ Toolkit for Performance Analysis and Logging. Profiling API and Tools for C, C++, CUDA, Fortran, and Python. The C++ template API is essentially a framework to creating tools: it is designed to provide a unifying interface for recording various performance measurements alongside data logging and interfaces to other tools.
|
#include "timemory/backends/device.hpp"
#include "timemory/components/cuda/backends.hpp"
#include "timemory/defines.h"
#include "timemory/ert/aligned_allocator.hpp"
#include "timemory/ert/counter.hpp"
#include "timemory/ert/data.hpp"
#include "timemory/ert/kernels.hpp"
#include "timemory/ert/types.hpp"
#include "timemory/settings/declaration.hpp"
#include <cstdint>
#include <functional>
Go to the source code of this file.
Classes | |
struct | tim::ert::configuration< DeviceT, Tp, CounterT > |
struct | tim::ert::executor< DeviceT, Tp, CounterT > |
struct | tim::ert::executor< device::gpu, Tp, CounterT > |
struct | tim::ert::callback< ExecutorT > |
for variadic expansion to set the callback More... | |
Namespaces | |
namespace | tim |
namespace | tim::ert |
Macros | |
#define | TIMEMORY_VEC 256 |
#define | TIMEMORY_USER_ERT_FLOPS |
Functions | |
template<typename DeviceT , typename CounterT , typename Tp , typename... Types, typename DataType = exec_data<CounterT>, typename DataPtr = std::shared_ptr<DataType>, typename std::enable_if<(sizeof...(Types)==0), int >::type = 0> | |
std::shared_ptr< DataType > | tim::ert::execute (std::shared_ptr< DataType > _data=std::make_shared< DataType >()) |
Provides configuration for executing empirical roofline toolkit (ERT)
Definition in file configuration.hpp.
#define TIMEMORY_USER_ERT_FLOPS |
Definition at line 52 of file configuration.hpp.
#define TIMEMORY_VEC 256 |
Definition at line 48 of file configuration.hpp.