API¶
This section provides the high-level details for general and toolkit usage. The library and Python APIs are recommended for general use and the C++ toolkit is recommended for building performance monitoring frameworks into their APIs. Detailed documentation on the toolkit API can be found in the Doxygen section.
- C / C++ / Fortran Library
- Python Bindings
- C++ Variadic Template Bundlers
- C++ Type-Traits
- C++ Concepts
- C++ Policies
- C++ Operations
- C++ Utilities
- Memory Management
Basic Usage¶
See the Getting Started Basics Section.
Creating Custom Tools/Components¶
Written in C++
Direct access to performance analysis data in Python and C++
Create your own components: any one-time measurement or start/stop paradigm can be wrapped with timemory
Flexible and easily extensible interface: no data type restrictions in custom components
Custom Logging Component Example¶
Thanks to the ability to pass pretty much anything to a component, timemory components are not limited to
performance measurements: they can serve as very efficient logging and debugging components since they can
easily be compiled out of the code by marking them with the is_available
type-trait set to false.
// This "component" is for conceptual demonstration only
// It is not intended to be copy+pasted
struct log_message : base<log_message, void>
{
public:
// use return type SFINAE to check whether "val" supports operator<<
// if Tp does not support <<, then this function will not be called
template <typename Tp>
auto start(const char* label, const Tp& val) -> decltype(std::cerr << val, void())
{
std::cerr << "[LOG:START][" << label << "]> " << msg << std::endl;
}
// use return type SFINAE to check whether "val" supports operator<<
// if Tp does not support <<, then this function will not be called
template <typename Tp>
auto stop(const char* label, const Tp& val) -> decltype(std::cerr << val, void())
{
std::cerr << "[LOG:STOP][" << label << "]> " << val << std::endl;
}
};
Where usage could look below, where the usage within the loop, e.g. logger_t{}.start("foo", val)
,
will never even create an instance of log_message
unless ENABLE_LOG_MESSAGE
is defined at compile-time.
Since the construction of logger_t{}
and start("foo", val)
do not have any side-effects,
the entire line will likely be optimized entirely away (depending on the optimization settings).
// log_message will NEVER be called when ENABLE_LOG_MESSAGE is not defined
#if !defined(ENABLE_LOG_MESSAGE)
TIMEMORY_DEFINE_CONCRETE_TRAIT(is_available, component::log_message, false_type)
#endif
// just a general bundle that uses TIMEMORY_GLOBAL_COMPONENTS environ variable
using bundle_t = tim::component_tuple<global_user_bundle>;
// a dedicated bundle for logging
using logger_t = tim::lightweight_bundle<log_message>;
void foo(double val)
{
// basic label == function name
TIMEMORY_BASIC_MARKER(bundle_t, "")
for(int i = 0; i < 10; ++i)
{
logger_t{}.start("foo", val); // write log message or nothing if not available
val += i;
logger_t{}.stop("foo", val); // write log message or nothing if not available
}
}
Composable Components Example¶
Building a brand-new component is simple and straight-forward.
In fact, new components can simply be composites of existing components.
For example, if a component for measuring the FLOP-rate (floating point operations per second)
is desired, it is arbitrarily easy to create and this new component will have all the
features of wall_clock
and papi_tuple
component:
// This "component" is for conceptual demonstration only
// It is not intended to be copy+pasted
struct flop_rate : base<flop_rate, double>
{
private:
wall_clock wc;
papi_tuple<PAPI_DP_OPS> hw;
public:
void start()
{
wc.start();
hw.start();
}
void stop()
{
wc.stop();
hw.stop();
}
auto get() const
{
return hw.get() / wc.get();
}
};
Extended Example¶
The simplicity of creating a custom component that inherits category-based formatting properties
(is_timing_category
) and timing unit conversion (uses_timing_units
)
can be easily demonstrated with the wall_clock
component and the simplicity and adaptability
of forwarding timemory markers to external instrumentation is easily demonstrated with the
tau_marker
component:
TIMEMORY_DECLARE_COMPONENT(wall_clock)
TIMEMORY_DECLARE_COMPONENT(tau_marker)
// type-traits for wall-clock
TIMEMORY_DEFINE_CONCRETE_TRAIT(is_timing_category, component::wall_clock, true_type)
TIMEMORY_DEFINE_CONCRETE_TRAIT(uses_timing_units, component::wall_clock, true_type)
TIMEMORY_STATISTICS_TYPE(component::wall_clock, double)
namespace tim
{
namespace component
{
//
// the system's real time (i.e. wall time) clock, expressed as the
// amount of time since the epoch.
//
// NOTE: 'value', 'accum', 'get_units()', etc. are provided by base class
//
struct wall_clock : public base<wall_clock, int64_t>
{
using ratio_t = std::nano;
using value_type = int64_t;
using base_type = base<wall_clock, value_type>;
static std::string label() { return "wall"; }
static std::string description() { return "wall-clock timer"; }
static value_type record()
{
// use STL steady_clock to get time-stamp in nanoseconds
using clock_type = std::chrono::steady_clock;
using duration_type = std::chrono::duration<clock_type::rep, ratio_t>;
return std::chrono::duration_cast<duration_type>(
clock_type::now().time_since_epoch()).count();
}
double get_display() const { return get(); }
double get() const
{
// get_unit() provided by base_clock via uses_timing_units type-trait
auto val = (is_transient) ? accum : value;
return static_cast<double>(val) / ratio_t::den * get_unit();
}
void start()
{
value = record();
}
void stop()
{
value = (record() - value);
accum += value;
}
};
//
// forwards timemory instrumentation to TAU instrumentation.
//
struct tau_marker : public base<tau_marker, void>
{
// timemory component api
using value_type = void;
using this_type = tau_marker;
using base_type = base<this_type, value_type>;
static std::string label() { return "tau"; }
static std::string description() { return "TAU_start and TAU_stop instrumentation"; }
static void global_init(storage_type*) { Tau_set_node(dmp::rank()); }
static void thread_init(storage_type*) { TAU_REGISTER_THREAD(); }
tau_marker() = default;
tau_marker(const std::string& _prefix) : m_prefix(_prefix) {}
void start() { Tau_start(m_prefix.c_str()); }
void stop() { Tau_stop(m_prefix.c_str()); }
void set_prefix(const std::string& _prefix) { m_prefix = _prefix; }
// This 'set_prefix(...)' member function is a great example of the template
// meta-programming provided by timemory: at compile-time, timemory checks
// whether components have this member function and, if and only if it exists,
// timemory will call this member function for the component and provide the
// marker label.
private:
std::string m_prefix = "";
};
} // namespace component
} // namespace tim
Using the two tools together in C++ is as easy as the following:
#include <timemory/timemory.hpp>
using namespace tim::component;
using comp_bundle_t = tim::component_tuple_t <wall_clock, tau_marker>;
using auto_bundle_t = tim::auto_tuple_t <wall_clock, tau_marker>;
// "auto" types automatically start/stop based on scope
void foo()
{
comp_bundle_t t("foo");
t.start();
// do something
t.stop();
}
void bar()
{
auto_bundle_t t("foo");
// do something
}
int main(int argc, char** argv)
{
tim::init(argc, argv);
foo();
bar();
tim::finalize();
}
Using the pure template interface will cause longer compile-times and is only available in C++ so a library interface for C, C++, and Fortran is also available:
#include <timemory/library.h>
void foo()
{
uint64_t idx = timemory_get_begin_record("foo");
// do something
timemory_end_record(idx);
}
void bar()
{
timemory_push_region("bar");
// do something
timemory_pop_region("bar");
}
int main(int argc, char** argv)
{
timemory_init_library(argc, argv);
timemory_push_components("wall_clock,tau_marker");
foo();
bar();
timemory_pop_components();
timemory_finalize_library();
}
In Python:
import timemory
from timemory.profiler import profile
from timemory.util import auto_tuple
def get_config(items=["wall_clock", "tau_marker"]):
"""
Converts strings to enumerations
"""
return [getattr(timemory.component, x) for x in items]
@profile(["wall_clock", "tau_marker"])
def foo():
"""
@profile (also available as context-manager) enables full python instrumentation
of every subsequent python call
"""
# ...
@auto_tuple(get_config())
def bar():
"""
@auto_tuple (also available as context-manager) enables instrumentation
of only this function
"""
# ...
if __name__ == "__main__":
foo()
bar()
timemory.finalize()