CPU Performance Counters Library Functions cpcbindcurlwp(3CPC)
NAME
cpcbindcurlwp, cpcbindpctx, cpcbindcpu, cpcunbind,
cpcrequestpreset, cpcsetrestart - bind request sets to
hardware counters
SYNOPSIS
cc [ flag... ] file... -lcpc [ library... ]
#include
int cpcbindcurlwp(cpct *cpc, cpcsett *set, uintt flags);
int cpcbindpctx(cpct *cpc, pctxt *pctx, idt id, cpcsett *set,
uintt flags);
int cpcbindcpu(cpct *cpc, processoridt id, cpcsett *set,
uintt flags);
int cpcunbind(cpct *cpc, cpcsett *set);
int cpcrequestpreset(cpct *cpc, int index, uint64t preset);
int cpcsetrestart(cpct *cpc, cpcsett *set);
DESCRIPTION
These functions program the processor's hardware counters
according to the requests contained in the set argument. If
these functions are successful, then upon return the physi-
cal counters will have been assigned to count events on
behalf of each request in the set, and each counter will be
enabled as configured.
The cpcbindcurlwp() function binds the set to the calling
LWP. If successful, a performance counter context is associ-
ated with the LWP that allows the system to virtualize the
hardware counters to that specific LWP.
By default, the system binds the set to the current LWP
only. If the CPCBINDLWPINHERIT flag is present in the
flags argument, however, any subsequent LWPs created by the
current LWP will inherit a copy of the request set. The
newly created LWP will have its virtualized 64-bit counters
initialized to the preset values specified in set, and the
counters will be enabled and begin counting events on behalf
of the new LWP. This automatic inheritance behavior can be
SunOS 5.11 Last change: 05 Mar 2007 1
CPU Performance Counters Library Functions cpcbindcurlwp(3CPC)
useful when dealing with multithreaded programs to determine
aggregate statistics for the program as a whole.
If the CPCBINDLWPINHERIT flag is specified and any of the
requests in the set have the CPCOVFNOTIFYEMT flag set,
the process will immediately dispatch a SIGEMT signal to the
freshly created LWP so that it can preset its counters
appropriately on the new LWP. This initialization condition
can be detected using cpcsetsample(3CPC) and looking at
the counter value for any requests with CPCOVFNOTIFYEMT
set. The value of any such counters will be UINT64MAX.
The cpcbindpctx() function binds the set to the LWP speci-
fied by the pctx-id pair, where pctx refers to a handle
returned from libpctx and id is the ID of the desired LWP in
the target process. If successful, a performance counter
context is associated with the specified LWP and the system
virtualizes the hardware counters to that specific LWP. The
flags argument is reserved for future use and must always be
0.
The cpcbindcpu() function binds the set to the specified
CPU and measures events occurring on that CPU regardless of
which LWP is running. Only one such binding can be active on
the specified CPU at a time. As long as any application has
bound a set to a CPU, per-LWP counters are unavailable and
any attempt to use either cpcbindcurlwp() or
cpcbindpctx() returns EAGAIN. The first invocation of
cpcbindcpu() invalidates all currently bound per-LWP
counter sets, and any attempt to sample an invalidated set
returns EAGAIN. To bind to a CPU, the library binds the cal-
ling LWP to the measured CPU with processorbind(2). The
application must not change its processor binding until
after it has unbound the set with cpcunbind(). The flags
argument is reserved for future use and must always be 0.
The cpcrequestpreset() function updates the preset and
current value stored in the indexed request within the
currently bound set, thereby changing the starting value for
the specified request for the calling LWP only, which takes
effect at the next call to cpcsetrestart().
When a performance counter counting on behalf of a request
with the CPCOVFNOTIFYEMT flag set overflows, the perfor-
mance counters are frozen and the LWP to which the set is
bound receives a SIGEMT signal. The cpcsetrestart() func-
tion can be called from a SIGEMT signal handler function to
SunOS 5.11 Last change: 05 Mar 2007 2
CPU Performance Counters Library Functions cpcbindcurlwp(3CPC)
quickly restart the hardware counters. Counting begins from
each request's original preset (see
cpcsetaddrequest(3CPC)), or from the preset specified in
a prior call to cpcrequestpreset(). Applications perform-
ing performance counter overflow profiling should use the
cpcsetrestart() function to quickly restart counting after
receiving a SIGEMT overflow signal and recording any
relevant program state.
The cpcunbind() function unbinds the set from the resource
to which it is bound. All hardware resources associated with
the bound set are freed and if the set was bound to a CPU,
the calling LWP is unbound from the corresponding CPU. See
processorbind(2).
RETURN VALUES
Upon successful completion these functions return 0. Other-
wise, -1 is returned and errno is set to indicate the error.
ERORS
Applications wanting to get detailed error values should
register an error handler with cpcseterrhndlr(3CPC). Other-
wise, the library will output a specific error description
to stderr.
These functions will fail if:
EACES For cpcbindcurlwp(), the system has Pentium 4
processors with HyperThreading and at least one
physical processor has more than one hardware
thread online. See NOTES.
For cpcbindcpu(), the process does not have the
cpccpu privilege to access the CPU's counters.
For cpcbindcurlwp(), cpcbindcpc(), and
cpcbindpctx(), access to the requested hypervi-
sor event was denied.
EAGAIN For cpcbindcurlwp() and cpcbindpctx(), the
performance counters are not available for use by
the application.
For cpcbindcpu(), another process has already
bound to this CPU. Only one process is allowed to
bind to a CPU at a time and only one set can be
bound to a CPU at a time.
SunOS 5.11 Last change: 05 Mar 2007 3
CPU Performance Counters Library Functions cpcbindcurlwp(3CPC)
EINVAL The set does not contain any requests or
cpcsetaddrequest() was not called.
The value given for an attribute of a request is
out of range.
The system could not assign a physical counter to
each request in the system. See NOTES.
One or more requests in the set conflict and
might not be programmed simultaneously.
The set was not created with the same cpc handle.
For cpcbindcpu(), the specified processor does
not exist.
For cpcunbind(), the set is not bound.
For cpcrequestpreset() and cpcsetrestart(),
the calling LWP does not have a bound set.
ENOSYS For cpcbindcpu(), the specified processor is
not online.
ENOTSUP The cpcbindcurlwp() function was called with
the CPCOVFNOTIFYEMT flag, but the underlying
processor is not capable of detecting counter
overflow.
ESRCH For cpcbindpctx(), the specified LWP in the
target process does not exist.
EXAMPLES
Example 1 Use hardware performance counters to measure
events in a process.
The following example demonstrates how a standalone applica-
tion can be instrumented with the libcpc(3LIB) functions to
use hardware performance counters to measure events in a
process. The application performs 20 iterations of a compu-
tation, measuring the counter values for each iteration. By
default, the example makes use of two counters to measure
external cache references and external cache hits. These
options are only appropriate for UltraSPARC processors. By
setting the EVENT0 and EVENT1 environment variables to other
strings (a list of which can be obtained from the -h option
SunOS 5.11 Last change: 05 Mar 2007 4
CPU Performance Counters Library Functions cpcbindcurlwp(3CPC)
of the cpustat(1M) or cputrack(1) utilities), other events
can be counted. The error() routine is assumed to be a
user-provided routine analogous to the familiar printf(3C)
function from the C library that also performs an exit(2)
after printing the message.
#include
#include
#include
#include
#include
#include
int
main(int argc, char *argv[])
{
int iter;
char *event0 = NUL, *event1 = NUL;
cpct *cpc;
cpcsett *set;
cpcbuft *diff, *after, *before;
int ind0, ind1;
uint64t val0, val1;
if ((cpc = cpcopen(CPCVERCURENT)) == NUL)
error("perf counters unavailable: %s", strerror(errno));
if ((event0 = getenv("EVENT0")) == NUL)
event0 = "ECref";
if ((event1 = getenv("EVENT1")) == NUL)
event1 = "EChit";
if ((set = cpcsetcreate(cpc)) == NUL)
error("could not create set: %s", strerror(errno));
if ((ind0 = cpcsetaddrequest(cpc, set, event0, 0, CPCOUNTUSER, 0,
NUL)) == -1)
error("could not add first request: %s", strerror(errno));
if ((ind1 = cpcsetaddrequest(cpc, set, event1, 0, CPCOUNTUSER, 0,
NUL)) == -1)
error("could not add first request: %s", strerror(errno));
if ((diff = cpcbufcreate(cpc, set)) == NUL)
error("could not create buffer: %s", strerror(errno));
if ((after = cpcbufcreate(cpc, set)) == NUL)
error("could not create buffer: %s", strerror(errno));
if ((before = cpcbufcreate(cpc, set)) == NUL)
error("could not create buffer: %s", strerror(errno));
if (cpcbindcurlwp(cpc, set, 0) == -1)
SunOS 5.11 Last change: 05 Mar 2007 5
CPU Performance Counters Library Functions cpcbindcurlwp(3CPC)
error("cannot bind lwp%d: %s", lwpself(), strerror(errno));
for (iter = 1; iter <= 20; iter]) {
if (cpcsetsample(cpc, set, before) == -1)
break;
/* ==> Computation to be measured goes here <== */
if (cpcsetsample(cpc, set, after) == -1)
break;
cpcbufsub(cpc, diff, after, before);
cpcbufget(cpc, diff, ind0, &val0);
cpcbufget(cpc, diff, ind1, &val1);
(void) printf("%3d: %" PRId64 " %" PRId64 "\n", iter,
val0, val1);
}
if (iter != 21)
error("cannot sample set: %s", strerror(errno));
cpcclose(cpc);
return (0);
}
Example 2 Write a signal handler to catch overflow signals.
The following example builds on Example 1 and demonstrates
how to write the signal handler to catch overflow signals. A
counter is preset so that it is 1000 counts short of over-
flowing. After 1000 counts the signal handler is invoked.
The signal handler:
cpct *cpc;
cpcsett *set;
cpcbuft *buf;
int index;
void
emthandler(int sig, siginfot *sip, void *arg)
{
ucontextt *uap = arg;
uint64t val;
SunOS 5.11 Last change: 05 Mar 2007 6
CPU Performance Counters Library Functions cpcbindcurlwp(3CPC)
if (sig != SIGEMT sip->sicode != EMTCPCOVF) {
psignal(sig, "example");
psiginfo(sip, "example");
return;
}
(void) printf("lwp%d - siaddr %p ucontext: %%pc %p %%sp %p\n",
lwpself(), (void *)sip->siaddr,
(void *)uap->ucmcontext.gregs[PC],
(void *)uap->ucmcontext.gregs[SP]);
if (cpcsetsample(cpc, set, buf) != 0)
error("cannot sample: %s", strerror(errno));
cpcbufget(cpc, buf, index, &val);
(void) printf("0x%" PRIx64"\n", val);
(void) fflush(stdout);
/*
* Update a request's preset and restart the counters. Counters which
* have not been preset with cpcrequestpreset() will resume counting
* from their current value.
*/
(cpcrequestpreset(cpc, ind1, val1) != 0)
error("cannot set preset for request %d: %s", ind1,
strerror(errno));
if (cpcsetrestart(cpc, set) != 0)
error("cannot restart lwp%d: %s", lwpself(), strerror(errno));
}
The setup code, which can be positioned after the code that
opens the CPC library and creates a set:
#define PRESET (UINT64MAX - 999ull)
struct sigaction act;
...
act.sasigaction = emthandler;
bzero(&act.samask, sizeof (act.samask));
act.saflags = SARESTARTSASIGINFO;
if (sigaction(SIGEMT, &act, NUL) == -1)
error("sigaction: %s", strerror(errno));
if ((index = cpcsetaddrequest(cpc, set, event, PRESET,
CPCOUNTUSER CPCOVFNOTIFYEMT, 0, NUL)) != 0)
error("cannot add request to set: %s", strerror(errno));
if ((buf = cpcbufcreate(cpc, set)) == NUL)
SunOS 5.11 Last change: 05 Mar 2007 7
CPU Performance Counters Library Functions cpcbindcurlwp(3CPC)
error("cannot create buffer: %s", strerror(errno));
if (cpcbindcurlwp(cpc, set, 0) == -1)
error("cannot bind lwp%d: %s", lwpself(), strerror(errno));
for (iter = 1; iter <= 20; iter]) {
/* ==> Computation to be measured goes here <== */
}
cpcunbind(cpc, set); /* done */
ATRIBUTES
See attributes(5) for descriptions of the following attri-
butes:
ATRIBUTE TYPE ATRIBUTE VALUE
Interface Stability Evolving
MT-Level Safe
SEE ALSO
cpustat(1M), cputrack(1), psrinfo(1M), processorbind(2),
cpcseterrhndlr(3CPC), cpcsetsample(3CPC), libcpc(3LIB),
attributes(5)
NOTES
When a set is bound, the system assigns a physical hardware
counter to count on behalf of each request in the set. If
such an assignment is not possible for all requests in the
set, the bind function returns -1 and sets errno to EINVAL.
The assignment of requests to counters depends on the capa-
bilities of the available counters. Some processors (such as
Pentium 4) have a complicated counter control mechanism that
requires the reservation of limited hardware resources
beyond the actual counters. It could occur that two requests
for different events might be impossible to count at the
same time due to these limited hardware resources. See the
processor manual as referenced by cpccpuref(3CPC) for
details about the underlying processor's capabilities and
limitations.
Some processors can be configured to dispatch an interrupt
when a physical counter overflows. The most obvious use for
this facility is to ensure that the full 64-bit counter
SunOS 5.11 Last change: 05 Mar 2007 8
CPU Performance Counters Library Functions cpcbindcurlwp(3CPC)
values are maintained without repeated sampling. Certain
hardware, such as the UltraSPARC processor, does not record
which counter overflowed. A more subtle use for this facil-
ity is to preset the counter to a value slightly less than
the maximum value, then use the resulting interrupt to catch
the counter overflow associated with that event. The over-
flow can then be used as an indication of the frequency of
the occurrence of that event.
The interrupt generated by the processor might not be par-
ticularly precise. That is, the particular instruction that
caused the counter overflow might be earlier in the instruc-
tion stream than is indicated by the program counter value
in the ucontext.
When a request is added to a set with the CPCOVFNOTIFYEMT
flag set, then as before, the control registers and counter
are preset from the 64-bit preset value given. When the flag
is set, however, the kernel arranges to send the calling
process a SIGEMT signal when the overflow occurs. The
sicode member of the corresponding siginfo structure is set
to EMTCPCOVF and the siaddr member takes the program
counter value at the time the overflow interrupt was
delivered. Counting is disabled until the set is bound
again.
If the CPCAPOVERFLOWPRECISE bit is set in the value
returned by cpccaps(3CPC), the processor is able to deter-
mine precisely which counter has overflowed after receiving
the overflow interrupt. On such processors, the SIGEMT sig-
nal is sent only if a counter overflows and the request that
the counter is counting has the CPCOVFNOTIFYEMT flag set.
If the capability is not present on the processor, the sys-
tem sends a SIGEMT signal to the process if any of its
requests have the CPCOVFNOTIFYEMT flag set and any
counter in its set overflows.
Different processors have different counter ranges avail-
able, though all processors supported by Solaris allow at
least 31 bits to be specified as a counter preset value.
Portable preset values lie in the range UINT64MAX to
UINT64MAX-INT32MAX.
The appropriate preset value will often need to be deter-
mined experimentally. Typically, this value will depend on
the event being measured as well as the desire to minimize
the impact of the act of measurement on the event being
SunOS 5.11 Last change: 05 Mar 2007 9
CPU Performance Counters Library Functions cpcbindcurlwp(3CPC)
measured. Less frequent interrupts and samples lead to less
perturbation of the system.
If the processor cannot detect counter overflow, bind will
fail and return ENOTSUP. Only user events can be measured
using this technique. See Example 2.
Pentium 4
Most Pentium 4 events require the specification of an event
mask for counting. The event mask is specified with the
emask attribute.
Pentium 4 processors with HyperThreading Technology have
only one set of hardware counters per physical processor. To
use cpcbindcurlwp() or cpcbindpctx() to measure per-LWP
events on a system with Pentium 4 HT processors, a system
administrator must first take processors in the system off-
line until each physical processor has only one hardware
thread online (See the -p option to psrinfo(1M)). If a
second hardware thread is brought online, all per-LWP bound
contexts will be invalidated and any attempt to sample or
bind a CPC set will return EAGAIN.
Only one CPC set at a time can be bound to a physical pro-
cessor with cpcbindcpu(). Any call to cpcbindcpu() that
attempts to bind a set to a processor that shares a physical
processor with a processor that already has a CPU-bound set
returns an error.
To measure the shared state on a Pentium 4 processor with
HyperThreading, the countsiblingusr and countsiblingsys
attributes are provided for use with cpcbindcpu(). These
attributes behave exactly as the CPCOUNTUSER and
CPCOUNTSYSTEM request flags, except that they act on the
sibling hardware thread sharing the physical processor with
the CPU measured by cpcbindcpu(). Some CPC sets will fail
to bind due to resource constraints. The most common type of
resource constraint is an ESCR conflict among one or more
requests in the set. For example, the branchretired event
cannot be measured on counters 12 and 13 simultaneously
because both counters require the CRUESCR2 ESCR to measure
this event. To measure branchretired events simultaneously
on more than one counter, use counters such that one counter
uses CRUESCR2 and the other counter uses CRUESCR3. See the
processor documentation for details.
SunOS 5.11 Last change: 05 Mar 2007 10
|