31 Jul 2012 00:33
[PATCH] Fix unit mask handling (including 'extra' bits) in operf
Maynard Johnson <maynardj <at> us.ibm.com>
2012-07-30 22:33:51 GMT
2012-07-30 22:33:51 GMT
*Suravee*, *Andi*, please review this patch and give a thumbs-up or let me
know if you find problems.
*Suravee*, one known problem is that 'cpuid_vendor' is not a known function
on my system, so I commented out that part below. Can you provide an
alternative means to determine if the system is "AuthenticAMD"?
*Andi*, there is a problem in opreport where incorrect event information is
printed out if using a named unit mask. This is not a new problem; opreport
prints wrong info whether opcontrol or operf was used to collect the profile.
For example, I tested this patch below with uops_retired, using both "0x2"
and "total_cycles" for unit mask values. When using "total_cycles", the header
info printed by opreport is:
CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated)
Counted uops_retired events (uops that actually retired.) with a unit mask of 0x00 (multiple flags) count 2000000
Note the incorrect unit mask info--^
This is because opreport determines what event(s) was(were) being profiled
by inspecting the header file created in the samples directory. It gets the
event code and numerical unit mask values and calls op_find_event().
There's currently no way to save a unit mask string in the samples header; if
there were, we would then need a new version of op_find-event() that would take
a um string instead of a um number.
Since this problem has existed since named unit masks were introduced, I guess
it must not be too serious as it's not been reported by anyone. But besides the
downside of opreport not showing correct event information, there's a more serious
problem. The samples are stored in sample files named something like:
uops_retired.2000000.0.all.all.all
(Continue reading)
RSS Feed