Andi Kleen | 10 Jan 13:32 2014

[PATCH 1/4] perf, tools: Add support for prepending LBRs to the callstack

From: Andi Kleen <ak <at> linux.intel.com>

I never found the default LBR display mode which generates histograms
of individual branches particularly useful.

This implements an alternative mode that creates histograms over complete
branch traces, instead of individual branches, similar to how normal
callgraphs are handled. This is done by putting it in
front of the normal callgraph and then using the normal callgraph
histogram infrastructure to unify them.

This way in complex functions we can understand the control flow
that lead to a particular sample.

The default output is unchanged.

This is only implemented in perf report, no change to record
or anywhere else.

This adds the basic code to report:
- add a new "branch" option to the -g option parser to enable this mode
- when the flag is set include the LBR into the callstack in machine.c.
The rest of the history code is unchanged and doesn't know the difference
between LBR entry and normal call entry.

Current limitations:
- There is no attempt to cut off the LBR at the beginning of the function,
so there may be small overlaps between the callstack and the LBR.
- The LBR flags (mispredict etc.) are not shown in the history

(Continue reading)

Andi Kleen | 10 Jan 13:32 2014

[PATCH 2/4] perf, tools: Add --branch-call-stack option to report

From: Andi Kleen <ak <at> linux.intel.com>

Add a --branch-call-stack option toperf report that changes all
the settings necessary for using the branches in callstacks.

This is just a short cut to make this nicer to use.

Signed-off-by: Andi Kleen <ak <at> linux.intel.com>
---
 tools/perf/Documentation/perf-report.txt |  5 +++++
 tools/perf/builtin-report.c              | 25 ++++++++++++++++++++++---
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 10a2798..77ec0b9 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
 <at>  <at>  -223,6 +223,11  <at>  <at>  OPTIONS
 	branch stacks and it will automatically switch to the branch view mode,
 	unless --no-branch-stack is used.

+--branch-call-stack::
+	Add the addresses of sampled taken branches to the callstack.
+	This allows to examine the path the program took to each sample.
+	The data collection must have used -b or -j.	
+
 --objdump=<path>::
         Path to objdump binary.

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
(Continue reading)

Andi Kleen | 10 Jan 13:32 2014

[PATCH 3/4] perf, tools: Filter out small loops from LBR-as-call-stack

From: Andi Kleen <ak <at> linux.intel.com>

Small loops can cause unnecessary duplication in the LBR-as-callstack,
because the loop body appears multiple times. Filter out duplications
from the LBR before unifying it into the histories.  This way the
same loop body only appears once.

This uses a simple hash based cycle detector. It takes some short
cuts (not handling hash collisions) so in rare cases duplicates may
be missed.

Signed-off-by: Andi Kleen <ak <at> linux.intel.com>
---
 tools/perf/util/machine.c | 73 ++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 62 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index a7e538b..0fb4e9a 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
 <at>  <at>  -10,6 +10,7  <at>  <at> 
 #include "thread.h"
 #include <stdbool.h>
 #include "unwind.h"
+#include "linux/hash.h"

 int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 {
 <at>  <at>  -1302,6 +1303,46  <at>  <at>  static int add_callchain_ip(struct machine *machine,
 	return callchain_cursor_append(&callchain_cursor, ip, al.map, al.sym);
(Continue reading)

Andi Kleen | 10 Jan 13:32 2014

[PATCH 4/4] perf, tools: Enable printing the srcline in the history

From: Andi Kleen <ak <at> linux.intel.com>

For lbr-as-callgraph we need to see the line number in the history,
because many LBR entries can be in a single function, and just
showing the same function name many times is not useful.

When the history code is configured to sort by address, also try to
resolve the address to a file:srcline and display this in the browser.
If that doesn't work still display the address.

This can be also useful without LBRs for understanding which call in a large
function (or in which inlined function) called something else.

Contains fixes from Namhyung Kim

Signed-off-by: Andi Kleen <ak <at> linux.intel.com>
---
 tools/perf/ui/browsers/hists.c | 15 ++++++++++++---
 tools/perf/ui/stdio/hist.c     | 16 +++++++++++++---
 tools/perf/util/callchain.h    |  1 +
 tools/perf/util/machine.c      |  2 +-
 tools/perf/util/srcline.c      |  8 +++++++-
 5 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index a440e03..5e0688b 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
 <at>  <at>  -399,9 +399,18  <at>  <at>  static char *callchain_list__sym_name(struct callchain_list *cl,
 {
(Continue reading)

Jiri Olsa | 11 Jan 16:36 2014
Picon

Re: [PATCH 1/4] perf, tools: Add support for prepending LBRs to the callstack

On Fri, Jan 10, 2014 at 04:32:03AM -0800, Andi Kleen wrote:
> From: Andi Kleen <ak <at> linux.intel.com>
> 
> I never found the default LBR display mode which generates histograms
> of individual branches particularly useful.
> 
> This implements an alternative mode that creates histograms over complete
> branch traces, instead of individual branches, similar to how normal
> callgraphs are handled. This is done by putting it in
> front of the normal callgraph and then using the normal callgraph
> histogram infrastructure to unify them.
> 
> This way in complex functions we can understand the control flow
> that lead to a particular sample.
> 
> The default output is unchanged.
> 
> This is only implemented in perf report, no change to record
> or anywhere else.
> 
> This adds the basic code to report:
> - add a new "branch" option to the -g option parser to enable this mode
> - when the flag is set include the LBR into the callstack in machine.c.
> The rest of the history code is unchanged and doesn't know the difference
> between LBR entry and normal call entry.

sounds like nice idea, but I could not get the patchset applied
on acme's perf/core

jirka
(Continue reading)

Andi Kleen | 11 Jan 18:58 2014

Re: [PATCH 1/4] perf, tools: Add support for prepending LBRs to the callstack

On Sat, Jan 11, 2014 at 04:36:14PM +0100, Jiri Olsa wrote:
> On Fri, Jan 10, 2014 at 04:32:03AM -0800, Andi Kleen wrote:
> > From: Andi Kleen <ak <at> linux.intel.com>
> > 
> > I never found the default LBR display mode which generates histograms
> > of individual branches particularly useful.
> > 
> > This implements an alternative mode that creates histograms over complete
> > branch traces, instead of individual branches, similar to how normal
> > callgraphs are handled. This is done by putting it in
> > front of the normal callgraph and then using the normal callgraph
> > histogram infrastructure to unify them.
> > 
> > This way in complex functions we can understand the control flow
> > that lead to a particular sample.
> > 
> > The default output is unchanged.
> > 
> > This is only implemented in perf report, no change to record
> > or anywhere else.
> > 
> > This adds the basic code to report:
> > - add a new "branch" option to the -g option parser to enable this mode
> > - when the flag is set include the LBR into the callstack in machine.c.
> > The rest of the history code is unchanged and doesn't know the difference
> > between LBR entry and normal call entry.
> 
> sounds like nice idea, but I could not get the patchset applied
> on acme's perf/core

(Continue reading)

Arnaldo Carvalho de Melo | 11 Jan 20:16 2014

Re: [PATCH 1/4] perf, tools: Add support for prepending LBRs to the callstack

Em Sat, Jan 11, 2014 at 06:58:16PM +0100, Andi Kleen escreveu:
> On Sat, Jan 11, 2014 at 04:36:14PM +0100, Jiri Olsa wrote:
> > On Fri, Jan 10, 2014 at 04:32:03AM -0800, Andi Kleen wrote:
> > > From: Andi Kleen <ak <at> linux.intel.com>
> > > 
> > > I never found the default LBR display mode which generates histograms
> > > of individual branches particularly useful.
> > > 
> > > This implements an alternative mode that creates histograms over complete
> > > branch traces, instead of individual branches, similar to how normal
> > > callgraphs are handled. This is done by putting it in
> > > front of the normal callgraph and then using the normal callgraph
> > > histogram infrastructure to unify them.
> > > 
> > > This way in complex functions we can understand the control flow
> > > that lead to a particular sample.
> > > 
> > > The default output is unchanged.
> > > 
> > > This is only implemented in perf report, no change to record
> > > or anywhere else.
> > > 
> > > This adds the basic code to report:
> > > - add a new "branch" option to the -g option parser to enable this mode
> > > - when the flag is set include the LBR into the callstack in machine.c.
> > > The rest of the history code is unchanged and doesn't know the difference
> > > between LBR entry and normal call entry.
> > 
> > sounds like nice idea, but I could not get the patchset applied
> > on acme's perf/core
(Continue reading)

Arnaldo Carvalho de Melo | 11 Jan 20:18 2014

Re: [PATCH 1/4] perf, tools: Add support for prepending LBRs to the callstack

Em Sat, Jan 11, 2014 at 04:16:57PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Sat, Jan 11, 2014 at 06:58:16PM +0100, Andi Kleen escreveu:
> > On Sat, Jan 11, 2014 at 04:36:14PM +0100, Jiri Olsa wrote:
> > > On Fri, Jan 10, 2014 at 04:32:03AM -0800, Andi Kleen wrote:
> > > > From: Andi Kleen <ak <at> linux.intel.com>
> > > > 
> > > > I never found the default LBR display mode which generates histograms
> > > > of individual branches particularly useful.
> > > > 
> > > > This implements an alternative mode that creates histograms over complete
> > > > branch traces, instead of individual branches, similar to how normal
> > > > callgraphs are handled. This is done by putting it in
> > > > front of the normal callgraph and then using the normal callgraph
> > > > histogram infrastructure to unify them.
> > > > 
> > > > This way in complex functions we can understand the control flow
> > > > that lead to a particular sample.
> > > > 
> > > > The default output is unchanged.
> > > > 
> > > > This is only implemented in perf report, no change to record
> > > > or anywhere else.
> > > > 
> > > > This adds the basic code to report:
> > > > - add a new "branch" option to the -g option parser to enable this mode
> > > > - when the flag is set include the LBR into the callstack in machine.c.
> > > > The rest of the history code is unchanged and doesn't know the difference
> > > > between LBR entry and normal call entry.
> > > 
> > > sounds like nice idea, but I could not get the patchset applied
(Continue reading)

Andi Kleen | 11 Jan 20:30 2014

Re: [PATCH 1/4] perf, tools: Add support for prepending LBRs to the callstack

> What was your build command line?
> 
> Here, on a f18 system it works with these:
> 
> $ make -C tools/perf O=/tmp/build/perf install
> 
> $ cd tools/perf ; make
> 
> Trying on another system...

Sorry for the false alarm. It looks like it was a problem on my side.
Works now.

-andi

Gmane