Using OProfile

OProfile is a system-wide profiler for Linux systems, capable of profiling all running code at low overhead. OProfile is released under the GNU General Public License. OProfile consists of a kernel driver and a daemon for collecting sample data, and several post-profiling tools for turning data into information.

OProfile leverages the hardware performance counters of the CPU to enable profiling of a wide variety of interesting statistics. The statistics can also be used for basic time-spent profiling to find CPU usage bottlenecks in the whole system, and within processes. All code is profiled: hardware and software interrupt handlers, kernel modules, the kernel, shared libraries, and applications.

OProfile is currently in alpha status. However, it has been proven as stable on a large number of differing configurations.

Use OProfile to find the bottleneck processes and functions in applications. You can then use Valgrind or Callgrind on the device to get callgraph data for your application, and Kcachegrind on Linux to view callgraphs to see why a piece of code ends in those bottleneck functions. This is due to the following:

OProfile provides system-wide information (important in case the usage of a system daemon is a bottleneck) whereas Valgrind profiles only the given process.
As OProfile profiles the whole system, its callgraphs' depth cannot be very deep for performance reasons. Valgrind-provided callgraphs are more informative and readable.

Valgrind is the recommended tool to, for example, find leaks, causes for crashes, and racy data accesses. For more information, see Using Valgrind.

Packages

source: oprofile

binary: oprofile

Installing OProfile on the Harmattan device

Install OProfile through the developer mode applet.

Prerequisite: Developer mode must be enabled.

Select Settings > Security > Developer mode.
Install the Performance bundle package by clicking Install.
You get a notification screen that lists all the applications to be installed in the bundle package. To install, click OK.
A dependency notice appears. Click Accept.

For more information on developer mode and installable tools, see Activating developer mode.

Installing debug symbols

In order to view any useful profiling information at functions level, you have to install debugging symbols also. Debug symbols normally come with debugging (-dbg) packages. The easiest way to install all dbg packages required for a given binary is to use debug-dep-install script which comes with the maemo-debug-scripts package. Install maemo-debug-scripts with the developer mode applet. For more information, see maemo-debug-scripts.

Debug symbols packages must match the binary package. You can install them after profiling, if you see that the report does not give symbols for all the libraries and binaries you are interested in.

Using the tool

You can use OProfile to collect system statistics and profiling information of the system.

Collecting statistics

1. On the device, enter the following commands:

opcontrol --deinit
opcontrol --no-vmlinux
opcontrol --separate=kernel
opcontrol --init

In the same way as the --separate=library option, the --separate=kernel option separates the collected statistics for each process and their components. In most use cases, processes (implicitly) request other processes, such as X server and hildon-desktop, to do work for them. To optimise the CPU usage, you need to see which processes need to use most CPU and in which of its components (binary or libraries) in the whole system. The --separate=kernel option also assigns CPU usage within kernel under the processes that caused it. The vmlinux binary name is used for this part.

2. Reset the collected statistics if you have profiled something else before (in any case it does not do any harm). Enter the following command:

opcontrol --reset

3. Start the use case you are interested in and enter the following command:

opcontrol --start

4. When you have finished, enter the following command:

opcontrol --stop

Now you have collected the data.

Before looking into the results, see the Interpreting profiling results documentation.

Viewing profile reports

To see a basic report for each process, enter the following command:

 opreport 
Overflow stats not available
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
          TIMER:0|
  samples|     -%|
------------------
    65956 77.0217 functracer                    <- this process percentage of total CPU usage
                  TIMER:0|
          samples|     -%|
        ------------------
            58174 88.2012 no-vmlinux            <- this process kernel load (percentage of above 77%)
             4626  7.0138 libc-2.10.1.so
             1815  2.7518 libunwind-arm.so.7.0.0
             1180  1.7891 functracer
               74  0.1122 libsp-rtrace1.so.1.0.2
               55  0.0834 memory.so
               32  0.0485 libbfd-2.19.51.20090709.so
    12368 14.4430 no-vmlinux                    <- kernel activity under no process = IDLE
     4952  5.7828 xrestop
                  TIMER:0|
          samples|     -%|
        ------------------
             4383 88.5097 no-vmlinux
              190  3.8368 libxcb.so.1.1.0
              164  3.3118 libX11.so.6.3.0
              150  3.0291 libc-2.10.1.so
               30  0.6058 anon (tgid:8373 range:0x3aac9000-0x3aaca000)
               22  0.4443 xrestop
               10  0.2019 libncurses.so.5.7
                3  0.0606 libXRes.so.1.0.0
     1719  2.0074 Xorg
                  TIMER:0|
          samples|     -%|
        ------------------
             1176 68.4119 no-vmlinux
              467 27.1670 Xorg
               49  2.8505 libc-2.10.1.so
               20  1.1635 fbdev_drv.so
                5  0.2909 libudev.so.0.5.0
                2  0.1163 libextmod.so
      308  0.3597 busybox
           ...

After you know the processes and components with the highest CPU usage, you need to find out the bottleneck functions or functionality in them. For this, you need to install debug symbols for them.

Note: If there is a lot of CPU activity for kernel that is not assigned under any process with the --separate=kernel option, it means that the system or kernel is idle. If your use case is (unexpectedly) slow despite system idling a lot, these kinds of issues are usually related to locking and other inter-process interaction issues that cannot be analysed by looking at the CPU usage. You may be able to analyse the issue with strace.

To see more detailed symbol analysis, enter the following command:

 opreport -l
Overflow stats not available
warning: /no-vmlinux could not be found.
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples -%        image name               app name                 symbol name
58174    68.1027  no-vmlinux               functracer               /no-vmlinux
12368    14.4789  no-vmlinux               no-vmlinux               /no-vmlinux         <- kernel activity under no app/process = IDLE
4383      5.1311  no-vmlinux               xrestop                  /no-vmlinux
1815      2.1248  libunwind-arm.so.7.0.0   functracer               /usr/lib/libunwind-arm.so.7.0.0
1362      1.5945  libc-2.10.1.so           functracer               strncpy
1176      1.3767  no-vmlinux               Xorg                     /no-vmlinux
527       0.6169  libc-2.10.1.so           functracer               memmove
467       0.5467  Xorg                     Xorg                     /usr/bin/Xorg
398       0.4659  libc-2.10.1.so           functracer               vfprintf
314       0.3676  libc-2.10.1.so           functracer               ptrace
198       0.2318  no-vmlinux               busybox                  /no-vmlinux
176       0.2060  libc-2.10.1.so           functracer               memcpy
...

In the example above, the function names are shown for C-library calls from functracer, but not for functionalities with the highest usage inside the functracer, the xrestop and Xorg processes and the libunwind library, so additional debug symbols should be installed and opreport -l run again. Note that anything with less than a few hundreds samples is not very reliable statistically.

Once you know which functionality is a bottleneck, you need to find out whether your program is (indirectly) causing the use of that functionality in the first place, is it using it too much or too often, or should the bottleneck functionality itself be optimised. Analysis of this falls to the developers of the tested program as only they know what their program is trying to achieve, why and how. Before this kind of analysis, it is too early to assign or to report bugs for the lower-level components.

Profiling kernel

If you want to profile the kernel, you need to specify the vmlinux file corresponding to your kernel. Enter the following command:

 opcontrol --vmlinux=/usr/lib/debug/vmlinux-<version>

This file contains both the kernel binary and its debug symbols, it comes from the kernel-debug package. The opreport command can then show, for example, the kernel function names in the report output.

To get the result with callgraph and kernel modules symbols, enter the following command:

 opreport -l -c -p /lib/modules/<version>/ > opreport.txt

Using opannotate

See Using opannotate to make sense of profiles.

Other profiling events supported by OProfile

Although OProfile supports several different methods for profiling, different ones work in different products. Default is set to the one that works.

For more information, see OProfile documentation.

Further information

For more information on the tools, see the following links: