Gprof Quick-Start Guide

What is gprof?

Grpof is a kind of performance analysis tools called profiler for C/C++ programs. Profiling is a technique for determining how a program uses processor resources, allows you to learn where your program spent its time and which functions called which other functions while it was executing. According to this information, you can easily find the functions consuming more time than you expected and probably optimize them for better performance.

Three steps to use gprof

It is very easy to use gprof. We only need to add the option -pg when compiling the program; then execute the program as usually, a file named gmon.out will be created in the current directory; and finally, run the gprof tool on the profiling data file to produce human-readable output. Gpof provides two forms of results: flat profile and call graph. We will see both of them latter.

Here is a simple c++ program as example to demonstrate the procedure.

// example1.cpp

#include <cstdlib>
#include <stdio.h>
using namespace std;
//declaration of functions
int func1();
int func2();

int func1(void) {
    int i=0,g=0;
    while(i++<100000) {
        g+=i;
    }
    return g;
}

int func2(void) {
    int i=0,g=0;
    while(i++<400000) {
        g+=i;
    }
    return g;
}

int main(int argc, char** argv) {
    int iterations = 10000;
    printf(`Number of iterations = %d\n`, iterations);
    while(iterations--) {
        func1();
        func2();
    }
}

Step 1. Compiling for profiling

Before profiling the program, we need to recompile it specifically for profiling. Why? Because the way grpof works is to insert code at the begining and the end of each function in the program to collect the timing information, and the option -pg is used to tell the compiler to insert those code automactily.

( If you want to know more detail, you can read the Implementation of Profiling and Section 3: Profilling Concepts.)

In command line, we type:

$ g++ example1.cpp -o example1 -pg
$ ls
example1 example1.cpp

As the above, a executale named example1 from the source file example1.cpp with profiling turned on is generated in the current directory.

Step 2. Run the program normally

Once the program is compiled whtih profiling turned on, you can run the program normally, that is, pretend you did not do anything to it and do what you would normally do.

$ ./example1
number of iterations = 10000
$ ls
example1 example1.cpp gmon.out

A new file gmon.out is creatd in the current directory. This file contains the raw data in a gprof-readable format.

Things to keep in mind:

Step3. Run gprof

You can run gprof like this:

$ gprof exectable-name [ data-file ] [ > output-file ]

If you don't specify the name of a data file, gmon.out is assumed. Following the gprof command with > output-file causes the output of gprof to be saved to output-file so you can examine it later. For this example, the program name is example1 and we will save the output into a file called example1.txt:

$ gprof example1 > example1.txt
$ ls
example1 example1.cpp example1.txt gmon.out

We see that the file example1.txt is generated in the current directory. Now you can use your favourite text editor to examine this file!

Analyzing gprof's output

As we mentioned brefore, there are two forms of output gprof produces: the flat profile and the call graph. The explainations of each column for those two are also included in the output file.

Here is part of the flat profile we just got:

Flat profile:
Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 80.80      9.59     9.59    10000   959.15   959.15  func2()
 20.33     12.00     2.41    10000   241.31   241.31  func1()

and the call graph:

Call graph (explanation follows)
granularity: each sample hit covers 2 byte(s) for 0.08% of 12.00 seconds
index % time    self  children    called     name
                                                 <spontaneous>
[1]    100.0    0.00   12.00                 main [1]
                9.59    0.00   10000/10000       func2() [2]
                2.41    0.00   10000/10000       func1() [3]
-----------------------------------------------
                9.59    0.00   10000/10000       main [1]
[2]     79.9    9.59    0.00   10000         func2() [2]
-----------------------------------------------
                2.41    0.00   10000/10000       main [1]
[3]     20.1    2.41    0.00   10000         func1() [3]
-----------------------------------------------

It is clear to see that the function main() invokes func1() and func2() which consume 80.80% and 20.33% of the total time, respectively. Don't be surprised that the addtion of those two percents is above 100%, that is caused by the calculating algorithm used by gprof.

Another example

Here is another simple program for you to practice. Before run it by yourself, could you guess which function is the most expensive one by reading the code below? Take a try!

/*--------------------------------------------------
  Gprof Example 2
  --------------------------------------------------
  main
    |-- func1 -- func3
    |-- func2 
  --------------------------------------------------
*/
#include <cstdlib>
#include <stdio.h>
using namespace std;
//declaration of functions
int func1();
int func2();
int func3();

int func1(void) {
    int i=0,g=0;
    while(i++<10000) {
        g+=i;
        func3();
    }
    return g;
}

int func2(void) {
    int i=0,g=0;
    while(i++<40000) {
        g+=i;
    }
    return g;
}


int func3() {
    int i=0,g=0;
    while(i++ < 1000) {
        g++;
    }
    return g;
}


int main(int argc, char** argv) {
    int iterations = 1000;
    printf("Number of iterations = %d\n", iterations);
    while(iterations--) {
        printf("the iteration No. %d\n", 1000-iterations);
        func1();
        func2();
    }
}  

Extended reading