man prospect (Commandes) - prospect - a performance analysis tool

NAME

prospect - a performance analysis tool

SYNOPSIS

prospect --help

prospect command

prospect [-f] [-o results-file] [-e] command

prospect [-s] [-o results-file] [-e] command

prospect [-k] [-o results-file] [-e] command

prospect [--follow-forks] [--output results-file] [--disassemble] command

prospect [--system-wide] [--output results-file] [--disassemble] command

prospect [--kernel-only] [--output results-file] [--disassemble] command

DESCRIPTION

Prospect is a performance analysis tool. Historically, Prospect was originally developed on HP-UX 9.0 in 1988 and has enjoyed a dedicated audience within and external to HP for performance tuning on HP-UX. On Linux, prospect uses the GPL kernel module oprofile to get samples of the program counter, fork, exec, map, exit events, and paths of mappings, in order to construct both a user and kernel profile for optionally every process running on the system.

Prospect executes command and collects system data while the command is running. Prospect lets you report data on a single command, the command and its children, or all processes on the system. See the Basic Options subsection below.

Prospect does not require any special builds or links for command and, in fact, the only requirement (for generating a profile) is that command be not stripped. Prospect outputs a symbol guess for stripped shared libraries, see the Reading Stripped Shared Library Profiles subsection in the NOTES section below. Prospect is very non-intrusive, generally requiring approximately 1% to 5% of the SPU when measuring and approximately 15% to 25% after the run is done for analysis. Processes run at full speed and results skewing is minimal.

Kernel profiling is possible as an alternative to kernel-gprof with the --kernel-only (or -k for short) feature. There is no requirement for a debug kernel build for this feature: the kernel runs at full speed and under true load conditions.

Program Options

prospect command [args ...]
Profile the command (with its optional args) and output the user and kernel profile for that process only on standard output. The sampling time is bounded by the exec'ed command. This is the default behavior of prospect and it may be modified with the options that follow.
-h, -?, --help
Output quick reference usage instructions.
-v, --version
Output prospect revision information.
-o File, --output File
Output profiles to File instead of standard output.
-x, --xml-output
Output XML instead of ASCII in output.
--xml-dtd
Output the XML DTD to standard output.
-f, --follow-forks
Profile the immediate command and all it's descendants. Useful for profiling multi-threaded programs and benchmark sets started from a shell or script.
-s, --system-wide
Profile all active processes on the system. Output will also contain a global kernel profile at the end.
-k, --kernel-only
Profile the kernel only (see also Kernel Profiling under NOTES below). Four different views of the kernel profile are presented and the sampling period is bounded by the exec'ed command.
-m, --merge-profiles
Merge profiles of identical binary objects. This means for example that since threads all share the same text (a binary object), you will get just one merged profile for all threads in a multi-threaded application. Also, the profiles of short-lived processes will be summed up and merged into one profile for that binary object if either --follow-forks or --system-wide modes are active. See also Merged Profiles under NOTES below.
-e, --disassemble
Enables an extended profile to be included in process profiles. The extended profile includes the assembly instructions inside symbols that are the heavy hitters. Include two -e options to force disassembled profiles for all symbols. Also note, that if a file is stripped, using this option will output the disassembled hits even though symbols are missing.
--force-disassemble
This has the same effect as the two -e options mentioned above, mainly to force disassembly for all functions regardless of the number of hits to them.
-H Hertz, --sampling-frequency Hertz
Set sampling time to Hertz value. By default, prospect uses a 200 Hz sampling period.
-M seconds, --min-time seconds
Specify minimum run time to appear in output results. Any user time based on extrapolation of sampled hits that is less than this value will not get printed out. You can set this to 0 (zero) to not limit. The default value is 0.001 seconds.
--system-sort
Sort output by descending system time instead of descending user time.
--no-sort
Do not sort the the system call tables in the output.
--system-map /path/to/System.map
Set the path to the System.map file manually. You may have to do this if prospect does not find a System.map file that matches the currently executing kernel. By default, prospect looks in /lib/modules/`uname -r`/build/System.map first, and in /boot/System.map-`uname -r` second.
--vmlinux /path/to/vmlinux
Set the path to the to the uncompressed vmlinux kernel file manually. You may have to do this if prospect does not find the uncompressed vmlinux file. By default, prospect looks in /lib/modules/`uname -r`/build/vmlinux for this file. On a system where you have not built the kernel (i.e. a vanilla distribution), you will not have this file available. If you cannot make this file available (usually by building the kernel, alternatively by uncompressing the compressed kernel image vmlinuz), then you will not be able to view disassembled profiles for the kernel.
-B File, --binary-trace-save File
Save raw results of run to a binary trace File instead of processing the run and producing the usual output above. There are two advantages of using this method, first there is less overhead both in processing and post-processing the trace by prospect, second you can post-process the saved binary run multiple times by prospect by using the following -b or --binary-trace-read option and get the exact same results every time since the run is effectively in "playback" mode. This second advantage is especially useful if your benchmark runs for a long time and it is not convenient to re-run prospect on it every time you need a different post-processing option.
-b File, --binary-trace-read File
Read a run from a saved binary trace File instead of the system. This option is useful to post-process a run saved with the -B option mentioned above. All prospect options are valid in this mode just like in a "real" run.
-a File, --ascii-trace, File
This outputs an ASCII translation into the specified File of either the saved binary trace, or the actual run. Expect more overhead when this option is on. This is not an issue if you are reading from a saved binary trace.
-d Level, --debug-level Level
Set the debugging output level to Level. Higher levels give more debugging information to standard error. You can also increase the debugging level by listing multiple -d options on the commandline. This option is primarily used for debugging, however, if you have undecoded PC's in your profiles (commonly from code that executed from ram, stripped files, etc.), prospect will print out the PC for these hits with this option.
--debug-system System
Debug the specified subsystem. Currently, prospect accepts only "dis" or "kmod" keywords are accepted for the System argument. The dis subsystem will output the usage and statistics of using the GDB slave processes for disassembly. By using this information, you can tune prospect's usage of GDB pipes and thus set the number of concurrent GDBs with the --concurrent-gdbs Number command below. The kmod subsystem will debug the kernel module facilities.
--debug-trace File
Use a debugging trace file for debugging prospect. The file is printed in flushed mode, thus there is a possibility of missing buffers when this is used.
--no-gettimeofday-in-trace
If specified, prospect will not timestamp each record in the --debug-trace debugging trace file by using gettimeofday() calls.
--buf-flush-rate Rate
Set the periodic oprofile buffer flush rate in seconds. Prospect has this set to 2 seconds as default. If you set this to 0 (zero), then the buffer will never be automatically flushed and in order to stop prospect you will need to manually flush the buffer.
--rt-priority Priority
Set the real-time priority that prospect will run at. The possible range is 0-99 and prospect runs at 0 as default (this is the normal timeshare priorty).
--inherit-priority
If given, this option makes prospect not fall back to timeshare before exec'ing the child program. Thus if you are running prospect under realtime priority, this flag will make the child run at the same realtime priority. Prospect uses the SCHED_RR (round robin) realtime scheduler if --rt-priority is greater than 0 (real time).
--concurrent-gdbs Number
Set the number of concurrently opened GDB processes for disassembly. By default prospect uses up to a maximum of 8 concurrently open GDBs. Each GDB process may take approximately 3.2 MB of memory. So, if you find yourself with too much memory pressure at the post-processing stage, you can tell prospect to use fewer GDB processes with this switch. If you set this to one, then only one GDB process will be opened and closed for each region. The penalty for having too few GDBs open is slower performance for prospect in the post-processing stage. If you have plenty of memory, go ahead and set this value high for better performance. Use the --debug-system dis switch described above to tune for your situation.

EXAMPLES

Note that while long options are used in the examples below for clarity, these long options do have short counterparts for brevity.

n+(en.
The following example runs prospect and saves the results to the results_file file. Only data from the immediate child, i.e. command, is reported. Output from command goes to standard output (if it has any and it uses stdout).

$ prospect -o results_file command

n+(en.
This example reports data on the command and all its children. The shell allows several commands to be executed while prospect is collecting data. It also gives further control on the length of time that prospect collects data. In this example, prospect will collect data until the shell is exited.

$ prospect --follow-forks -o results_file bash

$ command1

$ command2

...

$ exit

n+(en.
This example reports data on all active processes on the system while the command is running. In this example, the command used is sleep so that prospect collects data for a fixed period of time (sixty seconds in this case). This method works well for measuring system activity that does not lend itself to a direct command.

$ prospect --system-wide -o results_file sleep 60

n+(en.
This example generates a kernel profile by sampling for 60 seconds and saving the results to the file kern_profile.

$ prospect --kernel-only -o kern_profile sleep 60

NOTES

Possible Side Effects

Running prospect does have the effect of one more process running on the system as well as data collection in the kernel and transportation to user space at periodic intervals. This is usually nominal at about 5% of system utilization. Sometimes this can climb to higher numbers, especially when the system is at 100% already.

Process Profiles

For process profiles, a program does not have to be compiled with any particular options, and can be running before or after prospect is started. However, if the program has been stripped, then just the PC values are given instead of the routine names in the profile. The kernel is generally not stripped and the symbol list is obtained from the System.map file, so those routine names should appear in the profiles.

Stripped Shared Library Profiles

Stripped shared libraries will show funny looking symbols in the form of symbol1->symbol2 in the output profiles. This is because prospect uses the dynamic symbol table if the shared library has been stripped. A property of the dynamic symbol table is that static symbols have been eliminated since no one would/could link to them.

Thus, prospect attempts to give the user a clue as to a possible place where the symbol hits are by bounding the hit from the closest global symbol to the next in the hopes that if the hits are in static symbols, they probably belong to one of the global symbols listed.

Sampling Frequency and Aliasing

Prospect samples at a default sampling frequency of 200 Hz. This is adequate for most longer running processes. Depending on how lock-step a process is with the system and thus sampling clock, you may experience aliasing. When this happens, function symbols may have biased sample counts. If you suspect aliasing is happening, and in fact even if you don't, you can alleviate this effect by re-running your measurement with different sampling frequencies with the -H or --sampling-frequency option and compare the results.

Multithreaded Processes

On Linux, threads are implemented with the clone() system call and show up as regular processes from prospect's point of view. The easiest way to look at a multithreaded application is to use the --follow-forks option for displaying the multithreaded command and all its children. For example:

$ prospect --follow-forks -o mt_out my_multithreaded_app

This will output all the threads generated by my_multithreaded_app in the file mt_out sorted by user time consumed. Note that all threads will look like processes going by the same name. See also the -m / --merge-profiles discussion below.

Creating Merged Profiles

The -m or --merge-profiles option will create merged profile of the same binary objects. This means that there are either multiple copies of programs running and/or multiple short-lived processes in the run, then profiles for processes that belong to the same binary objects (the text files of these programs) will be merged together.

The feature has two advantages. First, if you are trying to get an idea of the performance of a short-lived process and increasing the sampling frequency with the -H option still does not give you enough samples, then the merged profiles feature will so long as you run many copies or multiple runs of that short-lived process and prospect observes all of them (hint: use either the --follow-forks on a shell, or --system-wide modes for this).

Second, for multi-threaded processes, this option will merge all profiles of all threads in a process container into one profile for that container. Make sure you run in the --follow-forks mode for this to work. This effectively gives you a profile of what all threads are doing in one profile. Some people like to see this alternate view of a multithreaded process. Whether this is useful to you will depend on the design of the multithreaded process you are monitoring.

Difficult to Measure Commands

Normally it is best for prospect to execute the program that you want to measure. However, this may be difficult when another program execs your program. An example is the X server. In this situation, you should use the --system-wide option and run a command that will exist for the lifetime of your program.

For example, if you want analyze a new X server invocation, you could run the following prospect command (assumes you're using bash):

$ prospect --system-wide -o results_file read

and then restart the X server (of course you'll have to execute the above command in something other than an xterm, a virtual text console is a good choice). When you've finished the X server activity that you are interested in, type <return> to complete the read command.

Moving Around in the Output File

The prospect output file can get rather large but can be navigated using the following hints.

Search for:

==--
to find processes.
USER
to find user profiles.
KERNEL
to find kernel profiles.
Skip
to find things skipped in output.

Common start-up edit commands (vi commands are in quotes):

vi file, " /USER" , " n"
Find first profile for most user time during run.
vi file, " /==--"
Find the start of the first process with most user time.

Kernel Profiling

When you use prospect to generate profiles of user processes, you get kernel profiles of kernel code executed on behalf of each user process as part of the output. Also, hits to kernel code executed during interrupt context and stolen from idle can show up as charged to arbitrary processes. So long at this is kept in mind, this presents an accurate picture of process performance; however, if you are interested in kernel performance then this is not enough.

With the Kernel Profiling feature, all kernel hits including those in interrupt context, those stolen from idle, and hits due to kernel threads, are summed into one kernel profile. This comes with the same advantages as prospect always has - no special build requirements and extremely low intrusion.

The output contains four profiles:

1. A global kernel profile of all system hits.

3. A kernel profile of hits due to kernel threads.

2. A kernel profile of hits due to user processes.

4. A kernel profile of hits to process zero.

Note that the first profile is the sum of the last three.

The Kernel Profile feature is activated by the --kernel-only or -k command line option. The time interstice to sample is specified as before. Thus, to sample the kernel for 120 seconds and output the results to a file called "output", do:

$ prospect --kernel-only -o output sleep 120

The kernel can be profiled with this feature under actual load conditions and without any build requirements - i.e., you can just as easy profile a performance built kernel as a debug build and the kernel runs at full speed.

And remember if there are too many unimportant (not enough hits to symbol) hits, adjust the output by using the --min-time Seconds option.

COPYING

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

DEPENDENCIES

Prospect depends on the open source software called oprofile. The appropriate module is included in the prospect distribution file and has a README file for building. Oprofile was developed by John Levon at the Victoria University of Manchester, UK. The active site for the oprofile software is http://oprofile.sourceforge.net. Note that prospect depends on version 0.5.4 of oprofile. Later versions may work but are not guaranteed. Earlier versions than 0.5 do not work. If you are using an earlier version of oprofile and upgrading oprofile is problem for you, please use an earlier version of prospect, specifically version 0.9.6a for best compatibility. 0.9.6a is also the version of prospect you should use for Red Hat 8.0 systems.

AUTHORS

Prospect for Linux was created in 2001 by Alex Tsariounov (HP:LOSL), with help from Bob Montgomery (HP:LOSL). Prospect includes code developed by Keith Fish (HP:TCL) and Doug Baskins (HP:Retired). Prospect uses the GPL Oprofile sampling module developed by John Levon (The Victoria University of Manchester, UK) and others, see http://oprofile.sf.net.

Prospect is currently sponsored by HP:LOSL, Fort Collins Colorado.

BUGS

Use the bug tracker feature of SourceForge hosting service for prospect at:

http://sourceforge.net/tracker/?group_id=53151&atid=469285

The prospect web site is at http://prospect.sf.net

Please make use of the mailing list facilities of SourceForge for the prospect project to ask questions and discuss this tool.

COPYRIGHT

Copyright (C) 2001-2004 Hewlett-Packard Company.

SEE ALSO

oprofile(1), opcontrol(1), gprof(1), readprofile(1), pfmon(1), profil(3), top(1), gtop(1), gkrellm(1), sar(1)

README - Included prospect readme file.

README.INSTALL - Included information about installation of prospect and oprofile.

README.ia64 - Included information about installation of prospect and oprofile for ia64 systems.

NEWS - Included prospect release notes.

http://prospect.sourceforge.net - prospect web site.

http://oprofile.sourceforge.net - oprofile web site.