tsprof, pmprof:  direct measurement subroutine profiler
for Linux on i686+

Features
Black box profile
Flat profile
Subroutine details
Text reports
Order info
Lite
Personal
If you want to know where an existing program spends its time, then use the tsprof package.  No recompilation, no relinking required.  Avoid the extra space, time, and administrative chore of creating and maintaining a separate version for profiling.  Get the detailed, comprehensive, and usable information you need, on subroutines in all modules, even for pthreads and SMP, just by running your existing program.

Black box profile

black box profile histogram
The Black box window shows a bar graph of time spent in a routine and the routines it calls, sorted by decreasing percentage of time, and displayed on a logarithmic scale.  Moving the pointer over the window produces a flying annotation which gives the percentage of total time and the identity of the routine under the pointer.  Cursor arrow keys move the pointer one position at a time.

Flat profile

flat profile area map
The Flat window displays an area map of percentage of time spent within a routine itself, not counting time spent in called routines.  Area on the screen is proportional to time.  The colors distinguish adjacent areas, and have no other meaning.  Like the black box window, pointer motion in the flat window produces a flying annotation, and cursor arrow keys move the pointer.

Subroutine details

detail area map of callers and callees for one subroutine

For any routine, a keystroke or button click and menu pick creates a new window which shows the flow of black box time through that routine.  On the left appear its callers.  On the right appear the routine itself and the subroutines it calls.

Text reports

        Date of execution and report; clock frequency

Monitored execution  Wed Jan 16 08:35:57 2002  /tmp/tsdata-test2t-XXX760
Report generated     Wed Jan 16 08:36:18 2002  by tsprof 1.00  2002-01-14
Clock frequency was  701.599MHz.
Profiling bounds:  15000 subroutines,  25002 arcs,  149 deep,  1024 recursive
 

    Identification of modules, and totals of subroutines and call arcs

 subroutines     call arcs
called  alloc   used  alloc      module
    54    118    223    558    0:/lib/ld-linux.so.2
    51   2382   1610   5592    1:/lib/i686/libc.so.6
    47    234    262    558    2:/opt/tsprof/lib/libpthread.so.0
     7    409    489   1152    3:/lib/i686/libm.so.6
    11     14     17     54    4:./test2t
====== ====== ====== ======
   170   3157   2601   7914
 

    Hierarchical profile

The list is sorted in descending order of total time spent inside a routine, including everything it calls.  The call arcs appear immediately after the caller, and the percentage for an arc is relative to the time for the caller.  Indentation helps to reinforce the relative nature of arc percentages.  `@millisec' is the average unit cost (seconds / count) * 1000.  The count and ticks fields contain only non recursive entries; times for sub trees are additive.  A plus sign `+' immediately after the count indicates that there were recursive entries; see the other sections of the report for more information.
Black box  4.443 seconds ('+' excludes Recursive entries)
   count         ticks     millisec    @millisec   % [id]  module:name
       4    3117005674    4442.7168    1110.6792 100 [1] 0:<top_level>
       4       9118305      12.9965       3.2491     0 [1] 0:<top_level> (self)
       2    2967637315    4229.8198    2114.9099    95 [1]->[2] 2:pthread_start_thread
       1     139273656     198.5089     198.5089     4 [1]->[9] 0:_dl_start
       1        564040       0.8039       0.8039     0 [1]->[15] 1:__libc_start_main
       1        285255       0.4066       0.4066     0 [1]->[22] 2:__pthread_manager
       1        125077       0.1783       0.1783     0 [1]->[31] 0:_dl_init
       1          2026       0.0029       0.0029     0 [1]->[25] 0:fixup
       2    2967637315    4229.8198    2114.9099  95 [2] 2:pthread_start_thread
       2          8932       0.0127       0.0064     0 [2] 2:pthread_start_thread (self)
       2    2967534529    4229.6733    2114.8366   100 [2]->[3] 4:thread_func
       2         88734       0.1265       0.0632     0 [2]->[38] 2:__pthread_do_exit
       2          1798       0.0026       0.0013     0 [2]->[117] 1:modify_ldt
       2          1729       0.0025       0.0012     0 [2]->[98] 1:sigprocmask
       2          1593       0.0023       0.0011     0 [2]->[136] 1:__getpid
   ...
An example from pmprof is
   count         ticks  %t   0:0x00410003  %0   1:0x000100a2  %1  [id]  module:name
       1     134947088 100        4568827 100       46855422 100   [1] 0:<top_level>
       1       9365553     7       278133     6      3416141     7   [1] 0:<top_level> (self)
       1       2244003     2        62306     1       280331     1   [1]->[6] 1:__libc_start_main
       1         28604     0          477     0         7859     0   [1]->[82] 0:_dl_init
       1     123306051    91      4227851    93     43150287    92   [1]->[2] 0:_dl_start
       1          2877     0           60     0          804     0   [1]->[25] 0:fixup
       1     123306051  91        4227851  93       43150287  92   [2] 0:_dl_start
       1          8944     0          393     0         4193     0   [2] 0:_dl_start (self)
       1     123297107   100      4227458   100     43146094   100   [2]->[3] 0:_dl_start_final
   ...
 

    Flat profile

The list is sorted in descending order of time spent within a routine itself, not including time spent in called subroutines.  Recursive entries contribute to both count and ticks.  A count with a following minus sign '-' indicates that there were some recursive entries; see the separate list, which excludes the non recursive calls.
Flat  4.443 seconds ('-' includes Recursive entries)
   count         ticks     millisec    @millisec   % [id]  module:name
 4000000    1661985354    2368.8537       0.0006  53 [5] 3:sqrt
      22     656716585     936.0284      42.5467  21 [4] 4:spin
 4000000     648807712     924.7558       0.0002  21 [8] 1:__isnan
       1     137709425     196.2794     196.2794   4 [12] 0:dl_main
       4       9118305      12.9965       3.2491   0 [1] 0:<top_level>
   ...
An example from pmprof is
   count         ticks  %t   0:0x00410003  %0   1:0x000100a2  %1  [id]  module:name
       1     122195089  91        4165565  91       42710040  91   [5] 0:dl_main
       1       9365553   7         278133   6        3416141   7   [1] 0:<top_level>
      93-       161946   0          48231   1         143839   0   [15] 1:msort_with_tmp
       3        375496   0          31198   1         157956   0   [9] 0:_dl_relocate_object
     311        479618   0          19176   0         240501   1   [13] 0:_dl_lookup_versioned_symbol
   ...
 

Recursive entries
   count         ticks     millisec    @millisec   % [id]  module:name
       2          3707       0.0053       0.0026   0 [27] 1:malloc
       1         15379       0.0219       0.0219    47 [27]->[76] 1:malloc_starter
       1         13067       0.0186       0.0186    40 [27]->[67] 1:chunk_alloc
       1          2816       0.0040       0.0040     9 [27]->[25] 0:fixup
       1           676       0.0010       0.0010     2 [27]->[82] 2:__pthread_mutex_unlock
       1           660       0.0009       0.0009     2 [27]->[158] 2:__pthread_mutex_trylock
       1           431       0.0006       0.0006     1 [27]->[168] 2:libc_internal_tsd_get
 
 

Features

  tsprof,
pmprof
tsprof
Lite
tsprof
Personal
prof
(gcc -p)
gprof
(gcc -pg)
sprof
(glibc)
oprofile
no recompile,
no relink
+ + +     + +
main
program
+ + + + +   +
shared
libraries
+ + +     1 +
dynamic
modules
+ + +     +
pthreads +           .
SMP +           +
nodes + + + + + + .
arcs + + +   + .  
CPU events +           +
direct time
measurement
+ + +        
text
output
+ + + + + + +
interactive
graphics
+ + +        
print
graphics
+ + +   add-on    
follow
fork+exec
+ +         +
#! interp + + +       .
runtime API + + +       .
wallclock mode + + +        

The tsprof package runs on a Linux i686+ kernel modified with the perfctr patches http://www.csd.uu.se/~mikpe/linux/perfctr.  The perfctr patches virtualize the hardware time stamp and performance counters for use by any individual process.  Tsprof reports entry counts, and time based on direct measurement via the virtualized hardware time stamp counter.  In addition to entry and tick counts, pmprof reports counts of upto two virtualized hardware performance monitor counters.  Both work with a wide range of binary executable programs, including shared libraries, dynamic modules, and pthreads including SMP.
 

Order info

tsprof, pmprof

The data gathering, analysis, and display functions in the tsprof, pmprof package are proprietary software which is licensed for use on one named machine (node-locked license).  The delivered package includes a selection of ready-to-run Linux 2.6/2.4 kernels with the perfctr patches already applied.  tsprof, pmprof License USD$595; discounts are available for quantity purchases or educational use.

tsprof Lite

The features of tsprof Lite edition are the same as tsprof, except the Lite edition does not cover pthreads, SMP, or CPU events. tsprof Lite provides counts and times for nodes and arcs of the main program, shared libraries, and dynamic modules; with text, interactive, and print graphics output. tsprof Lite license USD$295; discounts are available for quantity purchases or educational use.

tsprof Personal

The features of tsprof Personal edition are the same as tsprof, except the Personal edition does not cover pthreads, SMP, CPU events, or static executables, and does not follow fork+exec. tsprof Personal provides counts and times for nodes and arcs of the main program, shared libraries, and dynamic modules; with text, interactive, and print graphics output. tsprof Personal license allows use by the purchaser only, on any computer where the purchaser is the primary agent responsible for monitored program invocation, execution, and its consequences. tsprof Personal uses a software key file and generates personalized output. tsprof Personal license USD$79 pre-paid only; limit one per person, natural persons only (entities not eligible), not discountable; electronic delivery only (via .tgz attachment to e-mail and purchaser-initiated ftp of .rpm containing patched kernel.)

Satisfaction guaranteed

Within 30 days of delivery, if you are not satisfied that the tsprof edition you licensed performs substantially according to the product description and documentation, then return it with your reason for a refund of price paid.

E-mail inquiries to sales@BitWagon.com.  Prices, terms, and specifications subject to change without notice.  Orders subject to acceptance.


BitWagon Software LLC
9975 SW Hawthorne Lane
Portland, OR 97225  USA

http://www.BitWagon.com
sales@BitWagon.com

FAX +1 503 297 3754
TEL +1 503 297 3754



Copyright © 2002 BitWagon Software LLC.  All rights reserved. tsprof and pmprof are trademarks of BitWagon Software LLC. Linux is a registered trademark of Linus Torvalds. PostScript is a registered trademark of Adobe Systems Inc. Each trademark and registered trademark belongs to its respective owner.