Tcl Dev Kit User Guide Tutorial: Hotspot Analysis, or the Importance of Choosing a Good Algorithm

    * Introduction
    * Step One: Open the Tutorial Program
    * Step Two: View Unexecuted Lines
    * Step Three: View Executed Lines
    * Step Four: Use Coverage and Profiling Dialog
    * Step Five: An Improved Algorithm
    * Step Six: Saving Profile Data
          o Example


    Introduction

This tutorial shows how to use the Tcl Dev Kit Debugger with profiling
to analyze the efficiency of your code. When the "Profiling" option is
selected in the Project Settings <Debugger.html#ChangeProCoverage>, the
Tcl Dev Kit Debugger will generate additional data about the time
required to execute individual commands. The sample projects use integer
multiplication as implemented in Tcl. While you would probably never
implement multiplication in a scripting language, this example
demonstrates how the choice of algorithm affects the performance of an
operation.


    Step One: Open the Tutorial Program

    * *UNIX*: Change to the directory
      |/demos/TclDevKit/TclDebugger/Profiling/| beneath the root Tcl Dev
      Kit installation directory, and run:

tcldebugger mult_1.tpj

    * *Windows*: Click *Start | Programs | ActiveState Tcl Dev Kit | Tcl
      Dev Kit Demos | Profiling and Coverage | Hotspot Multiplication I*

The Tcl Dev Kit Debugger will open and load the |mult_1| project. The
file |mult_1_run.tcl| will open in the Code display portion of the
debugger's main window.


    Step Two: View Unexecuted Lines

Run the application in the debugger as usual. You can watch variables,
use the eval console, and perform all other debugging tasks while
coverage information is being gathered in the background.

As the debugger executes lines in the application, lines that have not
been executed will be displayed with a yellow background.

When the script has finished running, select *View | Code Coverage*, or
click the *Coverage* button on the toolbar. The *Code Coverage and
Profiling* dialog will be displayed.


    Step Three: View Executed Lines

Use the radio buttons at the top of the dialog to switch between the
highlighting modes. In Step Two, we discussed highlighting uncovered
code. In this step, we'll look at highlighting covered code.

To view qualitative profiling information, select *Highlight Covered
Code for Selected File*. Now, executed commands are colored in orange
and black. The darker the color, the greater the number of times the
command was executed.

Notice that lines 13 and 14 have a black background. These lines have
been executed the most often. These two lines make up the main loop of
the multiplication command and, together, are the "hotspot" - that is,
the part of the application that must be optimized to improve performance.


    Step Four: Use the Coverage and Profiling Dialog

A complex application might have several hotspots scattered over
multiple files, making this type of visual inspection tedious. To make
this easy, in the Code Coverage and Profiling dialog, select the *Calls*
tab. (The default tab, *Files*, shows a list of files for which
profiling information was collected.)

The *Calls* tab displays a list of all the lines in the program that
have been executed. The list contains the line number, the file in which
the line is located, and the number of calls made to the line. In
addition, the Calls tab displays the minimum, average, maximum, and
total time spent to execute a particular command. Times are shown in
microseconds. By default, the list is sorted by the number of calls, so
that the most frequently executed lines are displayed at the top.

If you were working with a large file, you could click on a line in the
list and then click the *Show Code* button to position the code display
in the debugger to the selected file and line.

In our example, we see that the commands on lines 13 and 14 were called
a whopping 15007 times, and used 212,705 and 229,367 microseconds of
processing time, respectively.


    Step Five: An Improved Algorithm

Think about the algorithms used in the hotspots that were highlighted by
the debugger in the previous step. In the real world, you might think
about moving this part from a Tcl implementation to one based upon C.
However, it is also possible to use a different algorithm to make the
calculation more efficient.

The second tutorial project contains a program that rewrites the
multiplication loop to use a halving/doubling algorithm, instead of
linearly incrementing the result.

Select *File | Close Project* to close the first project, then select
*File | Open Project* to load the |mult_2.tpj| project. The
|mult_2_run.tcl| file will open in the Code display portion of the
debugger's main window.

Run the program to the end, then open the Code Coverage and Profiling
dialog, select *Highlight Covered Code...*, and click the *Calls* tab.

While the multiplication loop still contains commands that are called
most often, the absolute numbers are drastically reduced, a direct
result of choosing a superior algorithm. The highest number of calls on
a line is now 41 instead of 15007, and the total processing time 
has been reduced to 1,465 microseconds. This occurs in the command that
decides between incremental and halving steps. We also see that the
halving step is called more often than the incremental step (about twice
as often in fact). This is not surprising considering that it is this
step that reduces the overall number of iterations through the loop.


    Step Six: Saving Profile Data

You can use the *Save Data* button in the Code Coverage and Profiling
dialog to export coverage information in a standard format for
post-processing with external tools like Excel, statistics applications,
etc. Clicking this button writes the current profiling information in
CSV format to a file.

The generated file contains the following columns:

*NrCallsToCmd*
    Number of calls to the described command
*Filename*
    Name of the file it is in
*Min*
    Minimum amount of time, in microseconds, spent on the executed command
*Avg*
    Average amount of time, in microseconds, spent on the executed command
*Max*
    Maximum amount of time, in microseconds, spent on the executed command
*Total*
    Total amount of time, in microseconds, spent on the executed command
*CmdLine*
    Line in the named file the command is on
*CmdCharIndex*
    Index of the command in the named file in characters
*CmdLength*
    Length of the command in characters


      Example

Look at the first row in the Excel spreadsheet pictured above. The
command on line 13 of "mult_1_run.tcl" was called 15007 times. The first
character of the command is the 129th character in the file (counting
EOL as one character), and the command consists of 10 characters.

You can view profiling information for both of the tutorial projects in
the files "mult_1.csv" and "mult_2.csv", located in the same directory
as the tutorial files.

