Monday, October 22, 2007

Parsing Enterprise Library Trace Output in Ruby

The Enterprise Library Tracer class is really handy, and I use it all the time for profiling slow pieces of code. The only problem is that its output is illegible. Using the default formatter, this is what one trace looks like:

    1 ----------------------------------------
    2 Timestamp: 10/24/2007 12:42:00 AM
    3 Message: Start Trace: Activity '03952c36-9e3c-4175-bd5a-9f2ea108ae7e' in method 'App.Results.LoadData' at 1981206585161 ticks
    4 Category: GetPhotoQuery
    5 Priority: 5
    6 EventId: 1
    7 Severity: Start
    8 Title:TracerEnter
    9 Machine: GABE-THINKPAD
   10 Application Domain: /LM/W3SVC/1/Root/App-11-128376599444062500
   11 Process Id: 3564
   12 Process Name: C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\aspnet_wp.exe
   13 Win32 Thread Id: 4164
   14 Thread Name: 
   15 Extended Properties: 
   16 ----------------------------------------
   17 ----------------------------------------
   18 Timestamp: 10/24/2007 12:42:09 AM
   19 Message: End Trace: Activity 'b7b7c71f-30ae-486c-8206-4d87dd44e2fc' in method 'App.Results.LoadData' at 1981236796826 ticks (elapsed time: 0.07 seconds)
   20 Category: GetPhotoQuery
   21 Priority: 5
   22 EventId: 1
   23 Severity: Stop
   24 Title:TracerExit
   25 Machine: GABE-THINKPAD
   26 Application Domain: /LM/W3SVC/1/Root/App-11-128376599444062500
   27 Process Id: 3564
   28 Process Name: C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\aspnet_wp.exe
   29 Win32 Thread Id: 4164
   30 Thread Name: 
   31 Extended Properties: 
   32 ----------------------------------------

If you are using multiple nested traces, it's impossible to interpret. A custom log formatter can help.

    1 <formatters>
    2   ...
    3   <add template="{category}({timestamp}): {message}" 
    4     type="Microsoft.Practices.EnterpriseLibrary.Logging.Formatters.TextFormatter, Microsoft.Practices.EnterpriseLibrary.Logging, Version=3.1.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a"
    5     name="Short Text Formatter" />
    6 </formatters>

Each entry is now compressed onto one line, and I've gotten rid of the information I don't need:

    1 GetPhotoQuery(8/24/2007 12:16:48 AM): Start Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.Results.LoadData' at 2228474435074 ticks
    2 GetRecords, GetPhotoQuery(8/24/2007 12:16:48 AM): Start Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.DataLayer.getPhotoQuery' at 2228474857791 ticks
    3 GetRecords, GetPhotoQuery(8/24/2007 12:16:48 AM): End Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.DataLayer.getPhotoQuery' at 2228475263799 ticks (elapsed time: 0.113 seconds)
    4 CreatePhotoRecList, GetPhotoQuery(8/24/2007 12:16:48 AM): Start Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.DataLayer.getPhotoQuery' at 2228475271352 ticks
    5 LoadAdditionalImages, CreatePhotoRecList, GetPhotoQuery(8/24/2007 12:16:48 AM): Start Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.PhotoRecList..ctor' at 2228475285334 ticks
    6 LoadAdditionalImages, CreatePhotoRecList, GetPhotoQuery(8/24/2007 12:16:48 AM): End Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.PhotoRecList..ctor' at 2228475413732 ticks (elapsed time: 0.035 seconds)
    7 LoadPhotoList, CreatePhotoRecList, GetPhotoQuery(8/24/2007 12:16:48 AM): Start Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.PhotoRecList..ctor' at 2228475416500 ticks
    8 LoadPhotoList, CreatePhotoRecList, GetPhotoQuery(8/24/2007 12:16:52 AM): End Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.PhotoRecList..ctor' at 2228491499906 ticks (elapsed time: 4.493 seconds)
    9 SortingResults, CreatePhotoRecList, GetPhotoQuery(8/24/2007 12:16:52 AM): Start Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.PhotoRecList..ctor' at 2228491505534 ticks
   10 SortingResults, CreatePhotoRecList, GetPhotoQuery(8/24/2007 12:16:52 AM): End Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.PhotoRecList..ctor' at 2228491518346 ticks (elapsed time: 0.003 seconds)
   11 CreatePhotoRecList, GetPhotoQuery(8/24/2007 12:16:52 AM): End Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.DataLayer.getPhotoQuery' at 2228491520681 ticks (elapsed time: 4.539 seconds)
   12 GetPhotoQuery(8/24/2007 12:16:52 AM): End Trace: Activity '466569a2-9713-4e66-aeca-d7d9f87aaae5' in method 'App.Results.LoadData' at 2228491521707 ticks (elapsed time: 4.773 seconds)

Better, but still hard to interpret. If you squint you can tell that this is a series of nested traces. Execution time is at the end of the line. I used to be in the habit of manually reformatting these files: adding indentation to show nesting and removing the attributes I didn't care about. But this kind of busywork is the problem computers were invented to solve. 37 lines of Ruby later, I had a script which would parse trace output like so:

    1 GetPhotoQuery
    2     GetRecords
    3     GetRecords completed at 8/24/2007 12:16:48 AM (0.113 seconds elapsed)
    4     CreatePhotoRecList
    5         LoadAdditionalImages
    6         LoadAdditionalImages completed at 8/24/2007 12:16:48 AM (0.035 seconds elapsed)
    7         LoadPhotoList
    8         LoadPhotoList completed at 8/24/2007 12:16:52 AM (4.493 seconds elapsed)
    9         SortingResults
   10         SortingResults completed at 8/24/2007 12:16:52 AM (0.003 seconds elapsed)
   11     CreatePhotoRecList completed at 8/24/2007 12:16:52 AM (4.539 seconds elapsed)
   12 GetPhotoQuery completed at 8/24/2007 12:16:52 AM (4.773 seconds elapsed)

Which shows me exactly where the bottleneck is. Some highlights of the script:
  • It chooses the file to parse based on the the script's own filename. Just copy it to the directory your trace file is in and rename it to, eg., mytraces.log.trace_parser.rb to parse the file mytraces.log. This piece of convention over configuration was inspired by Rails.
  • It generates the readable trace and automatically opens it in notepad for you, so getting the parsed output is a one-step process.
The Ruby source looks like this:

    1 #outputs a file with the trace information in a more readable format
    2 
    3 #convention: the log filename is the script filename minus
    4 #   the trailing name and rb extension 
    5 src_filename = __FILE__.chomp ".trace_parser.rb"
    6 dest_filename = "#{src_filename}.readable"
    7 dest = File.open(dest_filename, "w")
    8 
    9 File.open src_filename do |src|
   10   #need to keep track of how deeply nested we are
   11   indent_level = -1
   12 
   13   #print a line with the correct indent
   14   print_line = lambda { |s| dest.puts(" " * 4 * indent_level + s) }
   15 
   16   while line = src.gets do
   17     ##### try to capture the start of a trace #####
   18     _, name = /^(\w+).*\Start\sTrace/.match(line).to_a
   19     if name then
   20       indent_level += 1
   21       print_line.call(name)
   22       next # move along to the next line
   23     end
   24 
   25     ##### try to capture the end of a trace #####
   26     _, name, time, sec = /^(\w+).*\((.*)\).*End\sTrace.*elapsed time: (\d+\.?\d*) seconds/.match(line).to_a 
   27     if name then
   28       print_line.call("#{name} completed at #{time} (#{sec} seconds elapsed)")
   29       indent_level -= 1
   30       next # move along to the next line
   31     end
   32   end
   33 end
   34 
   35 dest.close
   36 
   37 `notepad #{dest_filename}`

No comments: