Mikael Eliasson - Elmah on steroids

Error logging is an important tool in knowing the health and quality of you website after you deploy. I will look at how to enhance ELMAH in this post.

Background

The first time I implemented error logging in one of my applications I built it myself from scratch. It worked pretty fine most of the time but all errors where not caught. When I heard about ELMAH I started to think that maybe my not very modular and only half working implementation was a waste of time. When Hanselman made a post about ELMAH I decided to try it out. I found that it was very easy to get working but I found the interface useless. Getting an overview while wading through a textual log is not the easiest thing and even in my custom implementation I had graphs for this. So I decided to add a better interface to my log.

This post will cover how to create the extended interface. If you are interested in how to set up ELMAH I suggest you read the post by Hanselman.

What you need

On the serverside I will use SQL server as storage for the log and Linq-To-Sql to retrive the data. On the client side I will use some jQuery and Flot. You will also need the source code for Elmah.

Extending the logging

I wanted to be able to group errors by date, type and url and to show the information using charts. It turn out that ELMAH natively supports the first two but the url is embedded in the error XML which is stored in a text field. Looking through the text fields for that url seemed like a stupid thing to do so I downloaded the code and extended it to store the url in a varchar field. First we start with the changes to the SQL. Open the SQL file called SQLServer.sql distributed with the code and make the following additions

The url field would probably be a good idea to make larger as urls can be quite long if you have filtering.

The next thing we need to do is to change the code slightly. I should tell you that these changes will only make it work with SQL. If you want to log as XML or something similar you will have to find out how to do that yourself.

First we make a change in the constructor of Error.cs

Then we make a change in the Log method of Error.cs

And the last thing to change is in the private class Commands inside ErrorLog.cs

With those changes in place ELMAH will now log the URL in a separate field. So lets move on to creating the user interface.

Creating the services and repositories

First I will show you the classes I use in my project. I try to avoid using linq-to-sql classes or EF classes in my application because these are too tied to the data access. Instead I create my POCO classes and project my quesries from the Linq.to-sql classes to my domain classes inside the repository. Here is the two classes I use for the error logging.

public class ErrorLogItem
{
    public System.Guid ErrorId { get; set; }
    public string Type { get; set; }
    public string Source { get; set; }
    public string Message { get; set; }
    public string User { get; set; }
    public int StatusCode { get; set; }
    public string Url { get; set; }
    public int Sequence { get; set; }
    public string AllXml { get; set; }

    public System.DateTime TimeUtc
    {
        get;
        set;
    }

    public string TimeString
    {
        get { return TimeUtc.ToString("yyyy-MM-dd hh:mm:ss"); }
    }

}

public class ErrorGrouping
{
    public string Key { get; set; }
    public int Count { get; set; }

}

The repository is a bit more interesting but not much. As I'm reading and deleting deleting errors it is not much to it.

public class ErrorLogRepository : Gmok.IErrorLogRepository
{
    public ErrorLogRepository(Gmok.ConnectionStringReader reader)
    {
        db = new DataClassesDataContext(reader.ConnectionString);
    }
    DataClassesDataContext db; 

    public void BatchDelete(System.Linq.Expressions.Expression<Func<Gmok.ErrorLogItem, bool>> expression)
    {
        var ids = GetErrorQuery().Where(expression).Select(e => e.ErrorId).ToArray();
        var itemsToDelete = this.db.ELMAH_Errors.Where(e => ids.Contains(e.ErrorId));

        this.db.ELMAH_Errors.DeleteAllOnSubmit(itemsToDelete);
    }

    public void BatchDelete(Guid[] ids)
    {
        var itemsToDelete = this.db.ELMAH_Errors.Where(e => ids.Contains(e.ErrorId));

        this.db.ELMAH_Errors.DeleteAllOnSubmit(itemsToDelete);
    }

    public void SaveChanges()
    {
        this.db.SubmitChanges();
    }

    public IQueryable<Gmok.ErrorLogItem> GetErrorQuery()
    {
        IQueryable<Gmok.ErrorLogItem> query = from e in this.db.ELMAH_Errors
                                                        select new Gmok.ErrorLogItem
                                                        {
                                                            ErrorId = e.ErrorId,
                                                            Message = e.Message,
                                                            Sequence = e.Sequence,
                                                            Source = e.Source,
                                                            TimeUtc = e.TimeUtc,
                                                            StatusCode = e.StatusCode,
                                                            Type = e.Type,
                                                            Url = e.Url,
                                                            AllXml = e.AllXml,
                                                            User = e.User
                                                        };

        return query;
    }
}

There are two functions to delete a batch of errors here. To keep the size of the error log in check it is a good idea to make it easy to delete errors that are of no interest anymore. I have not really taken that part as far as I would like. For example I would like to be able to delete all errors that occured before a certain date. It should be easy to fix I just have not got around to it. To decouple my controlers from the data access I have a service layer too. I let the service layer do most of the filtering, paging etc etc so that I could change my data access without having to rewrite all that logic. This is the service

public class ErrorLogService : IErrorLogService
{
    private IErrorLogRepository repo;
    public ErrorLogService(IErrorLogRepository repo)
    {
        this.repo = repo;
    }
    public PagedList<ErrorLogItem> GetErrors(int pageIndex, int pageSize)
    {
        var query = this.repo.GetErrorQuery().Select(noXmlSelector).OrderByDescending(e => e.Sequence);
        return new PagedList<ErrorLogItem>(query, pageIndex, pageSize);
    }

    public PagedList<ErrorLogItem> GetErrors(int pageIndex, int pageSize, System.Linq.Expressions.Expression<Func<ErrorLogItem, bool>> expression)
    {
        var query = this.repo.GetErrorQuery().Where(expression).Select(noXmlSelector).OrderByDescending(e => e.Sequence);
        return new PagedList<ErrorLogItem>(query, pageIndex, pageSize);
    }

    public IList<ErrorLogItem> GetErrors(DateTime from, DateTime to)
    {
        throw new NotImplementedException();
    }

    public PagedList<ErrorGrouping> GetErrorGroupingByDate(int pageIndex, int pageSize, DateTime dateFrom, DateTime dateTo)
    {
        var query = from e in this.repo.GetErrorQuery()
                    where e.TimeUtc.Date >= dateFrom && e.TimeUtc.Date <= dateTo 
                    group e by e.TimeUtc.Date into g
                    orderby g.Count() descending
                    select new ErrorGrouping
                    {
                        Count = g.Count(),
                        Key = g.Key.ToString()
                    };
        return new PagedList<ErrorGrouping>(query, pageIndex, pageSize);
    }

    public PagedList<ErrorGrouping> GetErrorGroupingByType(int pageIndex, int pageSize, DateTime dateFrom, DateTime dateTo)
    {
        var query = from e in this.repo.GetErrorQuery()
                    where e.TimeUtc.Date >= dateFrom && e.TimeUtc.Date <= dateTo 
                    group e by e.Type into g
                    orderby g.Count() descending
                    select new ErrorGrouping
                    {
                        Count = g.Count(),
                        Key = g.Key
                    };
        return new PagedList<ErrorGrouping>(query, pageIndex, pageSize);
    }

    public PagedList<ErrorGrouping> GetErrorGroupingByUrl(int pageIndex, int pageSize, DateTime dateFrom, DateTime dateTo)
    {
        var query = from e in this.repo.GetErrorQuery()
                    where e.TimeUtc.Date >= dateFrom && e.TimeUtc.Date <= dateTo 
                    group e by e.Url into g
                    orderby g.Count() descending
                    select new ErrorGrouping
                    {
                        Count = g.Count(),
                        Key = g.Key
                    };
        return new PagedList<ErrorGrouping>(query, pageIndex, pageSize);
    }

    public void DeleteErrors(System.Linq.Expressions.Expression<Func<ErrorLogItem, bool>> expression)
    {
        this.repo.BatchDelete(expression);
        this.repo.SaveChanges();
    }

    public void DeleteErrors(Guid[] ids)
    {
        this.repo.BatchDelete(ids);
        this.repo.SaveChanges();
    }

    public ErrorLogItem GetItem(Guid id)
    {
        return this.repo.GetErrorQuery().FirstOrDefault(e => e.ErrorId == id);
    }

    private Expression<Func<ErrorLogItem, ErrorLogItem>> noXmlSelector = e => new ErrorLogItem
    {
        ErrorId = e.ErrorId,
        Message = e.Message,
        Sequence = e.Sequence,
        Source = e.Source,
        TimeUtc = e.TimeUtc,
        Url = e.Url,
        StatusCode = e.StatusCode,
        Type = e.Type,
        User = e.User
    };

}

The service mainly contains methods to filter the error log and return a paged list of errors and error groupings. The groupings are what I will use for the graph and then when I click a bar in the chart I will load the list of errors that match that bar. The noXmlSelector is an expressions that selects everything but the XML from the error. The reason is that the XML might be very large and it is not nessescary to tranfer that over the wire as I will not display that anyway.

The next step is to build the controller. The controller is a bit more verbose so I will not post the whole thing. The interesting thing is the index Action.

// GET: /ErrorLog/
[GmokAuthorize(Roles = SystemRoles.Admin)]
public virtual ActionResult Index(DateTime? fromDate, DateTime? toDate, string groupby)
{
	fromDate = fromDate ?? DateTime.Now.AddDays(-7).Date;
	toDate = toDate ?? DateTime.Today;
	IndexViewModel vm = new IndexViewModel();
	vm.FromDate = fromDate.Value;
	vm.ToDate = toDate.Value;

	if (groupby == "date")
	{
		vm.ErrorGroupings = this.logService.GetErrorGroupingByDate(0, 10, fromDate.Value, toDate.Value);
	}
	else if (groupby == "type")
	{
		vm.ErrorGroupings = this.logService.GetErrorGroupingByType(0, 10, fromDate.Value, toDate.Value);
	}
	else
	{
		vm.ErrorGroupings = this.logService.GetErrorGroupingByUrl(0, 10, fromDate.Value, toDate.Value);
		groupby = "url";
	}

	if(Request.IsAjaxRequest()){
		return Json(vm.ErrorGroupings, JsonRequestBehavior.AllowGet);
	}

	vm.LastErrors = this.logService.GetErrors(0, 20);
	vm.GroupOptions = new SelectList(new string[] { "date", "type", "url" }, groupby);
	return View(vm);
}

That Action load the error groupings that match the selected filter and also group them by the selected grouping. As mentioned I can group by url, date, and error type and I can filter by date. There are a few other Actions that looks like this

[GmokAuthorize(Roles = SystemRoles.Admin)]
public virtual ActionResult GetErrorsByType(string type, DateTime? fromDate, DateTime? toDate)
{
    fromDate = fromDate ?? DateTime.Now.AddDays(-7);
    toDate = toDate ?? DateTime.Now;

    IList<ErrorLogItem> items = this.logService.GetErrors(0, 50, e => e.Type == type && e.TimeUtc.Date >= fromDate && e.TimeUtc.Date <= toDate);
    return Json(items, JsonRequestBehavior.AllowGet);

}

Those methods are used to fetch a list of errors using AJAX when I click on one of the bars in the chart.

Building the UI

First I will show you what the UI is going to look like. This is what I want when I open the page

And if I click on one of the bars I get

As youy can see in the upper image there are no labels which I would like to have but Flot only works with certain types of labels. Instead I have made it so that a tooltip with the label is shown when I hover over one of the bars. The html needed to display the chart is

<div>
    <div>
        <% using (Html.BeginForm(MVC.ErrorLog.Index(), FormMethod.Get))
           { %>
        <label for="groupby">
            Group by:
        </label>
        <%= Html.DropDownList("groupby", Model.GroupOptions) %>
        <label for="from">
            From:
        </label>
        <%= Html.TextBox("FromDate", Model.FromDate.ToString("yyyy-MM-dd"))%>
        <label for="to">
            To:
        </label>
        <%= Html.TextBox("ToDate", Model.ToDate.ToString("yyyy-MM-dd")) %>
        <input type="submit" value="Load" />
        <% } %>
    </div>
    <script type="text/javascript">
        var groups = [];
        var translatedTicks = [];

        <% int count = 0;
        foreach (var item in Model.ErrorGroupings) { %>
            groups.push([<%= "" + (count) %>, <%= item.Count %>]);
            translatedTicks.push([<%= "" + (count++) %>, "<%= item.Key %>"]);
        <% } %>
    </script>
    <div id="placeholder" style="height: 300px">
    </div>
    <div id="errorListDiv">
    </div>
    <p>
        <input id="DeleteErrorsButton" type="button" value="Delete errors" />
    </p>
</div>

There is ofcourse some javascript needed to make the graph work but I will come to that later. The groups and translatedTicks might seem odd but the groups is a matrix used to create the graph and the translatedTicks is the same matrix but with the Key which is a string instead of the count. The last one is used to show the tooltip.

The javascript needed to get this work is fairly simple

var plot;
var lastPoint;
///Init the graph, handles drawing and binding hover and click event
function initGraph() {
    plot = $.plot($("#placeholder"), [{ data: groups, label: "Count", color: "#cc9900"}],
    {
        bars: { show: true, align: "center" },
        grid: { hoverable: true, clickable: true },
        xaxis: { ticks: [], autoscaleMargin: 0.02 }
    });

    //Bind the hover event
    $("#placeholder").bind("plothover", function (event, pos, item) {

        if (item) {

            $("#tooltip").remove();
            var label = translatedTicks[item.datapoint[0]][1];
            var value = item.datapoint[1];

            showTooltip(item.pageX, item.pageY,
                    label + "<br /> " + value + " errors");
        }
        else {
            $("#tooltip").remove();
        }
    });

    ///Bind the click event
    $("#placeholder").bind("plotclick", function (event, pos, item) {
        if (item) {
            if (item != lastPoint) {
                if (lastPoint != null) {
                    plot.unhighlight(lastPoint.series, lastPoint.datapoint);
                }
                lastPoint = item;
                plot.highlight(item.series, item.datapoint);
                loadErrors(translatedTicks[item.datapoint[0]][1]);
            }
        }
    });
}

///Show the tooltip placed above the graph
function showTooltip(x, y, contents) {
    $('<div id="tooltip">' + contents + '</div>').css({
        position: 'absolute',
        display: 'none',
        top: y - 40,
        left: x + 20,
        border: '1px solid #fdd',
        padding: '2px',
        'background-color': '#fee',
        opacity: 0.80
    }).appendTo("body").fadeIn(200);
}

The initGraph is called from the $(document).ready. It starts with setting up the graph and then it first bind the hover function and then the click function. When we hover we wanted a tooltip to be displayed with the correct label. I look at what index the current datapoint had and then I look up the key at that index and displays that value.

When a bar is clicked I want to load the errors that belongs to that bar. and I also want to make sure that the correct bar is higlighted. The loading of the errors is made by two functions

//Load the errors for the key.
function loadErrors(key) {
    if ($("#groupby").val() == 'date') {
        $.getJSON("<%= Url.Action(MVC.ErrorLog.GetErrorsByDate()) %>/", { date: key, fromDate: $("#FromDate").val(), toDate: $("#ToDate").val() }, populateTable);
    } else if ($("#groupby").val() == 'type') {
        $.getJSON("<%= Url.Action(MVC.ErrorLog.GetErrorsByType()) %>/", { type: key, fromDate: $("#FromDate").val(), toDate: $("#ToDate").val() }, populateTable);
    } else {
        $.getJSON("<%= Url.Action(MVC.ErrorLog.GetErrorsByUrl()) %>/", { url: key, fromDate: $("#FromDate").val(), toDate: $("#ToDate").val() }, populateTable);
    }
}

//Populate the table with the data. It is using the ErrorGroup template found above
function populateTable(data, status) {
    var json = { Errors: data };
    var html = parseTemplate($("#ErrorGroup").html(), json);
    $("#errorListDiv").html(html);

    //Add a toggle all function to the toggle link
    $("#errorListDiv #errorGroupCheckToggler").click(function (event) {
        event.preventDefault();
        $(this).toggleClass("checked");
        if ($(this).hasClass("checked")) {
            $("#errorListDiv input:checkbox").attr("checked", true);
        } else {
            $("#errorListDiv input:checkbox").attr("checked", false);
        }

    });

}

In the first function the call that matches the current state is made. As callback paramemeter the second function is passed in. In the second function another function called parseTemplate is called. This is a template engine first developed by John Resig and then modified by Rick Strahl. What it takes the data and a template and then inserts the values. I like this approach because it looks very much like ASP.NET MVC view engine. The template is defined by me and looks like this

<script id="ErrorGroup" type="text/html">
      <table class="smallTable">
               <thead>
                   <tr>
                       <th>

                       </th>
                       <th>
                           <a id="errorGroupCheckToggler" href="#">Toggle</a>
                       </th>
                       <th>
                           StatusCode
                       </th>
                       <th>
                           Url
                       </th>
                       <th>
                           Date
                       </th>
                       <th>
                           User
                       </th>
                       <th>
                           Type
                       </th>
                   </tr>
               </thead>
               <tbody>
                   <# for(var i = 0;i < Errors.length; i++){
                       var item = Errors[i];
                       #>
                   <tr>
                       <td>
                            <a class="detailsLink" href="/elmah.axd/detail?id=<#= item.ErrorId #>" target="_blank">view</a>
                       </td>
                       <td>
                           <input class="deleteErrorCheckBox" type="checkbox" value="<#= item.ErrorId #>" />
                       </td>
                       <td>
                           <#= item.StatusCode #>
                       </td>
                        <td>
                           <#= item.Url #>
                       </td>
                       <td>
                           <#= item.TimeString #>
                       </td>
                       <td>
                           <#= item.User #>
                       </td>
                       <td>
                           <#= item.Type #>
                       </td>
                   </tr>
                   <tr>
                       <td class="text" colspan="7">
                           <#= item.Message #>
                       </td>
                   </tr>
                   <# } #>
               </tbody>
           </table>
 </script>

The html that is created is then inserted in the page and the last thing I do is to bind the toggle function to the toggle link defined in the template. The checkboxes is used when I want to delete errors. I'm not going to show how to do that as this is already a long post. It will be left as an exercise for the reader.

Summary

Adding error logging to you project gives you the tools to understand what goes wrong and which "problem areas" that need the most improvement. Especially as a young programmer I found it valuable get the information about where my applications breaks down because I do not have the same experience and understanding of my weak areas. By making it possible to group errors by Url I can also find out where I do have the most noticeable defects. By knowing this I can make sure I fix the defects that causes the most grief to the users first.