You ever have a bug that you just can’t put your finger on. Well this incomplete post is a tale of such a bug. I never had a chance to finish the post or the bug hunt. The post stares at me every time I log into my blog, haunting me. So, I had to post it to release it from its torment.
Early one evening I get the “White Screen of Death” in an ASP.NET application. Basically, it is an empty web page, just empty HTML tag when it should be a full blown data driven web page. You scared yet? This isn’t the first time I have seen this in this application. I am pretty sure it is related to other errors as I remember seeing errors in the application log when ever I see the screen in certain scenarios. The problem is, on this evening it happened in a scenarios without logged errors, but that doesn’t mean there are no errors…right?
Next to see if the error may have been captured elsewhere I set off to check the server logs. I check the server event logs for issues. Nothing jumps out at me. Then I want to check the IIS logs for the page request so I need to turn on failed request logging on the server. I haven’t done this in a while so I Binged it and got a good hit on this post, http://www.iis.net/configreference/system.applicationhost/sites/sitedefaults/tracefailedrequestslogging. Now I have it installed and configured, I just don’t know how to inspect the resulting traces. Another Bing and I found the answer at http://www.trainsignal.com/blog/iis-7-troubleshooting. OK, tracing is working and I can view the trace files, now I can’t reproduce the error, oh the horror… figures 😦
Finally, I found a failing scenario and I get nothing of any value in the trace. So I run the scenario again, I check the server event logs again, you guessed it, nothing jumps out at me. So far nothing, I don’t see anything obvious, well nothing I’m noticing as I could just be burnt out and missing the obvious (It happens).
I do a little more Binging and the pickings are slim. I get hits on white page issues involving SSLAlwaysNegoClientCert. Seems some people were having an issue with a 413 – Request Entity Too Large error causing the white page (https://communities.bmc.com/docs/DOC-6259), but there is no way this could be my issue because I’m not uploading anything…right? There is no way ViewState is incredibly large…nah…I’ll check anyway. Better safe than sorry and maybe something will finally jump out at me.
So, I need to:
- Check ViewState, mainly the size
- Check Network Traffic, view what is being sent and recieved
- See if I can repro the scenario locally so I can step through it in a debugger
And this ends our post. Sorry for the abrupt cliff hanger with no solution. I never finished the exploration on this as it was a very low priority edge case bug. It does provide some links on IIS tracing and a little insight into my thought process at the time on trying to discover the source of the bug. One thing I have always admired about many of the smart developers I work with is there thought processes and tool sets used in investigating issues. I have always felt that I came up short in my ability to quickly discover the root cause of issues so I have great respect for the software Sherlock Holmes of the world.
Well hindsight is 20/20 and the biggest mistake that I see is I didn’t automate the scenario. If I would have captured the scenario in a test, I could open it right now and continue where I left off, but now I have no idea where to even start to find the scenario that triggered this issue. So, the issue may still be lurking deep in the bowels of the application. This is the real horror in this post. Oh well, I have a lesson learned. Automate my bug hunt scenarios.
Actually, I wrote this some time ago and remember spending about a hour or so running through this drill in vain. So, in an effort not waste something that may be of value later I decided to just post it to stop it from haunting my post list.