Archive for August 2005

Appending to LARGE log files

A colleague mentioned today that the performance of logging to a text file suffers progressively as the file grows. I decided to run some tests to confirm this.

First, I created 3 programs that logged one line at a time to a log file, each time opening the file (for append), writing to it, and then closing the file. The 3 programs compared performance of VB6 file IO, the FileSystemObject from the MS Scripting library, and .NET System.IO. Performance of all 3 was comparable (about 1300 lines per second), and appeared to be bound by CPU rather than the disk.

I then created another .NET app to test the effect of the initial file size on the process of appending to the file. The app created a file of a specified size (on an external USB HD), and then attempted to append a single line to that file. Performance suffered, badly, just as my colleague had predicted. For a new 5GB file, it took 5 minutes to append a single line. That 5 minuites was all disk activity, no CPU.

After that first appended line, performance returned to the normal speed. Closing the app, waiting several minutes, and appending another line performed normally, i.e. well. Obviously, at some level the OS has to read the whole file before it can append to it. It then caches whatever information it needs, and can append to the file normally. I don’t know how long it caches the info for.

As a final test, I created a 5GB file on a NTFS compressed directory. The file just contained NULLs, so I’m sure it compressed well. Appending a first line to this file did NOT suffer the performance penalty that an uncompressed file did. I think this is because reading the compressed file does not require much disk activity, since it is very small on the disk.

So, if your app has large log files, you might be well served to place them on a compressed directory, or else to limit their size. Otherwise, your app may be very slow to start up as it tries to write that first “I’m starting” message to the log.

Testing Responsibilities

There is a common phrase I hear, at least where I work: “Programmers must always test their code”. When fixing bugs, I disagree.

The way I see it, a programmer’s responsibility is to be able to assert with a high degree of probability that the code changes they made:
a) fixed the problem
b) did not break anything else

Manually testing your own code is a great way to do (a), providing you understood the problem correctly to begin with! Testing your own code is pretty bad at catching (b), because if you did not realize you could break something, then it is probably unrelated anyway, i.e. you would not test it.

It comes down to time-management. Given that I have a limited amount of time to do the tasks on my todo list, it is pragmatic to consider what techniques give me the best return on my time-investment. Manual testing does not always make it to the top of my list.

Some techniques that I use to supplement or replace manual testing are:
* code inspection. Taking a look at the code and some of the surrounding code, and thoughtfully considering my changes in their proper context.
* adding comments. Often, when I comment the code to explain why something is done one way or another, I am forced to consider what happens in other scenarios.
* re-reading the bug report. Review the report to ensure that I understood the requirements.
* discussion. Discussing the issue may identify mistaken assumptions, or additional factors.
* NUnit tests. These offer more bang for the buck than manual tests, because they persist beyond my single session, and they also document the requirement.