Parsing Complex Log Files

Parsing log files (or any other unstructured set of data) is a rather challenging task. Unlike structured data files like XML or JSON, plain log text files do not follow any strict rules and may change without any warning. It is completely up to the person who has developed the application to decide what gets logged and in what format. The format of the log entries might even change between different releases of the software. As a system administrator you may need to negotiate some sort of approval procedure so that if you automate log parsing you will not get caught by surprise when the format of the file changes. It is best to engage developers as well, so they use the same tools as you are. If they are using the same tools as you are, they are less likely to break them.

In this chapter I'm going to use the catalina. out file generated by the Tomcat application server. As you can see, the application itself is not writing any log messages at all, so the only log entries you will find there are from the JVM and Tomcat. Obviously if you are using different application containers, such as Jetty or JBoss your log entries may look different. Even if you are using Tomcat, you can override default behavior and the way messages are formatted, so look at the log files that you are dealing with and adjust the examples in this chapter accordingly, so that they match your environment.

Was this article helpful?

0 0

Post a comment