I'm on a mission! I have to search log files that are between 150 MB and larger. These are syslogd files generated. Here is sample output.
2023-01-15 00:00:10 Mail.Debug 111.111.111.111 join[00000]: server.steve.net[1.1.1.1] 1168837209-0ff301650000-3Bdfku 1168837209 1168837210 SCAN – [email protected] [email protected] – 2 39 *abcdef.xyz.com SUBJ:Blah this is replaced
I don't confess to be an expert but this has got to be easier than I'm making it. I want to share my experiences, so far and I love adventures like this, because I learn a lot. What am I after? I want the 'server.steve.net[1.1.1.1]' or just [1.1.1.1].
Example #1 – For small files this works good
$sb = new-object System.Text.StringBuilder
$re = new-object regex('[(d{1,3}).(d{1,3}).(d{1,3}).(d{1,3})]')
$m = $re.match((get-content mySysLogfile.txt))
while ($m.Success)
{
$sb.Append($m.value)
$sb.AppendLine()
$m = $m.NextMatch()
}
$sb.ToString() > st1.txt
Example #2 – Works for large files extracting the data, performance takes a couple hours.
$sb = new-object System.Text.StringBuilder
$re = new-object regex('[(d{1,3}).(d{1,3}).(d{1,3}).(d{1,3})]')
$m = [System.IO.File]::OpenText("d:tempsyslogcatchall16.txt")
while($line = $m.ReadLine())
{
$line = $re.Match($line)
$sb.Append($line)
$sb.AppendLine()
}
$m.Close()
$sb.ToString() > st1.txt
Log Parser example
'Example 1
logparser -i:tsv "select top 50 Count(extract_token(field6,1, '[')) as CountOfIt,extract_token(Field6,1,'[') as IPAddress into Steve.csv from '\ServerNameShareNamesyslogd11.txt' Group By IPA
ddress order by CountOfIt DESC" -headerRow:off -iSeparator:'spaces'
'Example 2
logparser -i:tsv "select Top 10 Count(field6),Field6 from\ServerNameShareNamesyslogd11.txt Group By Field6" -headerRow:off -iSeparator:'spaces'
'Findstr
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/findstr.mspx?mfr=true
'QGrep
http://www.ss64.com/nt/qgrep.html
In conclusion, the clear winner was Log parser, speed and accuracy were great. Powershell was 'cool' but took too long. Maybe as I get better at Powershell, that will change. Findstr & QGrep appear to be more for parsing out entire lines of text. That was my experience, it could be my lack of advanced knowledge with these tools. I use FINDSTR a lot for doing quick searches, it is faster than FIND. I was hoping to use regular expressions, but found Powershell was easier to use for regex. I didn't try a grep utility found on sourceforge, because Log Parser did the trick. If you have other experiences using FINDSTR, QGrep or some other tool, please pass them along. Hope this helps!
RSS
3 Comments
http:// said
I noticed that you didn't mention "select-string", which is part of PowerShell. It is described on Ian Griffith's site, http://www.interact-sw.co.uk/iangblog/2006/06/03/pshfindstr
steve schofield said
Thanks for posting, I didn't know about select-string.
Avanti said
i want to design a log file analyzer.can any one suggest some searching algorithms so that log files can be searched in a less time(faster)