I'm on a mission! I have to search log files that are between 150 MB and larger. These are syslogd files generated. Here is sample output.
2023-01-15 00:00:10 Mail.Debug 111.111.111.111 join[00000]: server.steve.net[1.1.1.1] 1168837209-0ff301650000-3Bdfku 1168837209 1168837210 SCAN – [email protected] [email protected] – 2 39 *abcdef.xyz.com SUBJ:Blah this is replaced
I don't confess to be an expert but this has got to be easier than I'm making it. I want to share my experiences, so far and I love adventures like this, because I learn a lot. What am I after? I want the 'server.steve.net[1.1.1.1]' or just [1.1.1.1].
Example #1 – For small files this works good
$sb = new-object System.Text.StringBuilder
$re = new-object regex('[(d{1,3}).(d{1,3}).(d{1,3}).(d{1,3})]')
$m = $re.match((get-content mySysLogfile.txt))
while ($m.Success)
{
$sb.Append($m.value)
$sb.AppendLine()
$m = $m.NextMatch()
}
$sb.ToString() > st1.txt
Example #2 – Works for large files extracting the data, performance takes a couple hours.
$sb = new-object System.Text.StringBuilder
$re = new-object regex('[(d{1,3}).(d{1,3}).(d{1,3}).(d{1,3})]')
$m = [System.IO.File]::OpenText("d:tempsyslogcatchall16.txt")
while($line = $m.ReadLine())
{
$line = $re.Match($line)
$sb.Append($line)
$sb.AppendLine()
}
$m.Close()
$sb.ToString() > st1.txt
Log Parser example
'Example 1
logparser -i:tsv "select top 50 Count(extract_token(field6,1, '[')) as CountOfIt,extract_token(Field6,1,'[') as IPAddress into Steve.csv from '\ServerNameShareNamesyslogd11.txt' Group By IPA
ddress order by CountOfIt DESC" -headerRow:off -iSeparator:'spaces'
'Example 2
logparser -i:tsv "select Top 10 Count(field6),Field6 from\ServerNameShareNamesyslogd11.txt Group By Field6" -headerRow:off -iSeparator:'spaces'
'Findstr
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/findstr.mspx?mfr=true
'QGrep
http://www.ss64.com/nt/qgrep.html
In conclusion, the clear winner was Log parser, speed and accuracy were great. Powershell was 'cool' but took too long. Maybe as I get better at Powershell, that will change. Findstr & QGrep appear to be more for parsing out entire lines of text. That was my experience, it could be my lack of advanced knowledge with these tools. I use FINDSTR a lot for doing quick searches, it is faster than FIND. I was hoping to use regular expressions, but found Powershell was easier to use for regex. I didn't try a grep utility found on sourceforge, because Log Parser did the trick. If you have other experiences using FINDSTR, QGrep or some other tool, please pass them along. Hope this helps!
3 Comments
http:// said
I noticed that you didn't mention "select-string", which is part of PowerShell. It is described on Ian Griffith's site, http://www.interact-sw.co.uk/iangblog/2006/06/03/pshfindstr
steve schofield said
Thanks for posting, I didn't know about select-string.
Avanti said
i want to design a log file analyzer.can any one suggest some searching algorithms so that log files can be searched in a less time(faster)