Apache log shell scripts
Look for bytes returned > 1,000,000
We were looking for a bug that was dumping way too much data, and we needed a way to find records that returned more than a million bytes. David Choi figured this out.
cat access_log* | grep browseinst | awk -F\" ‘{ print $1" [“$2”] [“$6”] "$3 }’ | grep browseinst | awk ‘{if ($NF > 1000000) print $0}’h2. Explanation
- cat access_log* – feed contents of all access_log files into next part of script
- grep browseinst – only return lines with “browseinst”
- awk -F\" ‘{ print $1" [“$2”] [“$6”] "$3 }’ – split the line into fields delimited by double-quotes, and then only print 1st, 2nd, 6th, and 3rd fields
- grep browseinst – _watch for lines with “browseinst” again because it could have shown up in the referrer field, which we don’t want
- awk ‘{if ($NF > 1000000) print $0}’ – _if last field is greater than 1 million, print out the last set of fields
Explanation
(Note: try pulling it apart and build it back up, looking at the output at each step. That’s the only way I could make sense of it.)
Output
128.97.62.186–- - [14/Jul/2010:10:25:18 -0700] [GET /?page=browseinst&term=101&lastalpha=I&instructor=2HTTP/HTTP/1.1] [Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.6) Gecko/20100625 Firefox/3.6.6 ( .NET CLR 3.5.30729)] 2001026117128.1026117128.97.198.33–- - [14/Jul/2010:22:48:16 -0700] [GET /?page=browseinst&term=101&lastalpha=T&instructor=1207888HTTP/HTTP/1.1] [Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.6) Gecko/20100625 Firefox/3.6.6] 2001059649…
1059649...