Apache log shell scripts
Look for bytes returned > 1,000,000
We were looking for a bug that was dumping way too much data, and we needed a way to find records that returned more than a million bytes. David Choi figured this out.
cat access_log* | grep browseinst | awk -F\" ‘{ print $1" [“$2”] [“$6”] "$3 }’ | grep browseinst | awk ‘{if ($NF > 1000000) print $0}’
Explanation
- cat access_log* – feed contents of all access_log files into next part of script
- grep browseinst – only return lines with “browseinst”
- awk -F\" ‘{ print $1" [“$2”] [“$6”] "$3 }’ – split the line into fields delimited by double-quotes, and then only print 1st, 2nd, 6th, and 3rd fields
- grep browseinst – _watch for lines with “browseinst” again because it could have shown up in the referrer field, which we don’t want
- awk ‘{if ($NF > 1000000) print $0}’ – _if last field is greater than 1 million, print out the last set of fields
(Note: try pulling it apart and build it back up, looking at the output at each step. That’s the only way I could make sense of it.)
Output
128.97.62.186 - - [14/Jul/2010:10:25:18 -0700] [GET /?page=browseinst&term=101&lastalpha=I&instructor=2 HTTP/1.1] [Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.6) Gecko/20100625 Firefox/3.6.6 ( .NET CLR 3.5.30729)] 200 1026117128.97.198.33 - - [14/Jul/2010:22:48:16 -0700] [GET /?page=browseinst&term=101&lastalpha=T&instructor=1207888 HTTP/1.1] [Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.6) Gecko/20100625 Firefox/3.6.6] 200 1059649...
Here’s an update, with partially labelled output.
Here's an awk script tooutput, get those who returned more than 1,000,000bytes.bytes, but also show the number of seconds the request took.cat /logs/httpd/ssl_access_log | awk -F\" '{ print $1" [URL: "$2"] [BROWSER: "$6"] [REFERER: "$4"] [SECONDS: "$7"] "$3 }' | awk '{if ($NF > 1000000) print $0}'108.13.60.167 - - [27/Nov/2011:04:14:00 -0800] [URL: GET /file.php/7718/course_materials/Geography_131_early_HoloceneNew.pdf?forcedownload=1 HTTP/1.1] [BROWSER: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.24) Gecko/20111103 Firefox/3.6.24] [REFERER:https://classes.sscnet.ucla.edu/course/view.php?name=11F-GEOGM131-1] [SECONDS: 17] 200 352978975.75.47.248.169 - - [27/Nov/2011:04:16:50 -0800] [URL: GET /file.php/7826/course_materials/Anthro33Lecture13.ppt.pdf?forcedownload=1 HTTP/1.1] [BROWSER: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2] [REFERER: https://classes.sscnet.ucla.edu/.../course/view/11F-ANTHRO33-1?topic=7] [SECONDS: 16] 200 129891475.47.248.169 - - [27/Nov/2011:04:17:08 -0800] [URL: GET /file.php/7826/course_materials/Anthro33Lecture15.ppt.pdf?forcedownload=1 HTTP/1.1] [BROWSER: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2] [REFERER: https://classes.sscnet.ucla.edu/.../course/view/11F-ANTHRO33-1?topic=8] [SECONDS: 19] 200 1578667