Server-side kubernetes nginx-ingress log analysis using GoAccess
I manage a single node kubernetes cluster to run some of my fun side projects and I am slowly getting rid of Google Analytics from my projects. Therefore I was looking for a server-side log analyser and GoAccess seem to have what I need, mostly.
GoAccess is a very fast open source web log analyser and interactive viewer that runs in a terminal in *nix systems or through your browser. It provides fast and valuable HTTP statistics for system administrators that require a visual server report on the fly.
Parsing nginx-ingress container log
GoAccess recognises most common logs generated by IIS, Apache and nginx - However, the logs generated by nginx-ingress
container that sits as a gateway to the services running on my kubernetes cluster was quite different. Here is an example log
{"log":"123.123.223.23 - [123.123.223.27] - - [30/Jul/2019:04:45:57 +0000] \"GET /about-me HTTP/1.1\" 500 117356 \"-\" \"Mozilla/5.0 (compatible; AhrefsBot/6.1; +http://ahrefs.com/robot/)\" 458 13.047 [mustakim-site-service-80] 172.17.0.17:5000 117349 13.048 500 830f5081a7740db11eb2ffbce625dafc\n","stream":"stdout","time":"2019-07-30T04:45:57.689166775Z"}
So I had to tell GoAccess how to parse these logs by providing the value for --log-format
, --date-format
and --time-format
command line arguments.
Arguments | Value |
---|---|
--log-format |
%^ %^ [%h] - - [%d:%t] %~ %~ %m %U %^ %s %b %R %u %^ %^ %^ %^ %^ %T %^ |
--date-format |
%d/%b/%Y |
--time-format |
%H:%M:%S +0000 |
Keep in mind the log
json property of the container logs (usually found in /var/log/containers/
) does not always contains access logs similiar to the example above. It also contains other miscelenious logs generated by the container. Also all the logs were combined into one or many files (depending on number of replicas available for the ingress deployment. So I decided whatever I do -
- I need to
grep
with the service name to extract logs for a particular service. - If I need to generate dashboard for all requests then I'll simply
grep Mozilla
!
How it's done
- Get the name of all kubernetes services,
- Generate an
index.html
file that will contain links to each static html generated by GoAccess - in order to navigate easily. This will go to/storage/goaccess/out/
which is the root of a static web server already running. - Combine all logs generated by from kubernetes nginx-ingress from
/var/log/containers/
grep
the logs to extract logs of each of the services (this will make sure other logs are skipped)- Copy extracted log to a temporary location (in my case:
/storage/goaccess/imported-logs/imported-log.log
) - Run GoAccess and pass the log as well as instructions on how to parse them.
- Generated static html (that renders the nice dashboard) will go to
/storage/goaccess/out/{svc_name}.html
- Repeat steps 3 - 7 for each of the service
- Repeat the above, but
grep Mozilla
and save asall.html
so we have another dashboard for all requests in the server.
The script
This is nowhere near perfect but It's a good start. I will keep the gist updated as I improve this.