$ echo "howdy" | cowsay
< howdy >
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Docker image for delivering AWS S3 logs via AWStats

I had a small project to display some simple stats for, for some static content sitting in an AWS S3 bucket. I could have forwarded everything to Elastic+Kibana and showed some fancy graphs and charts, but I was only being asked for what I could easily produce via AWStats.

AWS S3 log format for awstats

For S3 logging, awstats needs its LogFormat set up in the following manner: %other %extra1 %time1 %host %logname %other %method %url %methodurl %code %other %extra2 %bytesd %other %extra3 %refererquot %uaquot %other %other %other %other %other %virtualname %other Amazon’s documentation is available here

Bulk OCRing mixed content and exporting as PDF

This is more written as an aide-memoire to myself than anything. It’s a process I’m currently using for bulk-processing a set of documents of various forms (MS Word, PPT, PDF, LibreOffice etc), converting them all to PDF, running OCR on any embedded images and then sticking the end-result into Elasticsearch via Tika (not documented, plenty documentation elsewhere re this final step).