My Current Approach on Web Analytics

During the technological writer’s block that have been the last three years and a half, one of my posts has become quite popular several times due to high profile people sharing it on different sites. Some of these times I have noticed because I have been credited and tagged in the post but I have mostly found by accident. I still remember how Leo told me the post was in the front page of Hacker News and when Alex told me a post in r/gifs made it to the front page of Reddit. I still struggle to wrap my head around any of these feats, even more so taking into account that I did not post neither of them (yes, I am missing on a lot of fake internet points). These were two notable ones, but some times people were sharing on social media and I was missing on some very interesting conversation.

That made me wish I had some kind of basic web analytics. I have already written what I think about Google Analytics: I think it is an overkill for a simple blog and I have serious doubts about the ethical consequences of the level of tracking it performs. This post is a short write-up about the alternative I found and how does it compare to Google Analytics, and my thoughts on adding analytics in the site.

A plausible alternative

The best alternative I found was Plausible Analytics, a small European company that provides a minimalist, open source alternative. This is list with the main things that caught my eye when looking at it:

  • It is open source, not only the actual agent code but also the backend, which means that a self-hosted version can be run in any machine and it is trivial to point the agent to report to a custom server.

  • It is concise: I will never be able to understand all of Google Analytics. It is a full-time job, and I just want a quick overview of some basic web metrics to understand how the blog is doing. Plausible simply displays a concise dashboard visits, time spent and referrals, as well as basic device info. You can check all of it at a glance.

  • It is not comprehensive: Google Analytics is a very, very powerful tracking tool. It is borderline creepy on how detailed the information is, and how granular it can get. Plausible define themselves as privacy focused since they provide just aggregate, anonymous metrics and allow to define basic goals to meter. They are also EU-based, which is reassuring since Google Analytics seems to struggle to comply with GDPR.

Completeness in the metrics

All the aforementioned points made me consider Plausible Analytics as a very relevant alternative, but what really interested me was the fact that Plausible includes thorough documentation on how to use a proxy to serve the agent code. That way, this simple tracking will not be affected by adblockers and similar solutions 1 and we will, indeed, have better analytics than using Google’s solution.

Even though Plausible already run the numbers on a much relevant sample and stated that 60% of the users already block Google Analytics, I wanted to test it on my target audience, by leaving both solutions in my site for a week and comparing its metrics afterwards.

During the short period of July 19th to July 22nd, Google Analytics registered 65 users while Plausible (with its proxy setup) registered 111, which translates in roughly 40% of the users in this time period having some kind of blocking technology. It is not a very significant sample, but I am not looking forward to keep Google Analytics on my website for much more time. Still, it is a number to take into account if you need reliable metrics for any project.

I still have some internal debate on how fair it is to use the proxy solution to serve the code - I am, effectively hiding tracking code that the user is actively trying to prevent. In the end, I thought that I am myself an uBlock Origin user not because of simple visit counting scripts, but heavy tracking solutions that base their profitability on providing detailed user behavior to an advertisement company. Whether I am correct or this is a cognitive dissonance is yet to be proven, though. 2

Using and publishing the metrics

Now that the tools are up and running, I can finally detect trends and spikes in visits, as well as their approximate sources, which is much more than I could rely on before 3. It is also just nice to be able to see that someone is reading my posts, and I am not just writing into the void. However, trying to be transparent with any readers in case they are concerned, I have actually make the metrics dashboard public so that everyone is able to check it. This is exactly the dashboard I can see, not some kind of minimized version. I will link to this dashboard in the About page as well, trying to reassure any skeptic users. And please don’t laugh at the numbers, I know there are not that many people reading the blog (yet!).

All in all, I am really happy with the solution I found. After some very simple modifications in my Hugo theme, I was able to get all the data I wanted (no more, no less) in a very intuitive dashboard. Hope to see it gain some traction with new posts in the future!

  1. Except not running any Javascript code; that is indeed the only reliable way to prevent any kind of analytics. ↩︎

  2. Plausible provides a compelling case as well. ↩︎

  3. My friends. Basically my friends were the closest thing I had as traffic spike notifications. ↩︎