How we built a GDPR compliant website analytics platform without using cookies
Fathom Analytics is GDPR compliant website analytics without cookies.
July 21, 2019
Running a privacy-focused business means we always look for ways to improve user privacy while providing simple website analytics software. Our goal was to continue to provide users with unique visits & total page views without interfering with the visitor experience. Nobody wants those annoying popups on their website unless they have to have them. That's why we invented no cookie notice analytics back in 2019 following growing demand from our customers.
So let's dive into how we built this in a privacy-focused way:
No cookie-like technology
One of the things we had to ensure was that our software included no cookies or cookie-like technology. We wanted to ensure we had ePrivacy Compliance, including the PECR and other member state implementations.
Because of this, it meant we couldn't use any of the following: 1. Cookies 2. localStorage 3. sessionStorage 4. Data derived from "Terminal Equipment" (timezone, device height/width, etc.)
And honestly, these limitations were reasonable. We don't want to be looking at things on the visitors' device, as that feels invasive. So these technical constraints were perfect.
Analytics without cookies
One of the things Fathom Analytics needs is the ability to track visits to the site and individual pages. Tracking page views alone, without visits, is completely useless and means that you won't have insight into how many people visit your site & pages each day.
Processing personal data (IP address & User-Agent as per the GDPR) is not an issue, and the GDPR offers six lawful bases for doing so. Our customers rely on legitimate interest, and there is no risk to the data subject, and we ensure their data is anonymized.
To do this, we combine the following data to create a unique hash for the user:
- A salt based on IP address & Site ID
- IP Address
- Hostname (The domain of the website, e.g. https://usefathom.com)
- Site ID
We then take this data and perform a SHA256 hash on it. So the output looks something like this: cd3f1ed906bb12b62dd5eff809aa1778211a02d1c11992476f0c9977c0db0646
The hashes we generate are impossible for us to "de-hash" (hashes and salts, explained). It's important to note that hashing is not the same as encryption. With encryption, you can decrypt the data, whereas hashing is one-way.
We recycle our salt string every day at midnight. This isn't necessarily needed, but it's added complexity against future computing power & rainbow tables.
With this solution, we then have an anonymized hash stored in our database, and you cannot use this hash to identify an individual.
Schrems II compliant analytics
To respond to the Schrems II ruling, we spent a long time working with our privacy officer and EU legal team. We created isolated, European analytics, which we call EU Isolation. Long story short, we ensure that no EU traffic ever leaves EU-owned servers and undergoes an additional round of hashing, using a secret encryption key kept on our EU infrastructure, before it touches US-owned services.
In addition, no US services have access to these servers. Not even GitHub. We self-host our continuous integration systems to ensure we adhere to complete EU Isolation. We don't even let US contractors have access; only EU & Canadian engineers have access. And the beauty in our solution is that we're a Canadian company, meaning we have adequacy ruling under the GDPR.
The purpose of the hashes is to make sure that there is no way for us to ever identify an individual from the data we track—this is paramount in Fathom Analytics being truly "privacy-focused" and what makes us stand out in this space for website analytics. This is also important for GDPR compliance
Since the start, we've wanted GDPR, CCPA and PECR compliance, because we agree with privacy laws that respect the digital rights of internet users. Here are the key points from GDPR's Recital 26 and our comments:
|Recital 26||Our comment|
|To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly.||A natural person / anonymous user can't be singled out due to everything being hashed. The only piece that can be connected is when we update the user's previous pageview, but that is all done in a single database transaction, and we don't keep query logs.|
|To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.||Brute forcing a 256 bit hash would cost 10^44 times the Gross World Product (GWP). 2019 GWP is US$88.08 trillion ($88,080,000,000,000) so we're at least a few dollars short of brute forcing a 256 bit hash.|
|The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.||We have rendered the data anonymous to the point where we could not identify a natural person from the hash.|
|This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes.||It's possible that GDPR does not apply to Fathom since data is made completely anonymous. Even if GDPR did still apply, we reiterate the stance that there is legitimate business interest to understand how your website is performing.|
By doing things in what we've outlined above, we're a great Google Analytics alternative. There are other analytics platforms who are already cookie-free, but they don't provide their users with total visits, only pageviews. Anyone who needs website analytics to make money from their online business could tell you why total visits is such an important metric and one we've worked so hard to deliver for our privacy-focused web analytics software.
We are incredibly open to any ideas, comments or concerns. This is a big step up from what we had, but there's always room for improvement.
You might also enjoy reading:
- The Journey to EU Isolation
- Making the world's fastest analytics even faster
- What happened to our infrastructure when a customer got over 10 million page views in a few hours?