Fathom Analytics

How to build a privacy-first software business

Jack Ellis, co-founder Fathom Analytics

Written by Jack Ellis

Published on: October 5th, 2020
How to build a privacy-first software business

Fathom Analytics is run in a very transparent way. We are honest and vulnerable in our blog posts, telling our personal stories into the software world, and we regularly share our thoughts, opinions, challenges, and successes on our podcast, Above Board.

Because of this, people regularly ask us how they can make their business more privacy-focused. And what do we do when we get lots of the same question? We blog about it so that everyone can benefit. So let’s get into it.

Practice what you preach

Privacy is not a marketing gimmick. You cannot just slap “privacy-first” on a product and watch the world change. I’m jumping straight into this point because I see it all the time. When Cloudflare launched their privacy-first analytics service, the first thing I noticed was that my browser had blocked 9 trackers/adverts from their blog. This included spy pixels from Facebook & Google.

If people are making moves to protect the privacy of their website visitors, I respect that, I’m not a privacy radical who is never satisfied. But if your announcement says “if you use those services, you’re giving up the privacy of your users to understand how what you’ve put online is performing” and, in the same blog post, giving up the privacy of your users, we know it’s a marketing gimmick. Cloudflare has a privacy-focused track record, so I’m not going to pile on them here, but this is a prime example of how the privacy-focused crowd has incredible bullshit detectors, and they will see through the marketing. So practice what you preach.

Privacy should be the foundation of everything you do

We built Fathom before we had a deep understanding of GDPR, CCPA, ePrivacy (UK: PECR), etc. Originally, our software wasn’t built to satisfy privacy laws, we built it because we wanted simpler software that put user privacy ahead of everything. But what does this mean?

Every time we make a product or business decision, we discuss what impact this has on a users’ right to privacy (as well as the privacy of people who visit websites that use our software). When we started in 2018, that was the main focus, and we used cookies in a very privacy-focused way. We would assign a random ID to a user, store it in a cookie, and that would be the identifier. But we soon realized that we couldn’t continue to do this if we wanted to be compliant with EU privacy laws.

Once we got better bearings with the law (thanks to some of the world’s best consultants and EU lawyers), we continued to consider user privacy ahead of everything, but we also started to think about privacy law. Here’s an example of how we consider software additions (with a real feature request):

  1. A handful of people asked for user journeys. They want to see how users use their site, where they fall off, and where they convert.
  2. A smart way of doing this would be keeping history in cookies and then syncing it. But we can’t because of privacy law.
  3. We don’t like the idea because we don’t believe that keeping user histories is right. We go to great lengths to protect user privacy and this would be a step in the wrong direction.
  4. We won’t build it.

This was an interesting consideration. But even if you could argue that the “user journey is fine, just make it anonymous”, you cannot do it at a technical level. ePrivacy (UK: PECR) prevents you from using cookies. And if you were to assign a fingerprint to a user on the server-side, and you tied that to everything someone did, we’d be getting into a questionable area. I’m sure it wouldn’t be a problem if, for example, we knew that someone viewed our about, pricing, and podcast page. But imagine if Fathom was being used on Quora.com, and people were looking at some highly sensitive questions, and they were having a fingerprint tied to their history, solely to build user history. That would be a disaster. And we invented the EU privacy law compliant analytics technique that said no to that kind of tracking.

Here are a few quick examples of areas of our business, outside of our privacy-focused analytics software, and how we deal with privacy:

  • Access logs - We only keep access logs for 24 hours to prevent malicious attacks.
  • Advertising - We will only advertise if it can be done in a privacy-focused way. This means that we’re not going to pay Facebook for targeted advertising, which we disagree with. We also won’t pay Google via their Adwords service. I’ve heard mixed opinions on that, and I know DuckDuckGo (privacy-focused search) advertises with them. We’re going to be speaking to DuckDuckGo about this so that we can get their thoughts. We love podcasts & other areas instead.
  • Billing - If a customer downgrades or deletes their account, we wipe their Stripe billing profile. We still keep invoices for legal reasons, but our access to their credit card is gone. Also, when they delete their account, we queue up their account for deletion immediately.
  • Cookies - We don’t use cookies on our marketing website. Once you get into the main Fathom application, we use cookies for security reasons & for sessions (needed to keep you logged in, and these are still compliant with privacy laws because they’re essential). But we avoid cookies whenever we can help it, and we never use tracking cookies.
  • Customer data - Our customer data stays in the database. Our data policy states that we may move personal data to Canada (since that’s where we are based), but the chances of that happening are very low.
  • Transactional emails - We use Postmark, so only 45 days of history is kept. And we never track opens.
  • YouTube - We wanted to reach the millions of people using YouTube. And we wanted to start doing videos. But we didn’t want to link people to YouTube since it tracks them. So we built a “shield” that you can see here, which allows you to view the video on YouTube or watch a YouTube-free version.

There are hundreds (thousands?) of other ways, but specifics aren’t important. The pattern is that user privacy is at the core of every decision we make. And honestly, even if you’re just starting to get interested in protecting your users’ privacy, you are miles ahead of some of the bigger companies who don’t give a rat’s derrière.

Getting to grips with privacy laws

Before we get into this, I have to add a disclaimer and state that I am not a lawyer, so this is not legal advice in any shape or form.

Everyone knows about GDPR. If you’re in the EU, you haven’t got a choice, and you must abide by it. If you’re outside of the EU, you’re in an interesting position, because the law was made by EU lawmakers. And whilst the EU believes you must comply with it, the challenges of enforcing GDPR fines against companies outside of the EU would be tremendous. What we’re seeing instead is the data protection authorities going after the EU subsidiaries (e.g. Facebook Ireland, Google Ireland, .etc.).

With us, because we have a lot of customers in the EU (including European governments who use Fathom), we need to ensure our privacy law compliance for their sake. Interestingly, complying with EU privacy laws doesn’t change anything with regards to the core of our business. It adds a few technical limitations, but we already protect user privacy. So the base here is that we will comply with GDPR and ePrivacy (UK: PECR). And then we will also comply with CCPA because we have customers there too.

GDPR

This is a huge piece of legislation, and it’s a headache to get into by yourself. This isn’t an article on “how to be GDPR compliant”, it’s an article on how to build a privacy-focused software business. For us, we hired a GDPR expert, who eventually became our privacy officer. She was able to help us with GDPR compliance, and we would never have been able to do it (to the level we’ve done it) on our own. Additionally, she was able to connect us to some incredible lawyers, who helped us with new documentation.

In the early days, we had a friend (Sam) who was a privacy consultant. He was able to give us a foundation, but we knew that we needed more. One quick point about GDPR is that there are 6 lawful bases for processing personal data:

  1. Consent
  2. Contract
  3. Legal Obligation
  4. Vital Interests
  5. Public Task
  6. Legitimate Interests

The GDPR is not as simple as “do this” or “do that” so we do recommend hiring an expert rather than trying to wing it yourself.

ePrivacy (PECR)

This is a privacy directive that came from the EU. The UK has left the EU but they had already implemented PECR beforehand (which is their implementation of the ePrivacy directive).

This directive goes into a lot of things and covers various marketing activities, but the key highlight is that you are not allowed to use cookies without a lawful basis. Why? Because the directive states that you cannot store or access information stored on “Terminal Equipment” without consent, unless it's considered essential. And “Terminal Equipment” is the users’ device.

So what does this mean? It means:

  • No cookies
  • No localStorage
  • No sessionStorage
  • No checking various capabilities of the machine

And it’s something that a lot of people forget about or don’t understand. This is the reason for us ditching cookies back in 2019. We invented an ePrivacy compliant analytics solution off the back of the ICO issuing advice about cookies. The title of that blog post speaks about GDPR, and we wrote this long before we knew what we know now. The solution was mostly for ePrivacy, not GDPR, as we were already GDPR compliant with ease.

As I said in the GDPR section, read the laws (and advice) yourself and hire a consultant if you need gaps filled in. It’s an excellent investment.

CCPA

CCPA came into action in January 2020, and we were involved in lobbying related to it alongside DuckDuckGo, Brave, Ghostery, ProtonMail, FastMail, and others. You can read more about that here: 24 Tech Companies Back CCPA Amendment to Make It Stronger: Privacy for All Act of 2019.

CCPA is a funny one because CCPA only applies to you if you meet one of the following criteria:

  1. Make at least $25 million in annual revenue
  2. Hold more than 50,000 users’ or devices’ data (this includes website visitors)
  3. Earn more than 50% of revenue from selling data

For criteria number 1, you'd have to be running a very successful business for that to apply, but criteria number 2 applies to anyone who has 50,000+ visits, which is very easy to hit if you're a global business. Whenever I get into conversations about CCPA, I always send people to this incredible article: California Consumer Privacy Act (CCPA) Compliance Guide because it’s pretty simple to follow.

Other privacy laws

I’m not going to get into any other privacy laws, as I only want to touch on ones that we’ve had to work with as a business. Ultimately, when you’re starting a business, you don’t have the luxury of lawyers, so get reading and pay for them to confirm your understanding or correct it. It’s much easier for you to read, establish a baseline understanding, and then pay hourly for a consultant or lawyer to fill in the gaps or correct you. If you’re already running a successful business, just pay for an expert and save yourself a ton of time.

Position of strength

We are in the age of “get customers, worry about profit later”. If you watch a talk from Y Combinator, they’ll talk about how they were focused on scaling the user base and finding their product-market fit. This can be fine, and I’m not going to bash all venture capital, but we should be aware that taking money from people can end up biting us in the butt down the road.

When building software like Facebook, it’s certainly much harder to think about profit from the start, because you have a situation where scaling users like crazy is the only way to grow. People want to be where their friends are, and Facebook had to offer that for free. Additionally, in the early days (even into the somewhat later days), businesses didn’t think social media was worth anything. So how could they generate revenue without investment? Perhaps it wasn’t possible.

But the reality of the situation is that, because so much money was already invested in Facebook, they had to start making a return for their investors. And they decided to sell user data. Perhaps that was always part of their strategy? And what’s fascinating is that the next 10 years may not be kind to Facebook. The world is slowly turning on them, lawmakers are becoming more aware of their dangerous attitude to privacy, they’re caught in the middle of political wars and I would hate to be Mark Zuckerberg. But let’s move away from the doom and gloom of Facebook, which we already know about, and stop talking about unicorn tech startups. Facebook is just an extreme example of how bad a company can end up. Let’s talk about how a business can set itself up for being in a position of strength, and why it’s important.

Being in a position of strength is fundamentally about control. You can choose to run your business in the way you know to be right. You don’t have to sell user data or compromise on your ethics for purposes of profitability. This is a huge luxury in this day and age. A lot of people start businesses because they want to be “in charge”, and then they sell parts of their business and get new accountability in their lives.

So what can we do to put ourselves in a position of strength?

Focus on profitability

Profitable businesses don’t need investors. If you can build your business to profit in a way that is ethical to you, you can re-invest that money in growing your business at a moderate, sustainable pace. You won’t have to take millions of venture capital, struggle to monetize, and decide to sell your user data. This has been our focus from the start with our business. We have focused on building a top-quality software product, and charged appropriately. Which brings me to the next point…

You need margin

Low margin businesses are stressful and growth can end up being damaging, not beneficial. You need to build and keep contingency funds. And if you’re only charging a few dollars above what your costs are, you’re going to suffer if things go wrong. Sure, you might argue that low margins work at high volume, but high volume also means that any problems you have are even bigger because you have more users who are affected. Margin helps you build contingency, and when bad weather comes, you will be ready for it. Notice that I said “when’ and not “if”. One of our friends, Justin Jackson (cofounder of Transistor), wrote a fantastic article about margin in business: Good businesses have margin.

Tune into yourself

The last few paragraphs have been quite anti-investment. I appreciate that sometimes you cannot build a company with investment, but that doesn’t mean you have to sacrifice your position of strength.

One of the big things I don’t like about venture capital is that some firms will sway you into doing things you’re not entirely comfortable with. You need to tune into yourself and become aware of your core values. If you’re a “make money at the cost of anything” kind of person, then you shouldn’t even be reading this article. But if you’re someone who has ethics and wants to do business in a sustainable, privacy-focused way, it’s so damn important that you do what’s right.

Some VC firms do care about doing what’s right, and that’s fantastic (for them), but a lot of firms care about one thing only: money. Because that’s what investors care about. And those VC firms are motivated to make money, not operate in a principled way. Although, now that we’re seeing the market demand more privacy-focused solutions, we’re seeing venture capital pour into that area.

Data Minimization

Data minimization is the practice of intentionally reducing the amount of personal data you collect about someone. Following this practice is incredibly simple when your business doesn’t rely on selling user data. All you have to do is think about what you’re storing and question the amount of data that you need.

In the past, companies have opted to save everything. Access logs, user activity, and everything they can get their hands on. And they'd keep it permanently. Why? Because even if they’re not selling your data, they can use your data to make business decisions that help grow their revenue.

Imagine you ran a website like Quora. You have millions of people coming to your website, browsing, and asking questions. Some people ask and browse highly sensitive questions. Let’s imagine you own this business. You have a couple of options:

Option 1 - No minimization

You keep track of every single question a user asks & views. And you tie that history to an account or some kind of server-side fingerprint.

Pros: You have a backlog of user activity and can profile users, build/improve algorithms, and increase interaction with your product. And make more money.

Cons: If your database is compromised, this is sensitive data that becomes public. I can now go into a database, type in your email, and find everything you did. Or maybe you have a static IP, and I can search for your activity by IP. Or maybe I know which town/city you’re in, and can isolate it down that way.

Option 2 - Data minimization

You anonymize activity and there’s no way of tying it back to an individual (via IP or user profile). Asked questions would be a bit more challenging, but you certainly wouldn’t need to track which questions they view.

Pros: You get business insight and can see how people behave inside your application. And no personal data is tied to the behavior. If the database is compromised, you wouldn’t be able to see a user’s history.

Cons: Less ability to profile users (this is a pro in my eyes!).

Server locations

Where you host your user's data can be of huge importance. And this section certainly ties into the Data Minimization section above. Any data you store can legally be pursued by governments. Some people will react to that statement with “well they shouldn’t be breaking the law”. But it’s not about that. And this article doesn’t exist to teach you about privacy is important. If you want some casual reading on that, my co-founder Paul Jarvis wrote a fantastic post: But I have nothing to hide that will help.

So imagine you’re processing data on citizens in the EU, and you're keeeping substantial data about them (which they've agreed to, perhaps you're a Health & Fitness company). You’ve got to think about GDPR compliance. You store a whole bunch of personal data about them. What should you do? 2 simple ways to handle this:

  1. Be clear in your terms about where their data is stored
  2. Allow EU users to have their data stored & processed in the EU

You need to be smart about how you do this. If you’re hoping to serve a global market, and you keep all of your servers in the EU for GDPR compliance reasons, you’re being ridiculous. Customers in the USA, Canada, India, Australia, etc. will suffer because of this, and you need a middle ground.

A smart solution would be the following:

  1. EU Customer? Here’s EU infrastructure, and we won’t transfer your data to the US
  2. US Customer? Here’s US infrastructure, we’ll keep your data out of the EU

The challenge is that if you’re a US-based company and you’re handling data for foreign citizens, Section 702 of the FISA Amendments Act of 2008 allows you to be targeted by the US government. So even if you do use EU servers, you’re a US business and must hand over data. If you’re an EU-based company, you can get the best of both worlds, as you can have your EU-based hosting for EU customers and US-based hosting for US customers. Happy days.

As you can tell, there is a lot of complexity in these areas. The server location only partially matters, and the recommendation is to invest significantly in data minimization.

Security

This area is a huge one. Security and privacy go hand in hand (Mac OSX groups them together into “Security & Privacy”), and privacy cannot be upheld without security. You cannot protect your users’ data if your server & database haven’t been adequately hardened. So what should you do?

Spend the big bucks

Pay for the best possible infrastructure and consider your technical expertise. Do you have experience hardening servers and keeping them up to date? Do you know the ins and outs of MySQL security? These are all things to consider when thinking about security.

We spend more on infrastructure than any of our competitors. We do it because we want industry-leading uptime, but we also do it because we want the best people in the world to manage our services. We pay for managed services (I wrote about why managed services are better) so that we have top tier server nerds, who have hundreds of years of combined experience. I am the “CTO” of our company and it would take me years to get as good as a single person who is responsible for managing our servers. So instead of trying to do that, we pay a premium for hundreds of geniuses to keep an eye on our server fleet. And we love it. It’s worth every penny. Managing infrastructure yourself just creates added work and added risk. But, hey, if you’re into DevOps, you do you!

Use tried and tested technology

Imagine you’ve been using one of the leading website CMS systems, something battle-tested like Statamic, and you/your developers have seen a new kid on the block. You want to move away from Statamic and try out the new kid on the block. But the risk is that this new kid hasn’t yet had any real production experience. Being an early technology adopter can be great, but not when your business depends on it. If you’re going to put security as one of your core values (which you should), I’d recommend sticking with battle-tested options.

Restrict access

As I was writing this blog post, I received an email from a clothing company my wife likes named Gymshark. They informed me that their provider, Shopify, had 2 rogue employees stole customer data from at least 100 merchants on their platform.

As a “CTO”, and as a developer, my first response is: why the hell did they have access to all of that data? Why did 2 members of support have access to data they didn’t need? Were they doing support on all that data? It just doesn’t make sense to me. I’m speculating, of course, but we have to do better.

For support teams, there should be some kind of mechanism in place where the customer has to authorize them for access. A lot of companies already do this. You’ll be asked, as a customer, to give support staff a “key” (which expires) and they’ll get access to your account. This is such a fantastic way of doing it because the customer has full control throughout the situation and the keys expire after the support session has ended.

Your default position shouldn’t be “here is access to everything”. Instead, you should limit what people can see. Access to over 1 million orders by default? No, that is a disaster.

Bringing it all together, running a privacy-first business

So there’s a lot of stuff you need to think about when running a privacy-first business. More than I could ever fit in a single blog post. The modern consumer is demanding privacy-first businesses because they’re sick of their privacy being invaded. A lot will change over the next 10 years, as more lawmakers get involved and demand soars. A huge step you can take immediately is to remove Google & Facebook spy pixels from your website and move to a privacy-first analytics service like Fathom Analytics. When user privacy & business interests align, the world becomes a better place.

Filed under: privacyReturn to blog →