Mastodon

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

A couple of weeks ago I wrote about some big changes afoot for Have I Been Pwned (HIBP), namely the introduction of annual billing and new rate limits. Today, it's finally here! These are two of the most eagerly awaited, most requested features on HIBP's UserVoice so it's great to see them finally knocked off after years of waiting. In implementing all this, there are changes to the existing "one size fits all" model so if you're using the HIBP API, please make sure you read this carefully and understand the impact (if any) on you. Here goes:

The Rate Limits and (Some) Pricing is Different

The launch blog post for the authenticated API explained the original rationale behind the $3.50 per month price and most importantly, how I wanted to ensure it didn't pose a barrier:

In choosing the $3.50 figure, I wanted to ensure it was a number that was inconsequential to a legitimate user of the service

As I said in the previous blog post, what I didn't understand at the time was that paradoxically, the low amount was a barrier to many organisations! But equally, it's made the API super accessible to the masses so that price stays. The rate limit, however, needed revisiting and to understand why, let's go back to the beginning:

The "1 request per 1,500ms" rate dated all the way back to 2016 where I'd initially attempted to combat abuse by applying the limit per IP. This was an entirely non-empirical, gut feel, "let's just try and fix the problem right now" decision and it was only very recently I actually started trawling through the data and looking at how the API was being consumed. 1 request every 1,500ms is a maximum of 57,600 requests in a day; here's the number of requests by the top 20 consumers of the service in a recent 24 hour mid-week period:

Keeping in mind that you're never going to achieve the full 57,600 requests in a day as you'd have to time every single one of them perfectly so as not to hit the rate limit, only 1 subscriber even achieved half that potential. In fact, only 9 subscribers achieved even a quarter of the potential with everyone else very quickly falling back to a small fraction of even that. To be fair, I'm conscious that I'm taking a full day of data and talking about requests as if they were evenly distributed across the entire period when there are inevitably use cases where it's more a short burst rather than a prolonged, even distribution. Regardless, what the data is saying is that the default "one size fits all" rate limit is way above and beyond what almost every single subscriber is actually consuming, and by a significant order of magnitude too. In a way, what we ended up with is the little guys subsidising the big guys.

The bottom line is that we're simultaneously adding a bunch of higher rate limits whilst reducing the entry level rate limit. It's easier if you see it all in context so let's just jump straight into the pricing (all in USD):

This is from Stripe's embeddable pricing table I mentioned in the previous post and it's what you see when you first sign up for a key. With new limits, it's easier to talk about "requests per minute" or RPM so that's the nomenclature we're sticking with now. That entry level 10RPM model will work for well in excess of 90% of current subscribers and it's only a very small percentage of the existing subscriber base exceeding it. (And yes, again, I know these requests are sometimes made in bursts but even still, 10RPM is far in excess of the vast majority of use cases.)

There are economies of scale that have been factored in here. Going from 10RPM to 100RPM isn't a 10x increase, it's about a 7x increase. Going to 5 times more requests is only 4 times the price, and so on and so forth. The hope is that this makes it easier for the folks who were previously buying multiple keys to justify scratching all the kludge previously used to do that and replacing it with a single key at a higher RPM.

To get to this outcome, we trawled back through heaps of data ranging from the high-level aggregated stats in the earlier chart to the nature of the organisations buying multiple keys (which we can obviously determine based on the email address used). I also chatted with a bunch of API users both during this process and over the preceding years and have a pretty good sense of the use cases. A few trends became immediately clear:

Firstly, use cases that are genuinely personal have a very low rate limit requirement. Checking your own address(es) or those of your family by a custom app, for example. Or one of my favourite uses (and one I definitely use), the Home Assistant integration:

On an ongoing basis, HA makes 1 request every 15 minutes. That's all. Each time we looked at genuine personal use cases, 10RPM was plenty.

Next, we found a bunch of use cases used within internal corporate environments, for example to monitor staff exposure in breaches. Now we're talking larger numbers of requests, but it's also something that's way more efficiently done via the existing domain search feature on the website. It's an on-demand, self-service and totally free feature that's been there for years. I know it's not API-based and there are good reasons for that (see the comment from me on that idea), but there's also the Enterprise route if API access is really that important (more on that later). Other examples included things like scanning customer emails to assess exposure at points where, for example, account takeover was a risk. In each of these cases, we're primarily talking about business entities using the service and I'm comfortable with commercial ventures wearing a greater cost.

And finally, there were the "heavy hitters", the ones with large volumes of keys. One such example using the API en masse provides security services to the big end of town and was funded to the tune of a figure that looks like a phone number. And again, I'm perfectly comfortable with them wearing a cost that's more commensurate to the value as opposed to a figure that was originally arrived at just to keep the bad guys out.

Existing Subscribers are Grandfathered in for 60 Days

Before I talk about the annual pricing, I want to make sure this headline is clear. Nothing changes for existing subscribers until the 6th of Jan next year, which is 60 days from today. On that date, the legacy rate limit of 1 request every 1,500ms will roll to the new 10RPM limit at exactly the same price. For that handful of big users for whom the 10RPM limit will be insufficient, you've got a couple of months to work out the best path forward. I'll be emailing every single active subscriber today to ensure everyone is notified well in advance (there's also an updated Terms of Use which requires a notification email to be sent).

What does this mean in practical terms? If you want annual billing or a higher rate limit, you can go and implement that whenever you're ready (more on that soon). Alternatively, if you just want to stick with 10 RPM then you don't have to do anything, nothing will change. What I do strongly suggest though (and this hasn't changed, it's always been the guidance), is to make sure you're handling HTTP 429 responses gracefully. Regardless of what your rate limit is, if you're consuming the API in a fashion where you're not directly controlling the rate yourself, make sure you handle those responses appropriately.

Billing Can Now Occur Annually

This is the easy one to explain: annual payments are now a thing 😊 As I explained in the previous blog post, frequent payments of small amounts can play havoc with reimbursements in the corporate environment. It sucks, I've been there, but it is what it is. Annual billing alleviates that through a combination of a 12x reduction in the frequency of an expense claim and a larger single sum that's easier to explain to your procurement people than $3.50.

So, what do you charge for annual rather than monthly billing? My initial temptation was just to make it literally 12 times more because I don't have a lot of patience for spivvy marketing guff. However, there's a valid case to be made that a 12x reduction on individual payments warrants a discount as it removes overhead from our end (there's a constant percentage of all payments that are disputed or fail or cause other demands on our time), plus there's an argument to be made along the lines of customer loyalty warranting a discount. There's also just the very simple mathematics of the whole thing, best illustrated by a recent payment in Stripe:

That's 8.5% that disappears on every transaction, largely due to the 30c AUD charge no matter what the price of the transaction is:

The point is that there's merit for all in incentivising annual rather than monthly payments. We decided to look at what a typical annual discount was and time and time again, found the same thing:

Or in other words, a couple of months for free when you sign up for a year. In fact, coincidentally, that's exactly what I just signed up for with Nabu Casa (Home Assistant cloud) after receiving an email saying annual billing was now available 😊

It's never exactly 17%, rather it's like each example took 17% off 12 month's worth of a normal monthly fee then moved the number to something that looked pretty 🙂 Some examples were less (Pluralsight is 14%) and others were more (the higher tiers of Zendesk are 20%), but ultimately we decided to work to that 17% number and came up with the following:

In keeping with the "pay for 10, get 12" theme, these prices are exactly 10 times the monthly ones. Easy peasy.

Stripe Customer Portal Magic Makes Changing Plans Easy

As I mentioned in the "big changes ahead" blog post, I've been deleting code like crazy in favour of deferring more processing back to Stripe themselves. By using their Customer Portal paradigm, it's now easy to change an existing plan:

The change can be to a different rate limit or to a different renewal cadence:

Stripe automatically proratas everything too so whilst you can upgrade immediately to a higher RPM or from monthly to annually, you'll only pay for the difference between the previous plan and the new one. Or, you can downgrade and on next renewal the lower plan will be automatically applied. It's super simple and it's all self-service.

Enterprise

For more than 7 years now, a small handful of organisations have used HIBP in a larger scale commercial fashion. Some of them you're familiar with, for example both 1Password and Mozilla do email address searches using k-Anonymity and that's not something that's a self-service "put your card into Stripe" sort of model (in part because k-Anonymity returns a huge number of results for each search). Infosec firms use Enterprise to support customers via domain level API searches. Identity theft companies use it to advise customers when they're exposed in a breach. One firm even uses it to help detect bot signups; it turns out that so many of us are so pwned, if someone signs up for their service and they're not pwned, that's a little bit suspicious (that's just one of many indicators they use).

This is a fundamentally different model, one that involves a close working relationship, lots of legal documents, procurement people, invoicing instead of credit cards and all sorts of other "Enterprisey" things. That still exists and nothing in today's blog post changes that. I mention this now in today's post simply because some of the folks from those organisations with Enterprise subscriptions will read this post and wonder where they sit. Likewise, I suspect those "100+ key" subscribers of the public API really should be on Enterprise and I'll be reaching out to them separately given the rate limit change will have a bigger impact on them than most.

In Closing

For that vast majority of users who are only at a fraction of the old rate limit, nothing changes other than there now being a key available for 17% less than before on an annual subscription. Meanwhile, for the folks battling corporate bureaucracy around small, frequent payments, this will sort you out and give you choices around rate limits you didn't have before.

There will be some people that fall between the cracks of the use cases outlined above and won't be happy with the changes. I expect that - I know it will happen - but I hope the rationale outlined here demonstrates the volume of thought and consideration that has gone into trying the find the sweet spot for pricing and rate limits. I also expect people will ask about adding other rate limits, for example to fill the gaps between say, 100RPM and 500RPM. We started out with more options, but a combination of that creating the whole paradox of choice problem and deeper analysis of how the API was actually being used led us to simplifying things. But who knows over the longer term, feedback is certainly welcome.

Lastly, if you're watching closely, you'll notice a lot more structure going in around the way HIBP is run. Last week I wrote about rolling out Zendesk for support so there's now a formal ticketing system in place. I also explained how Charlotte is playing a very active role in the management of HIBP and in the coming months, you'll see more around other initiatives to make the project more sustainable. I'm thinking of it like this: what must HIBP do to be sustainable in a post-Troy world? Or in other words, how can we get what has increasingly become an essential service for so many to be more robust and more self-sustaining beyond what one person can do as a sole operator devoting spare time to a passion project.

Stay tuned, there's much more to come 🙂

Have I Been Pwned
Tweet Post Update Email RSS

Hi, I'm Troy Hunt, I write this blog, create courses for Pluralsight and am a Microsoft Regional Director and MVP who travels the world speaking at events and training technology professionals