New Changes to Pull Rate Limits Coming to the Docker Hub
Why developers use Docker — a bit of history
Today, Docker has been around for about 7 years, and as most of you will already know, it is incredibly easy to use, as you can simply install a Docker client and you are ready to run your container — no logins or subscriptions required. Yet this was only made possible by the generosity of Docker, Inc., who consistently provided us with complimentary container images which were free of any form of registration. Of course, until now, you could easily also register and even purchase additional services, but for the majority of development environments, free access would be more than sufficient, as you could simply select an official Docker image before downloading it from the Docker Hub. This was always particularly helpful for many startups and SMEs, who could benefit from reduced running costs in the development stages of their projects.
But alas, 2020 is the year of crushed dreams and dashed expectations, and as always with these sorts of things, nothing in life is ever free. Unsurprisingly, therefore, as a new owner at Docker settles in, they have started to make changes to their business model and move closer towards monetisation.
The onus on us, therefore, is to adapt.
What and when is going to change at the Docker Hub?
Although as of the 4th of November, 2020, the Docker Hub provides 2,500 pulls every 6 hours for both free and anonymous users, according to the company’s official statement, this will gradually be reduced in the coming weeks and months. So with this in mind, you can very soon expect limits of 100 pulls every 6 hours for anonymous users and 200 pulls every 6 hours if you are logging in with a free account — a staggering reduction by any stretch of the imagination.
What do the upcoming Docker Hub rate limits mean from a business perspective?
These limits may potentially frustrate your CI/CD pipelines and production environments — in particular, Kubernetes clusters — until the next 6-hour window is reached, and it is important to recognise that this could happen throughout deployment, or at any other critical portion of development. Or, to state this another way, your development process may soon be given a much unwelcome technical blocker, which not only could render important fixes unfixable but may also derail your marketing plans completely.
Officially, the number of pulls recorded by the Docker Hub is assigned to a unique IP. This means that your entire local network — with a single real IP address — will contribute to the overall limit counter, not each teammate or even each machine. One way to somewhat improve this situation would be for each team member to register for a free account. But this will only be helpful with local pulls per individual developer. After all, like it or not, CI/CD is a common and single system for the entire project or organisation, and it will often also utilise a single account.
How can I avoid issues with the new Docker Hub rate limits or at least minimise the chance thereof?
So with that in mind, let’s talk about your options from a production environment and CI/CD perspective:
If your number of CI/CD runs per 6-hour window is negligible, and you are also fairly certain that you will never be dealing with spikes, then the newly imposed 100-pull limit will likely be sufficient for you, and you are free to keep everything as it is.
Register for a free account
If, on the other hand, just doubling the above limit will give enough manoeuvre for your business, then you should probably register for a free account and configure your CI/CD to use it. Of course, in that case, bear in mind that your chosen pull capacity will be capped at 200 pulls.
Register and subscribe to a paid Docker Hub plan $$
The other obvious solution would be to purchase a Pro plan for your CI/CD environment and configure it for data transfers with hub.docker.com. This would give you unlimited pull requests and would ensure your CI/CD pipelines are never hampering your business expansion.
Optimise your processes — how to stay within pull rate limits
More often than not, a burst of pulls from the Docker Hub in a fairly short space of time would be considered an exceptional event. This usually takes place at the very beginning of a project, when the project direction changes, or when programmers start developing a brand new feature. In fact, in the vast majority of cases, we stick firmly to the same Docker images. This can rightly lead to thoughts of caching, as the CI/CD environment could immediately cache data from the Docker hub before later then simply reusing it. As a result, not only would you minimise your chances of hitting Docker’s new stringent pull limits, but you may also optimise your pipelines’ performance.
For enterprises or large teams and projects
In the case of larger projects and companies, the situation may prove significantly different. After all, many enterprises will already have efficient caching mechanisms in place. These will likely take the form of locally hosted packaging systems that are able to mirror official repositories. Beyond offering great benefits such as build performance optimisation and mitigating issues with official repositories, these solutions gain favour with larger businesses due to their security and monitoring policies.
Even better, however, there are products that can help with mirroring and with hosting artefacts in general: so be sure to have a look at Artifactory and Nexus, as they will likely prove exceedingly useful.
For smaller teams and projects
If your projects or development teams are considerably smaller than average, it may be enough to simply utilise the power of Docker’s native caching system. In this way, so long as there are no changes to the context, every image layer downloaded or even created during build may easily be stored locally and reutilised. In order for this to work, however, you will have to avoid the “dind” approach (also known as Docker-in-Docker) and apply best practices to your Dockerfiles to repurpose layers wherever possible. In fact, not only can you reutilise images from hub.docker.com, but you can also reuse layers with other project dependencies, so you won’t have to repeatedly download them with every single pipeline run. In this way, your pipelines should become noticeably faster.
Migrating away from hub.docker.com
Finally, if your environments are already based on cloud solutions — particularly in the case of Google’s Kubernetes Engine (GKE) and Amazon’s Elastic Kubernetes Service (EKS) — and you are currently paying for a cloud provider which includes a Docker image registry, then perhaps it would be valuable to completely migrate to your main provider and thus reduce your overall dependence on Docker, Inc. At the moment, for instance, hub.docker.com is already expected to impose other limits (e.g. image retention), but this has been postponed to mid-2021.
Should you be worried about changes to the free Docker plans?
Considering everything we know so far, it is clear that many software projects will be forced to reevaluate their strategies for setting up their environments. Nonetheless, the good news here is that Docker have announced a new phased approach to this transition period that should — at least in theory — leave us with plenty of time to reinvent our environments.
For large-scale projects, however, the sheer magnitude of change may quickly turn into a significant challenge which not only requires extra development and DevOps capacity, but could also use a heavy dose of creative solutions.
Just keep in mind that other flagship providers are also aiming to support their clients. Amazon, for example, have recently announced a new public container registry that will allow developers to deploy container images publicly.
Very similarly, when it comes to these challenges, SPG will always apply the best possible best practices to your projects, and will aim to go even further if the time and budget allow. Though you may ask yourself whether to take the Docker-in-Docker approach, how to make sure your Docker image build is as stable and robust as possible, or even if your Docker layer is caching as much as it can, these will all vary considerably depending on the tools you use, your development environment and your own development process.
Thankfully, however, with many years of experience in Docker usage and configuration, and a full arsenal of development tools (including docker-compose, kubernetes, helm, and popular services such as GitLab CI, TravisCI, CircleCI, GitHub Actions, TeamCity and others), Software Planet Group would be happy to assist you with your project audit and optimisation, and help you achieve the results you need.