Americas

  • United States

Asia

Oceania

4 tools to prevent leaks in public code repositories

Feature
Nov 10, 20216 mins
DevSecOpsThreat and Vulnerability Management

Use these tools to find your company's exposed secrets in repositories such as GitLab, GitHub, or Google Cloud Build before attackers do.

Conceptual image of a password amid hexadecimal code.
Credit: Matejmo / Getty Images

Secrets stored in Git repositories have been a thorn in the side of developers and a go-to source for attackers for a long time. Ensuring that sensitive information is stored appropriately and scrubbed from repositories has become a necessity to reduce the likelihood of software being compromised, often in very public ways. While this seems obvious, it’s easy to overlook hardcoded connection strings, passwords, and even plaintext credentials stored by the development tool itself. Visual Studio, for instance, can store SQL connection credentials in plaintext unless told otherwise.

In 2020 alone, GitGuardian detected over 2 million secrets in public repositories. It has been largely hypothesized that a leaked credential produced by an intern played a part in the execution of the SolarWinds attack. With such high-profile cases like this, it’s worth taking a minute to evaluate whether your own projects could be exposed in this way.

The trick is finding the secrets to begin with. They are often tucked away in code or obscure XML files and encoded in ways that are difficult to find. Manually scrubbing code is both error prone and likely to result in oversights. Unfortunately, since Git, like other source control systems, retain previous commits, cleaning up an exposed secret goes beyond merely deleting the secret from the code and recommitting. It needs to be purged from the history, which can sometimes mean starting over. Because of this, it’s important to get things right and get them right early in the process.

Fortunately, several tools are available to help deal with this sort of issue. While most are command line tools, some are web-based options. All share similar functionality but achieve the result in slightly different ways. The major pieces of information that they look for include usernames, passwords, private keys, and other potentially sensitive information.

When considering which one to use, it’s important to evaluate your own technical abilities, time available to learn a new tool, whether you need custom detections, and budget.

Gitleaks

GitLeaks can be installed using Homebrew, Docker or Go, which is available on multiple platforms. Once installed, you will be able to define rules and run the command line tool to scan your Git repository. Rules are written using TOML (Tom’s Obvious Minimal Language) that looks something like a cross between JSON and Windows INI files. Each rule is a regular expression, so it’s helpful to be fluent in those.

Fortunately, once learned, there is really no limit to the types of patterns that you can match against. GitLeaks also provides quite a few sample configurations that you can use if you need to refresh your knowledge of regular expressions or need a good starting place for building your own. In addition, a default configuration will catch most of the common secrets that you don’t want bad guys or dishonest employees to find.

After running the tool, you’ll get a list of issues and a return code if you like automating your scans. GitLeaks is a powerful tool that requires you to know what you are doing, but it’s lightweight enough to where it can fit into anybody’s development pipeline.

GittyLeaks

GittyLeaks has a similar name to GitLeaks but is a very different tool. It’s written in Python and is somewhat of a one-trick pony. After installing it with pip, you can execute it from the folder from which your git repository is cloned. It will attempt to look for words like username, password, and email that may have been overlooked. It’s still a work in progress and lacks a solid mechanism to customize the search patterns as well as deal with remediation, but it does the job of finding secrets and doesn’t require learning special rule syntax like GitLeaks.

The one tricky piece of any Python tool is to ensure you are running both the right version of Python and the right version of pip. As long as you pay close attention to both, there were no issues running the tool and getting good results.

SpectralOps

SpectralOps from Spectral is a paid solution. While Spectral does not publicly provide pricing, you can sign up for a demo on its website to inquire about it. The benefits of SpectralOps are many. It integrates with multiple data sources beyond just Git. It can interface with GitHub, GitLab, NPM, Google Cloud Build, and more. Much like the free options, it is a command-line tool that scans your code before it goes to the cloud.

What sets SpectralOps apart is that it can easily integrate with several continuous integration (CI) systems, which are tools that automatically build and test code changes. SpectralOps uses specially crafted YAML files called detectors to pick up on secrets defined in your code. While Spectral is constantly expanding this library, you are free to write your own.

SpectralOps is clearly geared toward a larger development team with a budget and has a lot of support behind it. If you need a step above free tools and have privacy concerns, then SpectralOps is worth a look.

GitGuardian

GitGuardian is a fully web-based solution that continually scans your repositories for secrets. It’s almost entirely automated and functions much like an endpoint protection product for your code. It connects directly to GitHub, Bitbucket, GitLab or your internal repositories and continually monitors them for secrets. Any secrets discovered can be dealt with, tracked, and remediated from within the app.

While you will still need to make manual edits to your Git Repository, GitGuardian makes sure that you follow all the necessary steps to ensure proper secret removal. GitGuardian is free for private repositories and small teams up to 25 developers. Anything over that starts at $434 per month and goes up from there.

The one piece that some may take issue with is the fact that it does need to directly connect to your repository. If you have certain privacy policies or other concerns that might prohibit this, it may make more sense to look for an offline tool. That said, if you don’t like to fuss with a command line and prefer a visual approach, GitGuardian is the way to go.

No matter your budget or technical skill level, there is a tool available to help you ensure that mission critical secrets are not leaked to the world through your source control system. By taking some time and adopting one of these applications into your development process, you can take a large step toward making your organization and your user base feel less at risk and avoiding the notoriety of front-page headlines.

daniel_brame

I'm a highly caffeinated father, husband, hacker, developer, and fintech enthusiast who can nerd out about basically anything.