Of secrets and histories

Posted by Jesse Portnoy on February 04, 2025 · 1 mins read

A too common and dangerous mistake I encounter is source repos that either have secrets (sensitive data) committed to them or those where secrets were removed from the files but not purged from the commit history. The latter is even more common but it’s no less serious because it’s very easy to fish these out and is the first thing I’d look for if I were an attacker.

So, how do we avoid this potentially most expensive mistake?

  • Never commit secrets to any repo. You might think that if the repo is private, this isn’t a problem; if so, you’d be wrong:) Firstly, it may be private today but tomorrow? Who knows… (as an avid FOSS supporter, I certainly want all code to be open and there are plenty of examples of commercial entities that opted to do just that with their previously closed/proprietary repos).

Secondly, even if the repo will forever remain private, that’s not to say the source will not end up in the hands of less-than-benevolent people. So, what to do to protect our secrets?

  • Consider using a secret management system (there are loads, FOSS and otherwise - run a search, you may already be using a platform that offers this particular service)
  • However way you choose to store your secrets, make sure that any files containing them are included in the .gitignore file (if you use a different source control, find your counterpart, they all have them)

The above actions, good as they are, require all humans involved to adhere to these policies and, as you surely know, people often don’t. Luckily, you can make them:) Gitleaks is a project that aims to do just that. In its README, you’ll find a full GH Action example (similar hooks for other platforms are available, though writing your own is also easy enough) that runs this tool, as well as pre-commit one; and you do want both as it’s far better to catch these things before you commit. Make sure you include a note about how to set up the pre-commit hooks in your README because, once these things get into the commit log, purging them is, well, involved and annoying:)

Hopefully, you are now convinced that you should set these hooks up on any repo that includes (or may in the future include) sensitive data but what about all your existing repos and their commit histories? Well, as a first step, use gitleaks to detect compromised repos. Next, if you use Git (notice that Git is NOT shorthand for GitHub; in 2025, there’s a high chance you are using Git, no matter what platform you host your code on), take a look at git-filter-repo which is designed to assist with this purging task. You may also want to read Removing sensitive data from a repository. If you’re not using Git, check your source control documentation for its counterpart purging option.

Don’t be that person who could have prevented a crucial and embarrassing breach, but didn’t:)