Last night, developer and privacy activist Resynth1943 announced that GitHub’s source code had been leaked on GitHub itself, in GitHub’s own DMCA repository. It will take a little decompression to talk about that, but first things first – this isn’t as big of a deal as it might sound.
GitHub enterprise server! = GitHub.com
Shortly after Resynth1943 – who appeared to have broken the news and described the code as “just leaked” by an unknown individual – re-shared the announcement on Hacker News, GitHub CEO Nat Friedman had present at Hanoi to provide some context.
According to Friedman, the upload in question was actually that of the GitHub Enterprise Server, not the GitHub website itself. While the two share a substantial amount of code, the difference is significant. Part of that sense is that GitHub itself isn’t really hacked.
Although neither GitHub nor GitHub Enterprise Server is open-source, the source code of GitHub Enterprise Server is usually delivered to the client, albeit usually in a minified and scrambled format. According to Friedman, GitHub inadvertently provided some clients with a complete and undisturbed version of the GHES tarball several months ago; this is the code that has been included in GitHub’s public DMCA repository.
Ax sharpening related to DMCA
It seems the “unidentified individual” reference Resynth1943 that uploaded the leaked source code largely out of anger about the recent Youtube-dl takedown.
The code itself was included in GitHub’s DMCA repository, which serves as a history of DMCA takedown requests that GitHub received when it received them, similar to the Chilling Effects notifications you may have found on Google searches for many years.
What is this?
Inspired by Lumen (Formerly Chilling Effects) and Google, this repository contains the DMCA takedown notice and the objection notice we received here at GitHub. We publish them as they are received, with only personally identifiable information recompiled.
Resynth1943’s announcement also criticized Microsoft as hypocritical for not deliberately opening GitHub’s source while suggesting that perhaps it would be less secure now that its code was leaked.
How to shoot fake commitments?
The flagged commitment itself was made by user Nat – aka Nat Friedman, the current CEO of GitHub. Just like the content of the commit, this is misleading – Git itself, the GitHub base source code versioning system, does not significantly protect against user impersonation. The commit in question is not labeled “verified,” meaning it is not signed with Friedman’s GPG key.
Git commit – just like email messages – allows users to put whatever information they want in the user.name and user.email fields. This makes forging that information trivial. Unless the commit is actually signed with the GPG key associated with that email address, there is no real verification that it came from where it said it was.
This raises the issue of how a commit from some random user would show up in GitHub’s DMCA repository in the first place – but the answer there also doesn’t involve any real account compromises. Come on.
When you push a commit into the Git repository, you get a hash function that represents that commit and can be used to locate it in the tree. GitHub – the part of a Web application that provides in-browser access to that basic Git structure – keeps all branches of the Git repository in a single base repository, although it usually doesn’t export. show that way in the URL structure.
Use a fork, Luke
So, in order to create the illusion that GitHub CEO Nat Friedman has made a commitment to GitHub’s DMCA repository, first an unidentified individual needs to clone the DMCA repository. After tampering the repository – making a copy they have the privilege of committing to – the next step is to confirm the leaked source, forging Friedman’s name and email address in
This will result in a branched repository, with bogus commit. But it still won’t look great – after all, the URL will still point to both the fork and the attacker’s real GitHub account and username. But below, both parent and fork are part of the same repository at the basic Git level. This allows the attacker to construct a URL that makes the commit appear to have been made to the main repository, not the fork.
To complete the deception, the attacker started with
https://github.com/github/dmca, then appended
tree/$hash in the end, where
$hash is the hash of the commit made to their own fork – and presto! The result is a URL that appears to be a commit, made by CEO Nat Friedman, to GitHub’s own DMCA repository.
GitHub was not “hacked” —but there is plenty of room for improvement
On the plus side, there are no real compromises here. The source code, if accidentally, is made available to the client freely – not from the compromised server. Similarly, Friedman did not lose control of his own account, and GitHub did not lose control of his DMCA repository. In Friedman’s rather bleak words on Hacker News, “everything is fine, the situation is normal, the lark is on its wings, the snail is on the thorn, and all is well with the world. “
While all the perils recorded here are expected – if you want to verify your identity, you should sign your pledge using the GPG key – those expectations themselves, yes probably much lower than expected. Managing GPG is still tough enough to be a significant hurdle for many developers. More importantly, GitHub does not put in place any controls to emphasize the presence – or lack of – of such signatures.
We have seen a lot of suggestions floating around the tooltips, such as “this user usually signs their commit and this is not signed” if appropriate. We also assume that the time has passed to fix the issue that would allow attackers to tamper with which repository they have committed to using the URL-fork-and-manual building technique we described. on.
Finally, maybe it’s time to seriously discuss whether unsigned commits should be the default in the first place. We live in a world where even simple Web browsing is highly expected to be done using authentication and encryption – which makes the kind of common phishing seen today even more worthwhile. surprising and worrying.