Secure Self-Review: Preventing Package Manager Worms
When the vulnerability
"npm fails to restrict the actions of malicious npm packages"
was announced, I felt the response was underwhelming. To be fair, it's
not a problem unique to npm. This is going to be an issue for any package
manager or non-centralized software distribution mechanism. And it's very
hard to address. Nevertheless, it's also very alarming!
Every programmer should
be reviewing their own code before checkin (many do this already, tho a
surprising number do not). However, to be an effective security measure,
the review should be conducted in an environment isolated from the development
environment. Few people do THAT. The proper workflow is:
- Make changes and test them in a dev environment.
- Switch to the checkin environment, review and commit the changes
there.
I would like to make the case for more programmers to use this kind of
workflow. And I'm offering
textrecv
a (crude!) tool to help enable it. It automates the
unstated middle step of the above workflow:
- Changes are automatically, invisibly and safely duplicated into the
checkin environment.
We must consider this tool just a stab in the right direction,
which addresses part of the problem rather than a definitive solution.
So far, I'm addressing the git version of the npm vulnerability which
we started with.
textrecv doesn't help
you with full-fledged package managers such as npm, rubygems, debian,
etc.
Self-review is dependent on human judgement and thus fallible. But,
if widely adopted this strategy would catch most worms quite quickly.
It's just a piece of an overall
strategy. I'm not even sure what all the other pieces would be.
To help package managers, there ought to be a service that builds
and publishes packages
from git
projects, perhaps automatically when a tag of a certain format is
committed. Ideally this service can make packages without running
any code in the project being built.
(Difficult to impossible to use with most existing package managers.)
An alternative, less attractive tho, is doing builds in a secure
sandbox.
But what I'm offering today is just this piece:
textsend and textrecv
comprise a system for safely moving source code
from one system to another over a network.
textsend pushes source code from a 'dev' system where it is authored
and tested to a 'checkin' system running textrecv where it is lightly
sanitized. Then git (or another vcs) is used to publish it.
textsend and textrecv automate part of a workflow that help deter
unauthorized changes to your projects. However, they are no substitute
for thinking. Self-review is still very important. You _must_
review changes before checking the code in. This is the opportunity to
find changes that you didn't make, (however unlikely you might consider
it) along with all the other types of bugs and problems.
You were reviewing before checkin anyway, right? To make sure you
didn't leave in printfs or other debugging code, and to help you
compose the commit message, etc. All we're doing is moving the review
and check-in to a different environment. Don't think of these happening
in dev anymore.
As programmers, we need to take more seriously the responsibility we
have. We are creating software to run on other people's computers.
We need to take some reasonable steps to ensure the programs we create
are free of security holes. And we should be taking steps to make
sure we are not the inadvertant agents of malware propagation!
I want to promote a culture of greater review of software. Programmers
should conduct both self-review and peer review more often.
In order to use textsend/textrecv properly, you must have some other
system for
achieving privilege isolation: the dev environment should be on a
separate jail, container, virtual machine or distinct physical
machine from the checkin environment. Proper setup to achieve this
isolation is well beyond the scope of this screed, but a
few specifics to note:
- Dev should not have any way to talk to checkin except to send
source code for review and commit.
- You should not be able to ssh from dev to checkin. (probably not
from dev to anywhere else either.)
- Dev should not know any secret keys, passwords, or other secret
credentials known to checkin. (Except for the secret gpg2 key used
to authenticate data sent from textsend to textrecv.)
- In particular, the git password and/or secret key used to push to
origin should not be present on dev.
- You must never run any of the code transmitted by textrecv
in the checkin environment. Best practice, when possible, is
for the requisite interpreter or compiler needed to use the
code simply not be present on checkin.
- Specifically including: tests, build scripts
or packaging scripts should never be run on the checkin
environment.
You can use git in the dev environment (and should, to merge upstream
changes and process pull requests). And you can edit files in the
checkin environment (tho better to stay with very safe edits, since
you can't test here). But you must test only on dev and
certainly never commit or push (or have keys that would allow it) from dev.
textrecv enforces the following restrictions on the data it receives:
These restrictions are intended to make it safe to work with the files
received by textrecv with ordinary shell tools without risk of system
compromise, unexpected behavior, or need to think hard about it.
In order to ensure data integrity, gpg2 is used to sign data sent from
the dev to the checkin environments. A gpg2 keypair needs to be shared
between the 2 environments.
Why use textrecv instead of the many other ways to move files over a
network? Other tools are more general-purpose (even git) and not
designed to avoid the perils of potentially malicious source code.
rsync, for instance, makes no promises about file integrity (unless
you use ssh as a transport). Scp or sftp move files with guaranteed
integrity, but require an ssh installation. It is _very_ tricky to
install ssh in such a way as to allow file movement only and prohibit
all general remote execution. At least, I never saw an easy and fool-
proof way to set up scp-only mode. (Think you did it right? Did you
prevent write access to ~/.bashrc ? How about to the user's crontab?
Or ~/.init/* on upstart-based systems (ubuntu and derivatives) ?)
Not that it can't be done, but it's way more difficult it oughtta be.
No other tool prevents shenanigans with filenames or escape sequences
in file contents. (Tho git tries to avoid them in branch and ref names)
And all of those other tools are rather complicated by comparison.
textsend and textrecv are (still fairly) small and easy to audit.
The name is terrible. I'm sorry, but I couldn't think of a better one.