In support of the Cyber Grand Challenge program, DARPA required an execution environment that not only focused on research but also facilitated rigorous, repeatable measurements and presented the highest levels of integrity. The desire was that the measurements published out of the program could provide a basis for years of research in automated cyber reasoning. Further, due to the competitive nature of the program and the preexisting cultural acceptance of undermining the integrity of cybersecurity competitions, there was a real need to create a system that presented a very limited risk surface. The DARPA Experimental Cyber Research Evaluation Environment, or DECREE, was conceived to meet these needs.
DECREE is a novel environment specification consisting of a simplified set of system calls and a consistent, measurable interface to system resources (CPU, memory, etc.) for every user program. In this work, we explore many of the requirements and constraints that drove the design and development of DECREE and detail implementations of DECREE that have already been deployed in high-profile, public experiments. In particular, we detail the precise measures taken to ensure that two disparate implementations of the DECREE specification (Linux and FreeBSD) are indistinguishable from one another to user programs, while simultaneously striving for the highest levels of determinism and fidelity for measurement. In rare situations where the operating system does not provide enough capability, we detail the use of a custom hypervisor to address the deficiencies.
Originally published in BSD Magazine, Vol 12 Number 12, Issue 102, February 2018, ISSN 1898-9144.