Fri 09 December 2016
(this is the short version, honest) of the architectural features that enable an enclave to be launched and run in an untrusted environment. Image: /blog/images/sgx-arch.jpg og_image: /blog/images/sgx-arch.jpg
Continuing our Intel SGX series of articles, the following is a brief overview (this is the short version, honest) of the architectural features that enable an enclave to be launched and run in an untrusted environment.
Unless you have already read up on the SGX instructions and their associated architectural changes, your first question is probably 'how does the SGX hardware isolate an enclave from a compromised OS?' or some variation on that theme. The Intel SGX Explained white-paper from MIT is still the definitive guide to the SGX architecture from the ground up, so here we will just briefly touch on each feature that provides the isolation an enclave needs to run in an untrusted environment.
SGX enabled CPUs can determine if they are executing code from within an enclave (or not) by controlling entry to and exit from enclaves:
- Direct entry and exit from an enclave (via function calls and returns) can only be performed at well defined entry and exit points (using SGX instructions EENTRY and EEXIT).
- Interrupts (from HW or SW or any kind of signal) that occur while a CPU is inside an enclave force an asynchronous enclave exit and re-entry on return so that the handler code is executed outside the enclave.
To protect the memory that an enclave uses (both code and data), the SGX hardware makes the following guarantees:
- When a CPU is outside any enclave, no matter what privilege level the software is at, a memory access using a virtual memory address that maps into an enclave results in an aborted access. An aborted write is thrown away, while an aborted read should return nonsense, but typically returns 'all-1s'.
- When a CPU is inside an enclave, a memory access using a virtual memory address that maps into a different enclave also results in an aborted access.
- Virtual memory addresses that map into an enclave cannot be remapped to point anywhere else.
- When a CPU is inside an enclave, function local objects are mapped into the enclave.
- When a CPU is handling an interrupt from within an enclave, the CPU state at the point of interrupt is not available to the interrupt handler.
- DMA transactions to the physical memory holding the contents of an enclave are aborted.
To prevent these checks from being required on every memory access (which would be catastrophic for performance), most of the checks are added as extensions to the page table handling microcode, with entries in the TLB mapping to a special 'abort' page if any of the checks fail. This results in an additional requirement that a CPU's TLB is flushed every time the CPU enters or leaves an enclave and an additional requirement that a map from physical page address to owning enclave id is maintained by the architecture.
To allow for the over-subscription of enclave pages, an enclave page can be copied out of the enclave so that the OS can swap the page out into long(er) term storage, then swapped back in and copied back into the enclave. However, the architecture treats the OS as an unsecure channel, so as they are copied, each enclave page is encrypted for privacy, has a nonce added to uniquely identity the page eviction event, has source page meta data added so that it can be restored to the correct enclave at the correct virtual address, and finally has a MAC added to ensure the OS maintains integrity of the contents until the page is copied back. To maintain page table integrity for enclave pages, the source meta-data in an evicted enclave page contains a pointer to a copy of the nonce that has been securely stored in the enclave by the architecture. This allows the architecture to detect a compromised OS attempting to corrupt the page table by replaying a page or swapping one page for another at the point an evicted page is restored. Enclave authors are required to statically specify the initial number and relative layout of pages within an enclave, however SGX2 allows for an enclave to dynamically allocate more pages to itself.
Because code pages use the same page table as everything else, branching to code within an enclave results in the execution of the 'abort' page, so new instructions have been added to explicitly enter and exit an enclave. These new instructions only accept the entry and exit points specified by the enclave author as valid inputs. Code within an enclave uses a stack that is also within the enclave, so the new instructions and the CPU interrupt handling microcode include additional microcode to switch which stack is being used. The requirement for the stack to be in the enclave means that an enclave author must specify how many 'thread contexts' can simultaneously use the enclave, however SGX2 allows for an enclave to dynamically create additional thread contexts for threads to use.
SGX hardware treats the edge of the CPU package as the trust boundary, so to provide hardware isolation for enclave pages and associated structures, these pages are put in a region of physical memory called 'processor reserved memory' (PRM) which the on-board memory controller prevents external devices from accessing. The SGX architecture includes a memory encryption engine for protecting the contents of the PRM while it lives in the untrusted system DRAM.
Enclaves are loaded and started by untrusted code, so how can an enclave author be sure that their enclave started in the correct state and is running on a genuine SGX part? The process a remote party goes through to make these checks is called remote software attestation, and due to the untrusted nature of the environment, it is recommended that no secrets should be given to or embedded in enclave code until a secure connection has been made to that enclave and it has passed attestation.
Proving that an enclave is running on an SGX part is done by making use of a special key issued to the device called the attestation key. Attestation keys follow the EPID group signature scheme (which itself is an extension of the DAA scheme) and have the following properties:
- Attestation keys are private and unique to each SGX device. A valid attestation key is unknown by any party other than the SGX microcode of the device it is on.
- Each attestation key is a member of a group that represents the type and security version of the SGX device it is on.
- The public key of a group can verify signatures created using any attestation key that is a member of the group.
- An attestation key will be revoked if it is found in public. A list of revoked attestation keys is created so that verifiers can check if a given signature was created using a revoked key.
- If 'suspicious behaviour' can be attributed to the use of a specific but still private attestation key, then the key can be revoked without being revealed by having its signature added to the revoked signatures list. Each signature generated by an attestation key then includes a part that tries to prove that the key was not used to generate any of the signatures on the list.
Intel issues all attestation keys through their provisioning service. This service communicates with a special enclave on an SGX device called the provisioning enclave which has privileged access to keys burned into the device at manufacture time. The keys allow the provisioning service to verify that it is connected to a genuine SGX part, and allows the enclave to encrypt the attestation key for storage on the platform. This privilege level is only extended to enclaves signed using Intel's private key (i.e. their public key is in the certificate, and the certificate is valid), so Intel also provide a quoting enclave which is responsible for decrypting and using the attestation key on behalf of any enclave that can prove that it has been initialised correctly.
Proof of the initial state of an enclave is provided by the enclave hash. Every step that is taken to load and initialise the enclave modifies the hash, so if any part of it is loaded incorrectly, the hash would be wrong. The author of an enclave knows what value the hash should be, and delivers with an enclave, a signed certificate that contains the expected enclave hash. The following steps are performed to get an enclave launched:
- Author writes enclave and specifies the load order for the enclave contents.
- Author creates certificate containing (among other things) the expected enclave hash and the author's public key, then signs the certificate with the their private key.
- The target SGX platform creates an empty enclave (using SGX instruction ECREATE) and then loads the contents in the order the author specified (using SGX instructions EADD and EEXTEND). As the contents are loaded, the enclave hash is calculated and held in a location only accessible by the SGX microcode.
- Before an enclave can be launched (using SGX instruction EINIT), the certificate is authenticated against the author's public key, a token must be got from Intel's Launch Enclave (for licensing purposes allegedly), and the enclave hash must match the expected value in the certificate.
Once the enclave is started, to begin the attestation process the author (or other secret issuing agent) must make a secure connection to the remote enclave using an ephemeral key exchange algorithm (such as Diffie-Hellman). Ideally at this point, the enclave would authenticate that it is connected to a trusted agent, e.g. by having the author signing something ephemeral using their private key which the enclave can verify using the public key in the certificate. Remote software attestation then consists of the following steps:
- The enclave generates a report (using SGX instruction EREPORT) containing (among other things) the enclave hash, a hash of the certificate key, and a hash of the ephemeral key being used to secure the connection to the author.
- The enclave sends this report to Intel's Quoting Enclave to be validated and signed using an attestation key. The Quoting Enclave will only sign reports that originate from an SGX enclave on the same platform.
- The enclave now sends the signed report over the secure connection to the author who must then use the Intel attestation verification service (or equivalent service) to verify that the attestation key used to sign the report was issued to genuine SGX hardware, that the attestation key has not been revoked, and that the signature is valid.
- The author must then use the contents of the report to verify the ephemeral key hash to detect MITM, verify the certificate key hash to detect repackaging, and verify the enclave hash to detect tampering.
If everything verifies correctly then attestation is complete and the author can give the enclave the go-ahead to use any sealed secrets and/or provision new secrets to the enclave over the secure channel.
In the next post we will take a look at the Intel SGX SDK and how you use it to access all these architectural features in order to build your own enclave.