Difficulties operating KeyLime with recent versions of tpm2_tools
See original GitHub issueLuke asked me to open this issue, even though it is not strictly a KeyLime issue. Instead, it pertains to how keylime quotes are using tpm2_quote from the Intel TPM tool kit (https://github.com/tpm2-software/tpm2-tools).
Description of environment:
- KeyLime running on an Ubuntu 18.04 or Ubuntu 20.04 substrate
- All KeyLime components (verifier, registrar, agent, all tenants) are dockerized in an Ubuntu 20.04 container. This includes all TPM2 tools; there are none on the host; everything is in the container.
- TPM access is through the -rm0 TCTI, and is accomplished through docker device mapping
- /sys/kerne/… secure file system access is effected by mounting securefs inside the container
- All TPM devices are vTPMs created by swtpm 0.2.0-1 (comes default with Ubuntu 19.10)
Expected behavior:
Normal attestation – keylime agents call tpm2_quote and return results to verifier for attestation.
Actual behavior
Every so often tpm2_quote fails in the agent, causing the agent to declare itself failed, as follows.
Exception: Command: tpm2_quote -c /var/lib/keylime/secure/tmpmyv20oki -l sha256:15,16,22+sha1:10 -q
344c6757625846466d3944363961694271646e42 -m /tmp/tmpcda6uzzm -s /tmp/tmpdfyu3y6s -o /tmp/tmp4aj8qzp8 -g sha256 -p NK8hsHW4dmjJpQVL9Ecb returned 1, expected 0, output [b'quoted:
ff54434780180022000bbbee6c297fc02cccdab2fbe632d993feb486a8d77b42d19fe8aee2d05898b4e10014344c6757625846466d3944363961694271646e4200000000000e93d8000000020000000001201706190016363600000002000b030080410004030004000020192ef11bcd958ebbfd3a426d15702729dfeaa0048928b04bcb1e75af50fe0808\n', b'signature:\n', b' alg: rsassa\n', b' sig:
5b9febd76342db583aebbe72fa32e314dab22ddb9b2cc0b4663fe317e430990d7fd17abb823049abaac195f6fe7f07fe1bb4260fe4d9cbdcf700284d04ecc46d7ca58ee5760bee57cd94194f3fcd636a51a738db92e878a0ca615f36ed22fbc00e4a03c28493a65dbbcb3cab5e738c4263fae105f101d3796da70a1a923f9fcdb13d10800cf8085f8c1fe6081e12ec20896c81084f3c1f8f0c8d80c1890adcfe87981d9f5df809ae98106898c0387f118248cc28391fc2c0bcf899353dc541cbe6bba69720b0dec7808193148fae998bf83ab2e0212a84e92344a2969d8f79ed52ca3c91f3572a46723835148f70c40157ccb40f97b3ca484fd854bdbe0f51dc\n', b'pcrs:\n', b' sha256:\n', b' 15:
0x0000000000000000000000000000000000000000000000000000000000000000\n', b' 16:
0x026F8DCE9B7FEA1E65CC07B92BA79F74889F26639131A25157A3FF17CC4BA5A7\n', b' 22:
0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF\n', b' sha1:\n', b' 10:
0xF006E2C90753E7438C6A86443BF050ECCC3C9DBF\n', b'calcDigest:
619dc2bd0c6505d49d2213e65beefc6aba6d88eb8a0411924b5409baea7b0e2e\n'],
stderr [b"ERROR: FATAL ERROR: PCR values failed to match quote's digest!\n", b'ERROR: Error validating calculated PCR composite with quote\n', b'ERROR: Unable to run tpm2_quote\n']
Steps to reproduce problem
- Install and run KeyLime agent as describe above
- Turn on IMA with a genuine white list
- Ensure IMA on the compute node is actually measuring files
- Wait for tpm2_quote to fail
A theory of the bug
A superficial look at the source code of the Intel TPM tool kit indicates that the quote and the retrieval of PCR registers are two different TPM transactions in tpm2_quote. This opens up the possibility that IMA get in between the two operations and updates PCR10, resulting in the bug above.
A proposed workaround
Not yet tested; but will update this issue as soon as we do. We believe that the agent should try tpm2_quote several times , attempting to not propagate a single instance of the quote failure up to the verifier.
Issue Analytics
- State:
- Created 3 years ago
- Comments:24 (18 by maintainers)
Top GitHub Comments
As per the gitter discussion, one potential solution to this would be to calculate PCR values based on event logs (rather than having tpm2_quote send them).
Changes involve:
tpm2_quote
, we only get the quotetpm2_quote
Note: we might have to try various positions in the IMA log to find where it matches the quote digest, since the quote digest may be outdated with respect to the event logs.
Future optimization would be to only send along changes to the event logs (and have the Verifier remember where it left off) to save processing and network overhead.
I completely agree with @jetwhiz – this is not a KL issue proper. The correct fix is in tpm2-tools. I have been remiss, and not opened the bug on them yet. Unless someone else offers, I can attempt to raise this issue with them.