[Skiboot] [PATCH v2 2/6] secvar/secboot_tpm: unify behavior for bank hash check and secboot header check

Wed Nov 3 02:04:43 AEDT 2021

On 11/2/21 9:16 AM, Daniel Axtens wrote:
> Eric Richter <erichte at linux.ibm.com> writes:
> 
>> As the PNOR variable space cannot be locked, the data must be integrity
>> checked when loaded to ensure it has not beeen modified by an unauthorized
>> party. In the event that a modification has been detected (i.e. hash mismatch),
>> we must not load in data that could potentially be compromised.
>>
>> However, the previous code was a bit overzealous with its reaction to detecting
>> a compromised SECBOOT partition, and also had some inconsistencies in behavior.
>>
>> Case 1: SECBOOT partition cleared.
>> .init() checks the header for the magic number and version. As neither matches,
>> will reformat the entire partition. Now, .load_bank() will pass, as the data
>> was just freshly reformatted (note: this also could trigger the bug addressed
>> in the previous patch). Only variables in the TPM will be loaded by
>> .load_bank() as the data in SECBOOT is now empty.
>>
>> Case 2: Bank hash mismatch.
>> .load_bank() panics and returns an error code, causing secvar_main() to jump
>> to the error scenario, which prevents the secvar API from being exposed.
>> os-secure-enforcing is set unconditionally, and the user will have no API to
>> manage or attempt to fix their system without issuing a key clear request.
>>
>> This patch unifies the behavior of both of these cases. Now, .init() handles
>> checking the header AND comparing the bank hash. If either check fails, the
>> SECBOOT partition will be reformatted. Variables in the TPM will still be
>> loaded in the .load_bank() step, and provided the backend stores its
>> secure boot state in the TPM, secure boot state can be preserved.
>>
>> Signed-off-by: Eric Richter <erichte at linux.ibm.com>
>> Tested-by: Nick Child <nick.child at ibm.com>
>> ---
>>  libstb/secvar/storage/secboot_tpm.c          | 30 +++++++++++++++++++-
>>  libstb/secvar/test/secvar-test-secboot-tpm.c |  6 ++++
>>  2 files changed, 35 insertions(+), 1 deletion(-)
>>
>> diff --git a/libstb/secvar/storage/secboot_tpm.c b/libstb/secvar/storage/secboot_tpm.c
>> index 5907ff07..932e5fd3 100644
>> --- a/libstb/secvar/storage/secboot_tpm.c
>> +++ b/libstb/secvar/storage/secboot_tpm.c
>> @@ -374,7 +374,9 @@ fail:
>>  	return rc;
>>  }
>>  
>> -static int secboot_tpm_load_variable_bank(struct list_head *bank)
>> +
>> +/* Helper to validate the current active SECBOOT bank's data against the hash stored in the TPM */
>> +static int compare_bank_hash(void)
>>  {
>>  	char bank_hash[SHA256_DIGEST_LENGTH];
>>  	uint64_t bit = tpmnv_control_image->active_bit;
>> @@ -394,6 +396,15 @@ static int secboot_tpm_load_variable_bank(struct list_head *bank)
>>  		/* Tampered pnor space detected, abandon ship */
>>  		return OPAL_PERMISSION;
>>  
>> +	return OPAL_SUCCESS;
>> +}
>> +
> 
> So in doing this you take the check out of secvar_ops.load_bank(). I am
> guessing that:
> 
>  - secboot_tpm_store_init() gets called before any calls to load_bank()
> 

Correct.

>  - there's nothing that skiboot does that might cause the contents of
>    the PNOR to change between store_init() and load_bank()?
> 

Correct.

>  - anything that's able to change the PNOR while skiboot is running this
>    early boot stuff is outside the threat model?
> 

Correct.

> Is that right?

See secvar_main() in secvar_main.c for the flow. It is one call to _init,
then one call to _load_bank() for the variable bank, then the update bank.

Side note: the entire SECBOOT partition is loaded into memory in _init().
No further reads from PNOR are made, so even if PNOR is modified, it
won't be read until the next boot (assuming it doesn't get overwritten
by a write). load_bank() is more accurately "parse_bank()".

This is of course assuming memory can't be arbitrarily changed in this
time, but that would have even worse implications than this.

> 
>> +
>> +static int secboot_tpm_load_variable_bank(struct list_head *bank)
>> +{
>> +	uint64_t bit = tpmnv_control_image->active_bit;
>> +	int rc;
>> +
>>  	rc = secboot_tpm_deserialize_from_buffer(bank, tpmnv_vars_image->vars, tpmnv_vars_size, SECVAR_FLAG_PROTECTED);
>>  	if (rc)
>>  		return rc;
>> @@ -692,8 +703,25 @@ static int secboot_tpm_store_init(void)
>>  		rc = secboot_format();
>>  		if (rc)
>>  			goto error;
>> +		goto done;
>>  	}
>>  
>> +	/* Verify the active bank's integrity by comparing against the hash in TPM.
>> +	 * Reformat if it does not match -- we do not want to load potentially
>> +	 * compromised data.
>> +	 * Ideally, the backend driver should retain secure boot state in
>> +	 * protected (TPM) storage, so secure boot state should be the same, albeit
>> +	 * without the data in unprotected (PNOR) storage.
> 
> Don't you have 3 reserved bytes in struct secboot_header? Could you use
> one and say it's only defined in the case of the tpmnv_control header?
> Or does this state already get preserved and I've just misunderstood the
> interplay between backends/storage drivers/whatever?
> 

This is fancy wordplay since this is a storage driver that isn't supposed
to care about secure boot modes, states, and variables.

I'm specifically talking about the PK. If the PK is populated with a key,
we are in os-secure-enforcing. That is the "secure boot state" that is
being preserved across a PNOR wipe by being stored in TPM NV instead.

>> +	 */
>> +	rc = compare_bank_hash();
>> +	if (rc == OPAL_PERMISSION) {
>> +		rc = secboot_format();
>> +		if (rc)
>> +			goto error;
>> +	}
>> +	else if (rc)
>> +		goto error;
>> +
>>  done:
>>  	return OPAL_SUCCESS;
>>  
>> diff --git a/libstb/secvar/test/secvar-test-secboot-tpm.c b/libstb/secvar/test/secvar-test-secboot-tpm.c
>> index 798ca281..f203e57d 100644
>> --- a/libstb/secvar/test/secvar-test-secboot-tpm.c
>> +++ b/libstb/secvar/test/secvar-test-secboot-tpm.c
>> @@ -101,10 +101,15 @@ int run_test(void)
>>  
>>  	clear_bank_list(&variable_bank);
>>  
>> +	/* Disabling check for now, since bank hash check has been moved to _init().
>> +	 * This will be re-enabled with the forthcoming rework of the tests/_init() procedure.
>> +	 */
>> +#if 0
>>  	// Tamper with pnor, hash check should catch this
>>  	secboot_image->bank[0][0] = ~secboot_image->bank[0][0];
>>  
>>  	rc = secboot_tpm_load_bank(&variable_bank, SECVAR_VARIABLE_BANK);
>> +	fprintf(stderr, "rc = %d\n", rc);
>>  	ASSERT(rc != OPAL_SUCCESS); // TODO: permission?
>>  
>>  	// Fix it back...
>> @@ -113,6 +118,7 @@ int run_test(void)
>>  	// Should be ok again
>>  	rc = secboot_tpm_load_bank(&variable_bank, SECVAR_VARIABLE_BANK);
>>  	ASSERT(rc == OPAL_SUCCESS);
>> +#endif
>>
> 
> I haven't go through the rest of the series but is the rework part of
> this series or not?
> 

No. The test case rework is next up, but it is not as urgently needed.
Currently we are testing by hand the changes in this set.

> I think I conceptually prefer just deleting the code and leaving it in
> the git history but I don't _really_ care, especially for tests...
> 
> Kind regards,
> Daniel
>