TPM: Work around buggy TPMs that block during continue self test
We've been testing an alternative TPM for our embedded products and
found random kernel boot failures due to time outs after the continue
self test command.
This was happening randomly, and has been *very* hard to track down, but
it
looks like with this chip there is some kind of race with the
tpm_tis_status()
check of TPM_STS_COMMAND_READY. If things get there 'too fast' then
it sees the chip is ready, or tpm_tis_ready() works. Otherwise it takes
somewhere over 400ms before the chip will return TPM_STS_COMMAND_READY.
Adding some delay after tpm_continue_selftest() makes things reliably
hit the failure path, otherwise it is a crapshot.
The spec says it should be returning TPM_WARN_DOING_SELFTEST, not
holding
off on ready..
Boot log during this event looks like this:
tpm_tis
70030000.tpm_tis: 1.2 TPM (device-id 0x3204, rev-id 64)
tpm_tis
70030000.tpm_tis: Issuing TPM_STARTUP
tpm_tis
70030000.tpm_tis: tpm_transmit: tpm_send: error -62
tpm_tis
70030000.tpm_tis: [Hardware Error]: TPM command timed out during
continue self test
tpm_tis
70030000.tpm_tis: tpm_transmit: tpm_send: error -62
tpm_tis
70030000.tpm_tis: [Hardware Error]: TPM command timed out during
continue self test
tpm_tis
70030000.tpm_tis: tpm_transmit: tpm_send: error -62
tpm_tis
70030000.tpm_tis: [Hardware Error]: TPM command timed out during
continue self test
tpm_tis
70030000.tpm_tis: tpm_transmit: tpm_send: error -62
tpm_tis
70030000.tpm_tis: [Hardware Error]: TPM command timed out during
continue self test
The other TPM vendor we use doesn't show this wonky behaviour:
tpm_tis
70030000.tpm_tis: 1.2 TPM (device-id 0xFE, rev-id 70)
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>