SATA hang on 8315E triggered by heavy flash write?
Anthony Foiani
tkil at scrye.com
Thu May 23 15:52:23 EST 2013
Shaohui --
Thanks for the quick reply! Please find my investigation and results
below.
Xie Shaohui-B21989 <B21989 at freescale.com> writes:
> 1. only update NOR for a long enough time, for ex. tens of seconds,
> see if error happens;
It seems that I can do this without any errors:
/ # flash_erase /dev/mtd1 0 0
Erasing 64 Kibyte @ 7f0000 -- 100 % complete
/ # dd if=/dev/zero of=/dev/mtd1
dd: writing '/dev/mtd1': No space left on device
16385+0 records in
16384+0 records out
8388608 bytes (8.0MB) copied, 62.399439 seconds, 131.3KB/s
> 2. only r/w SSD without NOR operation, see if error happens;
Again, no problem:
/ssd # ls -al biggie.bin
-rw-r--r-- 1 root root 2330607084 May 22 19:34 biggie.bin
/ssd # ls -alh biggie.bin
-rw-r--r-- 1 root root 2.2G May 22 19:34 biggie.bin
/ssd # time cp biggie.bin biggie2.bin
real 3m 27.55s
user 0m 2.60s
sys 2m 16.13s
> 3. r/w SSD first and keep it run, then start to read NOR, if no
> error for a long time, then start to write NOR, see how long the
> error will happen.
Doing a NOR read during heavy SATA r/w seems to succeed, with no
errors on the console:
[window 1]
/ssd # time cp biggie.bin biggie2.bin
[window 2]
/ # dd if=/dev/mtd1 of=/dev/null
16384+0 records in
16384+0 records out
8388608 bytes (8.0MB) copied, 6.380613 seconds, 1.3MB/s
Doing a NOR write fails almost instantly (within a second):
[window 1]
/ssd # time cp biggie.bin biggie2.bin
[window 2]
/ # dd if=/dev/zero of=/dev/mtd1
[console]
[ 5160.269106] ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 5160.276387] ata2.00: failed command: READ DMA
[ 5160.280905] ata2.00: cmd c8/00:00:60:f3:01/00:00:00:00:00/e0 tag 0 dma 131072 in
[ 5160.280928] res 50/00:00:f0:c0:48/00:00:00:00:00/e0 Emask 0x10 (ATA bus error)
[ 5160.296386] ata2.00: status: { DRDY }
[ 5160.300195] ata2: hard resetting link
[ 5160.347858] ata2: setting speed (in hard reset)
[ 5170.439981] ata2: No Signature Update
[ 5170.611901] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 5170.618204] ata2.00: link online but device misclassified
[ 5175.623918] ata2.00: qc timeout (cmd 0xec)
[ 5175.628147] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 5175.634347] ata2.00: revalidation failed (errno=-5)
[ 5175.639373] ata2: hard resetting link
[ 5176.143847] ata2: Hardreset failed, not off-lined 0
[ 5176.155867] ata2: setting speed (in hard reset)
[ 5185.743871] ata2: No Signature Update
[ 5185.915900] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 5185.922203] ata2.00: link online but device misclassified
[ 5195.927910] ata2.00: qc timeout (cmd 0xec)
[ 5195.932140] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 5195.938342] ata2.00: revalidation failed (errno=-5)
[ 5195.943430] ata2: hard resetting link
[ 5196.443885] ata2: Hardreset failed, not off-lined 0
...
At this point, a hard reset / full power cycle is needed to recover.
The board is an MPC8315ERDB derivative, and I'm running a patched
3.4.36 kernel.
I've uploaded some (possibly) relevant files to:
http://foiani.home.dyndns.org/~tony/linux/ppc-sata-issues-201305/
There is a diff from 3.4.36, a devtree, and a kernel config.
Please let me know if there is any more information that I can
contribute.
Best regards,
Anthony Foiani
More information about the Linuxppc-dev
mailing list