I have an SSD from a PC I no longer use. I need to keep a copy of all its data for backup purposes. The problem is that dd reports "Input/output error"s when copying from the drive. There seem to be 20-30 of them in the entire 240GB drive so it is likely that most or all of my data is still intact.

What I’m concerned about is whether these input/output errors can cause issues in the image outside of the particular bad blocks. How does dd handle these errors? Will they be eg zeroed in the output or will the simply be missing? If they are simply missing will the filesystem be corrupted because the location of data has been shifted? If so, what tool should I be using to save what can be saved?

EDIT: Thanks for the help guys. I went with ddrescue and it reports to have saved 99.99% of the data. I guess there could still be significant loss if the 0.01% happens to be on filesystem structures, but in this case maybe I can use an undeleter or similar utility to see if I can get back the files. In any case, I can work at my leisure now that I have a copy of the data on non-failing storage.

  • patatahooligan@lemmy.worldOP
    link
    fedilink
    arrow-up
    3
    ·
    9 months ago

    Thanks for the input, guys. I consider my issue resolved.

    As for the specific question I head, dd can fill with zeroes the blocks that failed to read with conv=noerror,sync. However, this puts the zeroes at the end of the block and not over the exact bit/byte that failed to read, meaning that a read error will invalidate the rest of the block.

    But the consensus across source I searched seems to be to use ddrescue instead of dd.

    • rotopenguin@infosec.pub
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      9 months ago

      There is no particular bit or byte that is wrong. The drive is coming from an entire 128K to megabytes-large page that it couldn’t make sense of. There was already a lot of error correction code tried, and the overall analog values of the page were re-tried (was this a 14/16th millivolt, or a 15/16th millivolt?). That page couldn’t be made sense of, the MLC page overlaid on it couldn’t be made sense of, the TLC page overlaid on that couldn’t be made sense of, etc. Or things could also be so bad that the FTL doesn’t even know which flash cells your data should be found in.

      Everything that I understand about flash storage suggests that it can’t reasonably do little errors. You could still get small errors from a bit flip in delivery, or more likely flips in your PC’s own ram. But the flash itself should either be very right, or very very wrong. Nothing in-between.

      • patatahooligan@lemmy.worldOP
        link
        fedilink
        arrow-up
        1
        ·
        9 months ago

        Thanks for the explanation. I don’t really know how flash storage works. The fundamental idea of the problem I described would still apply, though as long as the input block size for dd extends to more than one page of the underlying storage.

        For example, say that exactly three pages fit in a block. If dd attempts to read pages A, B and C (ABC) and fails to read B, you would want the corresponding part zeroed in the output to preserve the offsets of all the other pages (A0C). But instead dd reads whatever it can for the entire block, then pads the rest of the block size with zeroes, effectively moving C forward (AC0). So essentially you magnify errors.