Using ddrescue to save my data from my utter negligence

14 May

I believe I know stuff about technology and this includes some knowledge about data and data management. So I know that I have to do regular backups of my home computer. And I actually do! Maybe I should do them more regularly and with a better procedure, but anyhow, I do. What I neglect however, is my phone. After all, I have only photos, some music and funny videos there, right? Well, now my SD card is borked and I need my alarm clock melody. That’s why I’m trying to rescue it using ddrescue.

Why don’t I do backups of my phone?

Well, although I trust cloud solutions for my phone numbers, I don’t want all my data in them. So the only way I have found up to now to do this is to physically remove the SD card from my phone and copy it to my computer.

This is hard. I need to use one of these needles to remove my SIM card and my SD card, then I need to copy everything and when putting back my SIM, restart the phone and type in my PIN.

Perhaps I can find a better way, like having an app that syncs the contents from my phone to a local NAS when I’m home. I have to research this though…

And also this is possible only because my phone is ancient and has a microSD slot. If I change my phone, that’s it! No more! I’ll put this research in my TODO list.

Meanwhile let’s try to save as much information as possible from the borked card.

Just copying the data

This is funny, but it is still possible for most of the data. How? Well, the phone does not recognize the card anymore, but when I put it in an adapter and connected it to my computer under Windows, it had some troubles, but recognized it at the end!

So I tried to select all the folders and copy them to my internal drive. And it started copying! It was painfully slow though… And I needed to switch to Linux for a task, so I stopped it.

While copying, some files were corrupted, so I had to instruct Windows to just skip them. Goodbye unknown photos of churches somewhere…

A few days later I decided I’m ‘better’ than that, and I should use Linux. This did not make it faster at all!

PS: It seems this was a good call, since just trying to copy the data from corrupted media risks to corrupt it even more and even make some good data inaccessible. So I would recommend imaging the drive first.

Using ddrescue

Once I decided I should be using ddrescue, I went to the Arch Linux Wiki for wisdom and tried what’s explained there.

Rescuing healthy data and creating the rescue map

To save your good data first, Arch Wiki says this command has to be executed:

$ sudo ddrescue --force -n /dev/sdx /path/to/image/file

PS: This one is very dangerous. You can easily wipe out your entire system if you don’t know what it does, so please read A LOT before attempting it.

And I started watching it do it’s job. This was a terrible mistake, since it took more than a day to finish a 16GB drive. ddrescue is slow.

There is a reason it is slow. First it went and read all the blocks of the device, despite only 80% was written. Then it went and read all the blocks again in the reverse. And it continued 3 more times like that. It does that, since it is sometimes easier to read a block from the other direction, depending where the error is.

Then it started trimming failed blocks. Why? I don’t know. I thought it needs them to try other stuff too, but apparently not. Next time I’ll clone the entire drive first (PS: oh, this is what cloning looks like. Sigh).

At least all of this is done automatically and I didn’t touch it for a day. Here’s what it reported during this time:

logs generated by ddrescue
The logs generated by ddrescue looked like this

It finished after a 21h 46m run. That’s excessive and I regret staying in front of it and watching it carefully.

On the other hand, I’m pretty happy I did it, since my OS is still running. ddrescue is dangerous.

Copying the corrupt data

Once this operation was done, I could continue with the next command recommended by the wiki:

$ sudo ddrescue --force -d -r3 -n /path/to/image/file

This one should use the file which was prepared in the last ddrescue run and try 3 times to read the bad blocks. It also specifies to use direct disc access for the operations, instead of asking the kernel, by saying -d.

You can see that both commands also specify the -n, which means no scraping is going to be performed.

Anyhow once I enter this command I get the same summary as with the previous command. Is it rewriting the created earlier image with all my data?! No, not really. There is a log directly after running ddrescue saying:

Initial status (read from mapfile)
rescued: 15920 MB, tried: 10696kB, bad-sector: 741888 B, bad areas: 1449

So we’re good.

While the previous command was slow as hell, this one is even slower. Perhaps, because it focuses on the bad areas of the SD card.

It seems to me that there are 11 MB left to test and recover if possible. And the utility says it’ll take 4 days and 22 hours. I’m not sure if it is worth it at this point, but maybe without these 11 MB I don’t get any of the rest either, so I’ll continue waiting and hope there is no power outage where I am in these 5 days.

Meanwhile, what I found is that most of these 11 MB are in the non-scraped section of the report, meaning if I actually wait, perhaps they will be read at some point without errors. Or I can run ddrescue once more without the -n option and hope for the best.

After a day or two of running, the second command was finished.

Making ddrescue scrape the non-scraped data

Surely not all 11MB would be salvageable, but perhaps at least the ones in the non-scraped section of the report could be recovered.

So after the second run of ddrescue I decided to run it again. This time without skipping the scraping:

$ sudo ddrescue --force -d -r3 /path/to/image/file

This, however is so slow, I couldn’t afford keeping the computer running at all times, to let it finish. I can’t say this is perfect, but if you need to restart, you need to restart.

Luckily, ddrescue uses the file to keep its progress, so you can simply hit Ctrl+C, wait a bit and the file is updated.

The next time you issue the same command, it starts where it stopped.

Putting up with ddrescue’s dreadfully slow pace

The worst part of restoring your data from a corrupt drive (regardless of its technology – HDD, SSD, SD card, etc.) is the infinite waiting you have to suffer.

Hopefully, the drive isn’t too corrupt, so you can issue one command to ddrescue, make it go through the drive three times and declare the rest as lost. Unfortunately, after I ran it several times in the course of two weeks (yes, I’m that patient and masochistic), I still have about 4MB of failing data.

This means I have saved 99.97% of my drive, but I worry that the missing 4MB are distributed in such a way, that they corrupt large portion of the files on the SD card. That’s why I’m still waiting for it.

The good thing is, that ddrescue continues to recover some of this data with each sweep. The bad thing is it is so slow, that I am already running it for weeks. This is definitely what one would expect if one gives the drive for professional data recovery.

Perhaps there’s a faster way to recover the last few Kilobytes, but I still have no clue how to do it. I guess this is going to wait for the next article.

After ddrescue

The Arch wiki says after ddrescue finishes, I have to check the filesystem of the image and mount the device.

I know how to mount a device on Linux, but the given command there is not the one I use. It’s this one:

# fsck -f /dev/sdY

A quick read of its man says it checks and repairs the filesystem. I guess the mounting is left to me afterwards. Anyhow, when I tried it, it printed out the following:

fsck from util-linux 2.38.1
e2fsck 1.47.0 (5-Feb-2023)
ext2fs_open2: Bad magic number in super-block
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /home/path/to/sdcard_image

The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>
    e2fsck -b 32768 <device>

As it looks, it detected the file system of the drive where the image is and decided the image itself should also be using this particular file system. This, of course, is not the case, since I imaged an SD card, which highly likely uses FAT32.

I guess this command needs to be performed after the image is written to a new SD card, but before that I want to check how many files I have saved. That’s why I need to mount the image.

Mounting the image file as if it is a device

Using the knowledge I have I tried this particular command:

# mount -o loop ~/backups/sdcard-rescue1 /mnt/rescue_img

It says Linux should mount the image in the /mnt/rescue_img directory using a loop device. However, it makes Linux complain that this is the wrong file system type ore something:

mount: /mnt/rescue_img: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
dmesg(1) may have more information after failed mount system call.

Instead, a command specifically made to setup a loop device might help:

sudo losetup /dev/loop0 /home/nikola/backups/sdcard-rescue1

This does not produce any output, so I know it has succeeded. But after trying to run the mount command again, this time using /dev/loop0 as device, I get the same error. dmesg however shows me what the problem is:

[35161.785558] FAT-fs (loop0): bogus number of reserved sectors

This means the file system is corrupt and I need to fix it.

Repairing the file system

A quick search and man page read on fsck lead me to this version of the command:

# fsck -r -t vfat /dev/loop0

The difference with the one before is that I specify the type of file system now and I request it to do a report for the device at the end. And from running it I get this:

fsck from util-linux 2.38.1
fsck.fat 4.2 (2021-01-31)
Logical sector size is zero.
/dev/loop0: status 1, rss 2352, real 0.001671, user 0.001653, sys 0.000000

Repairing what? Just mount it properly!

After doing this command and trying to mount again, I got the same old message. No difference. And that’s because the file system is probably alright. The problem is it the single partition on the SD card starts with an offset of a few bytes. mount doesn’t like that.

To check how many bytes offset I should account for, I used:

#fdisk -l /dev/loop0

I got the number below the Start column and multiplied it by the size of a block. Then I tried mounting again, but not before recreating the loop device, this time with the offset:

# losetup -d /dev/loop0
# losetup -o (block_count*block_size in bytes) /dev/loop0 /home/path/to/image
# mount -t vfat /dev/loop0 /mnt/rescue_image

And the data is available!

ddrescue can only image your damaged drive

Instead of a summary I’d like to provide a small warning.

What ddrescue does is directly copy the contents of the blocks of the drive. If there’s an error, it skips the block. Then it tries the failing blocks again. This means it will copy the data that has no problems and it will rescue some of the data with problems. However it does not mean it will provide you with all your files.

If there is a large number of failing blocks on random places across the drive, there is the possibility that a significant portion of the files are not readable. To deal with this problem there are other tools you can use to attempt reconstruct the internal structure of the corrupted files.

This wasn’t that crazy of a task, compared to other stuff I did. Only very slow. Check that one out: Can I move my boot partition? However, be warned, I wrote it several years ago and my writing style was still quite crude.