Android Vold 随笔之fsck
背景
Android 7.1 接入128G exfat U 盘后出现空指针异常
调试过程
异常打印如下
[2020/2/24 16:11:46] [ 4374.620557@3] fsck.exfat[9092]: unhandled level 1 translation fault (11) at 0x00000010, esr 0x92000005
[2020/2/24 16:11:46] [ 4374.626886@1] pgd = ffffffc04ef33000
[2020/2/24 16:11:46] [ 4374.627742@1] [00000010] *pgd=0000000000000000
[2020/2/24 16:11:46] [ 4374.632136@1] reg value pfn reg value pfn
[2020/2/24 16:11:46] [ 4374.639519@1] r0 : 0000000000000000 -------- r1 : 0000000000000000 --------
[2020/2/24 16:11:46] [ 4374.646972@1] r2 : 00000000ffdedb28 58835 r3 : 0000000000000000 --------
[2020/2/24 16:11:46] [ 4374.654472@1] r4 : 0000000000000000 -------- r5 : 00000000ffdeddb4 58835
[2020/2/24 16:11:46] [ 4374.661977@1] r6 : 00000000ffdedb28 58835 r7 : 00000000ee6e8e80 458d7
[2020/2/24 16:11:46] [ 4374.669480@1] r8 : 00000000ee7c8008 40c84 r9 : 0000000000000001 --------
[2020/2/24 16:11:46] [ 4374.676987@1] r10: 00000000aaf1a008 574ad r11: 00000000ffdedc7f 58835
[2020/2/24 16:11:46] [ 4374.684487@1] r12: 00000000ee7c7f20 3937 r13: 00000000ffdedb08 58835
[2020/2/24 16:11:46] [ 4374.691991@1] r14: 00000000ee7c0bb3 4a2d7 pc : 00000000ee7c143c 4a2d6
[2020/2/24 16:11:46] [ 4374.699332@1] sp : 0000000000000000 --------
[2020/2/24 16:11:46] [ 4374.703727@1]
[2020/2/24 16:11:46] [ 4374.705383@1] CPU: 1 PID: 9092 Comm: fsck.exfat Tainted: G O 3.14.29-00013-g6143217-dirty #10
[2020/2/24 16:11:46] [ 4374.714703@1] task: ffffffc013f4a4c0 ti: ffffffc00cccc000 task.ti: ffffffc00cccc000
[2020/2/24 16:11:46] [ 4374.722286@1] PC is at 0xee7c143c
[2020/2/24 16:11:46] [ 4374.725578@1] LR is at 0xee7c0bb3
[2020/2/24 16:11:46] [ 4374.728836@1] vma for ee7c143c:
[2020/2/24 16:11:46] [ 4374.731935@1] ee7be000-ee7c6000 r-xp 00000000 b3:0c 1394 /system/lib/libexfat.so
[2020/2/24 16:11:46] [ 4374.739265@1] vma for ee7c0bb3:
[2020/2/24 16:11:46] [ 4374.742370@1] ee7be000-ee7c6000 r-xp 00000000 b3:0c 1394 /system/lib/libexfat.so
[2020/2/24 16:11:46] [ 4374.749701@1] pc : [<00000000ee7c143c>] lr : [<00000000ee7c0bb3>] pstate: 000f0030
[2020/2/24 16:11:46] [ 4374.757225@1] sp : 00000000ffdedb08
[2020/2/24 16:11:46] [ 4374.760656@1] x12: 00000000ee7c7f20
[2020/2/24 16:11:46] [ 4374.764181@1] x11: 00000000ffdedc7f x10: 00000000aaf1a008
[2020/2/24 16:11:46] [ 4374.769623@1] x9 : 0000000000000001 x8 : 00000000ee7c8008
[2020/2/24 16:11:46] [ 4374.775050@1] x7 : 00000000ee6e8e80 x6 : 00000000ffdedb28
[2020/2/24 16:11:46] [ 4374.780491@1] x5 : 00000000ffdeddb4 x4 : 0000000000000000
[2020/2/24 16:11:46] [ 4374.785916@1] x3 : 0000000000000000 x2 : 00000000ffdedb28
[2020/2/24 16:11:46] [ 4374.791357@1] x1 : 0000000000000000 x0 : 0000000000000000
[2020/2/24 16:11:46] [ 4374.796786@1]
[2020/2/24 16:11:46] [ 4374.796786@1] PC: 0xee7c13bc:
[2020/2/24 16:11:46] [ 4374.801363@1] 13bc 461d31ff f7fe4690 4604ea18 e0004638 46313001 ea7cf7fe 42b8b180 f810bf1c
[2020/2/24 16:11:46] [ 4374.809635@1] 13dc 292c1c01 5d01d1f4 d1f1293d 21004420 46423001 41f0e8bd bf28f003 e8bd4628
[2020/2/24 16:11:46] [ 4374.817915@1] 13fc b57081f0 4605460c f04f4620 f7fe31ff 4606e9f4 e0004628 46213001 ea58f7fe
[2020/2/24 16:11:46] [ 4374.826196@1] 141c 42a8b158 f810bf1c 292c1c01 5d82d1f4 b11a2101 d1ef2a2c 2100e000 bd704608
[2020/2/24 16:11:46] [ 4374.834475@1] 143c 31016901 47706101 b0c2b5b0 481c4604 4478460d 68006800 69299041 61281e48
[2020/2/24 16:11:46] [ 4374.842755@1] 145c dc072900 0103f10d f44f4628 f7fe7280 6928ea3c 6ae8b9c8 2f80f410 4620d004
[2020/2/24 16:11:46] [ 4374.851035@1] 147c f7fe4629 6ae8ea38 2f00f410 4620d008 22004629 f7fe2300 4628e96e e9f4f7fe
[2020/2/24 16:11:46] [ 4374.859314@1] 149c b1106a60 f7fe4620 4806ea2c 44789941 68006800 bf041a40 bdb0b042 e900f7fe
[2020/2/24 16:11:46] [ 4374.867597@1]
[2020/2/24 16:11:46] [ 4374.867597@1] LR: 0xee7c0b33:
[2020/2/24 16:11:46] [ 4374.872301@1] 0b30 2000f8dc 68009107 68109002 ee1ef7fe 21074814 44782201 48136805 4478682b
[2020/2/24 16:11:46] [ 4374.880446@1] 0b50 ee1af7fe aa076828 92014621 ee1af7fe 2102480e 2201682b f7fe4478 480cee0e
[2020/2/24 16:11:46] [ 4374.888726@1] 0b70 44789902 68006800 bf011a40 e8bdb003 b00340b0 f7fe4770 bf00ed9a 00007366
[2020/2/24 16:11:46] [ 4374.897006@1] 0b90 0000736a 00007352 00004b3e 00004b12 0000731a 460cb570 46204605 f7fe4616
[2020/2/24 16:11:46] [ 4374.905298@1] 0bb0 2000ee04 e9c64621 46284000 ee02f7fe b11e4606 46214628 ee02f7fe bd704630
[2020/2/24 16:11:46] [ 4374.913566@1] 0bd0 460cb510 f7fe6821 2000edfc 0000e9c4 6848bd10 bf122800 68086880 60486840
[2020/2/24 16:11:46] [ 4374.921846@1] 0bf0 f004b108 2000bb19 00004770 47f0e92d 4682b084 4614481f 44784689 68006800
[2020/2/24 16:11:46] [ 4374.930127@1] 0c10 f8da9003 f7fe0010 f10dedd0 46070808 7000f8c9 4434e000 46414620 f0009402
[2020/2/24 16:11:46] [ 4374.938406@1] 0c30 4606f82d d0022e01 9c02b1ae 9c02e003 282e7820 4650d0ef 464a4639 96004623
看logcat 打印是块设备已经识别到了, 在mount 文件系统的时候出问题了
定位问题
看logcat是有trace,直接addr2line看
Trace 定位是在
是在fsck/main.c dirck 函数中调用出问题的
exfat也是移植开源的代码,其版本也比较久了 是0.9.5了
[2020/2/24 16:28:31] data/fsck.exfat /dev/block/vold/public\:8,1 <
[2020/2/24 16:28:31] exfatfsck 0.9.5
[2020/2/24 16:28:31] Checking file system on /dev/block/vold/public:8,1.
[2020/2/24 16:28:31] File system version 1.0
[2020/2/24 16:28:31] Sector size 512 bytes
[2020/2/24 16:28:31] Cluster size 128 KB
[2020/2/24 16:28:31] Volume size 60 GB
[2020/2/24 16:28:31] Used space 27 GB
[2020/2/24 16:28:31] Available space 32 GB
介绍下 fsck的作用
fsck is used to check and optionally repair one or more Linux filesystems. filesys can be a device name (e.g. /dev/hdc1, /dev/sdb2),
a mount point (e.g. /, /usr, /home), or an ext2 label or UUID specifier (e.g. UUID=8868abf6-88c5-4a83-98b8-bfc24057f7bd or
LABEL=root). Normally, the fsck program will try to handle filesystems on different physical disk drives in parallel to reduce the
total amount of time needed to check all of them
模拟问题
手动调用 fsck.exfat 模拟问题
fsck.exfat /dev/block/vold/public\:8,1
输入上面的命令也能出现这个问题。
一开始怀疑是跟U盘有特定关系,因为问题的提报是128G exfat 的,先入主观了,不过刚看这个代码也是生涩,函数如下,粗看是使用一个while 循环加链表节点遍历,依旧使用老办法先,先加printf
static void dirck(struct exfat* ef, const char* path)
{
struct exfat_node* parent;
struct exfat_node* node;
struct exfat_iterator it;
int rc;
size_t path_length;
char* entry_path;
if (exfat_lookup(ef, &parent, path) != 0)
exfat_bug("directory `%s' is not found", path);
if (!(parent->flags & EXFAT_ATTRIB_DIR))
exfat_bug("`%s' is not a directory (0x%x)", path, parent->flags);
path_length = strlen(path);
entry_path = malloc(path_length + 1 + EXFAT_NAME_MAX);
if (entry_path == NULL)
{
exfat_error("out of memory");
return;
}
strcpy(entry_path, path);
strcat(entry_path, "/");
rc = exfat_opendir(ef, parent, &it);
if (rc != 0)
{
free(entry_path);
exfat_put_node(ef, parent);
exfat_error("failed to open directory `%s'", path);
return;
}
while ((node = exfat_readdir(ef, &it)))
{
exfat_get_name(node, entry_path + path_length + 1, EXFAT_NAME_MAX);
exfat_debug("%s: %s, %"PRIu64" bytes, cluster %u", entry_path,
IS_CONTIGUOUS(*node) ? "contiguous" : "fragmented",
node->size, node->start_cluster);
if (node->flags & EXFAT_ATTRIB_DIR)
{
directories_count++;
dirck(ef, entry_path);
}
else
files_count++;
nodeck(ef, node);
exfat_put_node(ef, node);
}
exfat_closedir(ef, &it);
exfat_put_node(ef, parent);
free(entry_path);
}
看到原有的代码中有打印,不过没有开放,
exfat_debug("%s: %s, %"PRIu64" bytes, cluster %u", entry_path,
node->is_contiguous ? "contiguous" : "fragmented",
node->size, node->start_cluster);
手动打开后Android 源码下 source && lunch && mm fsck.exfat
然后运行bin档做实验
[2020/2/24 16:28:33] entry_path:/.Spotlight-V100/Store-V2/FC80D3EA-FD51-447F-BB4E-1EB5187351D0/journals.live_system/retire.5: fragmented, 0 bytes, cluster 0
[2020/2/24 16:28:33] entry_path:/.Spotlight-V100/Store-V2/FC80D3EA-FD51-447F-BB4E-1EB5187351D0/reverseStore.updates: contiguous, 2 bytes, cluster 56848
[2020/2/24 16:28:33] entry_path:/.Spotlight-V100/Store-V2/FC80D3EA-FD51-447F-BB4E-1EB5187351D0/journals.scan: contiguous, 131072 bytes, cluster 56849
[2020/2/24 16:28:33] entry_path:/.Spotlight-V100/Store-V2/FC80D3EA-FD51-447F-BB4E-1EB5187351D0/journals.scan/retire.810: fragmented, 0 bytes, cluster 0
[2020/2/24 16:28:33] entry_path:/.Spotlight-V100/Store-V2/FC80D3EA-FD51-447F-BB4E-1EB5187351D0/tmp.spotlight.loc: contiguous, 53117 bytes, cluster 56850
[2020/2/24 16:28:33] entry_path:/.Spotlight-V100/lpk.dll: contiguous, 46592 bytes, cluster 56851
[2020/2/24 16:28:33] entry_path:/.fseventsd: contiguous, 131072 bytes, cluster 56852
[2020/2/24 16:28:33] entry_path:/.fseventsd/fseventsd-uuid: contiguous, 36 bytes, cluster 56853
[2020/2/24 16:28:33] entry_path:/.fseventsd/lpk.dll: contiguous, 46592 bytes, cluster 56854
[2020/2/24 16:28:33] entry_path:/新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹新建文件夹: contiguous, 131072 bytes, cluster 56855
[2020/2/24 16:28:33] Segmentation fault
刚好看到运行停止的时候是停在这个文件夹,于是怀疑是否是这个文件夹比较特殊,看起来是文件夹长度太长,原有定义的宏长度为 EXFAT_NAME_MAX 256,于是增加做了几个实验来确认下
1、将该文件夹复制到另外一个exfat U盘中,看是否出现此问题,可以复现此问题
2、将原有U盘该文件夹路径改短, 看是否出现此问题,无法复现此问题
3、将 EXFAT_NAME_MAX 改为512, 编译运行后无法复现此问题
去看最新的开源代码,其有修改过这个问题了
https://github.com/relan/exfat/releases
SHA-1: 9f1954ed36a92d02f610919a83e4ae06e10c20c3
* Fix memory leak on an error handling path in fsck.
==========================
SHA-1: e62b9699508e26f3865918bb75bef199cb5627df
* Remove buffer size argument for exfat_get_name().
The output buffer is always UTF8_BYTES(EXFAT_NAME_MAX)+1 characters. No
need to repeat this every time.
==========================
SHA-1: 088ea8a362ab77d93699b7da72d9b1abf164b59c
* Reduce the sizes of name buffers.
EXFAT_NAME_MAX is the number of 16-bit code units, not Unicode
characters. When converting to UTF-8, 3 bytes are enough to keep any
Unicode character encoded by a 16-bit code unit.