Hi Dan Carpenter, Indeed, the behavior looks incorrect. I will look at it deeper and submit a fix. Best regards, Tigran. ----- Original Message ----- > From: "Dan Carpenter" <dan.carpenter@xxxxxxxxxx> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx> > Cc: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx> > Sent: Thursday, 28 August, 2025 13:08:13 > Subject: [bug report] pNFS/flexfiles: don't attempt pnfs on fatal DS errors > Hello Tigran Mkrtchyan, > > Commit f06bedfa62d5 ("pNFS/flexfiles: don't attempt pnfs on fatal DS > errors") from Jun 27, 2025 (linux-next), leads to the following > Smatch static checker warning: > > fs/nfs/flexfilelayout/flexfilelayout.c:880 ff_layout_pg_init_read() > error: uninitialized symbol 'ds_idx'. > > We recently changed ff_layout_choose_ds_for_read() from returning NULL to > returning error pointers. There are two bugs. One error path in > ff_layout_choose_ds_for_read() accidentally returns success. And some > of the callers are still checking for NULL instead of error pointers. > > Btw, I'm always promoting my blog about error pointers and NULL: > https://staticthinking.wordpress.com/2022/08/01/mixing-error-pointers-and-null/ > It's not really related here, since we should not mix error pointers > and NULL but I still link to it because that's what they taught me in > my Trump Wealth Institute course on Search Engine Optimization. > > Here is what ff_layout_choose_ds_for_read() looks like: > fs/nfs/flexfilelayout/flexfilelayout.c > 758 static struct nfs4_pnfs_ds * > 759 ff_layout_choose_ds_for_read(struct pnfs_layout_segment *lseg, > 760 u32 start_idx, u32 *best_idx, > 761 bool check_device) > 762 { > 763 struct nfs4_ff_layout_segment *fls = FF_LAYOUT_LSEG(lseg); > 764 struct nfs4_ff_layout_mirror *mirror; > 765 struct nfs4_pnfs_ds *ds = ERR_PTR(-EAGAIN); > 766 u32 idx; > 767 > 768 /* mirrors are initially sorted by efficiency */ > 769 for (idx = start_idx; idx < fls->mirror_array_cnt; idx++) { > 770 mirror = FF_LAYOUT_COMP(lseg, idx); > 771 ds = nfs4_ff_layout_prepare_ds(lseg, mirror, false); > 772 if (IS_ERR(ds)) > 773 continue; > 774 > 775 if (check_device && > 776 > nfs4_test_deviceid_unavailable(&mirror->mirror_ds->id_node)) > 777 continue; > > Bug #1: If we hit this continue on the last iteration through the loop > then we return success and *best_idx is not initialized. It should be: > > if (check_device && > nfs4_test_deviceid_unavailable(&mirror->mirror_ds->id_node)) { > ds = ERR_PTR(-EINVAL); > continue; > } > > 778 > 779 *best_idx = idx; > 780 break; > 781 } > 782 > 783 return ds; > 784 } > > Bug #2: The whole rest of the call tree expects NULL and not error pointers. > For example, ff_layout_get_ds_for_read(): > > fs/nfs/flexfilelayout/flexfilelayout.c > 812 static struct nfs4_pnfs_ds * > 813 ff_layout_get_ds_for_read(struct nfs_pageio_descriptor *pgio, > 814 u32 *best_idx) > 815 { > 816 struct pnfs_layout_segment *lseg = pgio->pg_lseg; > 817 struct nfs4_pnfs_ds *ds; > 818 > 819 ds = ff_layout_choose_best_ds_for_read(lseg, pgio->pg_mirror_idx, > 820 best_idx); > 821 if (ds || !pgio->pg_mirror_idx) > ^^ > 822 return ds; > > This needs to check for error pointers. But also if pgio->pg_mirror_idx > is zero, is that a success return? I don't know... > > 823 return ff_layout_choose_best_ds_for_read(lseg, 0, best_idx); > 824 } > > fs/nfs/flexfilelayout/flexfilelayout.c > 842 static void > 843 ff_layout_pg_init_read(struct nfs_pageio_descriptor *pgio, > 844 struct nfs_page *req) > 845 { > 846 struct nfs_pgio_mirror *pgm; > 847 struct nfs4_ff_layout_mirror *mirror; > 848 struct nfs4_pnfs_ds *ds; > 849 u32 ds_idx; > 850 > 851 if (NFS_SERVER(pgio->pg_inode)->flags & > 852 (NFS_MOUNT_SOFT|NFS_MOUNT_SOFTERR)) > 853 pgio->pg_maxretrans = io_maxretrans; > 854 retry: > 855 pnfs_generic_pg_check_layout(pgio, req); > 856 /* Use full layout for now */ > 857 if (!pgio->pg_lseg) { > 858 ff_layout_pg_get_read(pgio, req, false); > 859 if (!pgio->pg_lseg) > 860 goto out_nolseg; > 861 } > 862 if (ff_layout_avoid_read_on_rw(pgio->pg_lseg)) { > 863 ff_layout_pg_get_read(pgio, req, true); > 864 if (!pgio->pg_lseg) > 865 goto out_nolseg; > 866 } > 867 /* Reset wb_nio, since getting layout segment was successful */ > 868 req->wb_nio = 0; > 869 > 870 ds = ff_layout_get_ds_for_read(pgio, &ds_idx); > 871 if (!ds) { > ^^^^^^^^^^ > This doesn't check for error pointers either. > > 872 if (!ff_layout_no_fallback_to_mds(pgio->pg_lseg)) > 873 goto out_mds; > 874 pnfs_generic_pg_cleanup(pgio); > 875 /* Sleep for 1 second before retrying */ > 876 ssleep(1); > 877 goto retry; > 878 } > 879 > --> 880 mirror = FF_LAYOUT_COMP(pgio->pg_lseg, ds_idx); > ^^^^^^ > Uninitialized. > > > regards, > dan carpenter
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature