Hi Christoph, I don’t know why is the proper person to ask for this issue, but I see your patch in the questionable code path and I know you are always helpful, so ask you here. Let me rescript the problem I encountered. 1, There is an 8 disks raid5 with 64K chunk size on my machine, I observe /sys/block/md0/queue/optimal_io_size is very large value, which isn’t reasonable size IMHO. 2, It was from drivers/scsi/mpt3sas/mpt3sas_scsih.c, 11939 static const struct scsi_host_template mpt3sas_driver_template = { 11940 .module = THIS_MODULE, 11941 .name = "Fusion MPT SAS Host", 11942 .proc_name = MPT3SAS_DRIVER_NAME, 11943 .queuecommand = scsih_qcmd, 11944 .target_alloc = scsih_target_alloc, 11945 .sdev_init = scsih_sdev_init, 11946 .sdev_configure = scsih_sdev_configure, 11947 .target_destroy = scsih_target_destroy, 11948 .sdev_destroy = scsih_sdev_destroy, 11949 .scan_finished = scsih_scan_finished, 11950 .scan_start = scsih_scan_start, 11951 .change_queue_depth = scsih_change_queue_depth, 11952 .eh_abort_handler = scsih_abort, 11953 .eh_device_reset_handler = scsih_dev_reset, 11954 .eh_target_reset_handler = scsih_target_reset, 11955 .eh_host_reset_handler = scsih_host_reset, 11956 .bios_param = scsih_bios_param, 11957 .can_queue = 1, 11958 .this_id = -1, 11959 .sg_tablesize = MPT3SAS_SG_DEPTH, 11960 .max_sectors = 32767, 11961 .max_segment_size = 0xffffffff, 11962 .cmd_per_lun = 128, 11963 .shost_groups = mpt3sas_host_groups, 11964 .sdev_groups = mpt3sas_dev_groups, 11965 .track_queue_depth = 1, 11966 .cmd_size = sizeof(struct scsiio_tracker), 11967 .map_queues = scsih_map_queues, 11968 .mq_poll = mpt3sas_blk_mq_poll, 11969 }; at line 11960, max_sectors of mpt3sas driver is defined as 32767. Then in drivers/scsi/scsi_transport_sas.c, at line 241 inside sas_host_setup(), shots->opt_sectors is assigned by 32767 from the following code, 240 if (dma_dev->dma_mask) { 241 shost->opt_sectors = min_t(unsigned int, shost->max_sectors, 242 dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT); 243 } Then in drivers/scsi/sd.c, inside sd_revalidate_disk() from the following coce, 3785 /* 3786 * Limit default to SCSI host optimal sector limit if set. There may be 3787 * an impact on performance for when the size of a request exceeds this 3788 * host limit. 3789 */ 3790 lim.io_opt = sdp->host->opt_sectors << SECTOR_SHIFT; 3791 if (sd_validate_opt_xfer_size(sdkp, dev_max)) { 3792 lim.io_opt = min_not_zero(lim.io_opt, 3793 logical_to_bytes(sdp, sdkp->opt_xfer_blocks)); 3794 } lim.io_opt of all my sata disks attached to mpt3sas HBA are all 32767 sectors, because the above code block. Then when my raid5 array sets its queue limits, because its io_opt is 64KiB*7, and the raid component sata hard drive has io_opt with 32767 sectors, by calculation in block/blk-setting.c:blk_stack_limits() at line 753, 753 t->io_opt = lcm_not_zero(t->io_opt, b->io_opt); the calculated opt_io_size of my raid5 array is more than 1GiB. It is too large. I know the purpose of lcm_not_zero() is to get an optimized io size for both raid device and underlying component devices, but the resulted io_opt is bigger than 1 GiB that's too big. For me, I just feel uncomfortable that using max_sectors as opt_sectors in sas_host_stup(), but I don't know a better way to improve. Currently I just modify the mpt3sas_driver_template's max_sectors from 32767 to 64, and observed 5~10% sequetial write performance improvement (direct io) for my raid5 devices by fio. So there should be something to fix. Can you take a look, or give me some hint to fix? Thanks in advance. Coly Li -- Coly Li