Re: Support for transferring sparse files via scp/sftp correctly?

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

 



Sorry for the mis-send earlier. Here’s the complete message I meant to send.

On Mar 6, 2025, at 9:38 PM, Damien Miller <djm@xxxxxxxxxxx <mailto:djm@xxxxxxxxxxx>> wrote:
> If you want this to happen, I recommend starting by figuring out what
> protocol extensions need to be made, and how to support sparse files
> on system without SEEK_DATA/HOLE - it should be pretty to do this on
> upload without these flags and without extensions.


I was inspired by this thread to add sparse file support to AsyncSSH, on OSes that support SEEK_DATA and SEEK_HOLE. It looks like I should also be able to get this to work on Windows with FSCTL_QUERY_ALLOCATED_RANGES and FSCTL_SET_SPARSE, but I haven’t gotten to that yet.

As Darren Tucker said, the put() operation here can be made to work with any SFTP server. However, an SFTP extension is required to support this for get() or copy(), or the case where the copy-data extension is used to copy data between files on a remote server without reading and writing it back over the wire.

I’ve defined an extension called "ranges@xxxxxxxxxxxx <mailto:ranges@xxxxxxxxxxxx>” which is modeled somewhat after FXP_READDIR for getting valid data ranges in a remote file. Each call can return multiple ranges, but on files with a large number of ranges you may need send this request multiple times to get the complete list. This allows for the copying to be interleaved with getting back range responses.

The request looks like the following:

    uint32          id
    string          “ranges@xxxxxxxxxxxx <mailto:ranges@xxxxxxxxxxxx>”
    string          handle
    uint64          offset
    uint64          length

This requests valid data ranges in the file associated with the request handle. The offset and length specify the portion of the file which the ranges should be returned for. The response looks like:

    uint32          id
    uint32          count
    repeats count times:
        uint64          offset
        uint64          length
    bool end-of-list [optional]

The count specifies the number of ranges in the reply. After this is an optional bool which indicates whether there are any more valid data ranges in the request’s offset and length. If there are no entries at all within the request range, an FXP_STATUS of FX_EOF should be sent.

If you don’t get all of the requested ranges in a single request, additional requests can be sent starting at just past the end of the last range previously returned.

What do you think?
-- 
Ron Frederick
ronf@xxxxxxxxxxxxx



_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@xxxxxxxxxxx
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev




[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux