TAFT

Introduction

TAFT, the Transport Agnostic File Transfer protocol, provides easy programmatic access to remote files. It is intended as a substitute for the FTP protocol, which is highly complex and has many well-known problems. TAFT is a stateless, sessionless protocol that has simplicity as its main design goal.

While TAFT is not a network protocol, it is anticipated that it will be implemented atop a secure network protocol, which is almost certainly HTTPS. The HTTPS protocol already knows how to upload and download files, so the role of TAFT is merely to specify exactly how this is done.

The server and the client

Files are provided by a TAFT server and accessed by a TAFT client. The server makes available a collection of files and directories arranged in a hierarchy; the client downloads the files and lists the directories. Optionally, the server may allow the client to upload, replace, rename, move, and delete both files and directories.

The client and the server communicate by exchanging messages. The syntax and semantics of the messages are defined by the protocol, but TAFT does not define a mechanism for transmitting the messages, and this is the sense in which it is transport agnostic.

Protocol versions

Each version of the TAFT protocol is identified by a positive integer that increases by one with each revision of the protocol. To date there has only been one version of the TAFT protocol and it is version 1.

The client and the server must agree on the version of the protocol that they will use. Communication is always initiated by the client, which must tell the server which version of the protocol it wants to use. If the server does not support the version requested by the client, the server will respond with a message that indicates an error.

Public access, private access, and access keys

A TAFT server has the option of providing wide open public access or restricted private access, or both. To access the public files provided by a server, the client does not have to do anything special. To access the private files of a server, the client must supply an access key.

The access key is the only means by which the server knows the identity of the client. The access key is a shared secret of the client and the server and must be guarded as carefully as would a password or a private encryption key. If the client supplies an access key, and if the access key is known to the server and is considered by the server to be valid, the server allows the client to access the private files. The view of the file hierarchy, and the actions that the client is allowed to perform within it, are determined by the server based on the access key.

As far as the protocol is concerned, the access key is simply a non-empty string; the protocol places no restrictions on the length or the content of the string. It is recommended that the access key be generated from a secure source of random data and that its length be at least 16 characters.

The protocol does not specify how access keys are created, managed, or stored, or about their lifetimes, or about how they are bound to user identities or permissions. All of these details are left up to the implementation.

Messages

The client and the server communicate by exchanging messages. A message that is sent from the client to the server is called a request, while a message that is sent from the server to the client is called a response.

Message exchange is always initiated by the client. For each request that it receives from a client, the server sends a single response back to the client. In other words, there is a one-to-one correspondence between requests and responses, and the server never sends a response without first receiving a request from the client.

Each message has two parts: the head, which is mandatory, and the body, which is optional. The head consists of a single JSON object in plain text, while the body, if present, consists of file content encoded in Base64.

In a request, the head describes what the client is asking the server to do. If the request includes a command to upload a file, then the body is present and supplies the content of the uploaded file. In a response, the head indicates the success or failure of the request, and includes any metadata asked for by the request. If the request includes a command to download a file, then the body is present and supplies the content of the downloaded file.

Requests

The head of a request includes a mandatory property named command and will usually include a property named version. Here is an example of a typical request head:

{
  "version" : 1,
  "command" : "download",
  "path"    : "/data/2022/final.csv"
}

The value of the version property is a positive integer stating the version of the protocol that the client wants to use. The value of the command property is a string containing the name of the command that the client is asking the server to perform. Each command has a list of zero or more additional properties that serve as arguments to the command, some mandatory and some optional. For example, the download command shown above requires the path property, but also has some additional optional properties.

Here is an example of a request that has both a head and a body:

{
  "version" : 1,
  "command" : "upload",
  "path"    : "/test/jabberwocky.txt"
}
4oCZVHdhcyBicmlsbGlnLCBhbmQgdGhlIHNsaXRoeSB0b3ZlcwogICAgICBEaWQgZ3lyZSBhbmQg
Z2ltYmxlIGluIHRoZSB3YWJlOgpBbGwgbWltc3kgd2VyZSB0aGUgYm9yb2dvdmVzLAogICAgICBB
bmQgdGhlIG1vbWUgcmF0aHMgb3V0Z3JhYmUuCgrigJxCZXdhcmUgdGhlIEphYmJlcndvY2ssIG15
IHNvbiEKICAgICAgVGhlIGphd3MgdGhhdCBiaXRlLCB0aGUgY2xhd3MgdGhhdCBjYXRjaCEKQmV3
YXJlIHRoZSBKdWJqdWIgYmlyZCwgYW5kIHNodW4KICAgICAgVGhlIGZydW1pb3VzIEJhbmRlcnNu
YXRjaCHigJ0KCkhlIHRvb2sgaGlzIHZvcnBhbCBzd29yZCBpbiBoYW5kOwogICAgICBMb25nIHRp
bWUgdGhlIG1hbnhvbWUgZm9lIGhlIHNvdWdodOKAlApTbyByZXN0ZWQgaGUgYnkgdGhlIFR1bXR1
bSB0cmVlCiAgICAgIEFuZCBzdG9vZCBhd2hpbGUgaW4gdGhvdWdodC4KCkFuZCwgYXMgaW4gdWZm
aXNoIHRob3VnaHQgaGUgc3Rvb2QsCiAgICAgIFRoZSBKYWJiZXJ3b2NrLCB3aXRoIGV5ZXMgb2Yg
ZmxhbWUsCkNhbWUgd2hpZmZsaW5nIHRocm91Z2ggdGhlIHR1bGdleSB3b29kLAogICAgICBBbmQg
YnVyYmxlZCBhcyBpdCBjYW1lIQo=

The body is a block of Base64-encoded data. The body follows the head, with optional white space between the two. In this example, the body is the first four verses of the Lewis Carroll poem Jabberwocky, which the upload command is asking the server to save in a file named by the path property.

The message formatting shown here is illustrative only. The JSON object can be formatted in any style, and the Base64 block can have any amount of leading, trailing, or interior white space.

The commands are listed later, and each one is explained in detail.

Responses

As with a request, a response consists of a mandatory head and an optional body. The head is a plaintext JSON object that has a mandatory property named status. The value of this property is a string that is "Success" after a successful request or a short error message after a failed request.

Paths

The name of a file or directory is called a path. A path always appears as the value of a JSON property and is always a non-empty string. Here are some examples of paths:

"/"
"/README.txt"
"/rabbit.jpg"
"/data"
"/data/annual"
"/data/annual/2024"
"/data/annual/2024/Final Summary.csv"

A path begins with a slash and is followed by zero or more components separated by slashes. Each component of a path is a string of one or more characters, with all characters allowed except for the slash and the NUL character. White space before and after a path is ignored, but white space within a path is taken verbatim as part of the path. Letter case is significant.

All paths are absolute. Relative paths are not supported. There is no concept of a current directory or a working directory. Wildcards are not supported.

All path components are treated literally as the names of files or directories. Path components that have special meanings in other contexts, such as "~", "." and "..", do not receive special treatment. The only way to traverse the file hierarchy is to descend by one level for each component of the path.

Commands

This table provides an exhaustive list of the commands provided by version 1 of the protocol. Each command is explained in detail in a section of its own below.

Command Purpose
hello Causes the server to describe itself.
list Lists the contents of a directory or describes a single file.
download Downloads a file.
upload Uploads a file.
rename Changes the name of a file or directory.
move Moves a file or subdirectory from one directory to another.
delete Deletes a file.
mkdir Creates a directory.
rmdir Deletes a directory and its contents, recursively.

Protocol levels

The protocol defines four levels, with each level expanding upon the capabilities of the level beneath it.

Level zero. The client or server provides no meaningful services at all, just the following command:

hello

Level one. The client or server is capable of listing directories and downloading files. To be level-one compliant, the client or server must implement at least the following commands:

hello
list
download

Level two. The client or server also supports file upload. To be level-two compliant, the client or server must implement all of the commands from level one plus at least the following command:

upload

Level three. The client or server also supports the management of files and directories. To be level-three compliant, the client or server must implement all of the commands from levels one and two plus all of the following commands:

rename
move
delete
mkdir
rmdir

Timestamps

When asked to list files, the server includes a timestamp as part of the description of each file or directory that is listed. Timestamps are in UTC. The server acts as the sole timekeeper. The client is never asked to supply a timestamp. This means that timestamps are never converted from local times to UTC, and there is no need to store zone information with the timestamps. The client can convert UTC times to local times for local display purposes, using the local timezone offset that was in effect at the time represented by the timestamp.

The timestamp of a regular file is the date and time at which the content of the file was created or was most recently modified. The timestamp does not change if the file is renamed, or if the file is moved to another position in the hierarchy.

The timestamp of a directory is the date and time at which any file or subdirectory directly beneath it was created, deleted, or renamed.

Note that files and directories can be manipulated by the server operator without necessarily going through the protocol, and that these manipulations can cause timestamps to change.

Read the head

The first step in processing a request is to read the head, which is a JSON object in plain text. The JSON is parsed, and if there are any lexical or syntactic errors in the JSON object, the server must respond with this status:

Malformed request head

If the JSON object is parsed successfully, then analysis proceeds to check the structure and content of the head object.

Processing the command property

Every request must include in its head a property named command that specifies the action that the client wants the server to perform. The server processes this property as follows:

  1. Check that the property is present. If the head does not include a property named command then the server must stop processing the request and respond with this status:
    Missing command
  2. Check that the value is string. If the value of the command property is not a string then the server must stop processing the request and respond with this status:
    Malformed command
  3. Strip leading and trailing white space from the string. The value of the command property is allowed to arrive with leading and/or trailing white space. If any is present then remove it now.
  4. Check that the command exists. The value of the command property is compared to the following strings:
    "hello"
    "list"
    "download"
    "upload"
    "rename"
    "move"
    "delete"
    "mkdir"
    "rmdir"
    If the value is not one of these strings then the server must stop processing the request and respond with the following status:
    No such command

Processing the version property

A request may include in its head an optional property named version to specify the protocol version that the client wishes to use. Most requests include this property, as it is required by every command other than hello. For a command that requires this property, the server processes the property as follows:

  1. Check that the property is present. If the head does not include a property named version then the server must stop processing the request and respond with this status:
    Missing protocol version
  2. Check that the value is a positive integer. If the value of the version property is not a number, or if the value is a number that has a non-zero fraction, or if the value is an integer that is less than 1, then the server must stop processing the request and respond with this status:
    Malformed protocol version
  3. Check that the protocol version is supported by the server. If the server does not support the protocol version given as the value of the version property, including the case that the version number is higher than any existing version of the protocol, then the server must stop processing the request and respond with this status:
    Unsupported protocol version

Processing the accessKey property

A request may include in its head an optional property named accessKey. The presence of this property indicates that the client is attempting to access the private files, while the absence of this property indicates that the client is attempting to access the public files. The server processes this property as follows:

  1. Determine if the property is present. Neither the presence nor the absence of the accessKey property is necessarily an error, so this step is merely one of discovery.
  2. Check if public access is prohibited. If the accessKey property is absent but the server does not allow public access then the server must stop processing the request and respond with this status:
    No public access
  3. Check if public access is allowed. If the accessKey property is absent and the server allows public access then no further processing of this property is required and the remaining steps below are not performed.
  4. Check if private access is allowed. At this point it has been determined that the accessKey property is present. If the server does not allow private access then the server must stop processing the request and respond with this status:
    No private access
  5. Check that the value is a string. If the value of the accessKey property is not a string then the server must stop processing the request and respond with this status:
    Malformed access key
  6. Do not remove white space. Unlike most other property values that arrive at the server, the value of the accessKey property retains all white space verbatim.
  7. Check that the value is not empty. If the value of the accessKey property is an empty string then the server must stop processing the request and respond with this status:
    Malformed access key
  8. Check that the access key is known. If the value of the accessKey property is not known to the server then the server must stop processing the request and respond with this status:
    Access key unknown
  9. Check that the access key is acceptable. If the server is not willing to provide service to the access key holder then the server must stop processing the request and respond with this status:
    Access key rejected

Processing the path property

A request may include in its head the path property to specify the location of a file or directory if required by the command. While each command may apply specific additional processing to the path property, all of them begin processing the property as follows:

  1. Determine if the property is present. If the path property is not present then the server must stop processing the request and respond with this status:
    Missing path
  2. Check that the value is a string. If the value of the path property is not a string then the server must stop processing the request and respond with this status:
    Malformed path
  3. Strip leading and trailing white space from the string. The value of the path property is allowed to arrive with leading and/or trailing white space. If any is present then remove it now.
  4. Make sure the path is well-formed. If the value of the path property is an empty string, or if it does not begin with a slash, or if it contains two or more adjacent slashes, or if it ends with slash, then the server must stop processing the request and respond with this status:
    Malformed path
    

Processing the path property for an existing file

hello: Learning about the server

Every server must implement at least the hello command. This command is versionless, as it must be possible to query the capabilities of the server regardless of which versions of the protocol are supported by the client or the server. Therefore, the version property, if supplied in the request head, is ignored.

The head for this command is normally just this:

{ "command": "hello" }

All other properties included in the head are ignored. Here is a typical response to the hello command:

{
  "status": "Success",
  "operator": "National Bureau of Climate Studies",
  "description": "This is the public server for climate
    data sets provided by the National Bureau of Climate Studies.
    Access is free for all users.
    See file /README.txt for more information.",
  "public" : 1,
  "private" : 0,
  "versions" : [1, 2, 3]
}

The status of the response is always "Success", as the hello command should never fail. All of the properties shown here are always present, but the server can produce additional properties if it wants to. The operator and description properties are self explanatory; their values are strings, and either or both of them can be an empty string or null if the server does not wish to describe itself.

The public property tells the client whether or not the server provides public access, and, if it does, at what level of the protocol. This server provides public access at level 1, meaning that it is restricted to listing and downloading files, which is typical of public servers. The private property has the same purpose but for private access; this server offers no private access, which is the meaning of level 0.

The versions property is a non-empty array of positive integers specifying the protocol versions that the server is able to speak. When requesting any command other than hello, the client must specify one of these numbers in the version property of the request head. The versions are always listed in ascending order.

list: Listing directories

The list command is used to list the contents of a directory and to describe single files and directories. For example:

{
  "version" : 1,
  "command" : "list",
  "path"    : "/assets/images"
}

Supposing that the directory contains two files and one directory, the response might be:

{
  "status": "Success",
  "list":
  [
    {
      "type": "file",
      "name": "bunny.jpg",
      "size": 40439,
      "time": "2024-11-03T18:04:33Z"
    },
    {
      "type": "file",
      "name": "logo-64x64.png",
      "size": 8304,
      "time": "2023-06-29T06:22:58Z"
    },
    {
      "type": "directory",
      "name": "old",
      "size": 23,
      "time": "2024-11-03T18:04:33Z"
    }
  ]
}

The list property is an array containing one element for each item in the directory. Each element of the array is an object that describes a file or a directory. Each of these objects has the same four properties:

Property Value
type The type of the item, which is "file" for a regular file or "directory" for a directory.
name The name of the file or directory. Only the last component of the path is given, to avoid needless repetition.
size The size of the file in bytes, or, in the case of a directory, the number of items in the directory.
time The time that the item or directory was created or most recently modified, UTC, in ISO 8601 format.

To obtain a description of a single file, or to obtain a description of a directory itself without listing its contents, include the property self with value true.

download: Downloading a file

The most basic function of the server is to download a file to the client. The command to do this is named download and it accepts a file path as an argument:

{
  "version": 1,
  "command": "download",
  "path": "/assets/images/bunny.jpg"
}

Assuming that the file exists, the answer is:

{
  "status": "Success",
  "time": "2024-11-03T18:04:33Z",
  "size": 40439,
  "hash": "6d4c3bdf350ef06c60369425890a8c03da0c0e11beb8519a50206d1216116b1f"
}
/9j/4AAQSkZJRgABAAEBLAEsAAD//gAfTEVBRCBUZWNobm9sb2dpZXMgSW5jLiBW
MS4wMQD/2wCEAAkGBwgHBgkIBwgKCgkLDhgPDg0NDh0VFhEYIh4kJCIeISEmKzcu
Jig0KSEhMEEwNDk6PT49JS5DSEM8SDc8PTsBCgoKDgwOHA8PHDsnISc7Ozs7Ozs7
Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7O//EAaIA
AAEFAQEBAQEBAAAAAAAAAAABAgMEBQYHCAkKCwEAAwEBAQEBAQEBAQAAAAAAAAEC
AwQFBgcICQoLEAACAQMDAgQDBQUEBAAAAX0BAgMABBEFEiExQQYTUWEHInEUMoGR
oQgjQrHBFVLR8CQzYnKCCQoWFxgZGiUmJygpKjQ1Njc4OTpDREVGR0hJSlNUVVZX
WFlaY2RlZmdoaWpzdHV2d3h5eoOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0
tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4eLj5OXm5+jp6vHy8/T19vf4+foRAAIB
AgQEAwQHBQQEAAECdwABAgMRBAUhMQYSQVEHYXETIjKBCBRCkaGxwQkjM1LwFWJy
0QoWJDThJfEXGBkaJicoKSo1Njc4OTpDREVGR0hJSlNUVVZXWFlaY2RlZmdoaWpz
dHV2d3h5eoKDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXG
x8jJytLT1NXW19jZ2uLj5OXm5+jp6vLz9PX29/j5+v/AABEIAYMB9AMBEQACEQED
EQH/2gAMAwEAAhEDEQA/AOSslH2dWbqe9cUt7m6JUTAarg7gx7R4K4/ClFdQZKse
0hG64zxWbd3oaJDGck7B2rRLXUzGAEtzwKHYZcjjQLgVmOxZkJjiGPTFNO7JZWa2
AiBximpe8UtB2wmNIwelPm1E0PKlBgnOKce4upIMG2yDzXPe8insQIMxk+tdCegh
YwFwAMVMii0kQLBzycVjK7Vh2AKLmJscelC90TI3gMaoqn7w5ojK+5d9CFUMcorb
Sxmx6cuFI4JpTKiSrCsZKqPes73Vi0QsSXOOp7VcdEQ3qPCkSIn40PUOpO6lSvbn
mlHYGPRcSDHehMlodgRlvQnNTNFIVEV+TUlFiPaygDpTYo6ERjUSkY6VS0C+gGMl
xjqTzS0uGxeljVFAHAxUK9wuVol3Ng076Da1I2hAAPfdinFg3oMnjIAx1qUirksa
Fk4Harv3ItYibO/Z3olsOC1JIwGjKfnRF6CnZiF/KVY1HApczuPl0I5FbdmtN0Qt
⋮

The head contains the timestamp, the size of the file in bytes, and the SHA-256 digest of the file in hexadecimal. The content of the file follows in Base64.

If you are downloading a large file, you may prefer to obtain it in chunks, making one request for each chunk. This can be done for any file by including the offset and length properties in the request head.

For example, suppose we have a media file that is 5,307,294,188 bytes in size, or around 5.3 GB, and that the file is to be downloaded is chunks of 1 MiB at a time. The first request is:

{
  "version" : 1,
  "command" : "download",
  "path"    : "/movies/plan-9-from-outer-space.mpg",
  "offset"  : 0,
  "length"  : 1048576
}

The offset property is 0 to indicate the start of the file, and the length property is 1048576 to indicate that the chunk should be 1 MiB in size. For the second chunk, add the length to the offset, and you have:

  "offset": 1048576,
  "length": 1048576

About halfway through the movie there will be a request like this:

  "offset": 2462056448,
  "length": 1048576

The request for the final chunk would be:

  "offset": 5306843136,
  "length": 1048576

Each response would give you another 1 MiB of the file, until the final chunk, which will likely be undersized, and could have a size of zero:

  "size": 541052,

If the size comes back smaller than the requested length, the end of the file has been reached. If it happened that the size of the file was an exact multiple of 1 MiB, then the final request would have returned a zero size.

upload: Uploading a file

TBD. The list Command PURPOSE Lists the contents of a directory, returning the names, types, sizes, and timestamps of the files and subdirectories located within the directory whose path is specified. PROTOCOL LEVEL Level one. HEAD PROPERTIES version Number. Required. command String. Required. Must be "list". accessKey String. Optional. path String. Required. self Boolean. Optional. Defaults to false. BEHAVIOR See section 1.12, Parsing the head. See section 1.13, Processing the version property. See section 1.14, Processing the command property. See section 1.15, Processing the accessKey property. See section 1.20, Processing the path property for an existing directory. See section 1.31, Processing the self property. Determine which files and directories to include in the list. If the path property identifies a regular file, then only the named file is listed. If the path property identifies a directory and the self property is present and has value true, then only the named directory is listed (and not its contents). If the path property identifies a directory and the self property is absent or is present and has value false, then each item within the named directory is listed. Obtain the file and directory metadata.