Skip to main content

sftp

EXPERIMENTAL

This component is experimental and therefore subject to change or removal outside of major version releases.

Consumes files from a server over SFTP.

Introduced in version 3.39.0.

# Common config fields, showing default values
input:
label: ""
sftp:
address: ""
credentials:
username: ""
password: ""
private_key_file: ""
private_key_pass: ""
paths: []
codec: all-bytes
watcher:
enabled: false
minimum_age: 1s
poll_interval: 1s
cache: ""

Metadata​

This input adds the following metadata fields to each message:

- sftp_path

You can access these metadata fields using function interpolation.

Fields​

address​

The address of the server to connect to that has the target files.

Type: string
Default: ""

credentials​

The credentials to use to log into the server.

Type: object

credentials.username​

The username to connect to the SFTP server.

Type: string
Default: ""

credentials.password​

The password for the username to connect to the SFTP server.

Type: string
Default: ""

credentials.private_key_file​

The private key for the username to connect to the SFTP server.

Type: string
Default: ""

credentials.private_key_pass​

Optional passphrase for private key.

Type: string
Default: ""

paths​

A list of paths to consume sequentially. Glob patterns are supported.

Type: array
Default: []

codec​

The way in which the bytes of a data source should be converted into discrete messages, codecs are useful for specifying how large files or contiunous streams of data might be processed in small chunks rather than loading it all in memory. It's possible to consume lines using a custom delimiter with the delim:x codec, where x is the character sequence custom delimiter. Codecs can be chained with /, for example a gzip compressed CSV file can be consumed with the codec gzip/csv.

Type: string
Default: "all-bytes"

OptionSummary
autoEXPERIMENTAL: Attempts to derive a codec for each file based on information such as the extension. For example, a .tar.gz file would be consumed with the gzip/tar codec. Defaults to all-bytes.
all-bytesConsume the entire file as a single binary message.
chunker:xConsume the file in chunks of a given number of bytes.
csvConsume structured rows as comma separated values, the first row must be a header row.
csv:xConsume structured rows as values separated by a custom delimiter, the first row must be a header row. The custom delimiter must be a single character, e.g. the codec "csv:\t" would consume a tab delimited file.
delim:xConsume the file in segments divided by a custom delimiter.
gzipDecompress a gzip file, this codec should precede another codec, e.g. gzip/all-bytes, gzip/tar, gzip/csv, etc.
linesConsume the file in segments divided by linebreaks.
multipartConsumes the output of another codec and batches messages together. A batch ends when an empty message is consumed. For example, the codec lines/multipart could be used to consume multipart messages where an empty line indicates the end of each batch.
regex:(?m)^\d\d:\d\d:\d\dConsume the file in segments divided by regular expression.
tarParse the file as a tar archive, and consume each file of the archive as a message.
# Examples
codec: lines
codec: "delim:\t"
codec: delim:foobar
codec: gzip/csv

delete_on_finish​

Whether to delete files from the server once they are processed.

Type: bool
Default: false

max_buffer​

The largest token size expected when consuming delimited files.

Type: int
Default: 1000000

watcher​

An experimental mode whereby the input will periodically scan the target paths for new files and consume them, when all files are consumed the input will continue polling for new files.

Type: object
Requires version 3.42.0 or newer

watcher.enabled​

Whether file watching is enabled.

Type: bool
Default: false

watcher.minimum_age​

The minimum period of time since a file was last updated before attempting to consume it. Increasing this period decreases the likelihood that a file will be consumed whilst it is still being written to.

Type: string
Default: "1s"

# Examples
minimum_age: 10s
minimum_age: 1m
minimum_age: 10m

watcher.poll_interval​

The interval between each attempt to scan the target paths for new files.

Type: string
Default: "1s"

# Examples
poll_interval: 100ms
poll_interval: 1s

watcher.cache​

A cache resource for storing the paths of files already consumed.

Type: string
Default: ""