Skip to main content

unarchive

Unarchives messages according to the selected archive format into multiple messages within a batch.

# Common config fields, showing default values
label: ""
unarchive:
format: binary

When a message is unarchived the new messages replace the original message in the batch. Messages that are selected but fail to unarchive (invalid format) will remain unchanged in the message batch but will be flagged as having failed, allowing you to error handle them.

For the unarchive formats that contain file information (tar, zip), a metadata field is added to each message called archive_filename with the extracted filename.

Fields

format

The unarchive format to use.

Type: string
Default: "binary"
Options: tar, zip, binary, lines, json_documents, json_array, json_map, csv.

parts

An optional array of message indexes of a batch that the processor should apply to. If left empty all messages are processed. This field is only applicable when batching messages at the input level.

Indexes can be negative, and if so the part will be selected from the end counting backwards starting from -1.

Type: array
Default: []

Formats

tar

Extract messages from a unix standard tape archive.

zip

Extract messages from a zip file.

binary

Extract messages from a binary blob format consisting of:

  • Four bytes containing number of messages in the batch (in big endian)
  • For each message part:
    • Four bytes containing the length of the message (in big endian)
    • The content of message

lines

Extract the lines of a message each into their own message.

json_documents

Attempt to parse a message as a stream of concatenated JSON documents. Each parsed document is expanded into a new message.

json_array

Attempt to parse a message as a JSON array, and extract each element into its own message.

json_map

Attempt to parse the message as a JSON map and for each element of the map expands its contents into a new message. A metadata field is added to each message called archive_key with the relevant key from the top-level map.

csv

Attempt to parse the message as a csv file (header required) and for each row in the file expands its contents into a json object in a new message.