Skip to main content

unarchive

Unarchives messages according to the selected archive format into multiple messages within a batch.

# Common config fields, showing default values
label: ""
unarchive:
format: binary

When a message is unarchived the new messages replace the original message in the batch. Messages that are selected but fail to unarchive (invalid format) will remain unchanged in the message batch but will be flagged as having failed, allowing you to error handle them.

For the unarchive formats that contain file information (tar, zip), a metadata field is added to each message called archive_filename with the extracted filename.

Fields​

format​

The unarchive format to use.

Type: string
Default: "binary"
Options: tar, zip, binary, lines, json_documents, json_array, json_map, csv.

parts​

An optional array of message indexes of a batch that the processor should apply to. If left empty all messages are processed. This field is only applicable when batching messages at the input level.

Indexes can be negative, and if so the part will be selected from the end counting backwards starting from -1.

Type: array
Default: []

Formats​

tar​

Extract messages from a unix standard tape archive.

zip​

Extract messages from a zip file.

binary​

Extract messages from a binary blob format consisting of:

  • Four bytes containing number of messages in the batch (in big endian)
  • For each message part:
    • Four bytes containing the length of the message (in big endian)
    • The content of message

lines​

Extract the lines of a message each into their own message.

json_documents​

Attempt to parse a message as a stream of concatenated JSON documents. Each parsed document is expanded into a new message.

json_array​

Attempt to parse a message as a JSON array, and extract each element into its own message.

json_map​

Attempt to parse the message as a JSON map and for each element of the map expands its contents into a new message. A metadata field is added to each message called archive_key with the relevant key from the top-level map.

csv​

Attempt to parse the message as a csv file (header required) and for each row in the file expands its contents into a json object in a new message.