unarchive
Unarchives messages according to the selected archive format into multiple messages within a batch.
- Common
- Advanced
# Common config fields, showing default valueslabel: ""unarchive:format: binary
# All config fields, showing default valueslabel: ""unarchive:format: binaryparts: []
When a message is unarchived the new messages replace the original message in the batch. Messages that are selected but fail to unarchive (invalid format) will remain unchanged in the message batch but will be flagged as having failed, allowing you to error handle them.
For the unarchive formats that contain file information (tar, zip), a metadata
field is added to each message called archive_filename
with the
extracted filename.
Fields​
format
​
The unarchive format to use.
Type: string
Default: "binary"
Options: tar
, zip
, binary
, lines
, json_documents
, json_array
, json_map
, csv
.
parts
​
An optional array of message indexes of a batch that the processor should apply to. If left empty all messages are processed. This field is only applicable when batching messages at the input level.
Indexes can be negative, and if so the part will be selected from the end counting backwards starting from -1.
Type: array
Default: []
Formats​
tar
​
Extract messages from a unix standard tape archive.
zip
​
Extract messages from a zip file.
binary
​
Extract messages from a binary blob format consisting of:
- Four bytes containing number of messages in the batch (in big endian)
- For each message part:
- Four bytes containing the length of the message (in big endian)
- The content of message
lines
​
Extract the lines of a message each into their own message.
json_documents
​
Attempt to parse a message as a stream of concatenated JSON documents. Each parsed document is expanded into a new message.
json_array
​
Attempt to parse a message as a JSON array, and extract each element into its own message.
json_map
​
Attempt to parse the message as a JSON map and for each element of the map
expands its contents into a new message. A metadata field is added to each
message called archive_key
with the relevant key from the top-level
map.
csv
​
Attempt to parse the message as a csv file (header required) and for each row in the file expands its contents into a json object in a new message.