kafka_balanced
DEPRECATED
This component is deprecated and will be removed in the next major version release. Please consider moving onto alternative components.
Connects to Kafka brokers and consumes topics by automatically sharing partitions across other consumers of the same consumer group.
- Common
- Advanced
# Common config fields, showing default valuesinput:label: ""kafka_balanced:addresses:- localhost:9092topics:- benthos_streamclient_id: benthos_kafka_inputconsumer_group: benthos_consumer_groupbatching:count: 0byte_size: 0period: ""check: ""
# All config fields, showing default valuesinput:label: ""kafka_balanced:addresses:- localhost:9092tls:enabled: falseskip_cert_verify: falseenable_renegotiation: falseroot_cas: ""root_cas_file: ""client_certs: []sasl:mechanism: ""user: ""password: ""access_token: ""token_cache: ""token_key: ""topics:- benthos_streamclient_id: benthos_kafka_inputrack_id: ""consumer_group: benthos_consumer_groupstart_from_oldest: truecommit_period: 1smax_processing_period: 100msgroup:session_timeout: 10sheartbeat_interval: 3srebalance_timeout: 60sfetch_buffer_cap: 256target_version: 1.0.0batching:count: 0byte_size: 0period: ""check: ""processors: []
Offsets are managed within Kafka as per the consumer group (set via config), and partitions are automatically balanced across any members of the consumer group.
Partitions consumed by this input can be processed in parallel allowing it to utilise <= N pipeline processing threads and parallel outputs where N is the number of partitions allocated to this consumer.
The batching
fields allow you to configure a
batching policy which will be
applied per partition. Any other batching mechanism will stall with this input
due its sequential transaction model.
Alternatives​
The functionality of this input is now covered by the general kafka
input.
Metadata​
This input adds the following metadata fields to each message:
- kafka_key- kafka_topic- kafka_partition- kafka_offset- kafka_lag- kafka_timestamp_unix- All existing message headers (version 0.11+)
The field kafka_lag
is the calculated difference between the high
water mark offset of the partition at the time of ingestion and the current
message offset.
You can access these metadata fields using function interpolation.
Fields​
addresses
​
A list of broker addresses to connect to. If an item of the list contains commas it will be expanded into multiple addresses.
Type: array
Default: ["localhost:9092"]
# Examplesaddresses:- localhost:9092addresses:- localhost:9041,localhost:9042addresses:- localhost:9041- localhost:9042
tls
​
Custom TLS settings can be used to override system defaults.
Type: object
tls.enabled
​
Whether custom TLS settings are enabled.
Type: bool
Default: false
tls.skip_cert_verify
​
Whether to skip server side certificate verification.
Type: bool
Default: false
tls.enable_renegotiation
​
Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you're seeing the error message local error: tls: no renegotiation
.
Type: bool
Default: false
Requires version 3.45.0 or newer
tls.root_cas
​
An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate.
Type: string
Default: ""
# Examplesroot_cas: |------BEGIN CERTIFICATE-----...-----END CERTIFICATE-----
tls.root_cas_file
​
An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate.
Type: string
Default: ""
# Examplesroot_cas_file: ./root_cas.pem
tls.client_certs
​
A list of client certificates to use. For each certificate either the fields cert
and key
, or cert_file
and key_file
should be specified, but not both.
Type: array
Default: []
# Examplesclient_certs:- cert: fookey: barclient_certs:- cert_file: ./example.pemkey_file: ./example.key
tls.client_certs[].cert
​
A plain text certificate to use.
Type: string
Default: ""
tls.client_certs[].key
​
A plain text certificate key to use.
Type: string
Default: ""
tls.client_certs[].cert_file
​
The path to a certificate to use.
Type: string
Default: ""
tls.client_certs[].key_file
​
The path of a certificate key to use.
Type: string
Default: ""
sasl
​
Enables SASL authentication.
Type: object
sasl.mechanism
​
The SASL authentication mechanism, if left empty SASL authentication is not used. Warning: SCRAM based methods within Benthos have not received a security audit.
Type: string
Default: ""
Option | Summary |
---|---|
PLAIN | Plain text authentication. NOTE: When using plain text auth it is extremely likely that you'll also need to enable TLS. |
OAUTHBEARER | OAuth Bearer based authentication. |
SCRAM-SHA-256 | Authentication using the SCRAM-SHA-256 mechanism. |
SCRAM-SHA-512 | Authentication using the SCRAM-SHA-512 mechanism. |
sasl.user
​
A PLAIN
username. It is recommended that you use environment variables to populate this field.
Type: string
Default: ""
# Examplesuser: ${USER}
sasl.password
​
A PLAIN
password. It is recommended that you use environment variables to populate this field.
Type: string
Default: ""
# Examplespassword: ${PASSWORD}
sasl.access_token
​
A static OAUTHBEARER
access token
Type: string
Default: ""
sasl.token_cache
​
Instead of using a static access_token
allows you to query a cache
resource to fetch OAUTHBEARER
tokens from
Type: string
Default: ""
sasl.token_key
​
Required when using a token_cache
, the key to query the cache with for tokens.
Type: string
Default: ""
topics
​
A list of topics to consume from. If an item of the list contains commas it will be expanded into multiple topics.
Type: array
Default: ["benthos_stream"]
client_id
​
An identifier for the client connection.
Type: string
Default: "benthos_kafka_input"
rack_id
​
A rack identifier for this client.
Type: string
Default: ""
consumer_group
​
An identifier for the consumer group of the connection.
Type: string
Default: "benthos_consumer_group"
start_from_oldest
​
If an offset is not found for a topic partition, determines whether to consume from the oldest available offset, otherwise messages are consumed from the latest offset.
Type: bool
Default: true
commit_period
​
The period of time between each commit of the current partition offsets. Offsets are always committed during shutdown.
Type: string
Default: "1s"
max_processing_period
​
A maximum estimate for the time taken to process a message, this is used for tuning consumer group synchronization.
Type: string
Default: "100ms"
group
​
Tuning parameters for consumer group synchronization.
Type: object
group.session_timeout
​
A period after which a consumer of the group is kicked after no heartbeats.
Type: string
Default: "10s"
group.heartbeat_interval
​
A period in which heartbeats should be sent out.
Type: string
Default: "3s"
group.rebalance_timeout
​
A period after which rebalancing is abandoned if unresolved.
Type: string
Default: "60s"
fetch_buffer_cap
​
The maximum number of unprocessed messages to fetch at a given time.
Type: int
Default: 256
target_version
​
The version of the Kafka protocol to use.
Type: string
Default: "1.0.0"
batching
​
Allows you to configure a batching policy.
Type: object
# Examplesbatching:byte_size: 5000count: 0period: 1sbatching:count: 10period: 1sbatching:check: this.contains("END BATCH")count: 0period: 1m
batching.count
​
A number of messages at which the batch should be flushed. If 0
disables count based batching.
Type: int
Default: 0
batching.byte_size
​
An amount of bytes at which the batch should be flushed. If 0
disables size based batching.
Type: int
Default: 0
batching.period
​
A period in which an incomplete batch should be flushed regardless of its size.
Type: string
Default: ""
# Examplesperiod: 1speriod: 1mperiod: 500ms
batching.check
​
A Bloblang query that should return a boolean value indicating whether a message should end a batch.
Type: string
Default: ""
# Examplescheck: this.type == "end_of_transaction"
batching.processors
​
A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op.
Type: array
Default: []
# Examplesprocessors:- archive:format: linesprocessors:- archive:format: json_arrayprocessors:- merge_json: {}