Skip to content

Parquet

NanoMQ provides scalable and event-driven Parquet functionality, allowing users to configure trigger events or message topics for exchanges through rules. With the help of Parquet, users can persist data in the Parquet format.

Parquet is a columnar storage format known for its efficient compression and query performance. By configuring relevant parameters in Parquet, users have the flexibility to control the storage method, compression algorithm, and encoding to meet specific requirements.

Example Configuration

Below are the rule settings for exchanges and the relevant configurations for Parquet persistence:

hcl
# #====================================================================
# # Exchange configuration for Embedded Messaging Queue
# #====================================================================
# # Initalize multiple MQ exchanger by giving them different name (mq1)
exchange_client.mq1 {
	# # exchanges contains multiple MQ exchanger
	exchange {
		# # MQTT Topic for filtering messages and saving to queue
		topic = "exchange/topic1",
		# # MQ name
		name = "exchange_no1",
		# # MQ category. Only support Ringbus for now
		ringbus = {
			# # ring buffer name
			name = "ringbus",
			# # max length of ring buffer (msg count)
			cap = 1000,
			# # 2: RB_FULL_RETURN: When the ringbus is full, the data in the ringbus is taken out and returned to the aio
			fullOp = 2
		}
	}
}

# #====================================================================
# # Parquet configuration (Apply to Exchange/Messaging_Queue)
# #====================================================================
parquet { # # Parquet compress type.
	# #
	# # Value: uncompressed | snappy | gzip | brotli | zstd | lz4
	compress = uncompressed
	# # Encryption options
	encryption {
		# # Set a key retrieval metadata.
		# #
		# # Value: String
		key_id = kf
		# # Parquet encryption key.
		# #
		# # Value: String key must be either 16, 24 or 32 bytes.
		key = "0123456789012345"
		# # Set encryption algorithm. If not called, files 
		# # will be encrypted with AES_GCM_V1 (default).
		# #
		# # Value: AES_GCM_CTR_V1 | AES_GCM_V1
		type = AES_GCM_V1
	}
	# # The dir for parquet files.
	# #
	# # Value: Folder
	dir = "/tmp/nanomq-parquet"
	# # The prefix of parquet files written.
	# #
	# # Value: string
	file_name_prefix = ""
	# # Maximum rotation count of parquet files.
	# #
	# # Value: Number
	# # Default: 5
	file_count = 5
}

Configuration Items

exchange_client

  • exchange_client.<name>: Exchange client, When multiple exchange_clients need to be started, they can be launched by specifying multiple different names for each exchange.
  • exchange.topic: MQTT Topic for filtering messages and saving to queue.
  • exchange.name: Exchange name.
  • exchange.ringbus.name: ring bus name.
  • exchange.ringbus.cap: ring bus capacity.
  • exchange.ringbus.fullOp: the operation when ringbus is full.

parquet

  • parquet.compress: Compress algorithm. value: uncompressed | snappy | gzip | brotli | zstd | lz4 default is: uncompressed.
  • parquet.encryption: Encryption option.
  • parquet.encryption.key_id: Key retrieval metadata.
  • parquet.encryption.key: Encryption key, key must be either 16, 24 or 32 bytes.
  • parquet.encryption.key: Encryption algorithm. If not called, files will be encrypted with AES_GCM_V1 (default). value :AES_GCM_CTR_V1 | AES_GCM_V1 .
  • parquet.dir: The folder where Parquet files are stored.
  • parquet.file_name_prefix: The prefix used for naming Parquet files.
  • parquet.file_count: The maximum number of Parquet files allowed.