📡 Using Kafka as a Source in Ferry
Ferry allows you to ingest data from a Kafka topic and move it to different destinations like data warehouses, databases, or APIs.
📌 Prerequisites
Before using Kafka as a source, ensure:
- Kafka and ZooKeeper are properly installed (e.g., via Confluent Platform).
- Kafka is running on your system (Linux/WSL recommended).
- You have a topic with data (optionally serialized with Avro, JSON, etc.).
- Schema Registry is running if you're using Avro serialization.
🖥️ Starting Kafka Services (Linux/WSL)
Run the following commands (adjust paths as needed based on your Kafka installation directory):
bash
# Start ZooKeeper
/mnt/c/confluent-x.y.z/bin/zookeeper-server-start /mnt/c/confluent-x.y.z/etc/kafka/zookeeper.properties
# Start Kafka Broker
/mnt/c/confluent-x.y.z/bin/kafka-server-start /mnt/c/confluent-x.y.z/etc/kafka/server.properties
# Start Schema Registry (if using Avro)
# Make sure Kafka and ZooKeeper are running first
/mnt/c/confluent-x.y.z/bin/schema-registry-start /mnt/c/confluent-x.y.z/etc/schema-registry/schema-registry.properties💡 Replace
x.y.zwith your actual Confluent version (e.g.,7.9.0).
source_uri Format
To connect Ferry to a Kafka source, use the following URI format:
plaintext
kafka://<broker-address>?group_id=<group>&security_protocol=<protocol>&sasl_mechanisms=<mechanism>&sasl_username=<user>&sasl_password=<password>&schema_registry=<schema-registry-url>Parameters:
<broker-address>– Kafka broker address (e.g.,localhost:9092).group_id– Consumer group ID used for offset tracking.security_protocol– Security protocol (e.g.,PLAINTEXT,SASL_PLAINTEXT, etc.).sasl_mechanisms– SASL mechanism (e.g.,PLAIN,SCRAM-SHA-256).sasl_username,sasl_password– Optional, required for secured clusters.schema_registry– (Optional) URL to Schema Registry (e.g.,http://localhost:8081) if using Avro.
source_table_name Format
plaintext
<topic-name>Example:
plaintext
avro_topic