Verify Sink Connector is working

A few minutes after the sink connector starts running, you will be able to see data in Json format is uploaded in the msk-lab-{AccountId}-target-bucket.

  • In the console, navigate to the msk-lab-{AccountId}-target-bucket bucket. Confluent S3 sink connector creates topics/ prefix for all objects by default.
  • You will also see the Kafka topic name salesdb.salesdb.CUSTOMER/ as the secondary prefix for the S3 objects
  • By Default the data will be partitioned based on the Kafka partitions. Since we only had one single partition, you will see partition=0/ prefix for all of the Json objects

verify target bucket stores json files

Where to go next?

We suggest, you create an AWS Glue Crawler pointing to s3://msk-lab-{AccountId}-target-bucket/salesdb.salesdb.CUSTOMER/partition=0/ location. AWS Glue Crawler can infer the schema, and create a table in AWS Glue Catalog automatically. This allows you to run SQL queries in Amazon Athena against your records in Amazon S3 Datalake.