Setup

Steps we will perform in this section

Run the CloudFormation template to create the VPC, the Cloud9 Bastion environment and the Apache Kafka Client EC2 instances

  • Make sure you have created an EC2 KeyPair as shown in the Prerequisites section.
    Note: Create a pem file irrespective of using MAC or Windows.

  • Right click on Launch Stack and open it in a new tab to execute the CloudFormation template. You can download the CloudFormation template here.




  • Choose the EC2 KeyPair that you created in the Prerequisites step.

  • Click Next.

  • Click Next on the next page.

  • Scroll down, check the checkboxes next to I acknowledge that AWS CloudFormation might create IAM resources with custom names and I acknowledge that AWS CloudFormation might require the following capability: CAPABILITY_AUTO_EXPAND in the Capabilities section and click on Create stack.


    The stack creates:

    1. A VPC with 1 Public subnet and 3 Private subnets and the required plumbing including a NAT Gateway.
    2. A Cloud9 environment that can be used as a jump box.
    3. 2 Apache Kafka client EC2 instances.

Run the CloudFormation template to create the MSK cluster

  • Right click on the following link and open it in a new tab to execute the CloudFormation template. You can download the CloudFormation template here.




  • Click Next.

  • For BastionStack, specify the name of the Cloud9 Bastion CloudFormation stack that you created earlier.

  • For MSKKafkaVersion, choose 2.3.1.

  • For PCAARN, keep it blank as we will be not be using TLS mutual authentication for this lab.

  • For TLSMutualAuthentication, keep it set to false.

  • For VPCStack, Go to the CloudFormation console, Click on the CloudFormation stack that you created earlier (default MSKClient), go to the Outputs tab and copy the Value for the key VPCStackName.


  • Click Next

  • Click Next on the next page.

  • Scroll down and click on Create stack.

    It could take up to 15 minutes for the stack to run. Once the status of the stack changes to CREATE_COMPLETE, the stack is done creating. Please wait for the stack to complete and then proceed further.

    The stack creates:

    1. An Amazon MSK cluster that allows both TLS and PLAINTEXT client connections to the Amazon MSK Apache Kafka cluster.

Get the cluster information for the Amazon MSK clusters

  • Go to the Amazon MSK console. Click on the Amazon MSK cluster that was created by CloudFormation.

  • Click on View client information on the top right side of the page under Cluster summary.


  • Click on the Copy icon under Bootstrap servers for both TLS and Plaintext and paste it in a notepad application.

  • Click on the Copy icon under Zookeeper connect and paste it in a notepad application. Click on Done.


Setup SSH keys in the Cloud9 environment and the Amazon MSK environment variables in the KafkaClientEC2Instance

  • Go to the AWS Cloud9 console.

  • Click on MSKClient-Cloud9EC2Bastion and then click Open IDE.


  • In the Getting started section, click on Upload Files…


  • Click on Select files. Pick the EC2 pem file that you created in the Prerequisites section. Click Open. The file will be copied to the /home/ec2-user/environment dir and will also be visible in the left pane.


  • Go to the bash pane at the bottom and type in the following commands to setup the ssh environment so that you can access the Kafka Client EC2 instances.

    chmod 600 <pem file>
    eval `ssh-agent`
    ssh-add -k <pem file>
    
  • Ssh to the KafkaClientEC2Instance created by the Cloud9 Bastion CloudFormation stack. Run the following commands.
    Note: If you get a message saying Are you sure you want to continue connecting (yes/no)?, type yes.

    export MSK_STACK=MSK
    export ssh_cmd=$(aws cloudformation describe-stacks --stack-name $MSK_STACK --query 'Stacks[0].Outputs[?OutputKey==`SSHKafkaClientEC2Instance`].OutputValue' --output text)
    $ssh_cmd
    
  • Enter the following commands to setup the Amazon MSK environment variables.

    cd /tmp/kafka
    export MSK_STACK=MSK
    export region=$(curl http://169.254.169.254/latest/meta-data/placement/region)
    python3 setup-env-sasl.py --stackName $MSK_STACK --region $region
    . ./setup_env
    

AWS Glue Schema Registry

In this lab, the clickstream producer generating mock clickstream data uses the AWS Glue Schema Registry to send Avro-encoded messages to an Amazon MSK Apache Kafka topic. The producer can accept parameters to use a specific registry and a pre-created schema. However, in this lab, the producer is using the default schema registry (named default-registry) with auto registration of schemas. When we start the producer in the next section, we will take a look at the auto registered schema.