Amazon Cloudwatch Monitoring

  • This Lab requires you to have completed the Cluster Creation Lab. If you haven’t completed it, please complete it and then come back to this lab.

In this lab we will look at the simple monitoring available through the Amazon MSK Service

  1. Login to the Console of the account where your cluster is running and go to the MSK Service Console.

  2. Click on the name of the Amazon MSK Cluster you are interested in monitoring

  1. Click on the Monitoring tab

  2. You will see a simple dashboard that’s showing you metrics from your cluster:

Disk usage by broker - Indicates the percent of disk space used on each broker. Ideally this is fairly balanced, and well below 80%. Kafka doesn’t like full disks, so we are going to have to put some alarming in place on this data.

CPU (User) usage by broker - How much (user) CPU is being used. You’ll probably want to keep your brokers below about 60% peaks in production. Daily peaks or sustained duration about this means you likely need to expand your cluster, or you risk performance impact in the event of a broker going away, spikes in traffic, new consumers coming online, etc.

Network RX Packets by broker - You’re looking to make sure that each broker is generally getting the same amount of data. Keep an eye out for brokers that flatline (drop to zero and stay) or tabletop (climb to a peak then don’t go above that) as you may have performance issues with your broker in these conditions.

Network TX Packets by broker - Same as RX above.

  1. You can change the time span of the graphs by clicking in the time span bar in the top right, as well as turning on auto refresh by clicking on the arrow beside the refresh circle

These graphs will give you a summary of the basic inputs, storage, utilization, and output from your cluster. You probably want to setup alarms on these, and look at more advanced metrics and monitoring, so on to the next step!