In this lab we will look at the simple monitoring available through the Amazon MSK Service
Login to the Console of the account where your cluster is running and go to the MSK Service Console.
Click on the name of the Amazon MSK Cluster you are interested in monitoring
Click on the
You will see a simple dashboard that’s showing you metrics from your cluster:
Disk usage by broker - Indicates the percent of disk space used on each broker. Ideally this is fairly balanced, and well below 80%. Kafka doesn’t like full disks, so we are going to have to put some alarming in place on this data.
CPU (User) usage by broker - How much (user) CPU is being used. You’ll probably want to keep your brokers below about 60% peaks in production. Daily peaks or sustained duration about this means you likely need to expand your cluster, or you risk performance impact in the event of a broker going away, spikes in traffic, new consumers coming online, etc.
Network RX Packets by broker - You’re looking to make sure that each broker is generally getting the same amount of data. Keep an eye out for brokers that flatline (drop to zero and stay) or tabletop (climb to a peak then don’t go above that) as you may have performance issues with your broker in these conditions.
Network TX Packets by broker - Same as RX above.
These graphs will give you a summary of the basic inputs, storage, utilization, and output from your cluster. You probably want to setup alarms on these, and look at more advanced metrics and monitoring, so on to the next step!