This is a short tutorial describing how to monitor your Kubernetes cluster container logs using Loki stack. But why? Because it is easier to view, filter your logs in Grafana and to store them persistently in Loki rather than viewing them in a terminal.
Let’s get started! Assuming you already have Microk8s installed, enable the following addons:
You can enable an add-on by running microk8s enable. Ex: microk8s enable dns
addons:
enabled:
dns # CoreDNS
ha-cluster # Configure high availability on the current node
metrics-server # K8s Metrics Server for API access to service metrics
storage # Storage class; allocates storage from host directory
Note: Microk8s comes with a bundled kubectl and helm3. Just run microk8s kubectl or microk8s helm3. If you want to use your host kubectl you can configure it via: microk8s config > ~/.kube/config.
Warning: Be extra careful when running the microk8s config > ~/.kube/config command because it will overwrite the old config file.
Then proceed by installingLoki. Loki will store all the logs using object storage. This is efficient but the trade-off is that you can’t do complex aggregations and searches against your data. We are going to install Loki for exploration purposes but if you’re looking for a production ready version, check out the loki distributed helm chart.
Run the following helm commands to install Loki. You may want to install helm or use microk8s helm3 command.
You should get the following pods and services by running kubectl get pods and kubectl get services:
NAME READY STATUS RESTARTS AGE
loki-0 1/1 Running 0 9m8s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 54m
loki-headless ClusterIP None <none> 3100/TCP 9m23s
loki ClusterIP 10.152.183.187 <none> 3100/TCP 9m23s
Now, we can safely install Promtail. Promtail will import all the container logs into Loki and it should work auto-magically by auto-discovering all the pods that are running inside your cluster.
Finally, we need to visualize the logs using Grafana. Install it by running the helm command and then, edit the service to change its type from ClusterIP to NodePort.
Changing the service type to NodePort will allow you to visit Grafana in your browser without the need of adding an ingester.
❗❗To use vscode as the default editor export the following environment variable: KUBE_EDITOR=code -w
helm install grafana grafana/grafana
kubectl edit service/grafana
# Change metadata.spec.type to NodePort
# Grab the service's port using kubectl get services and look for 32204:
# grafana NodePort 10.152.183.84 <none> 80:32204/TCP 6d
Note: If you’re on Windows to access the service you will need to run kubectl cluster-info and use the IP address of the cluster. On Linux you should be able to access http://localhost:32204.
kubectl cluster-info
Kubernetes control plane is running at https://172.20.138.170:16443
To access Grafana visit: http://172.20.138.170:32204 where 32204 is the service’s NodePort.
Grab your Grafana admin password by following the instructions from the helm notes. The notes are displayed after Grafana has been installed. If you don’t have base64 on your OS check out CyberChef, it can decode base64 text.
After you’ve successfully logged in, head to Settings -> DataSources and add the Loki data source.
Head back to the Explore menu and display Loki’s logs using the Loki data source in Grafana. You can click log browser to view all available values for the app label.
Promtail should now import logs into Loki and create labels dynamically for each newly created container. If you followed along, congratulations!
The purpose of this article is to get you started quickly with a Home Assistant on a Raspberry Pi. It’s a simple walkthrough on how to install Home Assistant and configure it so it will boot with your PI.
I will use my old Raspberry PI V3 board.
Flashing the Raspberry PI OS
You will need a microSD card of reasonable size, I’m using a 16GB one and a USB Adapter to connect it with my PC.
Head over to Raspberry Pi OS website and download your preferred image, for my Home Assistant I’ve chosen Raspberry Pi OS with desktop and recommended software. After the download is completed, unzip the file and prepare to flash it.
To flash the OS image on the SD card I will use a program called balenaEtcher.
Download it, select your OS image, select the SD card, and hit flash.
After SD card flashing finishes, it is time to setup the Wi-Fi connection. If you’re using an ethernet cable you can skip this step, however, remember to enable SSH.
Setting up the Wi-Fi and enabling SSH
Unplug the SD card from the computer and plug it back. You should see two new drives D: and E:
Open your favorite text editor and create an empty file called ssh in drive E:. This will enable SSH access.
Create a new file called wpa_supplicant.conf using your text editor and paste the following contents in it:
Don’t forget to replace YOUR_WIFI_SSID and YOUR_WIFI_PASSWORD with the corresponding values regarding your Wi-Fi network.
Eject the SD card from your computer and plug it into the PI. At boot, the PI should automatically connect to your Wi-Fi network.
Installing Home Assistant Core
Find your Raspberry PI’s IP address and connect to it via ssh. You can run the command ssh pi@192.168.0.XXX. The password for the pi user should be raspberry.
After getting a shell, follow the instructions for installing Home Assistant from the official website.
Ensure that you run each command on its own line. Don’t directly copy the entire code block, copy each line individually.
Starting Home Assistant on boot
If you can access the Home Assistant web GUI using http://192.168.0.XXX:8123 then the next step would be to create a new systemd service so that some assistant starts at boot. Please replace XXX with your Raspberry PI’s IP address.
To create a new service:
Start a new shell on the Raspberry or ensure that you’re using the pi user. We will execute commands with sudo.
Use sudo nano /etc/systemd/system/hass.service to create a new file and paste the following contents into it:
If the service is running normally, everything is set up. You can safely reboot your PI and the Home Assistant service will run after boot.
Configuring Home Assistant
When visiting the Home Assistant’s web interface for the first time, you will be prompted to create a new user. You may also download the Home Assistant application for your mobile device if you wish to track things like battery, storage, steps, location and so on, in Home Assistant.
In future articles I will show you how to configure the BME680 enviromental sensor and how to activate the Apple Homekit integration. Until then, have fun exploring Home Assistant docs.
Things to do further:
Unattended Upgrades – Enable unattended upgrades for your Raspbian OS. Ensures that your OS’s is always patched and up to date.
UFW – Secure your Home Assistant server with the uncomplicated firewall.
Change default passwords or disable SSH login via password.
Click on mongodb-kafka-connect-mongodb-1.6.0.zip then unzip it and copy the directory into the plugin path /usr/share/java as defined in the CONNECT_PLUGIN_PATH: “/usr/share/java,/usr/share/confluent-hub-components” environment variable.
Connect needs to be restarted to pick-up the newly installed plugin. Verify that the connector plugin has been successfully installed:
➜ bin curl -s -X GET http://localhost:8083/connector-plugins | jq | head -n 20
[
{
"class": "com.mongodb.kafka.connect.MongoSinkConnector",
"type": "sink",
"version": "1.6.0"
},
{
"class": "com.mongodb.kafka.connect.MongoSourceConnector",
"type": "source",
"version": "1.6.0"
},
Note: If you don’t have jq installed you can omit it.
Creating the topics
Before starting the connector, let’s create the Kafka Topics events and events.deadletter, they will be used them in the connector.
To create the topics, we will need to download Confluent tools and run kafka-topics.
curl -s -O http://packages.confluent.io/archive/6.2/confluent-community-6.2.0.tar.gz
tar -xzf .\confluent-community-6.2.0.tar.gz
cd .\confluent-6.2.0\bin\
./kafka-topics --bootstrap-server localhost:9092 --list
__consumer_offsets
__transaction_state
_confluent-ksql-default__command_topic
_schemas
default_ksql_processing_log
docker-connect-configs
docker-connect-offsets
docker-connect-status
./kafka-topics --bootstrap-server localhost:9092 --create --topic events --partitions 3
Created topic events.
./kafka-topics --bootstrap-server localhost:9092 --create --topic events.deadletter --partitions 3
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues, it is best to use either, but not both.
Created topic events.deadletter.
Note: You will need Java to run the Confluent tools if you’re on Ubuntu you can type sudo apt install openjdk-8-jdk.
Starting the connector 🚙
To start the connector, it is enough to do a single post request to the connector’s API with the connector’s configuration.
The configuration that we will use is going to be:
In short, this POST will create a new connector named mongo-sink-connector using the com.mongodb.kafka.connect.MongoSinkConnector java class, run a single connector task that will get all the messages from the events topic and put them into the Mongo found at mongodb://mongodb:27017/my_events, database named my_events and collection named kafka_events. The records which will fail to be written into the database will be placed on a dead letter topic named events.deadletter, in my opinion this is better than discarding them, since we can inspect the topic to see what went wrong.
To verify that the connector is running, you can retrieve its first tasks status with:
➜ bin curl -s -X GET http://localhost:8083/connectors/mongo-sink-connector/tasks/0/status | jq
{
"id": 0,
"state": "RUNNING",
"worker_id": "connect:8083"
}
Querying the Database 🗃
Now that our Kafka Connect cluster is running and is configured, all that’s left to do is POST some dummy data into Kafka and check for it in the database.
That’s all! 🎉If we now connect to the database using mongosh or any other client, we can query the data.
mongosh
> use my_events
switched to db my_events
> db.kafka_events.findOne()
{
_id: ObjectId("6147242856623b0098fc756d"),
glossary: {
title: 'example glossary',
GlossDiv: {
title: 'S',
GlossList: {
GlossEntry: {
ID: 'SGML',
SortAs: 'SGML',
GlossTerm: 'Standard Generalized Markup Language',
Acronym: 'SGML',
Abbrev: 'ISO 8879:1986',
GlossDef: {
para: 'A meta-markup language, used to create markup languages such as DocBook.',
GlossSeeAlso: [ 'GML', 'XML' ]
},
GlossSee: 'markup'
}
}
}
}
}
Viewing Kafka Connect JMX Metrics
JConsole is a tool that can be used to view JMX metrics exposed by Kafka Connect, if you installed openjdk-8 it should come with it
Start JConsole and connect to localhost:9102. If you get a warning about an insecure connection, accept the connection, and ignore it.
After you’re connected click the MBeans tab and explore 🦹♀️
Summary
Getting into Kafka and Kafka Connect can be a bit overwhelming at first. I hope that this tutorial has provided you with the necessary basics so you can continue to play and explore on your own.
Spinning up a playground for Kafka and Connect using docker-compose isn’t that complicated, you can start from the confluent-cp-community repo, it will give you everything you need to get started. With some little modifications to the docker-compose file, we’ve spawned a MongoDB instance and exposed the JMX metrics in Kafka Connect.
Next, we’ve installed and configured the MongoDB connector and confirmed that it works as expected.
If you have any questions let me know in the comments.
In this article I will give you some tips on how to improve the throughput of a message producer.
I had to write a Golang based application which would consume messages from Apache Kafka and send them into a sink using HTTP JSON / HTTP Protocol Buffers.
To see if my idea works, I started using a naïve approach in which I polled Kafka for messages and then send each message into the sink, one at a time. This worked, but it was slow.
To better understand the system, a colleague has setup Grafana and a dashboard for monitoring, using Prometheus metrics provided by the Sink. This allowed us to test various versions of the producer and observe it’s behavior.
Let’s explore what we can do to improve the throughput.
Request batching 📪
A very important improvement is request batching.
Instead of sending one message at a time in a single HTTP request, try to send more, if the sink allows it.
As you can see in the image, this simple idea improved the throughput from 100msg/sec to ~4000msg/sec.
Batching is tricky, if your batches are large the receiver might be overwhelmed, or the producer might have a tough time building them. If your batches contain a few items you might not see an improvement. Try to choose a batch number which isn’t too high and not to low either.
Fast JSON libraries ⏩
If you’re using HTTP and JSON then it’s a good idea to replace the standard JSON library.
There are lots of open-source JSON libraries that provide much higher performance compared to standard JSON libraries that are built in the language.
There are several partitioning strategies that you can implement. It depends on your tech stack.
Kafka allows you to assign one consumer to one partition, if you have 3 partitions in a topic then you can run 3 consumer instances in parallel from that topic, in the same consumer group, this is called replication, I did not use this as the Sink does not allow it, only one instance of the Producer is running at a time.
If you have multiple topics that you want to consume from, you can partition on the topic name or topic name pattern by subscribing to multiple topics at once using regex. You can have 3 consumers consuming from highspeed.* and 3 consumer consuming from other.*. If each topic has 3 partitions.
Note: The standard build of librdkafka doesn’t support negative lookahead regex expressions, if that’s what you need you will need to build the library from source. See issues/2769. It’s easy to do and the confluent-kafka-go client supports custom builds of librdkafka.
Negative lookahead expressions allow you to ignore some patterns, see this example for a better understanding: regex101.com/r/jZ9AEz/1
Source Parallelization 🕊
The confluent-kafka-go client allows you to poll Kafka for messages. Since polling is thread safe, it can be done in multiple goroutines or threads for a performance improvement.
import (
"fmt"
"github.com/confluentinc/confluent-kafka-go/kafka"
)
func main() {
var wg sync.WaitGroup
c, err := kafka.NewConsumer(&kafka.ConfigMap{
"bootstrap.servers": "localhost",
"group.id": "myGroup",
"auto.offset.reset": "earliest",
})
if err != nil {
panic(err)
}
c.SubscribeTopics([]string{"myTopic", "^aRegex.*[Tt]opic"}, nil)
for i := 0; i < 5; i++ {
go func() {
wg.Add(1)
defer wg.Done()
for {
msg, err := c.ReadMessage(-1)
if err == nil {
fmt.Printf("Message on %s: %s\n", msg.TopicPartition, string(msg.Value))
// TODO: Send data through a channel to be processed by another goroutine.
} else {
// The client will automatically try to recover from all errors.
fmt.Printf("Consumer error: %v (%v)\n", err, msg)
}
}
}()
}
wg.Wait()
c.Close()
}
Buffered Golang channels can also be used in this scenario in order to improve the performance.
Protocol Buffers 🔷
Finally, I saw a huge performance improvement when replacing the JSON body of the request with Protocol Buffers encoded and snappy compressed data.
If your Sink supports receiving protocol buffers, then it is a good idea to try sending it instead of JSON.
Honorable Mention: GZIP Compressed JSON 📚
The Sink supported receiving GZIP compressed JSON, but in my case I didn’t see any notable performance improvements.
I’ve compared the RAM and CPU usage of the Producer, the number of bytes sent over the network and the message throughput. While there were some improvements in some areas, I decided not to implement GZIP compression.
It’s all about trade-offs and needs.
Conclusion
As you could see, there are several things you can do to your producers in order to make them more efficient.
Request Batching
Fast JSON Libraries
Partitioning
Source Parallelization & Buffering
Protocol Buffers
Compression
I hope you’ve enjoyed this article and learned something! If you have some ideas, please let me know in the comments.