Testing Tips: Avoid sleep in tests

Hi πŸ‘‹,

In this article I wanna show a testing tip that I’ve recently learned myself by reading Software Engineering at Google: Lessons Learned from Programming Over Time. The technique improved the way I write unit tests.

When I’m writing bigger unit tests, I have execute something in the background, like for example publishing a message to a message broker, wait for the message to be published and then consume it to test that what I published is correct.

When waiting for the message to be published or any other operation that required waiting in tests I used to call a sleep function, for a second or two, this is decent for few tests but if your tests grow then this approach does not scale well. Imagine if you’re having 50 tests and each test sleeps for one second, it would take at least 50 seconds to run the test suite, which is a lot of wasted time.

The better approach is to use a timeout and polling, you can poll at every millisecond to see if your test has done what you wanted to do instead of sleeping, this will improve the tests and reduce the execution time by a lot!

Let’s demonstrate this will a small example using the Golang programming language, I’m not going to use any external dependencies to demonstrate this technique but you can apply it everywhere you’re calling something that blocks or if you need to wait for something.

What we’re going to test is a simple struct with a method that blocks and modifies a field.

import (
	"math/rand"
	"time"
)

type SystemUnderTest struct {
	Result string
}

func (s *SystemUnderTest) SetResult() {
	go func() {
		time.Sleep(time.Duration(rand.Intn(3000)) * time.Millisecond)
		s.Result = "the_result"
	}()
}

func (s *SystemUnderTest) GetData() string {
	time.Sleep(time.Duration(rand.Intn(3000)) * time.Millisecond)
	return "the_data"
}

This is the not ideal way of testing it:

// A not very ideal way to test SetResult
func Test_SystemUnderTest_SetResult_NotIdeal(t *testing.T) {
	sut := SystemUnderTest{}
	sut.SetResult()

	time.Sleep(4 * time.Second)

	if sut.Result != "the_result" {
		t.Fatalf("Result not equal, want %s got %s", "the_result", sut.Result)
	}
}

SetResults takes between 0 to 3 seconds to run, since we’re waiting for the result we’re sleeping for 4 seconds.

=== RUN   Test_SystemUnderTest_SetResult_NotIdeal
--- PASS: Test_SystemUnderTest_SetResult_NotIdeal (4.00s)
PASS

A better way is to write a simple loop and poll for the result:

// A better way of testing the code
func Test_SystemUnderTest_SetResult(t *testing.T) {
	sut := SystemUnderTest{}
	sut.SetResult()

	passedMilliseconds := 0
	for {
		if passedMilliseconds > 4000 {
			t.Fatalf("timeout reached")
		}
		passedMilliseconds += 1
		time.Sleep(1 * time.Millisecond)
		if sut.Result != "" {
			break
		}
	}
	if sut.Result != "the_result" {
		t.Fatalf("Result not equal, want %s got %s", "the_result", sut.Result)
	}
}

Writing a loop and polling for the result will make the test more complex but it will execute faster. In this case the benefits outweigh the downsides.

=== RUN   Test_SystemUnderTest_SetResult
--- PASS: Test_SystemUnderTest_SetResult (2.08s)
PASS

If the language permits we can also use channels, let’s change the following function that returns a result after a random amount of time and test it.

func Test_SystemUnderTest_GetData(t *testing.T) {
	sut := SystemUnderTest{}

	timeoutTicker := time.NewTicker(5 * time.Second)
	result := make(chan string)

	// Get result when ready
	go func() {
		result <- sut.GetData()
	}()

	select {
	case <-timeoutTicker.C:
		t.Fatal("timeout reached")
	case actual := <-result:
		if actual != "the_data" {
			t.Fatalf("Data not equal, want: %s, got %s", "the_data", actual)
		}
	}
}

We avoided writing a loop with the use of a ticker and select.

In another case you may need to test HTTP calls on the local machine or any other library. Look for timeout options.

Go’s HTTP library let’s you specify a custom timeout for every call you make:

	client := http.Client{
		Timeout: 50 * time.Millisecond,
	}
	response, err := client.Get("http://localhost:9999/metrics")
	...

In Conclusion

Avoid the use of sleep in tests, try polling for the result instead or check if the blocking functions have parameters or can be configured to stop the execution after a timeout period.

Thanks for reading and I hope you’ve enjoyed this article! 🍻

Improving the throughput of a Producer βœˆ

Hello πŸ‘‹,

In this article I will give you some tips on how to improve the throughput of a message producer.

I had to write a Golang based application which would consume messages from Apache Kafka and send them into a sink using HTTP JSON / HTTP Protocol Buffers.

To see if my idea works, I started using a naΓ―ve approach in which I polled Kafka for messages and then send each message into the sink, one at a time. This worked, but it was slow.

To better understand the system, a colleague has setup Grafana and a dashboard for monitoring, using Prometheus metrics provided by the Sink. This allowed us to test various versions of the producer and observe it’s behavior.

Let’s explore what we can do to improve the throughput.

Request batching πŸ“ͺ

A very important improvement is request batching.

Instead of sending one message at a time in a single HTTP request, try to send more, if the sink allows it.

As you can see in the image, this simple idea improved the throughput from 100msg/sec to ~4000msg/sec.

Batching is tricky, if your batches are large the receiver might be overwhelmed, or the producer might have a tough time building them. If your batches contain a few items you might not see an improvement. Try to choose a batch number which isn’t too high and not to low either.

Fast JSON libraries ⏩

If you’re using HTTP and JSON then it’s a good idea to replace the standard JSON library.

There are lots of open-source JSON libraries that provide much higher performance compared to standard JSON libraries that are built in the language.

See:

The improvements will be visible.

Partitioning πŸ–‡

There are several partitioning strategies that you can implement. It depends on your tech stack.

Kafka allows you to assign one consumer to one partition, if you have 3 partitions in a topic then you can run 3 consumer instances in parallel from that topic, in the same consumer group, this is called replication, I did not use this as the Sink does not allow it, only one instance of the Producer is running at a time.

If you have multiple topics that you want to consume from, you can partition on the topic name or topic name pattern by subscribing to multiple topics at once using regex. You can have 3 consumers consuming from highspeed.* and 3 consumer consuming from other.*. If each topic has 3 partitions.

Note: The standard build of librdkafka doesn’t support negative lookahead regex expressions, if that’s what you need you will need to build the library from source. See issues/2769. It’s easy to do and the confluent-kafka-go client supports custom builds of librdkafka.

Negative lookahead expressions allow you to ignore some patterns, see this example for a better understanding: regex101.com/r/jZ9AEz/1

Source Parallelization πŸ•Š

The confluent-kafka-go client allows you to poll Kafka for messages. Since polling is thread safe, it can be done in multiple goroutines or threads for a performance improvement.

import (
	"fmt"
	"github.com/confluentinc/confluent-kafka-go/kafka"
)

func main() {
    var wg sync.WaitGroup
	c, err := kafka.NewConsumer(&kafka.ConfigMap{
		"bootstrap.servers": "localhost",
		"group.id":          "myGroup",
		"auto.offset.reset": "earliest",
	})

	if err != nil {
		panic(err)
	}
   
	c.SubscribeTopics([]string{"myTopic", "^aRegex.*[Tt]opic"}, nil)
    for i := 0; i < 5; i++ {
        go func() {
            wg.Add(1)
            defer wg.Done()
	        for {
	    	    msg, err := c.ReadMessage(-1)
		        if err == nil {
		    	    fmt.Printf("Message on %s: %s\n", msg.TopicPartition, string(msg.Value))
                    // TODO: Send data through a channel to be processed by another goroutine.
    		    } else {
			        // The client will automatically try to recover from all errors.
			        fmt.Printf("Consumer error: %v (%v)\n", err, msg)
		        }
	        }
        }()
    }
    wg.Wait()

	c.Close()
}

Buffered Golang channels can also be used in this scenario in order to improve the performance.

Protocol Buffers πŸ”·

Finally, I saw a huge performance improvement when replacing the JSON body of the request with Protocol Buffers encoded and snappy compressed data.

If your Sink supports receiving protocol buffers, then it is a good idea to try sending it instead of JSON.

Honorable Mention: GZIP Compressed JSON πŸ“š

The Sink supported receiving GZIP compressed JSON, but in my case I didn’t see any notable performance improvements.

I’ve compared the RAM and CPU usage of the Producer, the number of bytes sent over the network and the message throughput. While there were some improvements in some areas, I decided not to implement GZIP compression.

It’s all about trade-offs and needs.

Conclusion

As you could see, there are several things you can do to your producers in order to make them more efficient.

  • Request Batching
  • Fast JSON Libraries
  • Partitioning
  • Source Parallelization & Buffering
  • Protocol Buffers
  • Compression

I hope you’ve enjoyed this article and learned something! If you have some ideas, please let me know in the comments.

Thanks for reading! πŸ˜€