Aiven PostgreSQL plans are now larger and faster with local SSDs

Aiven PostgreSQL plans are now larger

As part of our ongoing efforts to improve our service according to client feedback, we've increased the disk space for all Aiven PostreSQL plans. 

Depending on plan type, we've increased disk space anywhere from 12% to 75%. Check out the table below to see improvements according to plan type:

Plan Previous storage Improved storage
Startup/business/premium-4 50 GB 80 GB
Startup/business/premium-8 100 GB 175 GB
Startup/business/premium-16 200 GB 350 GB
Startup/business/premium-32 400 GB 700 GB
Startup/business/premium-64 800 GB 1000 GB
Startup/business/premium-120 1200 GB 1400 GB
Startup/business/premium-160 1600 GB 1800 GB
Startup/business/premium-240 2400 GB 2800 GB

New Aiven PostgreSQL service plans will come with the increased disk space while existing service plans will receive the increases during their next upgrades.

We can also provide even greater disk space upon request. If you're interested, feel free to contact our sales for more information!

For those of you who are running your PostgreSQL services on Amazon Web Services (AWS) and Google Cloud Platform (GCP), the news is even better.

Aiven supports local SSDs in AWS and GCP

In fact, we are the first DBaaS to offer such support. 

Specifically, we will use PCIe NVMe local SSDs starting from our Startup/ Business/Premium-8 plans in GCP and Startup/Business/Premium-16 plans in AWS.

So why is this important?

Our initial tests have demonstrated up to a 400% increase in performance when using local SSDs over network-based SSDs. 

To learn more, view our benchmarking presentation that we gave at this year's PostgreSQL Conference Europe.

Currently only AWS and GCP have support for these but we hope to extend it to other cloud providers in the future when they have the required support.

We will be following up later on with a longer form blog post that will go into more detail. 

Increase your PostgreSQL performance with Aiven

We are constantly striving to be the first to offer features and updates that will markedly improve the performance of the services you run with us.

That is why we made sure to be the first to provide local SSDs in our GCP and AWS plans, as well as release production-ready PostgreSQL 10.

If you are a current client, we'd like to thank you for working with us to improve industry standards. If you aren't a client yet, give us a try!

Trying Aiven is free and comes with no commitments. Start your trial today and receive $10 worth of test credits.  


Aiven talks shop with Paf at second Apache Kafka meetup

In August, Aiven created an Apache Kafka meetup in Helsinki to discuss hot topics surrounding Kafka.

Due to its popularity, we decided to make it an ongoing event and held our second meetup yesterday.

This time, around 30 people attended Lifeline Ventures' office in downtown Helsinki and there were two presenters:
  1. Niklas Nylund of Paf, an operator of slots, lotteries, poker and casino games as well as betting both online and in casinos.
  2. Heikki Nousiainen, our CTO
First up was Nylund to discuss the ins and outs of how his team brought Kafka in to act as a data bus for their real-time analytics needs.

Kafka reduces the spaghetti  

Paf collects Change Data Capture events from databases and sends the events to Kafka, which are then consumed and imported into Kudu for analytic work.

Although Kafka possesses its quirks, Nylund is satisfied with how Paf has been able to use it to streamline their architecture.

Or, as he vividly put it, "Reducing the spaghetti."

Check out his presentation slides here to get a thorough understanding of Paf's integration process from beginning to end.

Kafka Connect simplifies integration

Second up was our CTO to discuss Kafka Connect framework, and using it to transfer data between Kafka and other systems, in this case
PostgreSQL and Elasticsearch.

Using Python code to interact with the services, Nousiainen was able to demonstrate how easy it was to push and pull data between Kafka and the external systems.

With a large number of available connectors from the Kafka community, integrating Kafka with other systems can be quite straightforward.

This in turn allows quick benefits and migration towards real-time stream analytics with Kafka-centric architecture.

Check out Nousiainen's presentation slides to get a better idea of the use cases for Kafka Connect.

Join the next Kafka discussion  

As the transition from a monolithic to microservices architecture continues, the use case for integrating Kafka as a streaming platform will only strengthen. 

This is evidenced by the increase in attendance of our events where many developers, be they users of Kafka or not, are gathering to learn more about what Kafka is, its use cases, and best practices for implementing it.

We are planning another Aiven Kafka meetup for December/January timeframe, so join the Helsinki Apache Kafka Meetup group to get the details when we finalize the plans and we'll see you soon!


Aiven is the first to offer PostgreSQL 10

Get Aiven PostgreSQL 10
We've got great news: PostgreSQL 10 is now available at Aiven on all major clouds!

That means that you can now access it on AWS, Google, Azure, UpCloud, and DigitalOcean clouds. Worldwide.

As with every PostgreSQL release, many of your older queries will simply run faster because of the many performance enhancements. But, why is PostgreSQL 10 significant?

As the 28th major update of the past 30 years, its primary focus is on improving the distribution of massive amounts of data across many nodes...let's look at some specifics.

PostgreSQL 10: the specifics 

Improved support for parallel queries 

Now, many more scan types are supported and can benefit from parallelization.

Depending on your query, newly added scan types such as parallel index scan and bitmap heap scan can speed it up immensely.

Also, merge joins are now a supported parallel join type in addition to other join types already supported in the previous release, such as hash joins and nested loop joins.

    Declarative partitioning support. 

    While you could create partitioning schemes by directly using constraints, inheritance and triggers in past versions of PostgreSQL...

    ...you can now use simple definitions to create your partitioning setup with PostgreSQL 10

    Even better, the performance of the new partitioning code is vastly improved over older methods.

      Hash indexes

      PostgreSQL 10 brings crash-safe hash index support that also performs far better than before. 

      Now, you can consider using hash indexes when your queries just need to check for equivalence to increase performance.

        Native logical replication 

        PostgreSQL 10 now brings proper support for logical replication in PostgreSQL itself. 

        Logical replication allows replication between different PostgreSQL versions, finally allowing for zero downtime upgrades to future versions. 

        You can also migrate data to and from environments where you don't have access to streaming replication.

        Easily test PostgreSQL 10 with Aiven today  

        With Aiven, you can create a full copy of the data within your existing PostgreSQL service as a new separate PostgreSQL 10 service.

        By forking, you can keep your existing PostgreSQL services as-is while testing the latest version for compatibility with your applications.

        And don't worry, it won't negatively affect the performance of your source service; it just provides an easy and efficient way to test PostgreSQL 10. So, let's get started.

        Team Aiven

        P.S. For a full list of features please see the full PostgreSQL 10 release notes.


        Stanford SSI uses Aiven InfluxDB in high-altitude

        Aiven goes to near space with InfluxDB

        A new day in space development 

        Historically, space development has remained the sole purview of government agencies due to its astronomical cost: that has changed. 

        Today, new technologies are reducing those costs, allowing smaller groups with big ambitions to take small steps and giant leaps in pushing space development forward. 

        One such group is Stanford SSI, a project-based student group that covers everything from high-altitude balloon platforms, to cube satellites, all the way through to rockets. 

        In fact, they recently broke the world record for the longest duration flight by a latex balloon with the launch of SSI-52...a pretty big achievement for a group only established in 2013. 

        What does Aiven have to do with balloons? 

        These aren’t your everyday balloons, but sophisticated platforms carrying complex scientific payloads into near-space and over tremendous distances: the stakes are a little higher. 

        Stanford SSI uses its ValBal balloon platform to reach altitudes as high as 120,000 feet while testing cutting-edge electronics and mechanics. 

        Their tests produce a massive amount of data, data that needs to be sent back. Kai Marshland, their Operations Lead, explains it best, 

        “...we want to store a highly variable set of data, analyze it over time, and manipulate it with minimal latency...Aerospace demands the utmost in quality, and that’s exactly what Aiven provides.” 

        In short, Aiven’s capabilities provide an ideal fit. For our part, we think it’s pretty cool to test our technology in demanding, high-altitude research flights. 

        Aiven tests InfluxDB at the edge of space 

        For latest launch on September 30, 2017, SSI-59, Stanford SSI used our InfluxDB service for their Database as a Service needs. 

        The goal for this launch? SSI-59 will use the same ValBal platform to test lighter and more efficient mechanics and avionics that should increase the platform’s endurance. 

        But most importantly, it’s the communications system that they are most excited to test, which Kai describes as “Revolutionary.” Here’s why, 

        "Having high-bandwidth communications means that we no longer have to worry about recovering the payload over the vast areas ValBal can fly over." 

        For instance, it will allow them to fly a radar glaciology payload over Greenland to measure the thickness of its ice sheets and transmit the data it collects almost instantaneously back to where they launched the payload from...no more worrying about recovering the platform.

        We operate in the cloud, but reach for the stars 

        Just one of many, SSI-59 is part of a long-term aim to redefine the high-altitude balloon research world, one that will provide a better understanding of the planet we live on. 

        This is the first time that Stanford SSI will be using our technology for their data needs and will provide an excellent use case for demonstrating the capabilities of our technology. 

        But, the idea that our technology can be applied to space development is most thrilling. After all, Aiven may be a database cloud service provider, but we don’t mind flying above them.


        Kafka Users and Access Control

        We're happy to announce user and topic level access controls for Aiven Kafka service. You can now create multiple users with separate access credentials each, and control produce and consumer privileges on user and topic basis.

        Both users and access control lists can be managed on the Aiven Console under the Users tab on the service details page.

        Managing users

        All users and the user specific access certificate and key are listed and available on the Users tab. The password is usable with Kafka REST service.

        You can add users with the Add service user... button or remove existing users with Remove...

        Reset password... button on the right both resets Kafka REST password as well as revokes and recreates access key and certificate for the specific user.

        Managing Access Control Lists

        Access Control Lists manage user privileges to consume from or produce to a topic. 

        Users can either be explicit users, or user masks with wildcard characters * and ?. Star matches a string of characters, question mark matches any single character in it's place.

        Similarly, topics can be specified as explicit topics as well as wildcard matches.

        Grants can be either Produce, Consume or Full Access for both.

        By default, the access is allowed for all configured users to both produce and consume on all topics. You can delete ACL entries on row by row basis.

        Give Aiven services a whirl

        Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer works for all of our services: PostgreSQL, Redis, InfluxDB, Grafana, Elasticsearch and Kafka!

        Go to https://aiven.io/ to get started!

        Team Aiven


        Kafka Connect Preview

        We're delighted to announce public preview for Kafka Connect support for Aiven Kafka. During preview, Kafka Connect is available at no extra cost as part of all Aiven Kafka Business and Premium plans. We're launching with support for Elasticsearch connector, and will soon follow with S3 and other connectors.

        Kafka Connect

        Kafka Connect is a framework for linking Kafka with other services. It makes it simple to define and configure connectors to reliably and scalably stream data between different systems. Kafka Connect provides a standard API for integration, handles offset management and workload distribution automatically.

        You can define and configure individual connectors via the Kafka Connect REST interface.

        Case example - IoT Device Shadow

        A customer of ours is using Aiven Kafka for capturing telemetry from a fleet of IoT devices. To that end, Aiven Kafka has proven to be a scalable and flexible pipeline for capturing and distributing traffic for processing.

        During the past month, we've worked together to support a new use case: maintaining a "device shadow" or a latest state update in Elasticsearch. This copy allows developers to query and access device states regardless whether the devices are currently online and connected or not.

        We built this new pipeline together with Kafka Connect and Elasticsearch Connector. You can follow these steps to set up a similar pipeline.

        Getting started: Launching Kafka and Elasticsearch services

        Create Aiven Kafka service and create your topics for the incoming traffic. In this example, we'll be using Business-4 plan for the service and 16 partitions to accommodate for the client load.

        First we'll launch a Kafka cluster from the Aiven web console. This cluster will receive the state updates from the IoT devices. A fairly low-spec cluster will work for this use case and we will launch it in one of the AWS regions:

        Next, we'll create a Kafka topic for our data under the Topics tab.

        We chose 16 partitions in this example, but you should select a number that matches with your workload. A larger number allows for higher throughput to support, but on the other hand increases resource usage on both the cluster as well as the consumer side. Contact us if unsure, we can help you to find a suitable plan.

        We will also need an Elasticsearch cluster for the device shadow data. We'll choose a three-node cluster with 4 GB memory in each node. Make note of the Elasticsearch Service URL, which we'll use with the Kafka Connector configuration in the next steps.

        We'll need to enable Kafka Connect by clicking the "Enable" button next to it in the service view. We also make a note of Kafka Connect access URL, which we will need in the following steps.

        Setting up the pipeline with scripts

        We'll be using a couple of Python code snippets to configure our data pipeline. We've downloaded the project and Kafka access certificates as ca.pem, service.cert and service.key to a local directory from the Kafka service view.

        You can refer to startup guides for both Aiven Kafka and Aiven Elasticsearch for details on setting up the environment.

        Here's our first snippet named query_connector_plugins.py for finding out the available connector plugins:

        import requests
        AIVEN_KAFKA_CONNECT_URL = "https://avnadmin:m9jyevsaehezqs36@gadget-kafka.htn-aiven-demo.aivencloud.com:22142"
        response = requests.get("{}/connector-plugins".format(AIVEN_KAFKA_CONNECT_URL))
        By running the script we can find out the available connector plugins:
        $ python3 query_connector_plugins.py

        To get started with the pipeline configuration, we'll pre-create an Elasticsearch index with a schema to meet our needs with script name create_elastic_index.py:

        import json
        import requests
        AIVEN_ELASTICSEARCH_URL = "https://avnadmin:in9zvfjaio32m0qy@gadget-elastic.htn-aiven-demo.aivencloud.com:24185"
        mapping = {
            "settings": {
                "number_of_shards": 16
            "mappings": {
                "kafka-connect-gadget-telemetry": {
                    "properties": {
                        "location": {
                            "type": "string"
                        "temperature": {
                            "type": "integer"
                        "timestamp": {
                            "type": "date"
        response = requests.put(
            headers={"content-type": "application/json"},
        Next, we'll run the script and the Elasticsearch index is created:

        $ python3 create_elastic_index.py
            "acknowledged" : true,
            "shards_acknowledged" : true
        Here's how we create and configure the actual Elasticsearch Connector to link our telemetry topic and Elasticsearch with a script named create_es_connector.py:

        import requests
        import json
        AIVEN_KAFKA_CONNECT_URL = "https://avnadmin:m9jyevsaehezqs36@gadget-kafka.htn-aiven-demo.aivencloud.com:22142"
        AIVEN_ELASTICSEARCH_URL = "https://avnadmin:in9zvfjaio32m0qy@gadget-elastic.htn-aiven-demo.aivencloud.com:24185"
        connector_create_request = {
            "name": "gadget-es-sink",
            "config": {
                "connection.url": AIVEN_ELASTICSEARCH_URL,
                "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
                "tasks.max": 3,
                "topics": "gadget-telemetry",
                "type.name": "kafka-connect-gadget-telemetry"  # This points to the created ES mapping
        response = requests.post(
            headers={"Content-Type": "application/json"},
        And enable the Connector by running the script:

        $ python3 create_es_connector.py
        {"name":"gadget-es-sink","config":{"topics":"gadget-telemetry", "type.name":"kafka-connect-gadget-telemetry", "tasks.max":"3", "connector.class":"io.confluent.connect.elasticsearch.ElasticsearchSinkConnector", "connection.url":"https://avnadmin:in9zvfjaio32m0qy@gadget-elastic.htn-aiven-demo.aivencloud.com:24185", "name":"gadget-es-sink"}, "tasks":[]}

        Next, we're going to send some simulated telemetry data to test everything out:

        from kafka import KafkaProducer
        import datetime
        import json
        import random
        AIVEN_KAFKA_URL = "gadget-kafka.htn-aiven-demo.aivencloud.com:22144"
        LOCATIONS = ["arizona", "california", "nevada", "utah"]
        producer = KafkaProducer(
        for i in range(10):
            device_name = "gadget_{}".format(i)
            telemetry = {
                "location": random.choice(LOCATIONS),
                "temperature": random.randint(40, 120),
                "timestamp": datetime.datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ")
            key = device_name.encode("utf-8")
            payload = json.dumps(telemetry).encode("utf-8")
            producer.send("gadget-telemetry", key=key, value=payload)
        # Wait for all messages to be sent
        print("Done, sent {} messages".format(i))
        $ python3 submit_telemetry.py
        Done, sent 10 messages

        Exploring the data with Kibana

        All of our Elasticsearch plans include integrated Kibana, which can be a handy tool for exploring and/or visualizing the data too. We can easily verify that our telemetry is flowing all the way to our Elasticsearch instance.

        Clicking the Kibana link under the Elasticsearch service information page opens a view to Kibana. We are greeted with a configuration page where we enter the name of our Elasticsearch index created in one of the earlier steps:

        Default discovery view on our sample data. The default view lists our entries. Since we're using keyed messages and the entry is always replaced with the latest entry, the timeline view will show only the timestamp of the last reception.

        Accessing data in Elasticsearch

        The real value of the new pipeline is realized with the ability to query for device information from Elasticsearch. In the Elasticsearch example query script (query_elasticsearch.py) below, we'll query for all devices that last reported from Arizona:

        import requests
        import json
        AIVEN_ELASTICSEARCH_URL = "https://avnadmin:in9zvfjaio32m0qy@gadget-elastic.htn-aiven-demo.aivencloud.com:24185"
        response = requests.get(

        Running the script show the list of active gadgets in the target region:

        $ python3 query_elasticsearch.py
            "took" : 7,
            "timed_out" : false,
            "_shards" : {
                "total" : 4,
                "successful" : 4,
                "failed" : 0
            "hits" : {
                "total" : 2,
                "max_score" : 1.3862944,
                "hits" : [
                        "_index" : "gadget-telemetry",
                        "_type" : "kafka-connect-gadget-telemetry",
                        "_id" : "gadget_2",
                        "_score" : 1.3862944,
                        "_source" : {
                            "temperature" : 114,
                            "location" : "arizona",
                            "timestamp" : "2017-12-06T13:55:01Z"
                        "_index" : "gadget-telemetry",
                        "_type" : "kafka-connect-gadget-telemetry",
                        "_id" : "gadget_5",
                        "_score" : 1.2039728,
                        "_source" : {
                            "temperature" : 45,
                            "location" : "arizona",
                            "timestamp" : "2017-12-06T13:55:01Z"

        The above example is easily extended to query data by a certain temperature threshold, location or time of the last update. Or, if we want to check the state of a single device, we now have the latest state available by its ID.


        In this example, we built a simple telemetry pipeline with Kafka, Kafka Connect and Elasticsearch. We used Elasticsearch connector, which is the first connector we support with Aiven Kafka. We'll be following up with S3 connector shortly with others to follow.

        Get in touch if we could help you with your business requirement!

        Give Aiven services a whirl

        Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer works for all of our services: PostgreSQL, Redis, InfluxDB, Grafana, Elasticsearch and Kafka!

        Go to https://aiven.io/ to get started!

        Team Aiven


        Larger Aiven Postgresql and Aiven Kafka plans in Azure

        New larger Aiven PostgreSQL plans in Azure

        We're happy to announce immediate availability of the larger 120GB and 160GB Aiven PostgreSQL plan tiers in Azure. These plans come with 1.2TB and 1.6TB storage capability respectively. These plans are available in the following regions: Australia East, Canada Central, Canada East, East Asia, East US 2, Japan West, North Central US, North Europe, UK South, UK West and West US.

        Plan Dedicated
        CPUs per VM † Memory per VM † Storage per VM †
        Startup-120 1 16 120 GB 1200 GB
        Startup-160 1 32 160 GB 1600 GB
        Business-120 2 16 120 GB 1200 GB
        Business-160 2 32 160 GB 1600 GB
        Premium-120 3 16 120 GB 1200 GB
        Premium-160 3 32 160 GB 1600 GB
        Actual amounts may vary slightly between different Cloud providers.

        New larger Aiven Kafka plans in Azure

        We've included larger 32GB and 64GB plan tiers to our Aiven Kafka offerings in Azure with increased total storage, core counts and larger memory. These plans are immediately available in all Azure regions.
        Plan Cluster
        CPU per VM † Memory per VM † Total Storage † Data Retention
        Business-32 3 8 32 GB 4200 GB 12 weeks
        Business-64 3 16 64 GB 6000 GB 18 weeks
        Premium-32 5 8 32 GB 8000 GB 20 weeks
        Premium-64 5 16 64 GB 10000 GB 30 weeks
        Actual amounts may vary slightly between different Cloud providers.

        Give Aiven services a whirl

        Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer works for all of our services: PostgreSQL, Redis, InfluxDB, Grafana, Elasticsearch and Kafka!

        Go to https://aiven.io/ to get started!

        Team Aiven


        Aiven Kafka 0.10.2 now available

        We're making Kafka 0.10.2 available to all our customers today, for our existing customers the updates will be rolled out starting next week.

        Improved stability and enhancements

        Apache Kafka version 0.10.2 brings many enhancements and bug fixes to Kafka.

        Kafka 0.10.2 is also groundbreaking for those who have been using the official Java client. Before the 0.10.2 release newer client versions could not talk to older cluster versions. The new Java client coming with 0.10.2 is finally able to talk to cluster versions 0.10 and 0.10.1 as well as the latest 0.10.2.

        The new release comes with lots of other useful bug fixes as well. Previously for example if you deleted a topic, recreated it and tried consuming from it, Kafka would "remember" the old commit offsets. Now that's happily been fixed.
        • [KAFKA-2000] - Delete consumer offsets from kafka once the topic is deleted
        Another thing that people have been asking us about is that since Aiven Kafka does not give out access to Zookeeper directly, they've been not able to use Kafka streams functionality. That situation has now improved on the Apache Kafka side with the removal of the need to use Zookeeper directly.
        • [KAFKA-4060] - Remove ZkClient dependency in Kafka Streams
        If you're interested in using the  Kafka Streams functionality with Aiven Kafka please contact us and we'll get you set up with our beta program for enabling it.

        Try the new Aiven Kafka 0.10.2 for free

        Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer works for all of our services: PostgreSQL, Redis, InfluxDB, Grafana, Elasticsearch and Kafka!

        Go to https://aiven.io/ to get started!

        Team Aiven


        Benchmarking Kafka Performance Part 1: Write Throughput

        We have offered a fully managed Kafka service for some time now, and we are quite often asked about just how many messages can you pipe through a given service plan tier on a selected cloud. So here's a benchmark we conducted to give you a rough idea on just how well Apache Kafka performs in the public cloud.

        This is the first post in a series that explores Kafka performance on multiple public cloud providers.

        What is Kafka?

        Apache Kafka is a high-performance open-source stream processing platform for collecting and processing large numbers of messages in real-time. It enables you to accept streaming data such as website click streams, events, transactions or other telemetry in real-time and at scale, and serve it downstream to stream processing applications.

        Kafka is built distributed for both scalability as well as fault tolerance. Adding more horizontal nodes to tackle growing loads is fairly straightforward and automatic replication of the data over more than one node maintains availability when nodes fail.

        The basic concepts in Kafka are producers and consumers.

        A producer is an application that generates data but only to provide it to some other application.

        An example of a producer application could be a web server that produces "page hits" that tell when a web page was accessed, from which IP address, what the page was and how long it took to render the page by the web server.

        On the consumer side there could be multiple systems interested in the same page hit data stream:
        • A time series database that is used to plot the total number of page hits over time
        • A reporting application collecting summaries of the pages accessed and sending them to a data warehouse database system
        • A DDoS detection system trying to find abnormal access patterns
        • A rate limiting monitor counting the number of hits from a specific source address
        • And so on...
        Kafka suits these kinds of applications very well: it provides a method of getting the data out of the hands of the producing application quickly and safely. Once the producer has written the message to Kafka, it can be sure that its part of the job is done. The producer application does not need to know how the data is used and by which applications, it just stores it in Kafka and moves on.

        On the consumer side a powerful feature of Kafka is that it allows multiple consumers to read the same messages. In our web page hit example above, each of the consumer applications get their own read cursor to the data and they can process the messages at their own pace, all without causing any performance issues or delays for the producer application.

        Here's what it roughly looks like:

        The Zookeeper cluster is a critical piece in keeping Kafka healthy and up and running. It maintains Kafka's metadata and most importantly, a consensus between the Kafka nodes of who is doing what.

        Aiven Kafka as a Service

        Aiven Kafka is a a fully managed service based on the Apache Kafka technology. Our aim is to make it as easy as possible to use Kafka clusters with the least amount of operational effort possible. We handle the Kafka and Zookeeper setup and operations for you, so you can focus on value-adding application logic instead of infrastructure maintenance. Aiven Kafka services can be launched in minutes, and we'll ensure they remain operational, well performing, up-to-date and secure at all times. Nodes are automatically distributed evenly across the available availability zones in order to minimize the impact of losing any of the zones.

        Aiven Kafka is available in Amazon Web Services, Microsoft Azure, Google Cloud Platform, UpCloud and DigitalOcean with a total coverage of 53 cloud regions. In this performance comparison we ran the benchmark on all of these except DigitalOcean, where our Kafka offering is limited by the available plans.

        Each Kafka service used in these tests is a regular Aiven-provided service with no alterations to its default settings.

        Benchmark Setup

        In this first Kafka benchmark post, we set out to estimate maximum write throughput rates for various Aiven Kafka plan tiers in different clouds. We wanted to use a typical customer message sizes and standard tools for producing load. We also wanted to generate the load from separate systems over the network to make sure the load could mimic the actual customer workloads as closely as possible.

        High-level view of the test setup, a single Aiven Kafka service with five nodes, distributed evenly over the availability zones:

        We picked message size of 512 bytes for our tests. Based on our experience, one of the most typical payloads is a JSON encoded message ranging somewhere between 100 bytes to 10 kilobytes in size.

        In these tests, we use a single topic with the partition count matching the node count of each Aiven plan tier. For more complex topic/partition setups Aiven actively balances the placement of the partitions, trying to achieve a "perfect" distribution of partitions. In the case of this test there is just a single partition for each node, so this is rather simple. We set the replication factor to one (1) in the case of this test, meaning each of the messages only resides on a single Kafka node.

        Apache Kafka version used was

        For load generation, we chose to use librdkafka and rdkafka_performance from the provided examples. We are using default settings for the most part, but bumped up single request timeout to 60 seconds as we expect the Kafka brokers to be under extreme load and request processing to take longer than under a normal healthy load level. Also, since Aiven Kafka services are offered only over encrypted TLS connections, we included the configuration for these, namely the required certificates and keys.

        librdkafka defaults to a maximum batch size of 10000 messages or to a maximun request size of one million bytes per request, whichever is met first. In these tests, we did not employ compression.

        producer.props configuration:

        We ran several instances of rdkafka_performance on multiple VMs on a different cloud provider from the one being tested. So all of the test load was coming from the internet thru the nodes' public network interfaces.

        We kept increasing the number of instances until we could find the saturation point and the maximum message rates for each plan.

        Each rdkafka_performance instance was started on the command line with:

          rdkafka_performance -P -s 512 -t target-topic -X file=producer.props

        Benchmark Results

        First set of tests was run on an Aiven Kafka Business-4 plan, which is a three node cluster and a common starting point for many of our customers. Each node in this plan has 4 gigabytes of RAM, a single CPU core and 200 gigabytes of disk on each node, providing a total 600 gigabytes of raw Kafka storage capacity in the cluster.

        Write performance (3 nodes @ 4 GB RAM, 1 CPU, 200 GB disk each):

        On UpCloud, we hit 200,000 messages per second. Azure and Google plans saturated at 120,000 and 130,000 messages per second and the Amazon deployment reached 50,000 messages per second.

        The performance is pretty respectable. The performance on Amazon is a bit behind the others because of the node types available and we will be looking at ways to optimize that in the future. As you will see in the next graph for the test with the bigger plan, the AWS performance is already more in line with the other providers. As you will see in the next graph for the test with the bigger plan, the AWS performance is already more in line with the other providers.

        Next, we tested three node clusters but with larger underlying instances using the Business-8 plan. This plan has nodes with 8 gigabytes of RAM, two CPU cores and 400 gigabytes of disk per node, i.e. all the primary resources are doubled when compared to the Business-4 plan. This test indicates how well Kafka scales vertically with increased resources.

        Write performance (3 nodes @ 8 GB RAM, 2 CPU, 400 GB disk each):

        We see a nice increase in performance, with 320,000 messages per second on UpCloud, 205,000 on Azure, 170,000 on Google and 160,000 messages per second on AWS.

        In the last test, we wanted to verify how well Kafka scales horizontally. With this test, we went from the Business plan tier to the Premium tier, which bumps the node count from three to five, while keeping the node specs otherwise identical. Also the test setup was updated to utilize a partition count of five (vs. three) for this test.

        Write performance (5 nodes @ 8 GB RAM, 2 CPU, 400 GB disk each):

        The results here are solid for Kafka: a two-thirds increase in the number of nodes resulted in a straight 2/3 increase in write performance. Awesome!

        Aiven Kafka Premium-8 on UpCloud handled 535,000 messages per second, Azure 400,000, Google 330,000 and Amazon 280,000 messages / second.

        Benchmark Conclusions

        Apache Kafka performs just as well as we expected and scales nicely with added resources and increased cluster size. We welcome you to benchmark your own workloads with Aiven and to share your results.

        We utilize Kafka as a message broker within Aiven as well as use it as a medium for piping all of our telemetry metrics and logs. We are happy with with our technical choice, and can recommend Apache Kafka for handling all kinds of streaming data.

        Find out more about Aiven Kafka at https://aiven.io/kafka.


        Bigger Aiven Kafka plans and introducing Amazon VPC peering

        New larger Aiven Kafka plans available

        We've included larger 32GB and 64GB plan tiers to our Kafka offerings with increased core counts and larger memory. These offerings are available in Business and Premium Kafka plans in all Google Cloud Platform, Amazon Web Services and UpCloud regions.

        Plan Cluster
        CPU Memory Total Storage Data Retention
        Startup-2 3 1 2 GB 90 GB 1 week
        Business-4 3 1 4 GB 600 GB 4 weeks
        Business-8 3 2 8 GB 1200 GB 6 weeks
        Business-16 3 4 16 GB 2400 GB 8 weeks
        Business-32 3 8 32 GB 4200 GB 12 weeks
        Business-64 3 16 64 GB 6000 GB 18 weeks
        Premium-4 5 1 4 GB 1000 GB 6 weeks
        Premium-8 5 2 8 GB 2000 GB 10 weeks
        Premium-16 5 4 16 GB 4000 GB 14 weeks
        Premium-32 5 8 32 GB 8000 GB 20 weeks
        Premium-64 5 16 64 GB 10000 GB 30 weeks

        If you need further horizontal scale, please contact us for custom Kafka plans.

        Schema Registry now available in Kafka plans

        We've added support for Confluent's Kafka Schema Registry which allows you to store your Kafka message descriptions in a centralized registry. The feature is available in our Business and Premium level Aiven Kafka plans. It also hooks up automatically with our existing Kafka REST service.

        In order to use these features you can easily enable them from the web console.

        Note that if you're using these with the Java client you will want to patch your Schema Registry client with this small patch that has been suggested for inclusion into the main Schema Registry project.

        AWS VPC peering available

        We've launched support for Amazon Web Services Virtual Private Cloud (VPC) peering. With VPC peering, you can link Aiven services directly into your own VPC networks. You can access the services with private IP addresses as if they were in the same network.

        VPC peering support is now available for all service types and Startup, Business and Premium plans in all Amazon Web Services regions.

        Please contact us if you want to try out our VPC support for yourself.

        Give Aiven services a whirl

        Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer works for all of our services: PostgreSQL, Redis, InfluxDB, Grafana, Elasticsearch and Kafka!

        Go to https://aiven.io/ to get started!

        Team Aiven


        Aiven Elasticsearch 5 available with easy upgrade from 2.x

        We're making Elasticsearch 5 available to all our customers today, including a simple single-click upgrade path for our existing Elasticsearch 2 users.

        Improved performance and usability

        Elasticsearch 5 (version 5.2.2 at the moment) and its accompanying new Kibana version (also version 5.2.2) bring many useful new features from a developer console in Kibana that you can test your queries with against the REST API to in to much improved index performance in Elasticsearch.

        However, perhaps the most exciting features are the performance improvements, which range from around 25-80% faster performance, depending on the usage scenario.

        You can find more about the new improvements from the Elasticsearch blog, which has a separate story about each 5.0, 5.1 and 5.2. The full release notes are also available here: 5.0, 5.1, 5.2

        Easy upgrade from Elasticsearch 2.x

        Upgrading your existing Aiven Elasticsearch 2 service is easy and will only take a moment.

        First, open your service's overview page in our web console:

        We can see that the currently running version is 2.4.2 and there is an upgrade button beside it. Let's press the button!

        Clicking the Upgrade button in the confirmation dialog will immediately start the upgrade. Unlike most Aiven software upgrades this upgrade is performed in-place in the running service nodes, i.e. this is not the usual rolling forward upgrade that we provide. This way we were able to squeeze down the upgrade downtime to the absolute minimum required.

        NOTE: After clicking the Upgrade button it is no longer possible to downgrade back to version 2.x with the chosen service. Please test your application carefully with Elasticsearch 5 before committing to the upgrade.

        Confirming the upgrade pops up a banner that stays on until the upgrade is complete. Typically the upgrade takes from seconds to a couple of minutes, depending on the number of indexes in the Elasticsearch database.

        Once the upgrade is complete, the yellow banner disappears and the new version number is updated in the service information:

        And we are all done with just a couple of clicks!

        In order to upgrade you may either create a new service or you can upgrade your current Elasticsearch cluster to the latest version.

        Try the new Aiven Elasticsearch 5 for free

        Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer works for all of our services: PostgreSQL, Redis, InfluxDB, Grafana, Elasticsearch and Kafka!

        Go to https://aiven.io/ to get started!

        Team Aiven


        Handling the AWS US-EAST-1 outage

        The ongoing outage in AWS US-EAST-1 (N. Virginia) affected a number of Aiven users, but the combination of our automation and manual actions taken by the operations team has resolved the issues for all Aiven users.

        Our 24/7 monitoring alerted us to the outage, initially thought to be limited to S3 but later revealed to affect more AWS resources, including EBS volumes and launching new EC2 instances.  A number of Aiven services running on affected EBS volumes in affected availability zones were migrated online to new instances in AWS US-EAST-2 (Ohio) ensuring service availability while the N. Virginia region is experiencing issues.

        There was also a number of Aiven services running on EC2 instances that continued to operate normally, but where access to S3 for backups started timing out and failing.   The affected services have been updated to store backups to the Ohio region until the N. Virginia region recovers.

        This ensures that all Aiven services have proper backups during the outage.  It won't be possible to spin up new instances in N. Virginia or perform a PostgreSQL point-in-time-recovery to a backup location stored only in N. Virginia while the outage is ongoing, but once AWS has resolved the issue Aiven services will be automatically restored.

        Update: Please see our help site for more details about the outage.

        Team Aiven


        Lower PostgreSQL pricing in AWS

        We're happy to announce pricing changes for Aiven PostgreSQL plans in Amazon Web Services

        The prices of Aiven PostgreSQL in AWS have been updated to take the latest AWS price cuts in account, lowering the price of various services by 10 to 30%. The highly-available business plans have seen the largest cuts in pricing.  The new prices are effective on February 1st for all current and new customers.

        Please check out our new pricing for more detail and sign up for a free trial!

        Team Aiven