2018-01-22

Introducing Aiven Elasticsearch 6 and new, bigger plans

We are excited to announce that Elasticsearch 6 is now available at Aiven!

Not only that, but we've drastically increased the disk sizes of our plans and added additional plans with even more nodes.

But, before we get into the larger plans, let's look at the benefits that Elasticsearch 6 will bring to users.


Elasticsearch 6


Faster recovery times


In Elasticsearch 6.0, the concept of sequence IDs has been introduced, allowing for operations based recovery.

Previously, Elasticsearch shards had to be compared against the primary with all the different segments transferred again to the restarted node on restart.

Now, Elasticsearch can just replay the operations from the log that happened between the restart which provides for much improved recovery times.

Much smaller index sizes


1. Sparse fields are now stored much more efficiently

If only some of your documents had a field defined in them, all of the documents stored in Elasticsearch indexes had to pay a price for those fields.

Elasticsearch 6's new Lucene index version has optimized this away, resulting in exceptionally large space savings for sparse indexes.

2. Removal of the _all doc field variable

Previously, all the values in an Elasticsearch document were duplicated into an _all variable within the index entry for easier searching.

Elasticsearch 6.0 removes that duplication which results in large index size decreases, further improving performance.

3. Faster queries with index sorting

If all of your queries are using a certain index sorting, you can now define it so the actual sorting is done at index creation time, resulting in large query time optimizations during runtime.

How to simply upgrade to Elasticsearch 6.0


Existing users can now update to this version at will by simply clicking on the Upgrade button within the Aiven Elasticsearch services dashboard page.

Please note: if you have unapplied maintenance updates, you will have to apply them before you can upgrade to Elasticsearch 6.

Elasticsearch disk storage size improvements


Based on your feedback, we've considerably increased the plan sizes considerably. Additionally, we've introduced 7 node Premium-7x- plans as a regular offering

We've also renamed our Elasticsearch Premium plans to be in-line with our plan naming for Kafka: the previous Premium plans have now been renamed to Premium-5x-*.


Plan Previous storage Improved storage
Startup-4 32 GB 80 GB
Startup-8 64 GB 175 GB
Startup-16 128 GB 300 GB
Startup-32 256 GB 700 GB
Business-4 96 GB 240 GB‡
Business-8 192 GB 525 GB‡
Business-16 384 GB 1050 GB ‡
Business-32 768 GB 2100 GB‡
Premium-4
Premium-5x-8 160 GB 875 GB‡
Premium-5x-16 640 GB 1750 GB ‡
Premium-5x-32 1280 GB 3500 GB‡
‡ Total storage size is the combined storage size available in the plan. Usable disk space depends on the replication factor being used.


New plans with more nodes:


Plan Storage
Premium-7x-8 1125 GB‡
Premium-7x-16 2450 GB‡
Premium-7x-32 4900 GB‡
‡ Total storage size is the combined storage size available in the plan. Usable disk space depends on the replication factor being used.

If you need plans even bigger than this, we can create larger custom plans on request. Feel free to contact sales if you'd like to buy a custom plan.

Start or upgrade to Elasticsearch 6.0 today 


Aiven clients can now experience the benefits of Elasticsearch 6.0 with larger plan offerings. If you are a current client, simply upgrade your plan by clicking the button within your console.

If you aren't, sign up today and test us out with $10 of free credits. Trying Aiven is always free and comes with no commitments. 





2018-01-04

Aiven statement on Meltdown and Spectre vulnerabilities

Aiven is aware of recently disclosed research regarding side-channel analysis of speculative execution on modern computer processors. This analysis has revealed several different vulnerabilities which have been named Meltdown and Spectre by the researchers and are tracked as CVE-2017-5715, CVE-2017-5753, and CVE-2017-5754.

Aiven will perform the necessary actions to protect your data and services from these vulnerabilities. These actions are implemented as automatic or scheduled maintenance tasks, require no user intervention and result in no impact on availability of the services.

Full details and the latest updates are recorded here: http://help.aiven.io/incident-reports/aiven-statement-on-meltdown-and-spectre-vulnerabilities

2017-12-21

PostgreSQL Performance in AWS, GCP, Azure, DO and UpCloud

I recently gave a talk about PostgreSQL performance in different clouds, under different settings at the pgconf.eu conference in Warsaw.

The talk compared the results of 3 benchmark tests for the just released PostgreSQL 10 running in five cloud infrastructures in two different regions:

  • Amazon Web Services (eu-west-1 and eu-central-1)
  • Google Cloud Platform  (europe-west2 and europe-west3)
  • Microsoft Azure (UK South and Germany Central)
  • DigitalOcean (LON1 and FRA1)
  • UpCloud (uk-lon1 and de-fra1)

We selected the specific regions because we were giving the talk a European conference and all of the vendors operate data centers in England and Germany.

Also, chose to run all benchmarks in two different regions to see if we would find noticeable differences in the different sites operated by these vendors.

The first benchmark was compared the following two different instance sizes on all clouds utilizing the network-backed freely scalable disks that are available in each cloud:

  • 4 vCPU / 16GB RAM
  • 16 vCPU / 64GB RAM

The second compared network and local SSD performance in AWS and GCP cloud infrastructures with the same instance sizes as the first.

For this post, we will be taking an in-depth look at the two benchmark tests while covering some important considerations when identifying and correcting PostgreSQL setups.

 4 vCPU / 16 GB RAM network disks benchmark


Although it is impossible to get VMs with the exact same specifications in every cloud, we provisioned similar setups in all clouds:

  • Amazon Web Services
    • m4.xlarge: 4 vCPU; 16 GB RAM
    • 350 GB gp2 EBS volume, no provisioned IOPS

  • Google Cloud Platform
    • n1-standard-4: 4 vCPU; 15 GB RAM
    • 350 GB PD-SSD

  • Microsoft Azure
    • Standard DS3 v2: 4 vCPU; 14 GB RAM
    • 350 GB P20

  • DigitalOcean
    • 16GB: 4 vCPU; 16 GB RAM
    • 350 GB block storage

  • UpCloud
    • 4CPUx16GB: 4 vCPU; 16 GB RAM
    • 350 GB MAXIOPS

PostgreSQL 10.0 was running on top of Linux 4.3.15 kernel on each cloud.  To replicate a typical production setup, the disks utilized LUKS full-disk encryption and WAL archiving was enabled to include the overhead of backups in the tests.

The test was performed using the venerable pgbench tool that comes bundled with PostgreSQL.

pgbench was run on another VM in the same cloud, running 16 clients for one hour with a data set roughly three times the amount of available memory, meaning that the data would not fit in cache and the impact of I/O is visible.


Google Cloud and UpCloud lead this benchmark showing that there's quite a bit of variation between the different cloud providers.

The performance of the big three providers are most stable across different regions while DigitalOcean and UpCloud show larger differences between different regions: the hardware or density of those regions may vary.

16 vCPU  / 64 GB RAM / network disks benchmark


The test run was repeated on the 16x64 setup:

  • Amazon Web Services
    • m4.4xlarge: 16 vCPU; 64 GB RAM
    • 1 TB gp2 EBS volume, no provisioned IOPS

  • Google Cloud Platform
    • n1-standard-16: 16 vCPU; 60 GB RAM
    • 1 TB PD-SSD

  •  Microsoft Azure
    • Standard DS5 v2: 16 vCPU; 56 GB RAM
    • 1 TB P30

  • DigitalOcean
    • 64GB: 16 vCPU; 64 GB RAM
    • 1 GB block storage

  • UpCloud
    • Custom: 16 vCPU; 60 GB RAM
    • 1 TB MAXIOPS 

Again we used PostgreSQL 10.0, Linux 4.3.15 and pgbench for the benchmark, this time with 64 parallel clients on the pgbench host with a dataset three times the host ram.


As with the prior setup, Google Cloud and UpCloud lead the pack with DigitalOcean's inter-region performance demonstrating widely different characteristics.

However, Amazon and Azure have a smaller performance gap and UpClouds inter-region performance gap is almost nonexistent.

Additionally, the actual transactions per second count hasn't increased much from the test run with the smaller instance types. This is mostly due to the size of the data set and the number of clients having increased.

 

General and cloud provider performance considerations


Data access latency has always been one of the most important factors in database performance and often is the most important single factor.

In broad terms, this means the larger the amount of your hot data that you can fit in your fastest data storage system the better.

Data is typically fetched from the following systems, listed below from fastest to slowest with each system typically being an order of magnitude slower than the preceding system:

CPU caches < RAM < Local disks < Network disks

All vendors use Intel Xeon CPUs of various types in their clouds and you can usually identify the type of CPU from the vendor's home page or simply by spawning a virtual machine and looking at /proc/cpuinfo.

Different hypervisors are used by different providers which may account for some differences, but most of the differences between these vendors are most likely in their IO subsystems.

AWS 


AWS has a number of different volume types available in its Elastic Block Storage (EBS) ranging from large & cheap spinning rust to highly performant SSDs with provisioned IOPS.

The most common disk type for database workloads are probably the "general purpose" gp2, which was used in our tests, and the "provisioned iops" io1 type.

The general purpose disk works just fine for most use cases and provides guaranteed performance for a reasonable price based on the disk size ranging from 100 IOPS to 10k IOPS.

While you can get up to 32k IOPS with the provisioned IOPS volume type, it comes with a hefty price.

AWS also provides a number of instance types with fixed local storage ("instance storage") which we'll cover later.

Google Cloud (GCP) 


GCP has two options for its scalable network storage: SSDs ("pd-ssd") and spinning ("standard").

The performance of Google Cloud's disks scale up automatically as volume size increases, i.e. the bigger the volume, the more bandwidth and IOPS you get.

GCP doesn't have provisioned IOPS disks at the moment, but the number of IOPS available by default usually exceeds what's available on Elastic Block Storage.

GCP allows attaching up to 8 local SSDs to most instance types, which we'll also cover later.

Microsoft Azure 


Azure has a number of different options for network attached storage with default storage based on spinning disks that are affordable, but rather slow.

To utilize SSDs in Azure, you must switch from the standard "D" type instances to "DS" and select the layout of disks to use: disks come in fixed sizes ranging from 128 GB to 4 TB in size with each tier offering different IOPS and bandwidth.

We used the "P20" (512GB) and "P30" (1TB) disks in our benchmarks which currently appear to have the best performance per dollar spent of all Azure disks.

Azure also has a number of instance types with instance local storage, but we have not yet included them in our benchmarks.

DigitalOcean 


DigitalOcean has simple options for VMs with a number of different VM sizes you can select which all come with a certain amount of vCPUs, RAM and local (SSD) disk.  You can additionally attach a network disk ("block storage") of any size to the VM.

All DigitalOcean VMs have local disks, but they are too small to be usable in these benchmarks.

UpCloud 


UpCloud has a proprietary storage system called MAXIOPS which is used for all disk resources.  MAXIOPS allows attaching a number of disks up to 1TB each to a VM.

We believe MAXIOPS is based on SSD arrays connected using InfiniBand, but UpCloud has not given a detailed view to its storage technology so far.

UpCloud doesn't offer any VMs with local storage.

Considerations for running PostgreSQL on ephemeral disks


As mentioned, a number of the cloud infrastructure vendors offer instance storage with what we believe to be much faster (potentially an order of magnitude faster) than the default, network-attached storage, and we wanted to include them in our benchmarks.

It is important to note that we can't rely on instance storage for data durability because local disks are ephemeral and data on them will be lost in case the VM is shut down or crashes.

Therefore, other means must be used to ensure data durability across node failures.  Luckily PostgreSQL comes with multiple approaches for this:

  • Replication to a hot standby node
  • Incremental backup of data as it's written (WAL streaming)

We utilized WAL streaming for backups in this benchmark: all data is streamed to a durable cloud storage system for recovery purposes as it's written.

 

16 GB RAM / local vs. network disks benchmark


Just as we couldn't with the network disk test, we were unable to provision VMs with identical specifications from the selected cloud infrastructure providers.

Specifically, AWS has a limited set of instance types with local disks unlike Google, which allows for attaching local disks to most types. The following VM configuration was used in this test run:


  • Amazon Web Services
    • i3.large: 2 vCPU; 15 GB RAM
    • 350 GB NVMe disk (max 475 GB)

  • Google Cloud Platform
    • n1-standard-4: 4 vCPU; 15 GB RAM
    • 350 GB NVMe disk (max 3 TB)

Again, we used PostgreSQL 10.0, Linux 4.3.15 and pgbench for the benchmark, this time with 16 parallel clients on the pgbench host with a dataset three times the host ram.

The following graph compares the performance of PostreSQL 10 running in AWS and GCP with local and network disks:


We were a bit surprised by the lack of differences between local and network disks. Apparently, the bottleneck is not in disk IO in this case.

 

 64 GB RAM / local vs. network disks benchmark


We repeated the local disk benchmark with 64 GB instances configured as follows:

  • Amazon Web Services
    • i3.2large: 8 vCPU; 61 GB RAM
    • 1 TB NVMe disk (max 1.9 TB)

  • Google Cloud Platform
    • n1-standard-16: 16 vCPU; 60 GB RAM
    • 1 TB NVMe disk (max 3 TB)

Again we used PostgreSQL 10.0, Linux 4.3.15 and pgbench for the benchmark, this time with 64 parallel clients on the pgbench host with a dataset three times the host ram.

The following graph compares the performance of PostreSQL 10 running in AWS and GCP with local and network disks:


The results speak for themselves:  the local NVMe-backed instances blow away the network-backed instances.

 

 Summary


A great many factors affect PostgreSQL performance and each production workload is different from the other, and most importantly, different than the workloads used in these benchmarks.

Our advice is to try to identify the bottlenecks in performance and tune the database configuration, workload or (virtual) hardware configuration to match the requirements.

In a cloud PostgreSQL-as-a-Service system such as ours, the PostgreSQL system parameters are typically automatically configured to match typical workloads and it's easy to try out different virtual hardware configurations.

If you decide to roll your own PostgreSQL setup, be sure to pay special attention to backups and data durability when using local disks.

 

Next steps


My conference slides are available in our GitHub repository and they include a set of benchmarks on different AWS PostgreSQL-as-a-Service platforms (Aiven, RDS and AWS Aurora) comparing performance of Aiven PostgreSQL backed by NVMe SSDs to RDS PosgreSQL on EBS volumes and AWS Aurora with a proprietary backend.

We will be publishing more benchmarks at regular intervals in the future comparing different workloads and infrastructure setups. We'll also include more database-as-a-service providers in our regular benchmarks, so stay tuned!

 

Launch the top-performing PostgreSQL service in minutes


As detailed in a previous blog post, PostgreSQL 10 with local disks is now available for all Aiven users. Please sign up for our free trial and have your top-performing PostgreSQL instance available for use in minutes.

Trying Aiven is always free and comes with no commitments.



2017-12-19

Aiven Kafka 1.0 Is Now Ready

kafka 1.0
After seven years of development, Apache Kafka released version 1.0.0 this past November.

Neha Narkhede, CTO of Confluent, stated that it represents a "Completeness of our vision."

That's a pretty big statement and is representative of 1.0.0 no merely being a version change, but something bigger.

Let's look at Aiven's take on some of the highlights of the version release.

Enhanced metrics support


You can get more information about the health of your Kafka service because of the increased granularity of the available data.

This will help Kafka gain more traction in enterprise deployments as there's better visibility into its inner workings. Aiven Kafka will be making these metrics available to you in the upcoming weeks.

Support for Java 9


Brings much improved performance in communication over TLS: numbers of up to 250% improved performance have been reported. Aiven will provide support for this later on in 2018.

Notable bug fixes 


1. Speed increase of broker startup after unclean shutdown

By reducing unnecessary snapshot file deletions, restarts after an unclean shutdown are up to an order of magnitude faster than before.

This improves recovery times in situations where things have already gone wrong.

2. Improved memory usage during partition reassignment

In 0.11, memory usage would spike during partition reassignment. Now, there is more efficient memory usage for partition reassignments during rebalances within Aiven Kafka.

Why both are important: improved resource management equals improved stability and performance. 

Aiven ensures that Apache Kafka 1.0 is ready for production use


As excited as we are for new releases, they are rarely ready for production use out of the box and Kafka 1.0 was no different.

Most notably, there was an issue surrounding memory leakage that needed to be resolved.

After much work from the community, we have now made Apache Kafka 1.0 available on the Aiven platform.

Get Aiven Kafka 1.0 today


All newly created Aiven Kafka services will be running on 1.0 automatically and we will be upgrading current client Kafka services in January.

If you are a current client and wish to upgrade to Kafka 1.0 earlier, feel free to do so by clicking the maintenance upgrade button within your customer panel.

Not a client of Aiven and want to try out Kafka? We'll give you $10 of credits to try it out for free! Simply click the button below and get started today!


2017-11-09

Aiven PostgreSQL plans are now larger and faster with local SSDs



Aiven PostgreSQL plans are now larger


As part of our ongoing efforts to improve our service according to client feedback, we've increased the disk space for all Aiven PostreSQL plans. 

Depending on plan type, we've increased disk space anywhere from 12% to 75%. Check out the table below to see improvements according to plan type:


Plan Previous storage Improved storage
Startup/business/premium-4 50 GB 80 GB
Startup/business/premium-8 100 GB 175 GB
Startup/business/premium-16 200 GB 350 GB
Startup/business/premium-32 400 GB 700 GB
Startup/business/premium-64 800 GB 1000 GB
Startup/business/premium-120 1200 GB 1400 GB
Startup/business/premium-160 1600 GB 1800 GB
Startup/business/premium-240 2400 GB 2800 GB



New Aiven PostgreSQL service plans will come with the increased disk space while existing service plans will receive the increases during their next upgrades.

We can also provide even greater disk space upon request. If you're interested, feel free to contact our sales for more information!

For those of you who are running your PostgreSQL services on Amazon Web Services (AWS) and Google Cloud Platform (GCP), the news is even better.


Aiven supports local SSDs in AWS and GCP


In fact, we are the first DBaaS to offer such support. 

Specifically, we will use PCIe NVMe local SSDs starting from our Startup/ Business/Premium-8 plans in GCP and Startup/Business/Premium-16 plans in AWS.

So why is this important?

Our initial tests have demonstrated up to a 400% increase in performance when using local SSDs over network-based SSDs. 

To learn more, view our benchmarking presentation that we gave at this year's PostgreSQL Conference Europe.

Currently only AWS and GCP have support for these but we hope to extend it to other cloud providers in the future when they have the required support.

We will be following up later on with a longer form blog post that will go into more detail. 


Increase your PostgreSQL performance with Aiven


We are constantly striving to be the first to offer features and updates that will markedly improve the performance of the services you run with us.

That is why we made sure to be the first to provide local SSDs in our GCP and AWS plans, as well as release production-ready PostgreSQL 10.

If you are a current client, we'd like to thank you for working with us to improve industry standards. If you aren't a client yet, give us a try!

Trying Aiven is free and comes with no commitments. Start your trial today and receive $10 worth of test credits.  


2017-10-19

Aiven talks shop with Paf at second Apache Kafka meetup

In August, Aiven created an Apache Kafka meetup in Helsinki to discuss hot topics surrounding Kafka.

Due to its popularity, we decided to make it an ongoing event and held our second meetup yesterday.

This time, around 30 people attended Lifeline Ventures' office in downtown Helsinki and there were two presenters:
  1. Niklas Nylund of Paf, an operator of slots, lotteries, poker and casino games as well as betting both online and in casinos.
  2. Heikki Nousiainen, our CTO
First up was Nylund to discuss the ins and outs of how his team brought Kafka in to act as a data bus for their real-time analytics needs.

Kafka reduces the spaghetti  


Paf collects Change Data Capture events from databases and sends the events to Kafka, which are then consumed and imported into Kudu for analytic work.

Although Kafka possesses its quirks, Nylund is satisfied with how Paf has been able to use it to streamline their architecture.

Or, as he vividly put it, "Reducing the spaghetti."

Check out his presentation slides here to get a thorough understanding of Paf's integration process from beginning to end.

Kafka Connect simplifies integration


Second up was our CTO to discuss Kafka Connect framework, and using it to transfer data between Kafka and other systems, in this case
PostgreSQL and Elasticsearch.

Using Python code to interact with the services, Nousiainen was able to demonstrate how easy it was to push and pull data between Kafka and the external systems.

With a large number of available connectors from the Kafka community, integrating Kafka with other systems can be quite straightforward.

This in turn allows quick benefits and migration towards real-time stream analytics with Kafka-centric architecture.

Check out Nousiainen's presentation slides to get a better idea of the use cases for Kafka Connect.

Join the next Kafka discussion  


As the transition from a monolithic to microservices architecture continues, the use case for integrating Kafka as a streaming platform will only strengthen. 

This is evidenced by the increase in attendance of our events where many developers, be they users of Kafka or not, are gathering to learn more about what Kafka is, its use cases, and best practices for implementing it.

We are planning another Aiven Kafka meetup for December/January timeframe, so join the Helsinki Apache Kafka Meetup group to get the details when we finalize the plans and we'll see you soon!

2017-10-11

Aiven is the first to offer PostgreSQL 10

Get Aiven PostgreSQL 10
We've got great news: PostgreSQL 10 is now available at Aiven on all major clouds!

That means that you can now access it on AWS, Google, Azure, UpCloud, and DigitalOcean clouds. Worldwide.

As with every PostgreSQL release, many of your older queries will simply run faster because of the many performance enhancements. But, why is PostgreSQL 10 significant?

As the 28th major update of the past 30 years, its primary focus is on improving the distribution of massive amounts of data across many nodes...let's look at some specifics.


PostgreSQL 10: the specifics 


Improved support for parallel queries 


Now, many more scan types are supported and can benefit from parallelization.

Depending on your query, newly added scan types such as parallel index scan and bitmap heap scan can speed it up immensely.

Also, merge joins are now a supported parallel join type in addition to other join types already supported in the previous release, such as hash joins and nested loop joins.


    Declarative partitioning support. 


    While you could create partitioning schemes by directly using constraints, inheritance and triggers in past versions of PostgreSQL...

    ...you can now use simple definitions to create your partitioning setup with PostgreSQL 10

    Even better, the performance of the new partitioning code is vastly improved over older methods.


      Hash indexes


      PostgreSQL 10 brings crash-safe hash index support that also performs far better than before. 

      Now, you can consider using hash indexes when your queries just need to check for equivalence to increase performance.


        Native logical replication 


        PostgreSQL 10 now brings proper support for logical replication in PostgreSQL itself. 

        Logical replication allows replication between different PostgreSQL versions, finally allowing for zero downtime upgrades to future versions. 

        You can also migrate data to and from environments where you don't have access to streaming replication.


        Easily test PostgreSQL 10 with Aiven today  


        With Aiven, you can create a full copy of the data within your existing PostgreSQL service as a new separate PostgreSQL 10 service.

        By forking, you can keep your existing PostgreSQL services as-is while testing the latest version for compatibility with your applications.

        And don't worry, it won't negatively affect the performance of your source service; it just provides an easy and efficient way to test PostgreSQL 10. So, let's get started.




        Cheers,
        Team Aiven

        P.S. For a full list of features please see the full PostgreSQL 10 release notes.