2016-12-28

Aiven PostgreSQL read-only replicas

We are happy to announce that we have enabled read-only replica access to all of our PostgreSQL plans that have one or more standby nodes. Utilizing the standby server nodes for read-only queries allows you to scale your database reads by moving some of the load away from the master server node to the otherwise mostly idle replica nodes.

What are master and standby nodes?

PostgreSQL master node is the primary server node that processes SQL queries, makes the necessary changes to the database files on the disk and returns back the results to the client application.

PostgreSQL standby nodes replicate (which is why they are also called "replicas") the changes from the master node and try to maintain an up-to-date copy of the same database files that exists on the master.

Standby nodes are useful for multiple reasons:
  • Having another physical copy of the data in case of hardware/software/network failures
  • Having a standby node typically reduces the data loss window in disaster scenarios
  • Restoring the database back to operation is quick by a controlled failover in case of failures, as the standby is already installed, running and in-sync with the data
  • Standby nodes can be used for read-only queries to reduce the load on the master server

What is the difference between having 0, 1 or even 2 standby nodes?

Aiven offers PostgreSQL plans with different number of standby server nodes in each:
  • Hobbyist and Startup plans have just a single master node and no standby nodes
  • Business plans have one master node and one standby node
  • Premium plans have one master node and two standby nodes
The difference between the plans is primary the behavior during failure scenarios. The are many bad things that can happen to cloud servers (or any server in general): hardware failures, disk system crashes, network failures, power failures, software errors, running our of memory, operator mistakes, fires, floods and so on.

Single node plans are most prone to data loss during failures. For example, if the server power suddenly goes out, some of the latest database changes may not have made it out from the server into backups. The size of the data loss window is dependent on the backup method used.

Single node plans are also the slowest to recover back to operation from failures. When the server virtual machine fails, it takes time to launch a new virtual machine and to restore it from the backups. The restore time can be anything from a couple of minutes to several hours, the primary factor in it being the size of the database that needs to be restored.

Adding a "hot" standby node helps with both of the above issues: the data loss window can be much smaller as the master is streaming out the data changes in real-time to the standby as they happen. The "lag" between the master and standby is typically very low, from tens of bytes to hundreds of bytes of data.

Also recovery from failure is much faster as the standby node is already up and running and just waiting to get the signal to get promoted as the master, so that it can replace the old failed master.

What about having two standby nodes? What is the point in that?

The added value of having a second standby node is that even during recovery from (single-node) failures, there are always two copies of the data on two different nodes. If another failure strikes after a failover when there is just a single master node running, we again risk losing some of the latest changes written to the database. It takes time to rebuild a new standby node and getting it in sync node after a failover when there is a lot of data in the database, and it often makes sense to protect the data over this time period by having another replica. This is especially important when the database size is large and recreating a replacement node for the faulted one can take hours.

Using standby nodes for read-only queries


Standby nodes are also useful for distributing the load away from the master server. In Aiven the replica nodes can be accessed by using the separate "Read-only replica URL" visible in the Aiven web console:



Using the replica URL in a database client application will connect to one of the available replica server nodes. Previously replica node access was only available in our Premium plans (master + two standbys) and now we have enabled it in our Business plans (master + one standby) as well.

So if you have had high CPU usage on the master node of your Startup plan, it may be worthwhile looking into the possibility of increasing your read throughput by using the replica servers for reads. Of course in addition by using a Business plan you'll also make the service have better high availability characteristics by having a standby to fail over to.

A good thing to note is that since the PostgreSQL replication used in Aiven PostgreSQL is asynchronous there is a small replication lag involved. What this means in practice is that if you do an INSERT on the master it takes a while (usually much less than a second) for the change to be propagated to the standby and to visible there.

Replica Usage

To start using your read-replica find its database URL and after that you can connect to it by copying the Read-only replica URL:

$ psql postgres://avnadmin:foo@replica.demopg.demoprj.aivencloud.com:10546/defaultdb?sslmode=require
psql (9.6.1, server 9.6.1)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)Type "help" for help.

defaultdb=>

After which you can run any read-only query without slowing down the master.

Also while connected, PostgreSQL can tell you whether you're connected to either a master or standby node. To check that out you can run:

defaultdb=> SELECT * FROM pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

If it returns TRUE you're connected to the replica, if it returns FALSE you're connected to the master server.

Try PostgreSQL 9.6 for free in Aiven

Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer works for all of our services: PostgreSQL, Redis, InfluxDB, Grafana, Elasticsearch and Kafka!

Go to https://aiven.io/ to get started!

Cheers,
Team Aiven

2016-12-21

Aiven PostgreSQL connection pooling

We're happy to announce that Aiven PostgreSQL now has support for connection pools. Connection pooling allow you to maintain very large numbers of connections to a database while keeping the server resource usage low.

Aiven PostgreSQL connection pooling utilizes PGBouncer for managing the database connection and each pool can handle up to 5000 database client connections. Unlike when connecting directly to the PostgreSQL server, each client connection does not require a separate backend process on the server. PGBouncer automatically interleaves the client queries and only uses a limited number of actual backend connections, leading to lower resource usage on the server and better total performance.

Why connection pooling?


Eventually a high number of backend connections becomes a problem with PostgreSQL as the resource cost per connection is quite high due to the way PostgreSQL manages client connections. PostgreSQL creates a separate backend process for each connection and the unnecessary memory usage caused by the processes will start hurting the total throughput of the system at some point. Also, if each connection is very active, the performance can be affected by the high number of parallel executing tasks.

It makes sense to have enough connections so that each CPU core on the server has something to do (each connection can only utilize a single CPU core [1]), but a hundred connections per CPU core may be too much. All this is workload specific, but often a good number of connections to have is in the ballpark of 3-5 times the CPU core count.

[1] PostgreSQL 9.6 introduced limited parallelization support for running queries in parallel on multiple CPU cores.

Without a connection pooler the database connections are handled directly by PostgreSQL backend processes, one process per connection:





Adding a PGBouncer pooler that utilizes fewer backend connections frees up server resources for more important uses, such as disk caching:




Many frameworks and libraries (ORMs, Django, Rails, etc.) support client-side pooling, which solves much the same problem. However, when there are many distributed applications or devices accessing the same database, a client-side solution is not enough.

Connection pooling modes


Aiven PostgreSQL supports three different operational pool modes: "session", "transaction" and "statement".

  • The "session" pooling mode means that once a client connection is granted access to a PostgreSQL server-side connection, it can hold it until the client disconnects from the pooler. After this the server connection will be returned back into the connection pooler's free connection list to wait for its next client connection. Client connections will be accepted (at TCP level), but their queries will only proceed once another client disconnects and frees up its backend connection back into the pool. This mode can be helpful in some cases for providing a wait queue for incoming connections while keeping the server memory usage low, but has limited usefulness under most common scenarios due to the slow recycling of the backend connections.
  • The "transaction" pooling mode on the other hand allows each client connection to take their turn in using a backend connection for the duration of a single transaction. After the transaction is committed, the backend connection is returned back into the pool and the next waiting client connection gets to reuse the same connection immediately. In practise this provides quick response times for queries as long as the typical transaction execution times are not excessively long. This is the most commonly used PGBouncer mode and also the Aiven PostgreSQL default pooling mode.
  • The third operational pooling mode is "statement" and it is similar to the "transaction" pool mode, except that instead of allowing a full transaction to be run, it cycles the server side connections after each and every database statement (SELECT, INSERT, UPDATE, DELETE statements, etc.) Transactions containing multiple SQL statements are not allowed in this mode. This mode is sometimes used for example when running specialized sharding front-end proxies.

How to get started with Aiven PostgreSQL connection pooling


First you need an Aiven PostgreSQL service, for the purposes of this tutorial we assume you already have created one. A quick Getting Started guide is available that walks you through the service creation part.

This the overview page for our PostgreSQL service in the Aiven web console. You can connect directly to the PostgreSQL server using the settings described next to "Connection parameters" and "Service URL", but note that these connections will not utilize PGBouncer pooling.



Clicking the "Pools" tab opens a list of PGBouncer connection pools defined for the service. Since this service was launched, there are no pools defined yet:

 
 


To add a new pool click on the "Add pool" button:


The pool settings are:
  • Pool name: Allows you to name your connection pool. This will also become the "database" or "dbname" connection parameter for your pooled client connections.
  • Database: Allows you to choose which database to connect to. Each pool can only connect to a single database.
  • Username: Selects which database username to use when connecting to the backend database.
  • Pool mode: Refers to the pooling mode descried in more detail earlier in this article.
  • Pool size: How many PostgreSQL server connections can this pool use at a time.

For the purposes of this tutorial we'll name the pool as "mypool" and set the pool size as 1 and the pool mode as "statement". Confirming the settings by clicking "Add pool" will immediately create the pool and the pool list is updated:


Clicking the "Info" button next to the pool information shows you the database connection settings for this pool. Note that PGBouncer pools are available under a different port number from the regular unpooled PostgreSQL server port. Both pooled and unpooled connections can be used at the same time.



Verifying the connection pool


We can use the psql command-line client to verify that the pooling works as supposed:

From terminal #1:

$ psql <pool-uri>

From terminal #2:

$ psql <pool-uri>
Now we have two open client connections to the PGBouncer pooler. Let's verify that each connection is able access the database:

Terminal #1:

mypool=> SELECT 1;
 ?column?
──────────
        1
(1 row)


Terminal #2:

mypool=> SELECT 1;
 ?column?
──────────
        1
(1 row)


Both connections respond as they should. Now let's check how many connections there are to the PostgreSQL backend  database:

Terminal #1:

mypool=> SELECT COUNT(*) FROM pg_stat_activity WHERE usename = 'avnadmin';
 count
-------
     1
(1 row)

And as we can see from the pg_stat_activity output the two psql sessions use the same PostgreSQL server database connection.

Summary


The more client connections you have to your database, the more useful connection pooling becomes. Aiven PostgreSQL makes using connection pooling an easy task and migrating from non-pooled connections to pooled connections is just a matter of gradually changing your client-side connection database name and port number!

Try PostgreSQL 9.6 for free in Aiven


Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer works for all of our services: PostgreSQL, Redis, InfluxDB, Grafana, Elasticsearch and Kafka!

Go to https://aiven.io/ to get started!

Cheers,
Team Aiven

2016-12-20

New AWS regions, Kafka topic management and PostgreSQL connection pooling

We're happy to announce the immediate availability of two new Amazon Web Services regions.  The new regions are in London and Montreal (Canada Central).  All Aiven services are available in the two new regions which brings the total number of available Aiven regions to 51!

Kafka topic management


As mentioned in our previous blog post, we've rolled out a new Kafka topic management interface for our web console.  The new console is immediately available to all current and new Kafka users.

PostgreSQL connection pooling


Connection pooling, also mentioned in the previous blog post, is now available in the web console.  Connection pooling is available in all of our Startup, Business and Premium plans.  The feature will be covered in depth in an upcoming blog post.

Try the new clouds and features for free


Remember that trying out Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer allows you to try out our new clouds and features, plus any existing clouds and services already available in Aiven.

Go to https://aiven.io/ to get started!

Cheers,
Team Aiven

2016-11-30

Aiven in Google Tokyo & other updates

We're happy to announce that we've recently added the new Google Cloud Northeast Asia (Tokyo) region to Aiven.  All Aiven services are now available in the new region which is our fourth supported cloud in Japan and 12th cloud in Asia.

We've also been busy enhancing our Kafka service with new topic management APIs and user inteface which we hope to roll out during the next couple of weeks.  On the PostgreSQL side we've just finished the development of our connection pooling system and PGBouncer -based connection pooling will be available in Aiven after our next service update.  The use cases and benefits of these new features will be covered in upcoming blog posts.

The rest of the week the Aiven founding team is busy meeting new and old customers and partners at the Slush startup conference here in Helsinki, Finland.

Cheers,
Team Aiven

2016-10-01

PostgreSQL 9.6 now available in AWS, Azure, Google Cloud with Aiven

PostgreSQL 9.6 is now available in Aiven, bringing the latest and most advanced PostgreSQL release to all the major clouds around the world: all Amazon Web Services, DigitalOcean, Google Cloud, Microsoft Azure and UpCloud regions are supported.



9.6 is the latest version of PostgreSQL incorporating a year's development effort and introducing major improvements to handling large amounts of data with new features like parallel query and reducing the need to continuously vacuum unmodified tables.

Aiven also lets you to fork an existing PostgreSQL 9.5 database into a new PostgreSQL 9.6 service. We're also planning to introduce in-place one-click upgrades of existing PostgreSQL 9.5 databases in the near future.

Try PostgreSQL 9.6 for free in Aiven

Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer works for all of our services: PostgreSQL, Redis, InfluxDB, Grafana, Elasticsearch and Kafka!

Go to https://aiven.io/ to get started!

Cheers,
Team Aiven

2016-09-23

User management improvements and new Azure regions launched

In this week's updates to Aiven we've greatly enhanced project membership management as well as launched Aiven in five new Azure regions.

 

Project membership improvements

Project membership management improvements make it easier to share the ownership and management responsibilities of an Aiven project between multiple users in one organization.  You can now promote other team members to project administrators, allowing them to invite more members to the project and to adjust billing settings.

We've also updated project invite functionality to allow inviting users who haven't yet registered their Aiven accounts.  Such users are invited to sign up to Aiven and once they've signed up the console will display pending invitations for them.  Pending invitations are shown for all users and projects in the console.

These improvements make it easier for multiple users to start collaborating in Aiven.


New Azure regions launched

We're happy to announce immediate availability of Aiven in five new Azure regions:  All Aiven services are now available in Azure Japan West (Osaka), Japan East (Tokyo), East Asia (Hong Kong), Southeast Asia (Singapore) and Brazil South (São Paulo) regions.


Pricing updates

To reflect the cost differences between different cloud providers and regions we have adjusted pricing of new services in Amazon Web Services and Azure clouds.  Pricing in most AWS regions has been increased for new plans; the prices for current Aiven users will stay unchanged until the end of 2016

We have no plans for changing Google Cloud, DigitalOcean or UpCloud pricing in the foreseeable future.

We'd also like to remind you about Aiven's unique feature allowing seamless migrations between cloud providers, making it possible to migrate services to different clouds if needed.


Trying Aiven is free, no credit card required

Our free trial program is still open: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans.

Go to https://aiven.io/ to get started!

We value your feedback

We are always interested in ways of making our service better. Please send your feedback and suggestions via email, Facebook, LinkedIn or using our support system.

2016-09-15

Aiven brings easy, powerful hosted databases to Microsoft Azure

We are proud to announce that Aiven is now available in the Microsoft Azure cloud!

The services available initially at launch in Microsoft Azure are Aiven PostgreSQL, Aiven Redis, Aiven Elasticsearch, Aiven Kafka, Aiven InfluxDB and Aiven Grafana.

Microsoft AzureMicrosoft Azure is a leading global cloud provider and what makes them special is their high number of data centers around the world, currently totaling 30+.

All Aiven services will be available in all generally available Azure regions, bringing nineteen new cloud regions to Aiven and setting the total number of supported data centers to 47, making us the cloud database provider with the widest geographic availability in the world!

The first batch of new cloud regions immediately available are from Azure North America and Europe. Azure Asia and South America regions will follow soon and will be available in the upcoming weeks.

Here's an updated world map showing our supported data center locations. The new Azure regions are the light blue ones:



Microsoft Azure provides numerous services from computing resources to higher level services like machine learning. See azure.microsoft.com for more information about their services.

The new Aiven regions that are immediately available are:

  • United States
    • Iowa - Azure: Central US
    • Virginia - Azure: East US
    • Virginia - Azure: East US 2
    • Illinois - Azure: North Central US
    • Texas - Azure: South Central US
    • California - Azure: West US
    • Washington - Azure: West US 2
    • Wyoming - Azure: West Central US
  • Canada
    • Ontario - Azure: Canada Central
    • Quebec - Azure: Canada East
  • Europe
    • Ireland - Azure: North Europe
    • Netherlands - Azure: West Europe
    • England - Azure: UK South
    • Wales - Azure: UK West
  • Asia
    • Hong Kong - Azure: East Asia (available in the coming weeks)
    • Singapore - Azure: Southeast Asia (available in the coming weeks)
    • Tokyo, Japan - Azure: Japan East (available in the coming weeks)
    • Osaka, Japan - Azure: Japan West (available in the coming weeks)
  • South America
    • Brazil - Azure: Brazil South (available in the coming weeks)
New services can be launched in these regions today and using the Aiven zero-downtime migration, it is also possible to easily migrate existing services to Azure!

All of the Aiven services offer worry-free fully automated DBaaS hosting, including offsite backups, automatic failure recovery and hardened security.

We will continue to expand our Database-as-a-Service offering in both cloud and region support and by adding more services. We are always looking for feedback on what to improve so feel free to let us know if you have ideas on what you'd love to see us support next.

Go to aiven.io to get started! Free $10 credits at registration, no credit card required. Services are billed by the hour.

Last but not least, we would like to thank all of our customers who participated in our beta testing phase!


Cheers,
    Aiven team


2016-08-18

Getting started with Aiven Kafka

Apache Kafka is a high-throughput publish-subscribe message broker, perfect for providing messaging for microservices. Aiven offers fully managed Kafka-as-a-Service in Amazon Web Services, Google Compute Engine, DigitalOcean and UpCloud (Microsoft Azure support is coming during 2016 Q3!).

Apache Kafka


Apache Kafka is a popular open-source publish-subscribe message broker. Kafka is distributed by design and offers scalability, high-throughput and fault-tolerance. Kafka excels in streaming data workloads, ingesting and serving hundreds of megabytes per second from thousands of clients.

Apache Kafka deployment example



Apache Kafka was originally developed by LinkedIn and open sourced in 2011.

Kafka is often used as a central piece in analytics, telemetry and log aggregation workloads, where it is used to capture and distribute event data at very high data rates. It can act as a communications hub for microservices for distributing work over a large cluster of nodes.

With Aiven, we use Kafka as a message bus between our cluster nodes as well as delivering telemetry, statistics and logging data. Kafka's guarantees for message delivery and fault-tolerance allows us to simplify and de-couple service components.

What is Aiven Kafka


Aiven Kafka is our fully managed Kafka service. We take care of the deployment and operational burden of running your own Kafka service, and make sure your cluster stays available, healthy and always up-to-date. We ensure your data remains safe by encrypting it both in transit and at rest on disk.

We offer multiple different plan types with different cluster sizing and capacity, and charge only based on your actual use on an hourly basis. Custom plans for deployments that are larger or have specific needs can also be requested. Aiven also makes it possible to migrate between the plans with no downtime to address changes in your requirements.

Below, I'll walk you through setting up and running with your first Aiven Kafka service.


Getting started with Aiven Kafka


Creating Aiven Kafka service is easy: just select the correct service type from the drop down menu on the new service creation dialog. You'll have the option of selecting three or five node cluster plans with the storage sizing of your choice. The larger node count allows for larger throughput or larger replica factors for mission critical data. If unsure, pick a three node cluster; you can always change the selected plan at a later time.



All Aiven services are offered over SSL encrypted connections for your protection. With Kafka, you're also required to perform client authentication with service certificates we provide. You can find and download these keys and certificates on the connection parameters section on the service details page: access key and certifications plus CA certificate you can use to verify the Aiven endpoint. Store these locally, we'll be referring back to them in code examples below (ca.crt, client.crt, client.key).




Finally, you can create the topics you'd like to use under the topics tab on the service details page. In Kafka terms, topics are logical channels that your send messages to and read them from. Topics themselves are divided into one or more partitions. Partitions can be used to handle larger read/write rates, but do note that Kafka's ordering guarantees are only valid within one partition.

When creating a topic, you can select number of partitions, number of replicas and how many hours the messages are retained in the Kafka logs before deletion. You also can increase the number of partitions at a later time.



That's it! The service is up and running and ready to capture and distribute your messages. Aiven team will take care of the operational burden of your cluster, and ensure it remains available and in use at all times. To utilize the service, we've included code examples in Python and Node.js below. Just make sure to replace the value of bootstrap_servers below with the service URL from the service details page. Also, verify that the SSL settings below point to the actual key and certificate files downloaded earlier.

Accessing Aiven Kafka in Python


Producing messages - Kafka term for sending them:

from kafka import KafkaProducer

producer = KafkaProducer(
    bootstrap_servers="getting-started-with-kafka.htn-aiven-demo.aivencloud.com:17705",
    security_protocol="SSL",
    ssl_cafile="ca.crt",
    ssl_certfile="client.crt",
    ssl_keyfile="client.key",
)

for i in range(1, 4):
    message = "message number {}".format(i)
    print("Sending: {}".format(message))
    producer.send("demo-topic", message.encode("utf-8"))

# Wait for all messages to be sent
producer.flush()

Consuming or receiving the same:

from kafka import KafkaConsumer

consumer = KafkaConsumer(
    "demo-topic",
    bootstrap_servers="getting-started-with-kafka.htn-aiven-demo.aivencloud.com:17705",
    client_id="demo-client-1",
    group_id="demo-group",
    security_protocol="SSL",
    ssl_cafile="ca.crt",
    ssl_certfile="client.crt",
    ssl_keyfile="client.key",
)

for msg in consumer:
    print("Received: {}".format(msg.value))

Output from the producer above:

$ python kafka-producer.py
Sending: message number 1
Sending: message number 2
Sending: message number 3

And the consuming side:

$ python kafka-consumer.py
Received: message number 1
Received: message number 2
Received: message number 3

 

Accessing Aiven Kafka in Node.js


Here's a Node.js example utilizing node-rdkafka module:

var Kafka = require('node-rdkafka');

var producer = new Kafka.Producer({
    'metadata.broker.list': 'getting-started-with-kafka.htn-aiven-demo.aivencloud.com:17705',
    'security.protocol': 'ssl',
    'ssl.key.location': 'client.key',
    'ssl.certificate.location': 'client.crt',
    'ssl.ca.location': 'ca.crt',
    'dr_cb': true
});

producer.connect();

producer.on('ready', function() {
    var topic = producer.Topic('demo-topic', {'request.required.acks': 1});
    producer.produce({
        message: new Buffer('Hello world!'),
        topic: topic,
    }, function(err) {
        if (err) {
            console.log('Failed to send message', err);
        } else {
            console.log('Message sent successfully');
        }
    });
});

And the consuming side:

var Kafka = require('node-rdkafka');

var consumer = new Kafka.KafkaConsumer({
    'metadata.broker.list': 'getting-started-with-kafka.htn-aiven-demo.aivencloud.com:17705',
    'group.id': 'demo-group',
    'security.protocol': 'ssl',
    'ssl.key.location': 'client.key',
    'ssl.certificate.location': 'client.crt',
    'ssl.ca.location': 'ca.crt',
});

var stream = consumer.getReadStream('demo-topic');

stream.on('data', function(data) {
    console.log('Got message:', data.message.toString());
});



Trying Aiven is free, no credit card required


Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans. The offer works for all of our services: PostgreSQL, Redis, InfluxDB, Grafana, Elasticsearch and Kafka!

Go to https://aiven.io/ to get started!


Cheers,

    Team Aiven

2016-07-22

Backing up tablespaces and streaming WAL with PGHoard

We've just released a new version of PGHoard, the PostgreSQL cloud backup tool we initially developed for Aiven and later open sourced.

Version 1.4.0 comes with the following new features:
  • Support for PostgreSQL 9.6 beta3
  • Support for backing up multiple tablespaces
  • Support for StatsD and DataDog metrics collection
  • Basebackup restoration now shows download progress
  • Experimental new WAL streaming mode walreceiver, which reads the write-ahead log data directly from the PostgreSQL server using the streaming replication protocol
  • New status API in the internal REST HTTP server
Please see our previous blog post about PGHoard for more information about the tool and a guide for deploying it.

Backing up multiple tablespaces

This is the first version of PGHoard capable of backing up multiple tablespaces. Multiple tablespaces require using the new local-tar backup option for reading files directly from the disk instead of streaming them using pg_basebackup as pg_basebackup doesn't currently allow streaming multiple tablespaces without writing them to the local filesystem.

The current version of PGHoard can utilize the local-tar backup mode only on a PG master server, PostgreSQL versions prior to 9.6 don't allow users to run the necessary control commands on a standby server without using the pgespresso extension. pgespresso also required fixes which we contributed to support multiple tablespaces - once a fixed version has been released we'll add support for it to PGHoard.

The next version of PGHoard, due out by the time of PostgreSQL 9.6 final release, will support local-tar backups from standby servers, natively when running 9.6 and using the pgespresso extension when running older versions with the latest version of the extension.

A future version of PGHoard will support backing up and restoring PostgreSQL basebackups in parallel mode when using the local-tar mode.  This will greatly reduce the time required for setting up a new standby server or restoring a system from backups.

Streaming replication support

This version adds experimental support for reading PostgreSQL's write-ahead log directly from the server using the streaming replication protocol which is also used by PostgreSQL's native replication and related tools such as pg_basebackup and pg_receivexlog. The functionality currently depends on an unmerged psycopg2 pull request which we hope to see land in a psycopg2 release soon.

While the walreceiver mode is still experimental it has a number of benefits over other methods of backing up the WAL and allows implementing new features in the future: temporary, uncompressed, files as written by pg_receivexlog are no longer needed saving disk space and I/O and incomplete WAL segments can be archived at specified intervals or, for example, whenever a new COMMIT appears in the WAL stream.

New contributors

The following people contributed their first patches to PGHoard in this release:
  • Brad Durrow
  • Tarvi Pillessaar

PGHoard in Aiven.io

We're happy to talk more about PGHoard and help you set up your backups with it.  You can also sign up for a free trial of our Aiven.io PostgreSQL service where PGHoard will take care of your backups.


Cheers,
Team Aiven

2016-07-19

New, bigger InfluxDB plans now available

We're happy to announce the immediate availability of new, bigger InfluxDB plans in Aiven. The new plans allow you to store up to 750 gigabytes of time-series data in a fully-managed InfluxDB database.

InfluxDB can be used to store time-series data from various data sources using data collection tools like Telegraf. The collected data is typically operating system and application metric data like CPU utilization and disk space usage, but we've also for example helped set up InfluxDB to host time-series data for an industrial manufacturing line where our Grafana service is used for data visualization.

Our InfluxDB Startup-4 plan, available in all AWS, Google Cloud, UpCloud and DigitalOcean regions, was expanded to 16 gigabytes of storage space and we've announced all new Startup-8, 16, 32 and 64 plans available in all AWS, Google Cloud and UpCloud regions with CPU counts ranging from 1 to 16, RAM from 4 to 64 gigabytes and storage space between 50 and 750 gigabytes.

Trying Aiven is free, no credit card required

Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans.

Go to https://aiven.io/ to get started!


Cheers,

    Team Aiven

2016-07-13

Aiven Kafka now publicly available!

In a world filled with microservices we're delighted to announce yet another expansion of the Aiven service portfolio in the form of Aiven Kafka. Aiven Kafka adds streaming data capabilities in the form of a distributed commit log. For the last three months we've been offering Apache Kafka in private beta and now we're making it publicly available!


Aiven Kafka is a service that can be used to ingest and read back large quantities of log event data. This allows you to write your whole event stream durably in a fire hose like fashion and then process it at your leisure. Kafka is being used in some of the largest companies on the planet for many mission-critical workloads. Besides using it for streaming data you can also use it as a message broker connecting your myriad services with each other.

Historically Kafka itself and especially its reliance on Apache ZooKeeper has made its setup require considerable time and effort and requiring skilled staff to maintain and operate it. Aiven Kafka is now making it trivially easy to have your own managed Kafka cluster.

The easy streaming log service for your microservices

Our Web Console allows you to launch Aiven Kafka in any of our supported clouds and regions with a couple of clicks. All Aiven services are available in all Amazon Web Services, Google Cloud, DigitalOcean and UpCloud regions allowing you to launch services near you in minutes.



Aiven Kafka is a first-class service in Aiven, meaning we'll take care of fault-tolerance, monitoring and maintenance operations on your behalf. In case you need to get more performance out of your Kafka cluster, you can simply expand your cluster by selecting a bigger plan and all your data will be automatically migrated to beefier nodes without any downtime.

Our startup Kafka plan

If you want to try out Kafka on a modestly powered three node cluster and don't need Kafka REST our Startup-2 plan will get you started. After getting started you can easily later upgrade to a larger plan if needed.
  • Startup-2: 1 CPU, 2 GB RAM, 30 GB SSD at $200 / month ($0.274 / hour)

Our three node business Kafka plans

Our Business plans are three node clusters which are deployed alongside Kafka REST to allow the use of HTTP REST calls for interacting with Kafka.
  • Business-4: 1 CPU, 4 GB RAM, 200 GB SSD at $500 / month ($0.685 / hour)
  • Business-8: 2 CPU, 8 GB RAM, 400 GB SSD at $1000 / month ($1.370 / hour)
  • Business-16: 4 CPU, 16 GB RAM, 800 GB SSD at $2000 / month ($2.740 / hour)
 

Highly-available five node premium Kafka plans

If you want an even higher level of reliability and performance our Premium Aiven Kafka plans are made for this. They all come with five (or more for custom plans) Kafka broker nodes. 
  • Premium-4: 1 CPU, 4 GB RAM, 200 GB SSD at $800 / month ($1.096 / hour)
  • Premium-8: 2 CPU, 8 GB RAM, 400 GB SSD at $1600 / month ($2.192 / hour)
  • Premium-16: 4 CPU, 16 GB RAM, 800 GB SSD at $3200 / month ($4.384 / hour)
Also if you need to find larger or otherwise customized plans, please don't hesitate to contact us.

 

Trying Aiven is free, no credit card required

Remember that trying Aiven is free: you will receive US$10 worth of free credits at sign-up which you can use to try any of our service plans.

Go to https://aiven.io/ to get started!

We value your feedback

We are always interested in ways of making our service better. Please send your feedback and suggestions via email, Facebook, LinkedIn or using our support system.

 Cheers,

   Aiven Team

2016-06-17

Even bigger PostgreSQL plans now available

We've just launched the new 64, 120 and 160 style PostgreSQL plans in multiple clouds. These new plans allow you to run larger and larger PostgreSQL instances in the cloud. The new plans are available in our Startup, Business and Premium flavors supporting various levels of high-availability. The number after the plan flavor designates the RAM available for database in use in the plan. CPU count and storage also grow with larger plans giving you more resources to run your transactions.


All of the new plans are available in both Amazon Web Services and Google Cloud, and the 64 and 120 plans are also available in UpCloud. We hope we can offer bigger plans in DigitalOcean in the near future as well.

The pricing for our new plans is available on our PostgreSQL service page.  Remember that trying out Aiven is free, you'll receive US$10 free credits on sign up which allows you to run one of our huge new plans for some hours, or a small instance for a couple of weeks.

Cheers,
Team Aiven

2016-05-12

Help test PostgreSQL 9.6 via Aiven

The first beta of the upcoming PostgreSQL 9.6 major release was announced today with a number of important new features such as parallel queries, enhanced foreign data wrappers and various performance improvements for large databases.

To make it easier to test the new beta as well as to validate your applications compatibility with PostgreSQL 9.6 we've added support for it in Aiven.  When creating a new service in the console you can now select "PostgreSQL 9.6 Beta" as your service type, but note that this is a beta version and there are no guarantees about data durability.

When the PostgreSQL 9.6 final release comes out - expected this September - we'll provide you with one-click minimal downtime upgrade functionality allowing you to upgrade your current PostgreSQL 9.5 production databases to the new version with little effort.  We'll also make it possible to "fork" a 9.5 production database into a new 9.6 database allowing you to perform further validation of the new version without disrupting the production system.

Test PostgreSQL 9.6 in Aiven for free

You can sign up for a free trial of the Aiven Cloud Database service at https://aiven.io/ and try out PostgreSQL 9.6 beta using the US$10 worth of free credits we provide you at sign-up.  Aiven PostgreSQL is available in all Amazon Web Services, Google Cloud, DigitalOcean and UpCloud regions.


Cheers,
Team Aiven

2016-04-28

PostgreSQL cloud backups with PGHoard

PGHoard is the cloud backup and restore solution we're using in Aiven. We started PGHoard development in early 2015 when the Aiven project was launched as a way to provide real-time streaming backups of PostgreSQL to a potentially untrusted cloud object storage.

PGHoard has an extensible object storage interface, which currently works with the following cloud object stores:
  • Amazon Web Services S3
  • Google Cloud Storage
  • OpenStack Swift
  • Ceph's RADOSGW utilizing either the S3 or Swift drivers 
  • Microsoft Azure Storage (currently experimental)
  •  

Data integrity

PostgreSQL backups consist of full database backups, basebackups, plus write ahead logs and related metadata, WAL. Both basebackups and WAL are required to create and restore a consistent database.

PGHoard handles both the full, periodic backups (driving pg_basebackup) as well as streaming the write-ahead-log of the database.  Constantly streaming WAL as it's generated allows PGHoard to restore a database to any point in time since the oldest basebackup was taken.  This is used to implement Aiven's Database Forks and Point-in-time-Recovery as described in our PostgreSQL FAQ.

To save disk space and reduce the data that needs to be sent over the network (potentially incurring extra costs) backups are compressed by default using Google's Snappy, a fast compression algorithm with a reasonable compression ratio. LZMA (a slower algorithm with very high compression ratio) is also supported.

To protect backups from unauthorized access and to ensure their integrity PGHoard can also transparently encrypt and authenticate the data using RSA, AES and SHA256.  Each basebackup and WAL segments gets a unique random AES key which is encrypted with RSA.  HMAC-SHA256 is used for file integrity checking.

Restoration is key

As noted in the opening paragraph, PGHoard is a backup and restore tool: backups are largely useless unless they can be restored.  Experience tells us that backups, even if set up at some point, are usually not restorable unless restore is routinely tested, but experience also shows that backup restoration is rarely practiced unless it's easy to do and automate.

This is why PGHoard also includes tooling to restore backups, allowing you to create new master or standby databases from the object store archives.  This makes it possible to set up a new database replica with a single command, which first restores the database basebackup from object storage and then sets up PostgreSQL's recovery.conf to fetch the remaining WAL files from the object storage archive and optionally connect to an existing master server after that.

Preparing PostgreSQL for PGHoard

First, we will need to create a replication user account. We'll just use the psql command-line client for this:

postgres=# CREATE USER backup WITH REPLICATION PASSWORD 'secret';
CREATE ROLE


We also need to allow this new user to make connections to the database. In PostgreSQL this is done by editing the pg_hba.conf configuration file and adding a line something like this:

host  replication  backup  127.0.0.1/32  md5

We'll also need to ensure our PostgreSQL instance is configured to allow WAL replication out from the server and it has the appropriate wal_level setting. We'll edit postgresql.conf and edit or add the following settings:

max_wal_senders = 2  # minimum two with pg_receivexlog mode!
wal_level = archive  # 'hot_standby' or 'logical' are also ok


Finally, since we have modified PostgreSQL configuration files, we'll need to restart PostgreSQL to take the new settings into use by running "pg_ctl restart", "systemctl restart postgresql" or "service postgresql restart", etc depending on the Linux distribution being used.  Note that it's not enough to "reload" PostgreSQL in case the WAL settings were changed.

Now we are ready on the PostgreSQL side and can move on to PGHoard.

Installing PGHoard

PGHoard's source distribution includes packaging scripts for Debian, Fedora and Ubuntu.  Instructions for building distribution specific packages can be found in the PGHoard README.  As PGHoard is a Python package it can also be installed on any system with Python 3 by running "pip3 install pghoard".

Taking backups with PGHoard

PGHoard provides a number of tools that can be launched from the command-line:
  • pghoard - The backup daemon itself, can be run under systemd or sysvinit
  • pghoard_restore - Backup restoration tool
  • pghoard_archive_sync - Command for verifying archive integrity
  • pghoard_create_keys - Backup encryption key utility
  • pghoard_postgres_command - Used as PostgreSQL's archive_command and restore_command
First, we will launch the pghoard daemon to start taking backups. pghoard requires a small JSON configuration file that contains the settings for the PostgreSQL connection and for the target backup storage. We'll name the file pghoard.json:

{
    "backup_location": "./metadata",
    "backup_sites": {
        "example-site": {
            "nodes": [
                {
                    "host": "127.0.0.1",
                    "password": "secret",
                    "port": 5432,
                    "user": "backup"
                }
            ],
            "object_storage": {
                "storage_type": "local",
                "directory": "./backups"
            }
        }
    }
}


In the above file we just list where pghoard keep's its local working directory (backup_location), our PostgreSQL connection settings (nodes) and where we want to store the backups (object_storage). In this example we'll just write the backup files to a local disk instead of a remote cloud object storage.

Then we just need to run the pghoard daemon and point it to our configuration file:

$ pghoard --short-log --config pghoard.json
DEBUG   Loading JSON config from: './pghoard.json', signal: None
INFO    pghoard initialized, own_hostname: 'ohmu1', cwd: '/home/mel/backup'
INFO    Creating a new basebackup for 'example-site' because there are currently none
INFO    Started: ['/usr/bin/pg_receivexlog', '--status-interval', '1', '--verbose', '--directory', './metadata/example-site/xlog_incoming', '--dbname', "dbname='replication' host='127.0.0.1' port='5432' replication='true' user='backup'"], running as PID: 8809
INFO    Started: ['/usr/bin/pg_basebackup', '--format', 'tar', '--label', 'pghoard_base_backup', '--progress', '--verbose', '--dbname', "dbname='replication' host='127.0.0.1' port='5432' replication='true' user='backup'", '--pgdata', './metadata/example-site/basebackup_incoming/2016-04-28_0'], running as PID: 8815, basebackup_location: './metadata/example-site/basebackup_incoming/2016-04-28_0/base.tar'
INFO    Compressed 16777216 byte file './metadata/example-site/xlog_incoming/000000010000000000000025' to 805706 bytes (4%), took: 0.056s
INFO    'UPLOAD' transfer of key: 'example-site/xlog/000000010000000000000025', size: 805706, took 0.003s
INFO    Ran: ['/usr/bin/pg_basebackup', '--format', 'tar', '--label', 'pghoard_base_backup', '--progress', '--verbose', '--dbname', "dbname='replication' host='127.0.0.1' port='5432' replication='true' user='backup'", '--pgdata', './metadata/example-site/basebackup_incoming/2016-04-28_0'], took: 0.331s to run, returncode: 0
INFO    Compressed 16777216 byte file './metadata/example-site/xlog_incoming/000000010000000000000026' to 797357 bytes (4%), took: 0.057s
INFO    'UPLOAD' transfer of key: 'example-site/xlog/000000010000000000000026', size: 797357, took 0.011s
INFO    Compressed 80187904 byte file './metadata/example-site/basebackup_incoming/2016-04-28_0/base.tar' to 15981960 bytes (19%), took: 0.335s
INFO    'UPLOAD' transfer of key: 'example-site/basebackup/2016-04-28_0', size: 15981960, took 0.026s



PGHoard automatically connected to the PostgreSQL database server, noticed that we don't have any backups and immediately created a new basebackup and started the realtime streaming of WAL files (which act as incremental backups). Each file stored in the backups was first compressed for optimizing the transfer and storage costs.

As long as you keep PGHoard running, it will make full backups using the default schedule (once per 24 hours) and continuously stream WAL files.

Looking at the contents of the "backups" directory, we see that our backups now contain a full database backup plus a couple of WAL files, and some metadata for each of the files:

$ find backups/ -type f
backups/example-site/xlog/000000010000000000000025
backups/example-site/xlog/000000010000000000000025.metadata
backups/example-site/xlog/000000010000000000000026
backups/example-site/xlog/000000010000000000000026.metadata
backups/example-site/basebackup/2016-04-28_0
backups/example-site/basebackup/2016-04-28_0.metadata


Available backups can be listed with the pghoard_restore tool:

$ pghoard_restore list-basebackups --config pghoard.json
Available 'example-site' basebackups:

Basebackup                                Backup size    Orig size  Start time
----------------------------------------  -----------  -----------  --------------------
example-site/basebackup/2016-04-28_0            15 MB        76 MB  2016-04-28T06:40:46Z


Looks like we are all set. Now let's try restore!

Restoring a backup

Restoring a backup is a matter of running a single command:

$ pghoard_restore get-basebackup --config pghoard.json --target-dir restore-test
Found 1 applicable basebackup

Basebackup                                Backup size    Orig size  Start time
----------------------------------------  -----------  -----------  --------------------
example-site/basebackup/2016-04-28_0            15 MB        76 MB  2016-04-28T06:40:46Z
    metadata: {'compression-algorithm': 'snappy', 'start-wal-segment': '000000010000000000000026', 'pg-version': '90406'}

Selecting 'example-site/basebackup/2016-04-28_0' for restore
Basebackup complete.
You can start PostgreSQL by running pg_ctl -D restore-test start
On systemd based systems you can run systemctl start postgresql
On SYSV Init based systems you can run /etc/init.d/postgresql start


The pghoard_restore command automatically chooses the latest available backup, downloads, unpacks (and decompresses and decrypts, when those options are used) it to the specified target directory. The end result will be a complete PostgreSQL data directory (e.g. something like /var/lib/postgresql/9.5/main or /var/lib/pgsql/data, depending on the distro), ready to be used by a PostgreSQL instance.

There are more command-line options for more detailed control over the restoration process, for example restoring to a particular point in time or transaction (PITR) or choosing whether the restored database will be acting as a master or a standby.

Backup encryption

In order to encrypt our backups, we'll need to create an encryption key pair. PGHoard provides a handy command for automatically creating a key pair and storing it into our configuration file:

$ pghoard_create_keys --key-id example --config pghoard.json
Saved new key_id 'example' for site 'example-site' in 'pghoard.json'
NOTE: The pghoard daemon does not require the 'private' key in its configuration file, it can be stored elsewhere to improve security


Note that in most cases you will want to extract the private key away from the configuration file and store it safely elsewhere away from the machine that makes the backups. The pghoard daemon only needs the encryption public key during normal operation. The private key is only required by the restore tool and the daemon while restoring a backup.

Uploading backups to the cloud

Sending backups to an object storage in the cloud is simple: we just need the cloud's access credentials and we'll modify the object_storage section pghoard.json:

            "object_storage": {
                "aws_access_key_id": "XXX",
                "aws_secret_access_key": "XXX",
                "bucket_name": "backups",
                "region": "eu-central-1",
                "storage_type": "s3"
            }


Now when we restart pghoard, the backups are sent to AWS S3 in Frankfurt:

$ pghoard --short-log --config pghoard.json
DEBUG   Loading JSON config from: './pghoard.json', signal: None
INFO    pghoard initialized, own_hostname: 'ohmu1', cwd: '/home/mel/backup'
INFO    Started: ['/usr/bin/pg_receivexlog', '--status-interval', '1', '--verbose', '--directory', './metadata/example-site/xlog_incoming', '--dbname', "dbname='replication' host='127.0.0.1' port='5432' replication='true' user='backup'"], running as PID: 8001
INFO    Creating a new basebackup for 'example-site' because there are currently none
INFO    Started: ['/usr/bin/pg_basebackup', '--format', 'tar', '--label', 'pghoard_base_backup', '--progress', '--verbose', '--dbname', "dbname='replication' host='127.0.0.1' port='5432' replication='true' user='backup'", '--pgdata', './metadata/example-site/basebackup_incoming/2016-04-28_1'], running as PID: 8014, basebackup_location: './metadata/example-site/basebackup_incoming/2016-04-28_1/base.tar'
INFO    Ran: ['/usr/bin/pg_basebackup', '--format', 'tar', '--label', 'pghoard_base_backup', '--progress', '--verbose', '--dbname', "dbname='replication' host='127.0.0.1' port='5432' replication='true' user='backup'", '--pgdata', './metadata/example-site/basebackup_incoming/2016-04-28_1'], took: 0.350s to run, returncode: 0
INFO    Compressed and encrypted 16777216 byte file './metadata/example-site/xlog_incoming/000000010000000000000027' to 799445 bytes (4%), took: 0.406s
INFO    Compressed and encrypted 16777216 byte file './metadata/example-site/xlog_incoming/000000010000000000000028' to 797784 bytes (4%), took: 0.137s
INFO    Compressed and encrypted 80187904 byte file './metadata/example-site/basebackup_incoming/2016-04-28_1/base.tar' to 15982372 bytes (19%), took: 0.417s
INFO    'UPLOAD' transfer of key: 'example-site/xlog/000000010000000000000028', size: 797784, took 0.885sINFO    'UPLOAD' transfer of key: 'example-site/xlog/000000010000000000000027', size: 799445, took 1.104s
INFO    'UPLOAD' transfer of key: 'example-site/basebackup/2016-04-28_1', size: 15982372, took 4.911s



The restore tool works the same way regardless of where the backups are stored:

$ pghoard_restore list-basebackups --config pghoard.json
Available 'example-site' basebackups:

Basebackup                                Backup size    Orig size  Start time
----------------------------------------  -----------  -----------  --------------------
example-site/basebackup/2016-04-28_1            15 MB        76 MB  2016-04-28T09:39:37Z



PostgreSQL 9.2+ and Python 3.3+ required

Today we released PGHoard version 1.2.0 with support for Python 3.3 and PostgreSQL 9.2 plus enhanced support for handling network outages.  These features were driven by external users, in Aiven we always use the latest PostgreSQL versions (9.5.2 at the time of writing) and access object storages near the database machines.


PGHoard in Aiven.io

We're happy to talk more about PGHoard and help you set up your backups with it.  You can also sign up for a free trial of our Aiven.io PostgreSQL service where PGHoard will take care of your backups.


Cheers,
Team Aiven

2016-04-26

SSL-enabled custom domains in Aiven Grafana and Kibana


We've just rolled out a new feature in Aiven: custom domains with valid SSL certificates for web frontends.

Last week we introduced valid SSL certificates from the Let's Encrypt for our Grafana and Kibana (Elasticsearch frontend) services in Aiven.  This allows your browser to immediately recognize and trust the web frontend services you launch from Aiven which we host at domains like grafana.my-project.aivencloud.com.  Previously the services' certificates were signed by Aiven's own CA which caused web browsers to prompt a warning.


With the launch of the Custom Domains feature today you can create a CNAME in your own domain pointing to your Aiven service.  Once the CNAME is set up you can register it in Aiven using our Web Console and we'll automatically set up a valid SSL certificate for it.  This way you can set up secure services like grafana.example.com and search.example.com in a just a few clicks in our console.

Try SSL-enabled custom domains for free

You can sign up for a free trial of our services at Aiven.io and try all of our services with US$10 worth of free credits, including ones with SSL certificates for custom domains.

The SSL certificate feature is available in all Startup, Business and Premium plans for Grafana and Elasticsearch.  If the Let's Encrypt project lifts its SSL certificate creation limits we may be able to provide this service also for Hobbyist plans in the future.

We value your feedback

We are always interested in ways of making our service better. Please send your feedback and suggestions via email, Facebook, LinkedIn or using our support system.

Cheers,

Team Aiven

2016-04-19

Monitoring, metrics collection and visualization using InfluxDB and Grafana

In addition to providing you the Aiven service, our crew also does a fair amount of software consulting in the cloud context. A very common topic we are asked to help on is metrics collection and monitoring. Here's a walk through on how to utilize InfluxDB and Grafana services for one kind of solution to the problem. We offer both as a managed Aiven service for quick and easy adoption.

Case-example

As an example, here's a dashboard screenshot from a pilot project we built recently for our customer:



This particular instance is used to monitor the health of quality assurance system on an industrial manufacturing line. The system being monitored uses IP based video cameras coupled with IO triggers to record a JPEG image of the artifacts passing through various stages of processing steps. The resulting dashboard allows verification that the system is working properly with a single glance.

On the top of the dashboard you'll see a simple reading of temperature sensor of the device. Any large deviation from the norm would be a good warning of oncoming hardware fault:




The next plotted metric is the size of the JPEG compressed image from the imaging device:




Interestingly, this relatively simple metric reveals a lot about the health of both the sensor and any lenses and lighting sources involved. Due to the nature of JPEG encoding, the frame and the size varies slightly even in rather static scenes, so it makes a good quick overall indicator that the component is running fine and returning up-to-date content.

The two graphs at the bottom track the last update time from each of the camera feeds and each of the IO trigger services respectively:
  

 
Here, we expect each of the cameras to update several times a second. The IO triggers are interrogated in a long-poll mode with a timeout of 15 seconds. These limits yield natural maximum allowable limits for monitoring and alerting purposes. In fact, the left hand side shows two readings that correlate with temporary network glitches.


Building blocks

The visualization and dashboard tool shown above is Grafana. The underlying storage for the telemetry data is InfluxDB. In this case, we utilize Telegraf as a local StatsD compatible collection point for capturing and transmitting the data securely into InfluxDB instance. And finally, we use a number of taps and sensors across the network that feed the samples to Telegraf using StatsD client libraries in Node.js, Python and Java based on the component.

In this project we are using Aiven InfluxDB and Aiven Grafana hosted services, but any other InfluxDB / Grafana should work more or less the same way.

InfluxDB - The metrics database

We start by launching an InfluxDB service in Aiven:



The service is automatically launched in a minute or so.

InfluxDB is a time-series database with some awesome features:
  • Adaptive compression algorithms allow storing huge numbers of data points
  • Individual metrics data points can be tagged with key=value tags and queried based on them
  • Advanced query language allows queries whose output data requires little or no post-processing
  • It is FAST!


Telegraf - The metrics collector

Next we will need the connection parameters to our InfluxDB instance. The necessary information required (hostname, username, password, etc.) for connecting our Telegraf collecting agent to InfluxDB can be found from the from the service overview page:



We typically run a single Telegraf container per environment. In order to make Telegraf talk to our InfluxDB and to accept StatsD input, we will need to modify its configuration file telegraf.conf a little bit and add the following sections:


    [outputs]
        [outputs.influxdb]
        url = "https://teledb.htn-aiven-demo.aivencloud.com:21950"
        database = "dbb253c1e025704a4494f3f65412b70e30"
        username = "usr2059f5ef88fb46e49bd1f5fd0d464d80"
        password = "password_goes_here"
        ssl_ca = "/etc/telegraf/htn-aiven-demo-ca.crt"
        precision = "s" 
[inputs]
    [inputs.statsd]
    service_address = "127.0.0.1:8125"
    delete_gauges = true
    delete_counters = true
    delete_sets = false
    delete_timings = true
    percentiles = [90]
    allowed_pending_messages = 10000
    percentile_limit = 1000

We want our InfluxDB connection to be secure against man-in-the-middle attacks, so we have included the service's CA certificate in the configuration file. This will force the InfluxDB server to prove its identity to our Telegraf client. The certificate can be downloaded from the Aiven web console:



Here's an example StatsD code blob for Node.js component:
    var statsd = require('node-statsd')
    var statsd_client = new statsd({
        host: '<telegraf_ip>',
        port: 8125,
    });
    statsd_client.gauge('image_size,source=30', 48436,
        function(error, bytes) {
            statsd_client.close();
        }
    );
The StatsD UDP protocol uses super simple textual message format and sending a metric takes few CPU cycles, so even a high request-throughput server can transmit metrics per request processed, without hurting the overall performance. The StatsD receiver in Telegraf parses these incoming metrics messages and consolidates the metrics, typically storing data at a much slower pace in to the metrics databases. This really helps keeping both the source's software's and the metrics database's load levels under control.

In the above code sample, we use Telegraf's StatsD extension for tagging support with the source=20 parameter. This handy little feature is what allows us to easily slice and display the collected metrics by each sensor or just plot all metrics, regardless of the source sensor. This is one of the killer features of InfluxDB and Telegraf!

OK, so now we are transmitting metrics from our application thru the Telegraf daemon to our InfluxDB database. Next up is building a Grafana dashboard that visualizes the collected data.

Grafana - The dashboard

We launch our Grafana from the Aiven console by creating a new service:



Normally an InfluxDB needs to be manually added as a data source in Grafana, however in this case we can skip that step as InfluxDB and Grafana services launched under the same project in Aiven are automatically configured to talk to each other.

We like Grafana a lot because it makes it simple to define visually appealing, yet useful graphs and it integrates with InfluxDB well. Grafana has a user-friendly query builder specifically for building queries for InfluxDB, and with a little practice it takes little time to conjure fabulous charts from almost any source data.

The Grafana web URL, username and password are available on the service overview page:




Opening Grafana in the browser, logging in with the credentials from above and defining a simple graph with an InfluxDB query editor... PROFIT!



That's it for now. Getting application metrics delivered from the application to a pretty dashboard doesn't take much effort nowadays!

What next?

We use Telegraf, InfluxDB and Grafana rather extensively in our own Aiven monitoring infrastructure. However, we have add a couple more components, such as Apache Kafka, to the stack, but that is a topic for an upcoming blog post. Stay tuned! :-)


Hosted InfluxDB and Grafana at Aiven.io

InfluxDB and Grafana are available at our Aiven.io service, you can sign up for a free trial at aiven.io.

Have fun monitoring your apps!

Cheers,

    Team Aiven