Tuesday, April 18, 2017

Continuent -- Product Management Newsletter -- April 2017

Tungsten Release 5.1.0
We are nearly ready with our new 5.1.0 release. It is a y-release, and therefore contains some new functionality that we want to share. In particular:
  • A simplified method for sharing tpm diag information
  • Some minor usability improvements to the thl command
  • The first phase in improving our filters, including a standard JS library for the JS filters, and upstream filter requirements (for example, pkey in heterogeneous appliers)
  • Numerous improvements to our core Hadoop support and associated tools
  • Some further fixes to tpm, net-ssh, and improvements to tpm and Ruby compatibility

The code is currently going through QA right now, so we expect to release in April. For more information, join us on the Webinar on the 19th April, details at the bottom of this newsletter!

Percona Live 2017 - Continuent Talks

Continuent is a Diamond Sponsor for Percona Live in Santa Clara, CA from the 24th-27th April. We have five different sessions if you would like to come along and meet us, or learn more about our products:

  • Continuent is back! But what does Continuent do Anyway? Tuesday, April 25- 9:20AM - 9:45AM, Eero Teerikorpi, CEO and Continuent Customers
  • Real-time Data Loading from MySQL and Oracle into Analytics/Big Data Tuesday, April 25 - 3:50 PM - 4:40 PM at Room 210, MC Brown, VP Products
  • Keynote Panel Wednesday, April 26 - 9:25 AM - 9:45 AM at Main Scenario, MC Brown, VP Products
  • Multi-Site, Multi-Master Done Right Wednesday, April 26 - 1:00 PM - 1:50 PM at Room 210, Matt Lang, Director of Professional Services - Americas
  • Spread the Database Love with Heterogeneous Replication Thursday, April 27 - 3:00 PM - 3:50 PM at Ballroom B, MC Brown, VP Products

We will of course be doing our best to share the info and presentations after the conference, but if you are near Santa Clara, or are visiting the conference anyway, please drop by the booth and meet the team.

Continuent Product Release Webinar on May 3, 2017 at 9am PST/1pm EST/5pm BST

We are holding another product release webinar on the 19th April where we will look back at some of the issues and challenges in 4.0.7 and how we are approaching 5.1.0, and our new release schedule.

We’ll also give you a sneak peek at 5.1.0 features and a look forward a few months to the upcoming releases in this quarter, and maybe even some experimental items.

Also, of course, you’ll get a chance to ask questions.

Improving Replicator Status and Usability

One of my own personal frustrations with the replicator (and at different times clustering) is that when we are executing large transactions and trying to identify what the transaction error is for a specific issue, sometimes it can be very difficult to see what is going on and why the replicator seems ‘stuck’. Furthermore, the status output from the trepctl command is not really very useful, since it shows a lot of information that is not helpful in comparison to the few statistics you really want.

So, we’ve done two things that will appear in an upcoming release (probably 5.2.0). Getting information about the THL is the first step, and the best way to do that is to extend the usability of the thl command. There are a few different elements to this, so first we’ve added more ways to select the THL entries. In 5.1.0, we’ve already added aliases from the ‘-low’ and ‘-high’ options to the more user friendly ‘-from’ and ‘-to’ options:

$ thl list -from 45 -to 65
In 5.2.0 we’ve added -first and -last, and these both default to showing one single transaction (either the first or last stored), or you can supply a number and get the first or last N transactions:

$ thl list -last 5
This helps pin down a transaction more easily, and makes it quicker to view a failing transaction instead of doing ‘thl list -low 575’, then ‘thl list 588’, etc. to get the one you really want.

These positioning requirements also work with other options, so if you extracted or applied a bad or corrupted single transaction (for example, because of a disk full error) and need to remove the last one, you can do:

$ thl purge -last
...without first having to work out which one is the last one!
From a large-transaction standpoint, we’ve also added a summary view of the size of transactions. So when you have a massive single transaction (thousands, hundreds of thousands or millions of rows) over one or more sequences, you can now check the transaction size without trying to judge from the output of ‘thl list’. For example, with a single massive transaction you can get a summary to understand the overall size:

$ thl list -sizes
1384 121 2017-04-07 10:09:31.0 125 chunks SQL 0 bytes (0 avg bytes per chunk) Rows 19072 (152 avg rows per chunk)
1384 122 2017-04-07 10:09:32.0 123 chunks SQL 0 bytes (0 avg bytes per chunk) Rows 19216 (156 avg rows per chunk)
1384 123 2017-04-07 10:09:32.0 123 chunks SQL 0 bytes (0 avg bytes per chunk) Rows 19266 (156 avg rows per chunk)
1384 124 2017-04-07 10:09:32.0 90 chunks SQL 0 bytes (0 avg bytes per chunk) Rows 14209 (157 avg rows per chunk)
Event total: 15421 chunks 0 bytes in SQL statements 2401906 rows

So if you do a trepctl status and see transaction 1384 is being processed, but the replicator is not doing anything, at least we now know that with 2.4 million rows it’s going to take a while to apply.

And, of course, the new position options work too:

$ thl list -sizes -last
...gets you the size of your last transaction.

In terms of the overall replicator status, because of the way the replicator works, it’s difficult to get information mid-transaction. The replicator is a state machine, and status is only reported at the end of the transaction. However, we’ve added a simple indication that the replicator is processing and that we know it’s still working by showing how long we’ve currently been applying an event:

$ trepctl status -name tasks
timeInCurrentEvent    : 379.218

It’s not hugely descriptive, but you can see how long we’ve been processing the current event, and know we haven’t hit an error (or the status would be offline). Match that with a quick thl sizes command on the current event and you can make a quick ‘oh, it’s just a big transaction’ determination.

Two other areas of the trepctl command will also be improved in 5.2.0. First, we’ve built a simplified status command that provides the key information, along with sizes and units, and simplified overall output depending on the current replicator state. This is a bit more human-readable and contains the key information you need to identify the current state:

$ trepctl qs
State: Online for 4398.38s, running for 5879.502s
Latency: 0.531s from DB commit time on ubuntu into THL
        2149.685s since last database commit
Sequence: 19193 last applied, (1362-19193 stored)

Of course if something goes wrong, you get more information:

$ trepctl qs
State: Faulty (Offline) for 3.06s because SEQNO 18012 did not apply, reason:
   Event application failed: seqno=18012 fragno=0 message=java.sql.SQLException: Statement failed on slave but succeeded on master

This is a quicker way to see the important information without a lot excessive detail and static settings, like the current version or timezone setting, which don’t change between installs.

In terms of understanding what the replicator is doing, we’ve improved the statistical output that you could previously get from trepctl status -name tasks and -name stages. The output from these is not very clear - in particular, it’s not really explaining what each stage is actually doing, or indeed what the counters mean. The ‘applyTime’ for example is actually the total time spent applying since the replicator last went online, not the last apply, and not since the replicator was started.

In place of that we have unified the output and made the content clearer:

$ trepctl perf
Statistics since last put online 1222.798s ago
Stage                |      Seqno |     Latency |      Events |  Extraction |   Filtering |    Applying |       Other |       Total
binlog-to-q          |      18009 |      0.315s |        1987 |    829.958s |      0.599s |     30.914s |      0.094s |    861.565s
                                          Avg time per Event |      0.418s |      0.000s |      0.000s |      0.016s
                                           Filters in stage | colnames,pkey,pkey
q-to-thl             |      18009 |      0.315s |        1987 |    734.173s |      2.011s |    124.850s |      0.523s |    861.557s
                                          Avg time per Event |      0.369s |      0.001s |      0.000s |      0.063s
                                           Filters in stage | enumtostring,settostring

Now it’s much easier to see what is configured, the correct sequence, and whether, say, filter execution time is what is ultimately affecting your latency.

These are still in their rough-cut early beta status, but we would welcome any feedback you have on the output and what you’d like to see in the output.

Kafka Applier for Tungsten Replicator

We know that many of our customers are interested in a Kafka applier, and some in a Kafka extractor, and the requests and interest is increasing substantially.

We have already done some work on this in terms of understanding the development load, and the basics of how the Kafka applier (in particular) would work. There are some important considerations when looking at the applier and the replicator.

Fundamentally, the replicator expects to replicate entire transactions between systems. That’s fine if each transactions only updates one or two smaller rows in a table, but what happens with larger updates? Depending on the environment and use case individual transactions can be huge.

Current thoughts go along these lines:
Have a configurable full transaction/individual row selection in the applier
Allow for splitting large single transactions into multiple messages if possible automatically

Other questions:
What formats do we want to write the data in; JSON, CSV?
Should we batch items (as we do with our BigData items), or just push a continuous stream
Any kind of rate-throttling required?
What kinds of tags/topics to use?

There are many other questions still to answer, but if you have thoughts and want to share, please let MC (mc.brown@continuent.com) know.

Wednesday, February 15, 2017

Continuent Product Management Newsletter, February 2017

Welcome to the first newsletter in 2017 for Continuent. 

Hope everybody has had a good start to the new year. We did at Continuent, starting off with another developer meeting while we pin down our plans for the coming year.

In this Continuent Product Newsletter we cover:
General Information
  • London Developer Meeting
  • Continuent Release Schedule
  • New Replicator Appliers
  • Ask Continuent Anything -- on Thursday 23rd, Feb at noon PST/3pm EST
New Releases
  • Release 5.0.1 Coming Soon
  • Release 4.0.7 and Net::SSH

London Developer Meeting

We had a customer meeting in London in January where we got the opportunity to spend a couple of hours with one of our customers and understands their needs and concerns. That was useful, partially because they love our product (which we obviously like to hear!) but also because their ideas for improvements match with our future vision of changes we are planning. That also gave us the opportunity to discuss in the dev management team a little more about our medium to long term goals.

Continuent Release Schedule

One of the many decisions we made at the London meeting was to change our release schedule. Our new schedule is designed to get fixes updates and new features into customer hands as quickly as possible. To achieve that, we’ve started by specifying our release levels first. They follow the traditional x.y.z model, where:

  • Z-releases contain bug fixes and minor improvements
  • Y-releases contain all previous bug fixes and mid-level functionality improvements and new features
  • X-releases will contain major new functionality, for example a new replicator source, cluster solution, or major improvements to core components.

In line with that basic overview, we’re aiming to to release:

  • Z-releases will happen every 6 weeks
  • Y-releases will happen every 3 months
  • X-releases will happen at least once a year

We’ll also be stricter about our version support process. All releases will be supported for a maximum of two-years from each Z release. All existing customers and versions will be supported up until the end of this year (2017). After that, we’ll start enforcing the time limits. Since we will have a new major version out of every year, a customer should never be more than two years and ergo two versions behind with a release.

More importantly overall, of course, it should mean that our customers can plan their updates knowing when the next release will come out. It should also mean that if you are waiting for a fix after reporting a bug, and we haven’t given you an explicit patch, you should have to wait a maximum of six weeks before the next release.

New Replicator Appliers

Continuent currently offers the following heterogeneous replication solutions:
  • Extraction of the real time transaction feed: from MySQL, AWS RDS and Oracle
  • Applying in real-time into following databases: to Oracle, MySQL, Hadoop, Redshift, and Vertica

One of the other topics that came up at the meeting was the need for additional appliers into a variety of other databases and targets, including, but not limited to, Kafka, Flume, Couchbase, MongoDB (we support it already, but improvements have been requested) and some improvements to our Hadoop applier environment. We’re open to further requests and suggestions and I’m collecting information right now. If you want to discuss it, please contact me (mc.brown@continuent.com) directly.

Online Q&A -- Ask Continuent Anything

Eric Stone, our COO, will be hosting an ‘Ask Continuent Anything’ session on Thursday 23rd, Feb at noon PST/3pm EST.

You can ask us anything here, technical, business, product specific, product requests, history lessons. We’ll try to answer any questions you have, and if we can’t provide an answer, we’ll follow up with you after the session. We’ll include a summary of those questions in the next newsletter.

Please join the meeting from your computer, tablet or smartphone:

You can also dial in using your phone.
United States: +1 (571) 317-3116; France: +33 (0) 170 950 590; Germany: +49 (0) 692 5736 7206; United Kingdom: +44 (0) 20 3713 5011
Access Code: 548-812-605

Continuent Release 4.0.7 and Net::SSH

Release 4.0.7 should be out within a week of this newsletter and is going through final testing right now. We’ll announce it accordingly when it’s ready.

Before the end of last year, we released 4.0.6 which contained many fixes, but most importantly tried to address the issues with compatibility of the Ruby Net::SSH modules used both during staging installs and within some of the command-line tools.

Unfortunately, this proved to be more complex than we expected. To summarize (from the Net::SSH docs):

  • Ruby 1.8.x is supported up until the net-ssh 2.5.1 release.
  • Ruby 1.9.x is supported up until the net-ssh 2.9.x release.
  • Current net-ssh releases require Ruby 2.0 or later.

This presents a problem when we are trying to work with a wide range of Ruby versions, made more complex by the fact that even relatively new releases of Linux OS distributions still work with old (and outdated) versions of Ruby. We have originally bundled Net::SSH to make the distribution easier to work with. Unfortunately, customers using a distribution that includes, say, Ruby 1.9.x as standard (Ubuntu, including 16.10 LTS) then have compatibility issues.

To address that, in both 4.0.7 and 5.0.1 we are no longer including Net::SSH. Instead, you’ll need to install the right Net::SSH version according to your installed Ruby and Linux distribution for the Tungsten tools to use. Full instructions will be included in the release notes.

Continuent Release 5.0.1 Coming Soon

As per the above release schedule, and our promise in the last product newsletter, release 5.0.1 is going through final testing right now and we hope to release it within a week of this newsletter. It contains a variety of fixes and improvements from the 5.0.0 release, while making it simpler and easier to install and upgrade from previous versions.  

NOTE: Release 5.0.1 will also be our standard expected release for all customers going forward.  

Wednesday, January 4, 2017

Continuent Product Management Newsletter, December 2016

Welcome everybody to the new Continuent. In case you have somehow missed it, Continuent has been split off from VMware and is now it’s own independent company.

This newsletter is the first of a regular, monthly, communication giving you the inside information on what is going on here at Continuent with respects to product development.

In this Continuent December 2016 Newsletter we cover:
  • Re-introduction
  • Webinar from December 7th Recording available
  • Continuent Road Map
  • Customer Advisory Board and other customer interactions
  • Release 4.0.6 is here!


The creation of a new Continuent Ltd also means that we have full control over our products and features, and we also know that our customers and users would like to know a lot more about what we are working on, what will come up in the future, and better information on when products will be released, and updated.

As VP/Products, MC Brown (me) I will be keeping you up to date. Many of you I have met or spoken to over the years, whether during sales calls, conferences, or support calls. Some of you I am still yet to meet or speak to, but I’m sure we will have lots of opportunities to do so over the coming months and years.

In the same way as the old Continuent (and the group within VMware) we are a well-spread team, which has the advantage of helping us to provide the 24/7 support coverage. It also means that we like to get together regularly and have some face-to-face time and meetings so that we can agree, and that’s what we did when the company first started.

Road Map Meeting

Back in October, our brand new team met up in Boston, MA, USA to give a good kick off for the company, choose our new path, discuss the pain points and issues most affecting our customers and what we would focus on in the short, medium, and longer term.

We discussed some really good ideas as part of that meeting, and we also built-up a tentative plan for what we are doing. I’ll be describing and discussing that in more detail in the coming months but here are some of the highlights and themes that came out of our discussions:

  • We hate to do this, but hopefully it will be the last time, we’re renaming the products. In short we have two products:
    • Tungsten Replicator (previously known as VMware Replication for MySQL, Oracle, and Data Analytics)
    • Tungsten Clustering (previously known as Continuent Tungsten and VMware Clustering for MySQL)
  • Release v4.0.6; this release has been in production for some time, but obviously during the changes at VMware earlier this year, we didn’t release. This is our top priority (see below).
  • Release v5.0.1; this release will include our new Oracle replication capability (as did v5.0.0) within the replicator, and end-to-end security improvements for replication and clustering. We’ll be simplifying this, and removing ‘security by default’
  • We’ve agreed to some medium-term goals for the product:
    • Ensure we have full MySQL 5.7 support as quickly as possible, including support for new datatypes
    • Move to a shorter, and fixed, release schedule, so we get more releases, and more frequent fixes and improvements.  
    • Support for Java 8
    • New OS version support (such as CentOS 7, SUSE and others)
    • Reduce the amount of ‘noise’ in the logs so that it’s easier to identify real issues
    • Improve the quality of debug information so that we can fix and resolve bugs quicker

Furthermore, we will of course be keeping you as customers more in the loop on exactly what we are doing.

Regular Talks, Customer Advisory Board, and Customer Portal

A few things are going to change in the short term - specifically, all of our customers will get an email like this from me every month letting you know what is going on. I’m going to try and make the emails as informative and lightweight as possible.

In addition to this, I will personally be doing three more things over the coming months:
  • Calling and speaking to some of you directly. I know some of you have already had health check calls, and these will continue, but I also want to speak to as many of our top and key customers as I can. As time goes on, we’ll repeat the process as new products and functionality and work continues. Your needs will change, and we want to work with you to achieve the results you need.
  • Setting up a customer panel where a selection of our customers will be able to talk to us, and to each other, about the way they use our software, what they want and need for the future, and how we can help them with their replicating and clustering needs.
  • Setting up a new customer-focused website where you can all get info on what’s happening, discuss things, and get more detail on the features and functionality we are building.

The aim here is not to overload you with information, but we do specifically want you to be more involved with the company and product features and have more freedom of communication with us so we can get the products and functionality you want.

Webinar - Continuent Tungsten 4.0.6

On the 8th December, I held a Webinar where I went over the new features and functionality of our first release as the new Continuent, v4.0.6. It includes some of the information I’ve covered in this email, as well as more detail on the specifics of the release and our upcoming roadmap and other plans.

If you’d like to watch the Webinar again, we’ve made a version available for download and viewing here: http://continuent-files.s3.amazonaws.com/Continuent_Release_Launch-20161208.mp4

The release includes a number of important bug fixes and features, including:

  • Basic MySQL 5.7 support; this means we now operate correctly against MySQL 5.7, including during installation. But this release specifically does not support the new MySQL 5.7 data types (JSON, or virtual/generated columns). We’re working on adding full support for this right now.
  • Fixes to installation that addresses some minor compatibility issues with MySQL 5.7
  • Fixes to the replicator in terms of reading and processing binary logs
  • Added more intensive checks for non-InnoDB tables and MySQL features

Thank You

Thanks for reading!

If you have any questions, please don’t hesitate to get in contact with me direct at mc.brown@continuent.com


Coming Up

In the next Continuent January 2017 Newsletter we will cover:
  • Details on Continuent 5.0.x Release
  • Improved Documentation
  • Continuent Online Training series
  • Continuent to expand to offer Remote-DBA services