How to do online configuration changes in MySQL NDB Cluster (Part I)

In this blog, we will discuss about how to perform cluster configuration changes while cluster is up and processing transactions (online).

In MySQL NDB Cluster, configuration data is parsed and distributed by the management server (MGMD) nodes. Users supply an input text file (commonly known as config.ini) which describes cluster topology, resource usage limits and other parameters. The MGMD nodes parse this file, combine it with designed in defaults and serve the resulting configuration to other node types (data nodes, api nodes), when they connect.

Reasons for changing configuration might include:

- Increased resource usage limits (Data memory, IndexMemory, buffers)
- Adding a new configuration parameter(s) i.e. enabling a new feature
- Unsupported configuration parameter taken out during downgrade to lower version i.e. disabling a feature
- etc ..

MySQL Cluster nodes pick up configuration changes when restarted, so it is generally necessary to restart each node in turn so that new configuration is rolled out across the cluster. MySQL Cluster’s high availability features allow the cluster to continue to process queries and transactions during this roll out.

Let’s create a MySQL NDB Cluster with the following environment:
  • MySQL NDB Cluster version (Latest GA version)
  • 1 Management node
  • 4 Data nodes
  • 1 MySQL server
  • Configuration slots for up to 4 additional API nodes
Steps to do online configuration change:

Where a change of configuration parameter is needed, the procedure is the same.

For any changes to the cluster configuration file (config.ini), a cluster restart is needed. But not all configuration parameter need a cluster wide restart. For example, configuration parameter mainly related to data nodes need a management node restart followed by all data nodes restart. No mysql server restart is needed. For configuration parameter that are related to mysql server need a management node restart, all data nodes restart and then all mysql servers restart.

Any parameter which is not defined under the [ndbd] and [ndbd default] section in cluster configuration file (config.ini) can be either MySQL server or API parameter. Or in other words, any parameters that is defined under section like [mysqld], [api], [tcp], [tcp default], [shm] etc can be consider as a MySQL configuration parameter. In that case, MySQL server must be restarted after management and all data nodes finished its restart process.

Let’s start the cluster with existing configurations as described above.

Let’s take an example, say we are planning to add a new configuration parameter ‘RequireEncryptedBackup=1’ into the configuration file (config.ini). And we will follow the process of adding the new configuration parameter while online below.

This parameter ‘RequireEncryptedBackup‘ will ensure that when the user takes a backup, a password must be passed into the backup command to encrypt the backup files. At present, we have not selected this parameter in the configuration file, look below:

Just to verify that without this configuration parameter, can we take a backup without passing password to the backup command or not:

From the above image, we can see that, backup command was executed without passing any password to it. So this confirms that the parameter ‘RequireEncryptedBackup’ is not added before. Now let’s enable or add this parameter to the configuration file (config.ini).

Since this parameter is not a MySQL configuration parameter so we need to restart only management and all the data nodes. There is no harm if a MySQL server is restarted though.

The cluster restart sequence is shown below:

- restart the management node with –reload option
- restart all the data nodes one by one (rolling restart) with or without –initial option based on the configuration parameter ‘restart type’.
- restart of all MySQL server(s), if mysql server related configuration parameter is added/changed

Management servers store the parsed NDB Cluster configuration data internally rather than re-reading the cluster configuration file each time the management server is started.

With the --reload option, the management server is forced to check its internal data store against the cluster configuration file and to reload the configuration if it finds that the configuration file does not match with the internal state.

With the --initial option, all internal state is deleted and the management server is forced to re-read the global configuration file and construct a new state.

To know more about rolling restart, please look into the link here.

To know more about cluster restart type, please look into the link here.

1) if there are two management nodes running on the cluster then one management node should stop and start with –reload option. Once it is finished, the second management node should stop and start with –reload option.

2) If there are many data nodes in multiple nodegroups then it is possible to restart more of the data nodes at once. This can reduce the time taken to roll out a configuration change, but cluster redundancy is reduced to a greater extent during the process. For example, with NoOfReplicas=2, and 6 data nodes, we can restart 3 data nodes at once (one in each nodegroup).

3) Most configuration changes require a normal node restart (N), involving recovery from logs and checkpoints. Some configuration changes require an initial node restart (IN), where logs and checkpoints are discarded, and the node recovers from live nodes in its nodegroup. Based on the configuration parameter restart type i.e. N (Node restart), IN (Initial node restart), all data nodes must restarted accordingly.

4) If the restart type of the configuration parameters is S (System restart), IS (Initial System restart) then it is not possible to change the configuration online. System restart involves restarting all nodes together, from logs and checkpoints, and Initial System Restart involves restarting all nodes together with no initial state, so Backup and Restore is required to retain the cluster data. There are not many of these types.

5) It’s possible to change multiple parameters with one rolling restart and don’t need to do one by one per changed parameter.

Step 1: Let’s stop the management node:

Start the management node with –reload option (as we have added a new configuration parameter):

Step 2: Let’s stop and start all the data nodes:

Since management node is restarted, let’s stop and start all the data nodes one by one:
From the below image, we can see that one node from each nodegroup is stopped.

Let’s restart both the stopped nodes without –initial option:

As we discussed before, data node can be restarted either one by one i.e. ‘rolling restart’ way or can be shutdown one node each from all the nodegroups. Then restarted the data node without the –initial option.

Let’s check the cluster status:

Let’s do the same for other two nodes i.e. ID: 34 and 36 and check the cluster status:

Now since the cluster has been re-started with the new configuration parameter so let’s verify it by taking one more backup. This time, backup must ask for the password to encrypt it else it will fail.

From the above image we can see that backup failed with error as password has not passed to the backup command.

So we have successfully added a new configuration parameter to a running cluster without shutting down the cluster.

With the above process, we have not restarted the MySQL server(s) as the configuration parameter that we have added is not related to MySQL server.

If the configuration parameter is related to MySQL server and there are many MySQL servers are running on the NDB cluster, then user should follow the rolling restart of all the MySQL servers i.e. stop and start of one MySQL server at a time. These restart are pretty much faster compare to data nodes restart.

We have just taken one parameter (RequireEncryptedBackup) as an example, the above process should follow for any parameters that is added to the configuration file or need a change.

This concludes our online configuration changes discussion.

                                                                                                         To Be Continued ...


Popular posts from this blog

MySQL Cluster Self Healing And No Single Point Of Failure Feature

MySQL NDB Cluster Backup & Restore In An Easy Way

MySQL NDB Cluster Installation Through Docker