Table Partitioning In MySQL NDB Cluster And What’s New (Part II)

Whats new in NDB Cluster 7.5 version

In this version, users have more flexible ways of table partitioning rather than the default way thru ldm. Now user can partition the table either by node or by ldm. There are 4 different ways of table partitioning supported, these are:

  • FOR_RP_BY_NODE
  • FOR_ RA_BY_NODE
  • FOR_RP_BY_LDM (Default)
  • FOR_RA_BY_LDM
    • FOR_RA_BY_LDM_X_2
    • FOR_RA_BY_LDM_X_3
    • FOR_RA_BY_LDM_X_4
From the above RA is for Read from any replica i.e either from Primary replica or backup replica and RP is for Read from Primary replica only. The above options user can give either thru create table or from alter table sql statement in the COMMENT section like below.

mysql> create table t1(col1 int unsigned not null primary key AUTO_INCREMENT, col2 int unsigned not null, col3 char(255)) engine=ndbcluster
COMMENT="NDB_TABLE=PARTITION_BALANCE=FOR_RP_BY_NODE";

OR

mysql> alter table t1 COMMENT="NDB_TABLE=PARTITION_BALANCE=FOR_RA_BY_NODE";

The FOR_RA and FOR_RP exist because:


- For most tables, the primary replicas has more ‘work’ associated with it than the backup replicas, so for balance, we want all LDMs to have an equal share of primary and backup replicas.

- For tables where the backup replica is read, it may be that the backup replica has more of a balance of the ‘work’ associated with it (as it can service reads), so it may be that we get balance without needing every LDM to have a primary replica. In this case we need fewer partitions e.g.
(Number of partitions = number of LDMs * number of data nodes / number of replicas).

The BY_NODE variants exist for tables where it is not considered necessary to split the table data across multiple LDMs within a node. Probably this would mostly be e.g. for fully replicated tables.


FOR_RP_BY_NODE:

In this scheme, one partition will be created per node. So in a cluster with 4 nodes, with this partition type, the total number of partition would be number of nodes multiply by number of partitions i.e. 4 * 1 = 4. Let’s look at the below images:

Let’s create a database ‘test1’, a table ‘t1’ with FOR_RP_BY_NODE table option and then insert few rows into it.


Let’s check the partition count from ndb_desc:


From the above image, we can see that PartitionCount: 4 and PartitionBalance: FOR_RP_BY_NODE.

FOR_RA_BY_NODE:

In this partition type, one partition will be created per node group. Read can happen from any of the replicas. So in a cluster with 4 data nodes, 2 replicas i.e. with 2 NodeGroups. the total number of partition would be total number of partitions multiply by total number of groups i.e. 1 * 2 = 2. Lets look at the below images:

Let’s alter the table ‘t1’ that we have created above to a new table option FOR_RA_BY_NODE.


Let’s check the partition count from ndb_desc:


From the above image, we can see that PartitionCount: 2 and PartitionBalance: FOR_RA_BY_NODE.

FOR_RP_BY_LDM:

In this partition type, only one partition will be created per LDM in each node. With a 4 data nodes cluster and 4 LDMs per node then the total number of partitions would be 4 * 4 = 16 partitions i.e. (number of LDMs per node * total number of data nodes). This is the default partition scheme in NDB cluster 7.5 version.

Let’s alter the table ‘t1’ that we have created above to a new table option FOR_RP_BY_LDM.


Let’s check the partition count from ndb_desc:


From the above image, we can see that PartitionCount: 16 and PartitionBalance: FOR_RP_BY_LDM.

FOR_RA_BY_LDM:

In this partition type, only one partition is created per LDM in each nodegroup. So the total number of partitions would be number of LDMs multiplied by number of nodegroups. With a 4 data nodes cluster, 4 LDMs per node with 2 replicas i.e. with 2 NodeGroups, then the total number of partitions would be 2 * 4 = 8.

Let’s alter the table ‘t1’ that we have created above to a new table option FOR_RA_BY_LDM.


Let’s check the partition count from ndb_desc:


From the above image, we can see that PartitionCount: 8 and PartitionBalance: FOR_RA_BY_LDM.

FOR_RA_BY_LDM_X_2:


In this partition type, there will be 2 partitions per LDM in each nodegroup. With a 4 data nodes cluster, 4 LDMs per node with 2 replicas i.e. with 2 NodeGroups, then the total number of partitions would be 2 * 4 * 2 = 16.

FOR_RA_BY_LDM_X_3:

In this partition type, there will be 3 partitions per LDM in each nodegroup. With a 4 data nodes cluster, 4 LDMs per node with 2 replicas i.e. with 2 NodeGroups, then the total number of partitions would be 2 * 4 * 3 = 24.

FOR_RA_BY_LDM_X_4:

In this partition type, there will be 4 partitions per LDM in each nodegroup. With a 4 data nodes cluster, 4 LDMs per node with 2 replicas i.e. with 2 NodeGroups, then the total number of partitions would be 2 * 4 * 4 = 32.

As stated before, even though NDB cluster supporting various types of partitioning scheme, unless application demand, user should always follow automatic (implicit) partitioning.

To know more details on these new variables, please refer to the cluster reference manual here.

This concludes the second part of the table partitioning.

                                            
                                                                                                          To be continued ...


Comments

Popular posts from this blog

MySQL Cluster Self Healing And No Single Point Of Failure Feature

MySQL NDB Cluster Backup & Restore In An Easy Way

Cluster Installation On Physical Hosts: