Table partitioning in MySQL NDB Cluster and what’s new (Part III)
Whats new in NDB Cluster 7.5 version (Contd.)
In cluster 7.5 the READ_BACKUP and FULLY_REPLICATED table features were added. These features are both designed to improve read performance and scalability, and can be set on a per-table basis. These features are fully implemented inside MySQL Cluster, and tables using these features support all of the normal MySQL Cluster features - secondary unique and ordered indexes, foreign keys, disk resident columns, replication etc. The read performance improvements do not require any special effort to take advantage of – MySQL Cluster automatically chooses the most efficient way to execute reads, whether issued over NdbApi, or via SQL, executed in MySQLD or pushed down for parallel execution in the data nodes.
READ_BACKUP:
Prior to 7.5, all Committed Read operations were routed to a primary replica of the table or index fragment to be read. The READ_BACKUP feature enables Committed Read operations to be routed to any replica (primary or backup role) of the table or index fragment to be read. This can improve read latency due to improved read locality, and can improve read throughput due to replica parallelism. The trade-off is an increase in commit latency for writing transactions. Note that this feature does not directly affect the partitioning of a table, but rather affects how transactions are executed on the table.
Setting the READ_BACKUP option on a table increases the theoretical maximum aggregate read throughput on the table by the NoOfReplicas (e.g. double for NoOfReplicas=2, triple for NoOfReplicas=3) as all replicas can be used for reads.
Setting the READ_BACKUP option on a table increases the part of table data that can be read locally from 1/nodes to 1/nodegroups – e.g multiplies it by NoOfReplicas. Reading data locally avoids communication delays and bottlenecks which can improve transaction latency and throughput.
For a 1 nodegroup cluster, all table and index data can be read locally. For an n>1 nodegroup cluster, 1/n th of the table and index data can be read locally.
The READ_BACKUP feature was added in 7.5.4 version where users have a choice whether to enable it depending on application’s requirement i.e. read intensive or write intensive. From the cluster 8.0.19 version, READ_BACKUP is enabled by default.
We can enable ‘READ BACKUP’ either through CREATE TABLE or from am ALTER TABLE command, like below:
create table t1(col1 int unsigned not null primary key AUTO_INCREMENT, col2 int unsigned not null, col3 char(255), col4 char(255)) engine=ndbcluster
COMMENT="NDB_TABLE=READ_BACKUP=1”;
OR
alter table t1 ALGORITHM=INPLACE,COMMENT="NDB_TABLE=READ_BACKUP=1";
To demonstrate how READ_BACKUP and FULLY_REPLICATED works, we will perform some sql queries on the table and then will check the values of two NDB statistical counters i.e. Ndb_api_read_row_count_session and Ndb_api_trans_local_read_row_count_session. Then we will do a comparison between two sql query run to find out how many rows read from which replicas. The first variable (Ndb_api_read_row_count_session) will shows the total number of rows return from the table as part of a select query while the second variable (Ndb_api_trans_local_read_row_count_session) will shows how many rows returned from the data node where the select query transaction coordinated.
Let’s create a NDB Cluster with below environment.
- NDB Cluster version (7.5.21)
- 1 Management server
- 4 Data nodes servers
- 1 Mysqld server
- 4 API nodes
- NoOfReplicas = 2
After rows insertion finished into table ‘t1’, let’s check the values of Ndb_api_trans_local_read_row_count_session and Ndb_api_read_row_count_session as show below:
Let’s do a query i.e. sum of all the rows, and then again check the values of these two statistical variables:
So from the above image, we can see that total number of rows return is ‘16384’ i.e. total row count from the table ‘t1’ and the local reads values shows ‘4000’.
With 4 data nodes cluster, with 4 LDMs each, the total number of partition would be 16 (number of data nodes * number of LDMs). Each data node have 4 primary partition and 4 backup partition. So the above ‘4000’ reads are from 4 primary partitions of a single data node.
Let’s run one more query and check the values ‘rows_read’ and ‘local_read’.
From the above image, we can see that ‘rows_read’ as ‘16384’ and ‘local_read’ as ‘4067’. So even with multiple queries run, the total number of rows return from the primary partition is more or less same.
Let’s alter the table to enable READ_BACKUP and perform the same operations to see any difference.
Now lets do the same query and check the values of ‘rows_read’ and ‘local_read’.
From the above, rows_read values is same as before while local_read value is kind of double i.e. ‘8214’. That means the read returns from both the replicas i.e. primary and backup replicas.
Let’s do one more ops and check the values:
From the above image we can see the same result as the previous one. So with READ_BACKUP enabled, read can perform from any of the replicas.
FULLY_REPLICATED:
The effect of enabling the FULLY_REPLICATED feature on a table is to have all of the table and index fragments available locally on every data node. The FULLY_REPLICATED feature implies the READ_BACKUP feature, so this means that with the FULLY_REPLICATED option all table and index data can be read locally using Committed Read on every data node.
For a 1 nodegroup cluster, all table and index fragments are available locally in any case, so FULLY_REPLICATED has no effect beyond setting READ_BACKUP.
For n > 1 nodegroup clusters, FULLY_REPLICATED means that all data can be read locally on every data node.
This means that the theoretical aggregate maximum read throughput increases by NoOfNodes over a READ_PRIMARY table, and by NoOfNodeGroups over a READ_BACKUP table.
As with READ_BACKUP, the performance tradeoff here is write performance, as writes must now modify data on every data node in the system.
Additionally, FULLY_REPLICATED tables must store more copies of the table’s data (multiplied by NoOfNodeGroups) when compared to non fully replicated tables.
The FULLY_REPLICATED option is useful for tables that are read often relative to their write rate, and where their data volume is not dominant in a schema. Dimension tables in a star schema might be good candidates for using the FULLY_REPLICATED feature, allowing local parallel joins with fact table data in larger non fully replicated tables.
User can enable the ‘FULLY REPLICATED’ option either through a CREATE TABLE statement or through an ALTER TABLE command, like below:
create table t1(col1 int unsigned not null primary key AUTO_INCREMENT, col2 int unsigned not null, col3 char(255), col4 char(255)) engine=ndbcluster
COMMENT="NDB_TABLE=FULLY_REPLICATED=1";
OR
alter table t1 algorithm=copy, comment="NDB_TABLE=FULLY_REPLICATED=1";
When no partition balance is mentioned during the table creation then the default partition type will be used i.e. FOR_RP_BY_LDM.
So in a cluster with 4 data nodes, 2 replicas i.e. with 2 NodeGroups, 4 LDMs and FULLY_REPLICATED is enabled then the total number of partition would be (total number of data nodes multiply by total number of LDMs per node / number of replicas) = 4 * 4 / 2 = 8.
Let’s create a table ‘t1’ with table option FULLY_REPLICATED and inserts some rows into it.
Let’s check the partition count and table option from ndb_desc:
From the above image, we can see that PartitionCount: 8, PartitionBalance: FOR_RP_BY_LDM (default) and Table options: readbackup, fullyreplicated.
Let’s look at the same example that we have discussed in READ_BACKUP case and let’s find out if there is any difference in FULLY_REPLICATED case.
Let’s alter the same table ‘t1’ to enable FULLY_REPLICATED:
Let’s do the same operations again here too:
From the above image, we can see that ‘local_read’ values is same as ‘rows_read’ i.e. each node has full copy of table data.
Let’s do one more time to see if any changes in ‘local_read’:
From the above image we can see the result is same as before. So with FULLY_REPLICATED table, all the table data is available across all the data nodes.
To know more details on these new config variables, please refer to the cluster reference manual here.
This concludes the third part of the table partitioning.
To be continued ....
ReplyDeleteThis post is so helpfull.Keep updating with more informaion
RPA Certification
RPA Training