MySQL Cluster a 5 9s (99.999%) Database

MySQL Cluster:

MySQL Cluster is a real time, ACID-compliant transactional database. It is a combination of MySQL server and the NDB storage engine. Data is cached in memory and durably stored on disk. Indexed columns are always kept in memory while non-indexed columns can be kept in memory or disk. It was mainly designed for telecom databases with 99.999% availability and high performance. Unlike simpler sharded systems, MySQL Cluster transactions can transparently query and update all data in the system.

Key features of MySQL Cluster:

  • MySQL Cluster is designed using a shared nothing architecture
  • Support for large database sizes
    • With all columns kept in memory, can store upto few terabytes
    • With some columns kept on disk, can store upto few petabytes
  • Supports read and write transactions:
    • during upgrade, scaling out, adding new columns/index to the
      tables, backup, add new nodes, add/drop FK etc
  • Survives multiple node failures while writes happen.
  • Support for automated node failure detection and automated 
    recovery after node failure.
  • Support for 144 data nodes (version 8.0.18 or later).
  • Support for multiple levels of replication:
    • Synchronous replication inside the cluster (replica)
    • Asynchronous replication between cluster (Source -> Replica)
  • Support for both SQL and NoSQL (NDB API) i.e. in both ways user
    applications can interact with the same data.
  • Support for online scaling with no down time, i.e. cluster can be scaled 
    while transactions are going on.
  • Support for automatic data partitioning based on cluster architecture
    selected by the user.

Cluster architectural diagram:

Components of MySQL cluster in nutshell:

From the above architectural diagram, one can see that there are three 
types of nodes exist. These are:
  • Management node(s)
  • Data nodes
  • API nodes

Management nodes:

This node has a number of functions including:
  • Handling the cluster configuration file called ‘config.ini’ and serving
    this info to other nodes on request.
  • Serves cluster address and port information to clients.
  • Gathers and records aggregated cluster event logs. 
  • Provides a cluster status query and management service, available to
    users via a Management client tool.
  • Acts as an arbitrator in certain network partition scenarios.
Management nodes are essential for bootstrapping a system and 
managing it in operation, but they are not critical to transaction
processing, which requires only Data nodes and Api nodes.

Since this server has limited responsibilities it does not need a lot of
resources to run.

Data nodes:

These are the heart of MySQL cluster storing the actual data and indexes,
and processing distributed transactions. Almost all of the cluster 
functionality are implemented by these nodes. Data nodes are logically
grouped into nodegroups. All data nodes in a nodegroup (up to four) 
contain the same data, kept in sync at all times. Different nodegroups
contain different data. This allows a single cluster to provide high
avilability and scale out of data storage and transaction processing
capacity. Data nodes are internally scalable and can make good use of
machines with large resources like CPU, RAM, Disk capacity etc.

API nodes:

Each API node connects to all of the data nodes. API nodes are the point of access to the system for user transactions. User transactions are defined and sent to data nodes which process them and send results back. The most commonly used API node is MySQL server (mysqld) which allows SQL access to data stored in the cluster.
There are a number of different API node interfaces MySQL cluster supports like C++ NDB API, ClusterJ (for java applications), Node.js (for java script applications) etc.

Use cases of MySQL Cluster:
MySQl cluster was initially designed for telecom databases. After years of 
improvement, it is now used in many other areas like online gaming, 
authentication services, online banking services, payment services, 
fraud detection and many more.


NDB is known for its high performance:
  • 20M writes/second
  • 200M reads/second
  • 1B updates/minute in benchmarks


Like every database, MySQl Cluster has some limitations which includes:
  • Only READ COMMITTED transaction isolation level is supported
  • No support for save point 
  • GTID (Global Transaction Identifiers) not supported
  • No schema changes (DDL) allowed during data node restart
For more information about MySQL NDB Cluster, Please check the official
documentation site.


Popular posts from this blog

MySQL Cluster Self Healing And No Single Point Of Failure Feature

Cluster Installation On Physical Hosts:

MySQL NDB Cluster Backup & Restore In An Easy Way