MySQL Cluster is a high-availability, high-redundancy version of MySQL
adapted for the distributed computing environment. It uses the NDB
Cluster storage engine to enable running several MySQL servers in a cluster.
This storage engine is available and in binary releases from MySQL-Max 4.1.3.
Beginning with MySQL 4.1.10a, it is also available in RPMs compatible with most
modern Linux disributions. (Note that both the mysql-server
and
mysql-max
RPMs must be installed in order to have MySQL Cluster
capability.)
The operating systems on which MySQL Cluster is currently available are Linux, Mac OS X, and Solaris. We are working to make Cluster run on all operating systems supported by MySQL, including Windows.
This chapter represents a work in progress, and its contents are subject to revision as MySQL Cluster continues to evolve. Additional information regarding MySQL Cluster can be found on the MySQL AB Web site at http://www.mysql.com/products/cluster/.
You may wish to make use of two additional online resources provided by MySQL AB:
Answers to some commonly asked questions about Cluster may be found in the section 16.9 MySQL Cluster FAQ. If you are new to MySQL Cluster, you may also find our Developer Zone article How to set up a MySQL Cluster for two servers to be helpful.
MySQL Cluster is a technology which enables clustering of in-memory databases in a share-nothing system. The share-nothing architecture allows the system to work with very inexpensive hardware, and without any specific requirements on hardware or software. It also does not have any single point of failure because each component has its own memory and disk.
MySQL Cluster integrates the standard MySQL server with an in-memory clustered storage engine called NDB. In our documentation, the term NDB refers to the part of the setup that is specific to the storage engine, whereas MySQL Cluster refers to the combination of MySQL and the new storage engine.
A MySQL Cluster consists of a set of computers, each running a number of processes including MySQL servers, storage nodes for NDB Cluster, management servers, and (possibly) specialised data access programs. The relationship of these components in a cluster are shown here:
All these programs work together to form MySQL Cluster. When data is stored in the NDB Cluster storage engine, the tables are stored in the storage nodes for NDB Cluster. Such tables are directly accessible from all other MySQL servers in the cluster. Thus, in a payroll application storing data in a cluster, if one application updates the salary of an employee, all other MySQL servers that query this data can see this change immediately.
The data stored in the storage nodes for MySQL Cluster can be mirrored; the cluster can handle failures of individual storage nodes with no other impact than that a small number of transactions are aborted due to losing the transaction state. Since transactional applications are expected to handle transaction failure, this should not be a source of problems.
By bringing MySQL Cluster to the Open Source world, MySQL makes clustered data management with high availability, high performance, and scalability available to all who need it.
NDB is an in-memory storage engine offering high-availability and data-persistence features.
The NDB storage engine can be configured with a range of failover and load-balancing options, but it is easiest to start with the storage engine at the cluster level. MySQL Cluster's NDB storage engine contains a complete set of data, dependent only on other data within the cluster itself.
We will now describe how to set up a MySQL Cluster consisting of an NDB storage engine and some MySQL servers.
The cluster portion of MySQL Cluster is currently configured independently of the MySQL servers. In a MySQL Cluster, each part of the cluster is considered to be a node.
Note: In many contexts, the term "node" is used to indicate a computer, but when discussing MySQL Cluster it means a process. There can be any number of nodes on a single computer, for which we use the term cluster host.
There are three types of cluster nodes, and in a minimal MySQL Cluster configuration, there will be at least three nodes, one of each of these types:
MGM
) node: The role of this type of node is to manage
the other nodes within the MySQL Cluster, such as providing configuration
data, starting and stopping nodes, running backup, and so forth. Because this
node type manages the configuration of the other nodes, a node of this type
should be started first, before any other node. A MGM node is started with the
command ndb_mgmd
.
ndbd
.
mysqld --ndb-cluster
or simply by using mysqld
with ndbcluster
added to `my.cnf'.
Configuration of a cluster involves configuring each individual node in the cluster and setting up individual communication links between nodes. MySQL Cluster is currently designed with the intention that storage nodes are homogenous in terms of processor power, memory space, and bandwidth. In addition, in order to provide a single point of configuration, all configuration data for the cluster as a whole is located in one configuration file.
The management server (MGM node) manages the cluster configuration file and the cluster log. Each node in the cluster retrieves the configuration data from the management server, and so requires a way to determine where the management server resides. When interesting events occur in the storage nodes, the nodes transfer information about these events to the management server, which then writes the information to the cluster log.
In addition, there can be any number of cluster client processes or applications. These are of two types:
This section is a "How-To" in which we will describe the basics for how to plan, install, configure, and run a viable MySQL Cluster. Unlike the example in section 16.4 MySQL Cluster Configuration, the result of following the guidelines and procedures outlined below should be a usable MySQL Cluster which meets minimum requirements for availability and safeguarding of data.
In this section, we will cover: hardware and software requirements; networking issues; installation of MySQL Cluster; configuration issues; starting, stopping, and restarting the cluster; loading of a sample database; and performing queries.
This How-To makes the following assumptions:
Node | IP Address |
Management (MGM) node | 192.168.0.10 |
MySQL server (SQL) node | 192.168.0.20 |
Storage (NDBD) node "A" | 192.168.0.30 |
Storage node "B" | 192.168.0.40 |
While we refer to a Linux operating system in this How-To, the instructions and procedures that we provide here should be easily adaptable to either Solaris or Mac OS X. We also assume that you already know how to perform a minimal installation and configuration of the operating system with networking capability, or that you are able to obtain assistance in this elsewhere if needed.
We discuss MySQL Cluster hardware, software, and networking requirements in somewhat greater detail in the next section. (See section 16.3.1 Hardware, Software, and Networking.)
One of the strengths of MySQL Cluster is that it can be run on commodity hardware and has no ususual requirements in this regard, other than for large amounts of RAM, due to the fact that all live data storage is done in memory. (Note that this is subject to change, and that we intend to implement disk-based storage in a future MySQL Cluster release.) Naturally, multiple and faster CPUs will enhance performance. Memory requirements for Cluster processes are relatively small.
The software requirements for Cluster are also modest. Host operating systems
do not require any unusual modules, services, applications, or configuration to
support MySQL Cluster. For Mac OS X or Solaris, the standard installation is
sufficient. For Linux, a standard, "out of the box" installation should be all
that is necessary. The MySQL software requirements are simple: all that is
needed is a production release of MySQL-max 4.1.3 or newer; you must use the
-max
version of MySQL in order to have Cluster support. It is not
necessary to compile MySQL yourself merely to be able to use Cluster. In this
How-To, we will assume that you are using the -max
binary
appropriate to your operating system, available via the MySQL software downloads
page at http://dev.mysql.com/downloads.
For inter-node communication, Cluster supports TCP/IP networking in any standard topology, and the minimum expected for each host is a standard 100 Mbps Ethernet card, plus a switch, hub, or router to provide network connectivity for the cluster as a whole. We strongly recommend that a MySQL Cluster be run on its own subnet which is not shared with non-Cluster machines for the following reasons:
It is also possible to use the high-speed Scalable Coherent Interface (SCI) with MySQL Cluster, but this is not a requirement. See section 16.7 Using High-Speed Interconnects with MySQL Cluster for more about this protocol and its use with MySQL Cluster.
Each MySQL Cluster host computer running storage or SQL nodes must have
installed on it a MySQL-max binary. For management nodes, it is not necessary
to install the MySQL server binary, but you do have to install the MGM server
daemon and client binaries (ndb_mgmd
and ndb_mgm
,
respectively). In this section, we will cover the steps necessary to install
the correct binaries for each type of Cluster node.
As of this writing, the most recent production version for Linux was MySQL 4.1.10a; if a more recent version is available, you should use that instead, and subsitute the version number in the following instrucitons as appropriate. MySQL AB provides precompiled binaries, and there is generally no need to compile these yourself. (If you do require a custom binary, see section 2.8.3 Installing from the Development Source Tree.) Therefore, the first step in the installation process for each cluster host is to download the file `mysql-max-4.1.10a-pc-linux-gnu-i686.tar.gz' from the MySQL downloads area, and we will assume that you have placed it in each machine's `/var/tmp' directory.
RPMs are also available for both 32-bit and 64-bit Linux platforms; as of MySQL
4.1.10a, the -max
binaries installed by the RPMs supports the NDBCluster
storage engine. If you choose to use these rather than the binary files, be
aware that you must install both the -server
and -max
packages on all machines that are to host cluster nodes. (See Linux for
more information about installing MySQL using the RPMs.) After installing from
RPM, you will still need to configure the cluster as discussed in
section 16.3.3 Configuration.
Note: After completing the installation, do not yet start any of the binaries. We will show you how to do so following the configuration of all nodes.
On each of the 3 machines designated to host storage or SQL nodes, perform the following steps as the system root user:
mysql
group and mysql
user on the system, as some OS distributions create these as part of the
operating system installation process. If these are not already present, then
create a new mysql
user group, then add a mysql
user to this
group:
groupadd mysql useradd -g mysql mysql
cd /var/tmp tar -xzvf -C /usr/local/bin mysql-max-4.1.10a-pc-linux-gnu-i686.tar.gz ln -s /usr/local/bin/mysql-max-4.1.10a-pc-linux-gnu-i686 mysql
cd mysql scripts/mysql_install_db --user=mysql
chown -R root . chown -R mysql data chgrp -R mysql .Note that the data directory on each machine hosting a storage node is `/usr/local/mysql/data'. We will make use of this piece of information when we configure the management node. (See section 16.3.3 Configuration.)
cp support-files/mysql.server /etc/rc.d/init.d/ chmod +x /etc/rc.d/init.d/mysql.server chkconfig --add mysql.serverHere we use Red Hat's
chkconfig
for creating links to the startup
scripts; use whatever means is appropriate for this purpose on your operating
system/distribution, such as update-rc.d
on Debian.
Remember that the steps listed above must be performed separately for each machine on which a storage or SQL node is to reside.
For the MGM (management) node, it is not necessary to install the
mysqld
executable, only the binaries for the MGM server and client,
which can be found in the downloaded -max
archive. Again we assume that
you have placed this file in `/var/tmp'. As root (that is, after using
su root
or your system's equivalent for temporarily assuming the
system root account's privileges), perform the following
steps to install ndb_mgmd
and ndb_mgm
on the Cluster
management node host:
ndb_mgm
and
ndb_mgmd
from the archive into a suitable directory such as
`/usr/local/bin':
cd /var/tmp tar -zxvf mysql-max-4.1.10a-pc-linux-gnu-i686.tar.gz /usr/local/bin '*/bin/ndb_mgm*'
cd /usr/local/bin chmod +x ndb_mgm*
In section 16.3.3 Configuration, we will create and write configuration files for all of the nodes in our example Cluster.
For our 4-node, 4-host MySQL Cluster, we will need to write 4 configuration files, 1 per node/host.
connectstring
.
The `my.cnf' file needed for the storage nodes is fairly simple. The configuration file should be located in the `/etc' directory and can be edited (and created if necessary) using any text editor, for example:
vi /etc/my.cnf
For each storage node and SQL node in our example setup, `my.cnf' should look like this:
[MYSQLD] # Options for mysqld process: ndbcluster # run NDB engine ndb-connectstring=192.168.0.10 # location of MGM node [MYSQL_CLUSTER] # Options for ndbd process: ndb-connectstring=192.168.0.10 # location of MGM node
After entering the above, save this file and exit the text editor. Do this for the machines hosting storage node "A", storage node "B", and the SQL node.
The first step in configuring the MGM node is to create the directory in which the configuration file can be found and then to create the file itself. For example (running as root):
mkdir /var/lib/mysql-cluster cd /var/lib/mysql-cluster vi config.ini
We show vi
being used here to create the file, but any text editor
should work just as well.
For our representative setup, the `config.ini' file should read as follows:
[NDBD DEFAULT] # Options affecting ndbd processes on all storage nodes: NoOfReplicas=2 # Number of replicas DataMemory=80M # How much memory to allocate for data storage IndexMemory=52M # How much memory to allocate for index storage # For DataMemory and IndexMemory, we have used the # default values. Since the "world" database takes up # only about 500KB, this should be more than enough for # this example Cluster setup. [TCP DEFAULT] # TCP/IP options: portnumber=2202 # This the default; however, you can use any # port that is free for all the hosts in cluster # Note: It is recommended beginning with MySQL 5.0 that # you do not specify the portnumber at all and simply allow # the default value to be used instead [NDB_MGMD] # Management process options: hostname=192.168.0.10 # Hostname or IP address of MGM node datadir=/var/lib/mysql-cluster # Directory for MGM node logfiles [NDBD] # Options for storage node "A": # (one [NDBD] section per storage node) hostname=192.168.0.30 # Hostname or IP address datadir=/usr/local/mysql/data # Directory for this storage node's datafiles [NDBD] # Options for storage node "B": hostname=192.168.0.40 # Hostname or IP address datadir=/usr/local/mysql/data # Directory for this storage node's datafiles [MYSQLD] # SQL node options: hostname=192.168.0.20 # Hostname or IP address datadir=/usr/local/mysql/data # Directory for SQL node's datafiles # (additional mysqld connections can be # specified for this node for various # purposes such as running ndb_restore)
(NOTE: The "world" database can be downloaded from http://dev.mysql.com/doc/ where it can be found listed under "Examples".)
Once all the configuration files have been created and these minimal options have been specified, you are ready to proceed with starting the cluster and verifying that all processes are running. We discuss how this is done in section 16.3.4 Initial Startup.
For more detailed information about the available MySQL Cluster configuration parameters and their uses, see section 16.4.4 Configuration File and section 16.4 MySQL Cluster Configuration. For configuration of MySQL Cluster as relates to making backups, see section 16.6.4.4 Configuration for Cluster Backup.
Note: The default port for Cluster management nodes is 1186; the default port for storage nodes is 2202.
Starting the cluster is not very difficult once it has been configured. Each cluster node process must be started separately, and on the host where it resides. While it is possible to start the nodes in any order, it is recommended that the management node be started first, followed by the storage nodes, and then finally by any SQL nodes:
shell> ndb_mgmd -f /var/lib/mysql-cluster/config.iniNote that
ndb_mgmd
must be told where to find its configuration
file, using the -f
or --config-file
option. (See section 16.5.3 ndb_mgmd
, the Management Server Process for details.)
shell> ndbd --initialNote that it is very important to use the
--initial
parameter
only when starting ndbd
for the first time, or when
restarting after backup/restore or configuration change. This is
because this parameter will cause the node to delete any files created by
earlier ndbd
instances needed for recovery, including the recovery
log files.
mysqld
startup command as shown here:
shell> mysqld &
If all has gone well, and the cluster has been set up correctly, the cluster
should now be operational. You can test this by invoking the ndb_mgm
management node client; the output should look like what is shown below:
shell> ndb_mgm -- NDB Cluster -- Management Client -- ndb_mgm> show Connected to Management Server at: localhost:1186 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=2 @192.168.0.30 (Version: 4.1.11, Nodegroup: 0, Master) id=3 @192.168.0.40 (Version: 4.1.11, Nodegroup: 0) [ndb_mgmd(MGM)] 1 node(s) id=1 @192.168.0.10 (Version: 4.1.11) [mysqld(SQL)] 1 node(s) id=4 (Version: 4.1.11)
You may see some slight differences depending upon the exact version of MySQL that you are using.
Note: If you are using an older version of MySQL, you may see the SQL node referenced as `[mysqld(API)]'. This reflects an older usage which is now deprecated.
You should now be ready to work with databases, tables, and data in MySQL Cluster. See section 16.3.5 Loading Sample Data and Performing Queries for a brief discussion.
Working with data in MySQL Cluster is not much different than working with it MySQL without Cluster. There are two points to keep in mind when doing so:
ALTER TABLE
) to
use the NDB CLuster storage engine in order to have them replicated in the
cluster. If you are importing tables from an existing database using the
output of mysqldump
, you can open the SQL script(s) in a text editor
and add this option to any table creation statements, or replace any existing
ENGINE
(or TYPE
) option(s) with one of these. For example,
suppose that you have the sample world database on another MySQL
server (that doesn't support MySQL Cluster), and you wish to export the
definition for the CITY
table:
shell> mysqldump --add-drop-table world City > city_table.sqlThe resulting `city_table.sql' file will contain this table creation statement (and the
INSERT
statements necessary to import the table
data):
DROP TABLE IF EXISTS City; CREATE TABLE City ( ID int(11) NOT NULL auto_increment, Name char(35) NOT NULL default '', CountryCode char(3) NOT NULL default '', District char(20) NOT NULL default '', Population int(11) NOT NULL default '0', PRIMARY KEY (ID) ) TYPE=MyISAM; INSERT INTO City VALUES (1,'Kabul','AFG','Kabol',1780000); INSERT INTO City VALUES (2,'Qandahar','AFG','Qandahar',237500); INSERT INTO City VALUES (3,'Herat','AFG','Herat',186800); # (remaining INSERT statements omitted)You will need to make sure that MySQL uses the NDB storage engine for this table. There are two ways that this can be accomplished. One of these is, before importing the table into the Cluster database, to modify its definition so that it reads (still using the City as an example):
DROP TABLE IF EXISTS City; CREATE TABLE City ( ID int(11) NOT NULL auto_increment, Name char(35) NOT NULL default '', CountryCode char(3) NOT NULL default '', District char(20) NOT NULL default '', Population int(11) NOT NULL default '0', PRIMARY KEY (ID) ) ENGINE=NDBCLUSTER; INSERT INTO City VALUES (1,'Kabul','AFG','Kabol',1780000); INSERT INTO City VALUES (2,'Qandahar','AFG','Qandahar',237500); INSERT INTO City VALUES (3,'Herat','AFG','Herat',186800); # (etc.)This will need to be done for the definition of each table that is to be part of the clustered database. The easiest way to accomplish this is simply to do a search-and-replace on the `world.sql' file and replace all instances of
TYPE=MyISAM
with ENGINE=NDBCLUSTER
. If you do not wish to modify
the file, you can also use ALTER TABLE
; see below for particulars.
Assuming that you have already created a database named world on
the SQL node of the cluster, you can then use the mysql
command-line
client to read `city_table.sql', and create and populate the
corresponding table in the usual manner:
shell> mysql world < city_table.sqlIt is very important to keep in mind that the above command must be executed on the host where the SQL node is running -- in this case, on the machine with the IP address 192.168.0.20. To create a copy of the world database on the SQL node, save the file to `/usr/local/mysql/data', then run
shell> cd /usr/local/mysql/data shell> mysql world < world.sqlOf course, the SQL script must be readable by the mysql user. If you save the file to a different location, adjust the above accordingly. Running
SELECT
queries on the SQL node is no different than running
them on any other instance of a MySQL server. To run queries from the command
line, you first need to log in to the MySQL Monitor in the usual way:
shell> mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 to server version: 4.1.9-max Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql>If you did not modify the
ENGINE=
clauses in the table definitions prior
to importing the SQL script, then you should at this point run the following
commands:
mysql> USE world; mysql> ALTER TABLE City ENGINE=NDBCLUSTER; mysql> ALTER TABLE Country ENGINE=NDBCLUSTER; mysql> ALTER TABLE CountryLanguage ENGINE=NDBCLUSTER;Note that we simply use the default root account with its empty password here. Of course, in a production setting, you should always follow the standard security precautions for installing a MySQL server, including the setting of a strong root password and creation of a user account with only those privileges required to accomplish necessary tasks by that user. For more information about these, see section 5.5 The MySQL Access Privilege System. It is worth taking into account that Cluster nodes do not make use of the MySQL privileges system when accessing one another, and setting or changing MySQL user accounts (including the root account) has no effect on interaction between nodes, only on applications accessing the SQL node. Selecting a database and running a
SELECT
query against a table in
that database is also accomplished in the usual manner, as is exiting the
MySQL Monitor:
mysql> USE world; mysql> SELECT Name, Population FROM City ORDER BY Population DESC LIMIT 5; +-----------+------------+ | Name | Population | +-----------+------------+ | Bombay | 10500000 | | Seoul | 9981619 | | São Paulo | 9968485 | | Shanghai | 9696300 | | Jakarta | 9604900 | +-----------+------------+ 5 rows in set (0.34 sec) mysql> \q Bye shell>Applications using MySQL can use standard APIs. It is important to remember that your application must access the SQL node, and not the MGM or storage nodes. This brief example shows how we might execute the same query as above using PHP 5's
mysqli
extension running on a Web server elsewhere on
the network:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <title>SIMPLE mysqli SELECT</title> </head> <body> <?php # connect to SQL node: $link = new mysqli('192.168.0.20', 'root', '', 'world'); # parameters for mysqli constructor are: # host, user, password, database if( mysqli_connect_errno() ) die("Connect failed: " . mysqli_connect_error()); $query = "SELECT Name, Population FROM City ORDER BY Population DESC LIMIT 5"; # if no errors... if( $result = $link->query($query) ) { ?> <table border="1" width="40%" cellpadding="4" cellspacing ="1"> <tbody> <tr> <th width="10%">City</th> <th>Population</th> </tr> <? # then display the results... while($row = $result->fetch_object()) printf(<tr>\n <td align=\"center\">%s</td><td>%d</td>\n</tr>\n", $row->Name, $row->Population); ?> </tbody </table> <? # ...and verify the number of rows that were retrieved printf("<p>Affected rows: %d</p>\n", $link->affected_rows); } else # otherwise, tell us what went wrong echo mysqli_error(); # free the result set and the mysqli connection object $result->close(); $link->close(); ?> </body> </html>We assume that the process running on the Web server can reach the IP address of the SQL node. In a similar fashion, you can use the MySQL C API, Perl-DBI, Python-mysql, or MySQL AB's Connectors to perform the tasks of data definition and manipulation just as you would normally with MySQL.
To shut down the cluster simply enter the following in a shell on the machine hosting the MGM node:
shell> ndb_mgm -e shutdown
This will cause the ndb_mgm
, ndb_mgm
, and any
ndbd
processes to terminate gracefully. Any SQL nodes can be
terminated using mysqladmin shutdown
and other means.
To restart the cluster, simply run these commands:
shell> ndb_mgmd -f /var/lib/mysql-cluster/config.ini
shell> ndbdRemember not to invoke this command with the
--initial
option when restarting an NDBD node normally.
shell> mysqld &
For information on making backups, see section 16.6.4.2 Using The Management Server to Create a Backup.
To restore the cluster from backup requires the use of the ndb_restore
command. This is covered in section 16.6.4.3 How to Restore a Cluster Backup.
More information on configuring MySQL Cluster can be found in section 16.4 MySQL Cluster Configuration.
A MySQL server that is part of a MySQL Cluster differs in only one respect from a normal (non-clustered) MySQL server, in that it employs the NDB Cluster storage engine. This engine is also referred to simply as NDB, and the two forms of the name are synonomous.
In order to avoid unnecessary allocation of resources, the server is configured by default with the NDB storage engine disabled. To enable NDB, you will need to modify the server's `my.cnf' configuration file, or start it with the @option{--ndbcluster} option.
Since the MySQL server is a part of the cluster, it will also need to know how
to access an MGM node in order to obtain the cluster configuration data. The
default behavior is to look for the MGM node on localhost
. However,
should you need to specify its location elsewhere, this is can be done in
`my.cnf' or on the MySQL server command line. Before the NDB storage
engine can be used, at least one MGM node must be operational, as well as any
desired data nodes.
NDB, the Cluster storage engine, is available in binary distributions beginning with MySQL-Max 4.1.3 for Linux, Mac OS X, and Solaris. It is not yet supported on Windows, but we intend to make it available for win32 and other platforms in the near future.
If you choose to build from a source tarball or the MySQL 4.1 BitKeeper tree, be
sure to use the @option{--with-ndbcluster} option when running
configure
. You can also use the BUILD/compile-pentium-max
build script. Note that this script includes OpenSSL, so you must have or obtain
OpenSSL to build successfully; otherwise you will need to modify
compile-pentium-max
to exclude this requirement. Of course, you can
also just follow the standard instructions for compiling your own binaries, then
perform the usual tests and installation procedure. See section 2.8.3 Installing from the Development Source Tree.
In the next few sections, we assume that you are already familiar with installing MySQL, and here we cover only the differences between configuring MySQL Cluster and configuring MySQL without clustering. (See section 2 Installing MySQL if you require more information about the latter.)
You will find Cluster configuration easiest if you have already have all management and storage nodes running first; this is likely to be the most time-consuming part of the configuration. Editing the `my.cnf' file is fairly straightforward, and this section will cover only any differences from configuring MySQL without clustering.
In order to familiarise you with the basics, we will describe the simplest possible configuration for a functional MySQL Cluster. After this, you should be able to design your desired setup from the information provided in the other relevant sections of this chapter.
First, you need to create a configuration directory such as `/var/lib/mysql-cluster', by executing the following command as the system root user:
shell> mkdir /var/lib/mysql-cluster
In this directory, create a file named `config.ini' with the following
information, substituting appropriate values for HostName
and
DataDir
as necessary for your system.
# file "config.ini" - showing minimal setup consisting of 1 storage node, # 1 management server, and 3 MySQL servers. # The empty default sections are not required, and are shown only for # the sake of completeness. # Storage nodes must to provide a hostname but MySQL Servers are not required # to do so. # If you don't know the hostname for your machine, use localhost. # The DataDir parameter also has a default value, but it is recommended to # set it explicitly. # Note: DB, API, and MGM are aliases for NDBD, MYSQLD, and NDB_MGMD # respectively. DB and API are deprecated and should not be used in new # installations. [NDBD DEFAULT] NoOfReplicas= 1 [MYSQLD DEFAULT] [NDB_MGMD DEFAULT] [TCP DEFAULT] [NDB_MGMD] HostName= myhost.example.com [NDBD] HostName= myhost.example.com DataDir= /var/lib/mysql-cluster [MYSQLD] [MYSQLD] [MYSQLD]
You can now start the management server as follows:
shell> cd /var/lib/mysql-cluster shell> ndb_mgmd
Then start a single DB node by running ndbd
. When starting
ndbd
for a given DB node for the very first time, you should use the
--initial
option as shown here:
shell> ndbd --initial
For subsequent ndbd
starts, you will generally not want to
use this option:
shell> ndbd
This is because the --initial
option will delete all existing data and
log files (as well as all table metadata) for this storage node and create new
ones. One exception to this rule is when restarting the cluster and restoring
from backup after adding new storage nodes.
By default, ndbd
will look for the management server at
localhost
on port 1186. (Prior to MySQL 4.1.8, the default port
was 2200.)
Note: If you have installed MySQL from a binary tarball, you will need
to specify the path of the ndb_mgmd
and ndbd
servers
explicitly. (Normally, these will be found in `/usr/local/mysql/bin'.)
Finally, go to the MySQL data directory (usually `/var/lib/mysql' or `/usr/local/mysql/data'), and make sure that the `my.cnf' file contains the option necessary to enable the NDB storage engine:
[mysqld] ndbcluster
You can now start the MySQL server as usual:
shell> mysqld_safe --user=mysql &
Wait a moment to make sure the MySQL server is running properly. If you see the
notice mysql ended
, check the server's `.err' file to find out what
went wrong.
If all has gone well so far, you now can start using the cluster:
shell> mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 to server version: 4.1.7 Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> SHOW ENGINES; +------------+---------+-------------------------------------------------------- ----+ | Engine | Support | Comment | +------------+---------+-------------------------------------------------------- ----+ ... | NDBCLUSTER | DEFAULT | Clustered, fault-tolerant, memory-based tables | | NDB | YES | Alias for NDBCLUSTER | ... mysql> USE test; Database changed mysql> CREATE TABLE ctest (i INT) ENGINE=NDBCLUSTER; Query OK, 0 rows affected (0.09 sec) mysql> SHOW CREATE TABLE ctest \G *************************** 1. row *************************** Table: ctest Create Table: CREATE TABLE `ctest` ( `i` int(11) default NULL ) ENGINE=ndbcluster DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
To check that your nodes were set up properly, start the management client as shown:
shell> ndb_mgm
You can then use the SHOW
command from within the management client in
order to obtain a report on the cluster's status:
NDB> SHOW Cluster Configuration --------------------- [ndbd(NDB)] 1 node(s) id=2 @127.0.0.1 (Version: 3.5.3, Nodegroup: 0, Master) [ndb_mgmd(MGM)] 1 node(s) id=1 @127.0.0.1 (Version: 3.5.3) [mysqld(API)] 3 node(s) id=3 @127.0.0.1 (Version: 3.5.3) id=4 (not connected, accepting connect from any host) id=5 (not connected, accepting connect from any host)
At this point, you have successfully set up a working MySQL Cluster.
You can now store data in the cluster by using any table created with
ENGINE=NDBCLUSTER
or its alias ENGINE=NDB
.
connectstring
Configuring MySQL Cluster requires working with two files:
We are continuously making improvements in Cluster configuration and attempting to simplify this process. While we strive to maintain backwards compatibility, there may be times when introduce an incompatible change. In such cases we will try to let Cluster users know in advance if a change is not backwards compatible. If you find such a change which we have not documented, please use our Bugs Database to report it.
In order to support MySQL Cluster, you will need to update `my.cnf' as shown in the example below.
From version 4.1.8 some simplifications in `my.cnf' were made,
including new sections for ndbcluster
executables. However,
these should not be confused with those occurring in `config.ini'
files. As always, you may specify these parameters when invoking those
executables from the command line.
# my.cnf # example additions to my.cnf for MySQL Cluster # (valid from 4.1.8) # enable ndbcluster storage engine, and provide connectstring for # management server host (default port is 1186) [mysqld] ndbcluster ndb-connectstring=ndb_mgmd.mysql.com # provide connectstring for management server host (default port: 1186) [ndbd] connect-string=ndb_mgmd.mysql.com # provide connectstring for management server host (default port: 1186) [ndb_mgm] connect-string=ndb_mgmd.mysql.com # provide location of cluster configuration file [ndb_mgmd] config-file=/etc/config.ini
(For more information on connectstrings, see
section 16.4.4.2 The MySQL Cluster connectstring
.)
# my.cnf # example additions to my.cnf for MySQL Cluster # (will work on all versions) # enable ndbcluster storage engine, and provide connectstring for management # server host to the default port 1186 [mysqld] ndbcluster ndb-connectstring=ndb_mgmd.mysql.com:1186
Also starting with MySQL 4.1.8, the `my.cnf' file supports a separate
[mysql_cluster]
section for settings to be read by and affecting all
executables in the cluster:
# cluster-specific settings [mysql_cluster] ndb-connectstring=ndb_mgmd.mysql.com:1186
Currently the configuration file is in INI format, and is named
`config.ini' by default. It is read by ndb_mgmd
at
startup and it can be placed anywhere. Its location and name are
specified by using --config-file=[<path>]<filename>
on the
command line with ndb_mgmd
. If the configuration file is not
specified, ndb_mgmd
will by default try to read a
`config.ini' located in the current working directory.
Default values are defined for most parameters, and can also be specified in
`config.ini'. To create a default value section, simply add the word
DEFAULT
to the section name. For example, storage nodes are configured
using [NDBD]
sections. If all storage nodes use the same data memory
size, and this is not the same as the default size, create an [NDBD
DEFAULT]
section containing a DataMemory
line to specify the default
data memory size for all data nodes.
The INI format consists of sections preceded by section headings (surrounded by square brackets), followed by the appropriate parameter names and values. One deviation from the standard format is that the parameter name and value can be separated by a colon (`:') as well as the equals sign (`='); another is that sections are not uniquely identified by name. Instead, unique entries (such as two different nodes of the same type) are identified by a unique ID.
At a minimum, the configuration file must define the computers and nodes involved in the cluster and on which computers these nodes are located. An example of a simple configuration file for a cluster consisting of one management server, two storage nodes and two MySQL servers is shown below:
# file "config.ini" - 2 data nodes and 2 SQL nodes # This file is placed in the startup directory of ndb_mgmd (the management # server) # The first MySQL Server can be started from any host. The second can be started # only on the host mysqld_5.mysql.com [NDBD DEFAULT] NoOfReplicas= 2 DataDir= /var/lib/mysql-cluster [NDB_MGMD] Hostname= ndb_mgmd.mysql.com DataDir= /var/lib/mysql-cluster [NDBD] HostName= ndbd_2.mysql.com [NDBD] HostName= ndbd_3.mysql.com [MYSQLD] [MYSQLD] HostName= mysqld_5.mysql.com
There are six different sections in this configuration file:
[COMPUTER]
: Defines the the cluster hosts.
[NDBD]
: Defines the cluster's storage nodes.
[MYSQLD]
: Defines the cluster's MySQL server nodes.
[MGM|NDB_MGMD]
: Defines the management server node in the cluster.
[TCP]
: Defines TCP/IP connections between nodes in the cluster, with
TCP/IP being the default connection protocol.
[SHM]
: Defines shared-memory connections between nodes. Prior to MySQL
4.1.9, this type of connection was available only in binaries that were built
using the --with-ndb-shm
option. Beginning with MySQL 4.1.9-max, it is
enabled by default, but should still be considered experimental.
Note that each node has its own section in the `config.ini'. For instance, since this cluster has two storage nodes, the configuration file contains two sections defining these nodes.
One can define DEFAULT
values for each section. As of MySQL
4.1.5, all parameter names are case-insensitive.
connectstring
With the exception of the MySQL Cluster management server (ndb_mgmd
),
each node making up a MySQL Cluster requires a connectstring which
points to the management server's location. This is used in establishing a
connection to the management server as well as in performing other tasks
depending on the node's role in the cluster. The syntax for a
connectstring is as follows:
<connectstring> := [<nodeid-specification>,]<host-specification>[,<host-specification>] <nodeid-specification> := nodeid=<id> <host-specification> := <host>[:<port>]
<id>
is an integer larger than 1 identifying a node in config.ini
<port>
is an integer referring to a regular unix port
<host>
is a string which is a valid Internet host address
example 1 (long): "nodeid=2,myhost1:1100,myhost2:1100,192.168.0.3:1200" example 2 (short): "myhost1"
All nodes will use localhost:1186
as the default connectstring value if
none is provided. If <port>
is omitted from the connectstring, the
default port is 1186. (Note: Prior to MySQL 4.1.8, the default port was
2200.) This port should always be available on the network, since it has been
assigned by IANA for this purpose (see
http://www.iana.org/assignments/port-numbers for details).
By listing multiple <host-specification>
values, it is possible to
designate several redundant management servers. A cluster node will attempt to
contact successive management servers on each host in the order specified, until
a successful connection has been established.
There are a number of different ways to specify the connectstring:
[mysql_cluster]
section
in the management server's `my.cnf' file.
The recommended method for specifying the connectstring is to set it on the command line or in the `my.cnf' file for each executable.
The [COMPUTER]
section has no real significance other than serving as
a way to avoid the need of defining host names for each node in the system.
All parameters mentioned here are required.
[COMPUTER]Id
[COMPUTER]HostName
The [NDB_MGMD]
section (or its alias [MGM]
) is used to configure
the behavior of the management server. Either the ExecuteOnComputer
or
HostName
parameter must be present. All other parameters in the
following can be omitted and, if so, will assume their default values.
[NDB_MGMD]Id
[NDB_MGMD]ExecuteOnComputer
[COMPUTER]
section.
[NDB_MGMD]PortNumber
[NDB_MGMD]LogDestination
CONSOLE
, SYSLOG
, and FILE
:
CONSOLE
outputs the log to stdout
:
CONSOLE
SYSLOG
sends the log to a syslog
facility, possible values
being one of auth
, authpriv
, cron
, daemon
,
ftp
, kern
, lpr
, mail
, news
, syslog
,
user
, uucp
, local0
, local1
, local2
,
local3
, local4
, local5
, local6
, or local7
.
Note: Not every facility is necessarily supported by every
operating system.
SYSLOG:facility=syslog
FILE
pipes the cluster log output to a regular file on the same
machine. The following values can be specified:
filename
: The name of the logfile.
maxsize
: The maximum size to which the file can grow before logging
rolls over to a new file. When this occurs, the old logfile is renamed by
appending `.x' to the filename, where x
is the next number not
yet used with this name.
maxfiles
: The maximum number of logfiles.
FILE:filename=cluster.log,maxsize=1000000,maxfiles=6It is possible to specify multiple log destinations as shown here, using a semicolon-delimited string:
CONSOLE;SYSLOG:facility=local0;FILE:filename=/var/log/mgmdThe default value for the
FILE
parameter is
FILE:filename=ndb_<id>_cluster.log,maxsize=1000000,maxfiles=6
,
where <id>
is the ID of the node.
[NDB_MGMD]ArbitrationRank
ArbitrationRank
can take one of
the following values:
0
: The node will never be used as an arbitrator.
1
: The node has high priority; that is, it will be preferred as an
arbitrator over low-priority nodes.
2
: Indicates a low-priority node which be used as an arbtrator only if
a node with a higher priority is not available for that purpose.
ArbitrationRank
to 1 (the default value) and that of all SQL
nodes to 0.
[NDB_MGMD]ArbitrationDelay
[NDB_MGMD]DataDir
FILE
parameter for
[NDB_MGMD]LogDestination
as discussed previously in this section.)
The [NDBD]
section is used to configure the behavior of the storage
nodes. There are many parameters which control buffer sizes, pool
sizes, timeouts, and so forth. The only mandatory parameters are:
ExecuteOnComputer
or HostName
.
NoOfReplicas
These need to be defined in the [NDBD DEFAULT]
section.
Most storage node parameters are set in the [NDBD DEFAULT]
section. Only
those parameters explicitly stated as being possible set locally local values
are allowed to be changed in the [NDBD]
section. HostName
,
Id
and ExecuteOnComputer
must be defined in the local
[NDBD]
section.
The Id
value (that is, the the storage node identifier) can be allocated
on the command line when the node is started or in the configuration file.
For each parameter it is possible to use k
, M
, or G
as a
suffix to indicate units of 1024, 1024*1024, or 1024*1024*1024. (For example,
100k
means 10 * 1024 = 102400.) Parameters and values are currently
case-sensitive.
[NBDB]Id
[NDBD]ExecuteOnComputer
COMPUTER
section.
[NDBD]HostName
ExecuteOnComputer
. It defines the hostname of the computer the storage
node on which is to reside. Either this parameter or ExecuteOnComputer
is required.
[NDBD]ServerPort
[NDBD]NoOfReplicas
[NDBD DEFAULT]
section, and
defines the number of replicas for each table stored in the cluster. This
parameter also specifies the size of node groups. A node group is a set of nodes
that all store the same information.
Node groups are formed implicitly. The first node group is formed by the set of
storage nodes with the lowest node IDs, the next node group by the set of
the next lowest node identities, and so on. By way of example, assume that we
have 4 storage nodes and that NoOfReplicas
is set to 2. The four storage
nodes have node IDs 2, 3, 4 and 5. Then the first node group is formed from
nodes 2 and 3, and the second node group by nodes 4 and 5. It is important to
configure the cluster in such a manner that nodes in the same node groups are
not placed on the same computer, as in this situation a single hardware failure
would cause the entire cluster to crash.
If no node IDs are provided then the order of the storage nodes will be the
determining factor for the node group. Whether or not explicit assignments are
made, they can be viewed in the output of the management client's SHOW
command.
There is no default value for NoOfReplicas
; the maximum possible value is
4.
[NDBD]DataDir
[NDBD]FileSystemPath
DataDir
. Note: This directory must exist before
the ndbd
process is initiated.
The recommended directory hierarchy for MySQL Cluster includes
`/var/lib/mysql-cluster', under which a directory for the node's filesystem
is created. This subdirectory contains the node ID. For example, if the node ID
is 2, then this subdirectory is named `ndb_2_fs'.
[NDBD]BackupDataDir
DataMemory
and IndexMemory
are parameters specifying the size of
memory segments used to store the actual records and their indexes. In setting
values for these, it is important to understand how DataMemory
and
IndexMemory
are used, as they usually need to be updated in order to
reflect actual usage by the cluster:
[NDBD]DataMemory
DataMemory
is used to store both the the actual
records and indexes. Each record is currently of fixed size. (Even
VARCHAR
columns are stored as fixed-width columns.) There is a 16-byte
overhead on each record; an additional amount for each record is incurred
because it is stored in a 32KB page with 128 byte page overhead (see below).
There is also a small amount wasted per page due to the fact that each record
is stored in only one page. The maximum record size is currently 8052 bytes.
The memory space defined by DataMemory
is also used to store ordered
indexes, which use about 10 bytes per record. Each table row is represented in
the ordered index. A common error among users is to assume that all indexes are
stored in the memory allocated by IndexMemory
, but this is not the case:
only primary key and unique hash indexes use this memory; ordered indexes use
the memory allocated by DataMemory
. However, creating a primary key or
unique hash index also creates an ordered index on the same keys, unless you
specify USING HASH
in the index creation statement. This can be verified
by running ndb_desc -d db_name table_name
in the
management client.
The memory space allocated by DataMemory
consists of 32KB pages, which
are allocated to table fragments. Each table is normally partitioned into the
same number of fragments as there are storage nodes in the cluster. Thus, for
each node, there are the same number of fragments as are set in
NoOfReplicas
. Once a page has been allocated, it is currently not
possible to return it to the pool of free pages, except by deleting the table.
Performing a node recovery also will compress the partition because all records
are inserted into empty partitions from other live nodes.
The DataMemory
memory space also contains UNDO information: For each
update, a copy of the unaltered record is allocated in the DataMemory
.
There is also a reference to each copy in the ordered table indexes. Unique hash
indexes are updated only when the unique index columns are updated, in which
case a new entry in the index table is inserted and the old entry is deleted
upon commit. For this reason, it is also necessary to allocate enough memory to
handle the largest transactions performed by applications using the cluster. In
any case, performing a few large transactions holds no advantage over using many
smaller ones, for the following reasons:
DataMemory
is 80MB; the minimum is 1MB. There is no
maximum size, but in reality the maximum size has to be adapted so that the
process does not start swapping when the limit is reached. This limit is
determined by the amount of physical RAM available on the machine and by the
amount of memory that the operating system may commit to any one process. 32-bit
operating systems are generally limited to 2-4GB per process; 64-bit operating
systems can use more. For large databases, it may be preferable to use a 64-bit
operating system for this reason. In addition, it is also possible to run more
than one ndbd
process per machine, and this may prove advantageous on
machines with multiple CPUs.
[NDBD]IndexMemory
CREATE TABLE example ( a INT NOT NULL, b INT NOT NULL, c INT NOT NULL, PRIMARY KEY(a), UNIQUE(b) ) ENGINE=NDBCLUSTER;There are 12 bytes overhead (having no nullable columns saves 4 bytes of overhead) plus 12 bytes of data per record. In addition we have two ordered indexes on columns a and b consuming about 10 bytes each per record. There is a primary key hash index on the base table with roughly 29 bytes per record. The unique constraint is implemented by a separate table with b as primary key and a as a column. This table will consume another 29 bytes of index memory per record in the example table as well as 12 bytes of overhead, plus 8 bytes of record data. Thus, for one million records, we need 58 MB for index memory to handle the hash indexes for the primary key and the unique constraint. We also need 64 MB for the records of the base table and the unique index table, plus the two ordered index tables. You can see that hash indexes takes up a fair amount of memory space; however, they provide very fast access to the data in return. They are also used in MySQL Cluster to handle uniqueness constraints. Currently the only partitioning algorithm is hashing and ordered indexes are local to each node. Thus ordered indexes cannot be used to handle uniqueness constraints in the general case. An important point for both
IndexMemory
and DataMemory
is that the
total database size is the sum of all data memory and all index memory for each
node group. Each node group is used to store replicated information, so if there
are four nodes with 2 replicas, then there will be two node groups. Thus, the
total data memory available is 2*DataMemory
for each storage node.
It is highly recommended that DataMemory
and IndexMemory
be set
to the same values for all nodes. Since data is distributed evenly over all
nodes in the cluster the maximum amount of space available is no better than
that of the smallest node in the cluster.
DataMemory
and IndexMemory
can be changed, but decreasing either
of these can be risky; doing so can easily lead to a node or even an entire
Cluster that is unable to restart due to there being insufficient memory space.
Increasing these values should be quite okay, but it is recommended that such
upgrades are performed in the same manner as a software upgrade, beginning with
an update of the configuration file, then restarting the management server
followed by restarting each storage node in turn.
Updates do not increase the amount of index memory used. Inserts take effect
immediately; however, rows are not actually deleted until the transaction is
committed.
The default value for IndexMemory
is 18MB. The minimum is 1MB.
The next three parameters which we discuss are important because they affect the
number of parallel transactions and the sizes of transactions that can be
handled by the system. MaxNoOfConcurrentTransactions
sets the number of
parallel transactions possible in a node; MaxNoOfConcurrentOperations
sets the number of records that can be in update phase or locked simultaneously.
Both of these parameters (especially MaxNoOfConcurrentOperations
) are
likely targets for users setting specific values and not using the default
value. The default value is set for systems using small transactions, in order
to ensure that these do not use excessive memory.
[NDBD]MaxNoOfConcurrentTransactions
[NDBD]MaxNoOfConcurrentOperations
MaxNoOfConcurrentOperations
will always be used to
calculate the number of operation records in the transaction co-ordinator
portion of the node.
It is also important to have an idea of the memory requirements for operation
records. In MySQL 4.1, these consume about 1KB per record. This figure is
expected to be reduced in MySQL 5.x.
[NDBD]MaxNoOfLocalOperations
MaxNoOfConcurrentOperations
which fits systems with many simultaneous
transactions, none of them being very large. If there is a need to handle one
very large transaction at a time and there are many nodes, then it is a good
idea to override the default value by explicitly specifying this parameter.
The next set of parameters is used to determine temporary storage when executing a query which is part of a Cluster transaction. All records are released when the query is completed and the cluster is waiting for the commit or rollback.
The default values for these parameters are adequate for most situations. However, users with a need to support transactions involving large numbers of rows or operations may need to increase these to enable better parallelism in the system, while users whose applications require relatively small transactions can decrease the values in order to save memory.
[NDBD]MaxNoOfConcurrentIndexOperations
MaxNoOfConcurrentOperations
.
The default value of this parameter is 8192. Only in rare cases of extremely
high parallelism using unique hash indexes should it be necessary to increase
this value. Using a smaller value is possible and can save memory if the DBA is
certain that a high degree parallelism is not required for the cluster.
[NDBD]MaxNoOfFiredTriggers
MaxNoOfFiredTriggers
is 4000, which is sufficient
for most situations. In some cases it can even be decreased if the DBA feels
certain the need for parallelism in the cluster is not high.
A record is created when an operation is performed that affects a unique hash
index. Inserting or deleting a record in a table with unique hash indexes or
updating a column that is part of a unique hash index fires an insert or a
delete in the index table. The resulting record is used to represent this index
table operation while waiting for the original operation that fired it to
complete. This operation is short lived but can still require a large number of
records in its pool for situations with many parallel write operations on a base
table containing a set of unique hash indexes.
[NDBD]TransactionBufferMemory
ZATTRBUF_FILESIZE
(found in
`ndb/src/kernel/blocks/Dbtc/Dbtc.hpp') set to 4000*128 bytes (500KB). A
similar buffer for key info, ZDATABUF_FILESIZE
(also in `Dbtc.hpp')
contains 4000*16 = 62.5KB of buffer space. Dbtc
is the module which
handles transaction co-ordination.
Dblqh
module (in
`ndb/src/kernel/blocks/Dblqh/Dblqh.hpp') which affect reads and updates.
These include ZATTRINBUF_FILESIZE
, set by default to 10000*128 bytes
(1250KB) and ZDATABUF_FILE_SIZE
, set by default to 10000*16 bytes
(roughly 156KB) of buffer space. To date, there have been neither any reports
from users nor any results from our own extensive tests suggesting that either
of these compile-time limits should be increased.
The default value for TransactionBufferMemory
is 1MB.
[NDBD]MaxNoOfConcurrentScans
MaxNoOfConcurrentScans
scans concurrently from all nodes in the
cluster.
Scans are actually performed in two cases. The first of these cases occurs when
no hash or ordered indexes exists to handle the query, in which case the query
is executed by performing a full table scan. The second case is encountered when
there is no hash index to support the query but there is an ordered index. Using
the ordered index means executing a parallel range scan. Since the order is kept
on the local partitions only, it is necessary to perform the index scan on all
partitions.
The default value of MaxNoOfConcurrentScans
is 256. The maximum value
is 500.
This parameter specifies the number of scans possible in the transaction
co-ordinator. If the number of local scan records is not provided, it is
calculated as the product of MaxNoOfConcurrentScans
and the number
of storage nodes in the system.
[NDBD]MaxNoOfLocalScans
[NDBD]BatchSizePerLocalScan
ScanBatchSize
defined in the SQL nodes.
[NDBD]LongMessageBuffer
[NDBD]NoOfFragmentLogFiles
NoOfFragmentLogFiles
may need to be set as high as 300 or even higher in
order to provide sufficient space for REDO logs.
If the checkpointing is slow and there are so many writes to the database
that the log files are full and the log tail cannot be cut without jeapo
rdising recovery, then all updating transactions will be aborted with
internal error code 410, or Out of log file space temporarily
. This
condition will prevail until a checkpoint has completed and the log tail can be
moved forward.
[NDBD]MaxNoOfSavedMessages
The next set of parameters defines pool sizes for metadata objects. It is necessary to define the maximum number of attributes, tables, indexes, and trigger objects used by indexes, events, and replication between clusters.
[NDBD]MaxNoOfAttributes
[NDBD]MaxNoOfTables
BLOB
data type an extra table is used to
store most of the BLOB
data. These tables also must be taken into account
when defining the total number of tables.
The default value of this parameter is 128. The minimum is 8 and the
maximum is 1600. Each table object consumes approximately 20KB per node.
[NDBD]MaxNoOfOrderedIndexes
[NDBD]MaxNoOfUniqueHashIndexes
USING HASH
option when defining the unique index.
The default value is 64. Each index consumes approximately 15KB per node.
[NDBD]MaxNoOfTriggers
[NDBD]MaxNoOfIndexes
MaxNoOfOrderedIndexes
and MaxNoOfUniqueHashIndexes
instead.
This parameter is used only by unique hash indexes. There needs to be one record
in this pool for each unique hash index defined in the cluster.
The default value of this parameter is 128.
The behavior of storage nodes is also affected by a set of parameters taking boolean values. These parameters can be specified as TRUE by setting them equal to 1 or Y, and as FALSE by setting them equal to 0 or N.
[NDBD]LockPagesInMainMemory
[NDBD]StopOnError
[NDBD]Diskless
ndbd
on a diskless computer.
This feature makes the entire cluster operate in diskless mode. We plan to add
the capability to employ diskless mode on a per-table basis in a future
(5.0 or 5.1) release of MySQL Cluster.
Currently, when this feature is enabled, backups are performed but backup data
is not actually stored. In a future release we hope to make diskless backup a
separate, configurable parameter.
This feature is enabled by setting Diskless
to either 1 or
y. Diskless
is disabled by default.
[NDBD]RestartOnErrorInsert
There are a number of parameters specifying timeouts and intervals between various actions in Cluster storage nodes. Most of the timeout values are specified in milliseconds. Any exceptions to this will be mentioned below.
[NDBD]TimeBetweenWatchDogCheck
[NDBD]StartPartialTimeout
[NDBD]StartPartitionedTimeout
StartPartialTimeout
milliseconds but is still in a possibly partitioned state, the cluster waits
until this timeout has also passed.
The default timeout is 60000 milliseconds (60 seconds).
[NDBD]StartFailureTimeout
[NDBD]HeartbeatIntervalDbDb
[NDBD]HeartbeatIntervalDbApi
HeartbeatIntervalDbDb
.
The default interval is 1500 milliseconds (1.5 seconds). This interval can vary
between indidivual storage nodes because each storage node watches the MySQL
servers connected to it, independently of all other storage nodes.
[NDBD]TimeBetweenLocalCheckpoints
TimeBetweenLocalCheckpoints
to 6 or less means that local checkpoints
will be executed continuously without pause, independent of the cluster's
workload.
[NDBD]TimeBetweenGlobalCheckpoints
[NDBD]TimeBetweenInactiveTransactionAbortCheck
[NDBD]TransactionInactiveTimeout
[NDBD]TransactionDeadlockDetectionTimeout
[NDBD]NoOfDiskPagesToDiskAfterRestartTUP
NoOfDiskPagesToDiskAfterRestartTUP
to a value of 20 means writing 1.6MB
of data pages to disk each second during a local checkpoint. This value
includes the writing of UNDO log records for data pages; that is, this
parameter handles the limitation of writes from data memory. UNDO log
records for index pages are handled by the parameter
NoOfDiskPagesToDiskAfterRestartACC
. (See the entry for IndexMemory
for information about index pages.)
In short, this parameter specifies how quickly local checkpoints will be
executed, and operates in conjunction with NoOfFragmentLogFiles
,
DataMemory
, and IndexMemory
.
The default value is 40 (3.2MB of data pages per second).
[NDBD]NoOfDiskPagesToDiskAfterRestartACC
NoOfDiskPagesToDiskAfterRestartTUP
and acts in a similar fashion, but
limits the speed of writing index pages from index memory.
The default value of this parameter is 20 index memory pages per second (1.6MB
per second).
[NDBD]NoOfDiskPagesToDiskDuringRestartTUP
NoOfDiskPagesToDiskAfterRestartTUP
and
NoOfDiskPagesToDiskAfterRestartACC
, only it does so with regard to local
checkpoints executed in the node when a node is restarting. As part of all node
restarts a local checkpoint is always performed. During a node restart it is
possible to write to disk at a higher speed than at other times, because fewer
activities are being performed in the node.
This parameter covers pages written from data memory.
The default value is 40 (3.2MB per second).
[NDBD]NoOfDiskPagesToDiskDuringRestartACC
NoOfDiskPagesToDiskAfterRestartTUP
and
NoOfDiskPagesToDiskAfterRestartACC
, values for this parameter are
expressed in terms of 8KB pages written per 100 milliseconds (80KB/second).
The default value is 20 (1.6MB per second).
[NDBD]ArbitrationTimeout
Several configuration parameters corresponding to former compile-time parameters were introduced in MySQL 4.1.5. These enable the advanced user to have more control over the resources used by node processes and to adjust various buffer sizes at need.
These buffers are used as front-ends to the file system when writing log records to disk. If the node is running in diskless mode, then these parameters can be set to their minimum values without penalty due to the fact that disk writes are "faked" by the NDB storage engine's filesystem abstraction layer.
[NDBD]UndoIndexBuffer
Index UNDO buffers overloaded
.
[NDBD]UndoDataBuffer
Data UNDO buffers overloaded
.
[NDBD]RedoBuffer
REDO log buffers overloaded
.
In managing the cluster, it is very important to be able to control the number
of log messages sent for various event types to stdout
. There are 16
possible event levels (numbered 0 through 15). Setting event reporting for a
given event category to level 15 means all event reports in that category are
sent to stdout
; setting it to 0 means that there will be no event reports
made in that category.
By default, only the startup message is sent to stdout
, with the
reaimaining event reporting level defaults being set to 0. The reason for this
is that these messages are also sent to the management server's cluster log.
An analogous set of levels can be set for the management client to determine which event levels to record in the cluster log.
[NDBD]LogLevelStartup
[NDBD]LogLevelShutdown
[NDBD]LogLevelStatistic
[NDBD]LogLevelCheckpoint
[NDBD]LogLevelNodeRestart
[NDBD]LogLevelConnection
[NDBD]LogLevelError
[NDBD]LogLevelInfo
The parameters discussed in this section define memory buffers set aside for execution of online backups.
[NDBD]BackupDataBufferSize
BackupWriteSize
(see below), the pages are sent to disk. While flushing
data to disk, the backup process can continue filling this buffer until it runs
out of space. When this happens, the backup process pauses the scan and waits
until some disk writes have completed and have thus freed up memory so that
scanning may continue.
The default value for this parameter is 2MB.
[NDBD]BackupLogBufferSize
[NDBD]BackupMemory
BackupDataBufferSize
and
BackupLogBufferSize
.
The default value is 2MB + 2MB = 4MB.
[NDBD]BackupWriteSize
The [MYSQLD]
sections in the `config.ini' file define the behavior
of the MySQL servers (SQL nodes) used to access cluster data. None of the
parameters shown is required. If no computer or host name is provided, then any
host can use this SQL node.
[MYSQLD]Id
[MYSQLD]ExecuteOnComputer
[COMPUTER]
section of the configuration file.
[MYSQLD]ArbitrationRank
[MYSQLD]ArbitrationDelay
[MYSQLD]BatchByteSize
BatchSize
) and in terms of bytes (BatchByteSize
). The actual
batch size will be limited by both parameters.
The spped at which queries are performed can vary by more than 40% depending
upon how this parameter is set. In future releases, the MySQL Server will make
educated guesses on how to set parameters relating to batch size, based on the
query type.
This parameter is measured in bytes and by default is equal to 32KB.
[MYSQLD]BatchSize
[MYSQLD]MaxScanBatchSize
TCP/IP is the default transport mechanism for establishing connections in MySQL Cluster. It is actually not necessary to define connections because Cluster automatically set ups a connection between each of the storage nodes, between each storage node and all MySQL server nodes, and between each storage node and the management server.
It is only necessary to define a connection in order to override the default connection parameters. In that case it is necessary to define at least NodeId1, NodeId2, and the parameters to change.
It is also possible to change the default values fro these parameters by setting
them in the [TCP DEFAULT]
section.
[TCP]NodeId1
[TCP]NodeId2
[TCP]
seciton of the
configuration file.
[TCP]SendBufferMemory
[TCP]SendSignalId
[TCP]Checksum
[TCP]PortNumber
[TCP DEFAULT]
section.
This parameter should no longer be used. Use the ServerPort
instead.
[TCP]ReceiveBufferMemory
Beginning with MySQL 4.1.9-max, MySQL Cluster will attempt to use the shared
memory transporter and configure it automatically where possible, chiefly where
more than one node runs concurrently on the same cluster host. (In previous
versions of MySQL Cluster, shared memory segments were supported only when the
-max
binary was built using --with-ndb-shm
.) When explicitly
defining shared memory as the connection method it is necessary to define at
least NodeId1
, NodeId2
and ShmKey
. All other parameters
have default values that will work out fine in most cases.
Note: SHM support should still be considered experimental in MySQL 4.1.
[SHM]NodeId1
[SHM]NodeId2
NodeId1
and NodeId2
.
[SHM]ShmKey
[SHM]ShmSize
ShmSize
. The default value is 1MB.
[SHM]SendSignalId
[SHM]Checksum
Using SCI transporters to connect nodes in a MySQL Cluster is supported only
when the MySQL-max binaries are built using
--with-ndb-sci=/your/path/to/SCI
. The path should point to a
directory that contains at a minimum `lib' and `include' directories
containing SISCI libraries and header files.
In addition, SCI requires specialised hardware.
It is strongly recommended to use SCI Transporters only for communication
between ndbd
processes. In addition, using SCI Transporters means that
the ndbd
processes never sleep. For this reason, SCI Transporters
should be used only on machines having at least 2 CPUs dedicated for use by
ndbd
processes. There should be at least 1 CPU per ndbd process, with
at least 1 CPU left in reserve to handle operating system activities.
[SCI]NodeId1
[SCI]NodeId2
NodeId1
and NodeId2
.
[SCI]Host1SciId0
NodeId1
).
[SCI]Host1SciId1
[SCI]Host2SciId0
NodeId2
).
[SCI]Host2SciId1
[SCI]SharedBufferSize
ndbd
process.
[SCI]SendLimit
[SCI]SendSignalId
[SCI]Checksum
Checksum
is enabled, checksums are calculated for all messages before
they are placed in the send buffer. This feature prevents messages from being
corrupted while waiting in the send buffer. It also serves as a check against
data being corrupted during transport.
Understanding how to manage MySQL Cluster requires a knowledge of four essential processes. In the next few sections of this chapter, we cover the roles played by these processes in a cluster, how to use them, and what startup options are available for each of them.
ndbd
, the Storage Engine Node Process
ndb_mgmd
, the Management Server Process
ndb_mgm
, the Management Client Process
mysqld
is the traditional MySQL server process. To be used with MySQL
Cluster, mysqld
needs to be built with support for the NDB Cluster
storage engine, as it is in the precompiled -max
binaries available from
mysql.com for MySQL versions 4.1.3 and newer.
If the mysqld
binary has been built in such a manner, the NDB Cluster
storage engine is still disabled by default. You can use either of two possible
options for enabling the NDB Cluster storage engine:
mysqld
ndbcluster
in the [mysqld]
section of
your `my.cnf' file.
An easy way to verify that your server is running with the NDB Cluster storage
engine enabled is to issue the command SHOW ENGINES
in the MySQL Monitor
(mysql
). You should see the value YES
in the row listing
NDBCLUSTER
; if you see NO
in this row (or if there is no such row
displayed in the output), you are not running an NDB-enabled version of
mysqld
. If you see DISABLED
in this row, then you need to
enable it in either one of the two ways shown above.
In order to read cluster configuration data, the MySQL server requires 3 pieces of information:
Beginning with MySQL 4.1.5, node IDs can be dynamically allocated, in which case there is no need to specify it explicitly.
The mysqld
parameter ndb-connectstring
is used to specify the
connectstring either on the command line when starting mysqld
or in
`my.cnf'. The connectstring contains the hostname/IP and port where the
management server can be found.
In the following example, ndb_mgmd.mysql.com
is the host where the
management server resides, and the management server listens for cluster
messages on port 1186:
shell> mysqld --ndb-connectstring=ndb_mgmd.mysql.com:1186
See section 16.4.4.2 The MySQL Cluster connectstring
for more info on connectstrings.
Given this information, the MySQL server will be a full participant in the
cluster. (We sometimes refer to a mysqld
process running in this
manner as an SQL node.) It will be fully aware of all cluster storage nodes
as well as their status, and will establish connections to all storage nodes. In
this case, is able to use any storage node as a transaction coordinator and
to access storage node data for reading and updating.
ndbd
, the Storage Engine Node Process
ndbd
is the process that is used to handle all the data in tables
using the NDB Cluster storage engine. This is the process empowers a storage
node to accomplish distributed transaction handling, node recovery,
checkpointing to disk, online backup, and related tasks.
In a MySQL Cluster, a set of ndbd
processes co-operate in handling
data. These processes can execute on the same computer (host) or on different
computers. The correspondences between storage nodes and Cluster hosts is
completely configurable.
In MySQL versions prior to 4.1.5, each ndbd
process should be started
in a separate directory, the reason for this being that ndbd
generated
a set of log files in its starting directory. In MySQL 4.1.5, this behavior was
changed such that the these files are placed in the directory specified by
DataDir
in the configuration file. Thus ndbd
can be started
from anywhere.
These log files are listed below. <NodeID>
is the node's unique identifier.
For example, `ndb_2_error.log' is the error log generated by the storage
node whose node ID is 2.
ndbd
process has encountered. Each record in this file contains a
brief error string and a reference to a trace file for this crash. A typical
entry in this file might appear as shown here:
Date/Time: Saturday 31 January 2004 - 00:20:01 Type of error: error Message: Internal program error (failed ndbrequire) Fault ID: 2341 Problem data: DbtupFixAlloc.cpp Object of reference: DBTUP (Line: 173) ProgramName: NDB Kernel ProcessID: 14909 TraceFile: ndb_2_trace.log.2 ***EOM***
ndbd
process. This file is created only if ndbd
is
started as a daemon, which is the default behavior beginning with MySQL
4.1.5. This file was named `node<NodeID>.out' in versions 4.1.3
and 4.1.4.
ndbd
process when started as a daemon. It also functions as a lock
file to avoid the starting of nodes with the same identifier.
ndbd
, where it is possible
to trace all incoming, outgoing, and internal messages with their data in the
ndbd
process.
It is recommended not to use a directory mounted through NFS because in some environments this can cause problems whereby the lock on the pid-file remains in effect even after the process has terminated.
When starting ndbd
it may also be necessary to specify the hostname of
the management server and the port on which it is listening. Optionally, one may
also specify the node ID that the process is to use.
shell> ndbd --connect-string="nodeid=2;host=ndb_mgmd.mysql.com:1186"
(See section 16.4.4.2 The MySQL Cluster connectstring
for additional information about this
issue.)
When ndbd
starts, it actually initiates two processes. The first of
these is called the "angel process"; its only job is to discover when the
execution of the execution process has been completed, and then to restart the
ndbd
process if it is configured to do so. Thus, if you attempt to
kill ndbd
via the Unix kill
command, it is necessary to kill
both processes. A more proper way to terminate an ndbd
process is to
use the management client and stop the process from there.
The execution process uses one thread for reading, writing, and scanning data,
as well as all other activities. This thread is implemented asynchronously so
that it can easily handle thousands of concurrent activites. In addition, a
watch-dog thread supervises the execution thread to make sure that it does not
hang in an endless loop. A pool of threads handles file I/O, with each thread
able to handle one open file. Threads can also be used for transporter
connections by the transporters in the ndbd
process. In a system
performing a large number of operations, including updates, the ndbd
process can consume up to 2 CPUs if permitted to do so. For a machine with many
CPUs it is recommended to use several ndbd
processes which belong to
different node groups.
ndb_mgmd
, the Management Server ProcessThe management server is the process that reads the cluster configuration file and distributes this information to all nodes in the cluster that request it. It also maintains a log of cluster activities. Management clients can connect to the management server and check the cluster's status.
As of MySQL 4.1.5, it is no longer necessary to specify a connectstring when starting the management server. However, if you are using more than one management server, a connectstring should be provided and each node in the cluster should specify its node ID explicitly.
The following files are created or used by ndb_mgmd
in its starting
directory. From MySQL 4.1.5, the log and PID files are placed in the
DataDir
as specified in the configuration file. In the list that
follows, <NodeID> is the unique node identifier.
stdout
and stderr
when running the management
server as a daemon.
ndb_mgm
, the Management Client Process
The management client process is actually not needed to run the cluster. Its
value lies in providing a set of commands for checking the cluster's status,
starting backups, and performing other administrative functions. The management
client access the management server using a C API that advanced users can also
employ for programming dedicated management processes which can perform
tasks similar to those performed by ndb_mgm
.
When starting the management client, it is necessary to supply the hostname and port of the management server as in the example below. The default values are localhost and 1186 (prior to MySQL 4.1.8, the default Cluster port was 2200).
shell> ndb_mgm localhost 1186
Additional information about using ndb_mgm
can be found in
section 16.5.5.4 Command Options for ndb_mgm
and section 16.6.1 Commands in the Management Client.
mysqld
ndbd
ndb_mgmd
ndb_mgm
All MySQL Cluster executables (except for mysqld
) take the following
options as of 4.1.8. If you are using an earlier version, please read carefully,
as we have made changes in some of these switches in order to make them
consistent with one another as well as with mysqld
. (For example,
the -V
switch was -v
in earlier versions.) You can also use the
-?
option to see what is supported in your version.
-?, --usage, --help
-V, --version
ndbd
process. The version number is
the MySQL Cluster version number. The version number is relevant because not
all versions can be used together, and at startup the MySQL Cluster processes
verifies that the versions of the binaries being used can co-exist in the same
cluster. This is also important when performing an online software upgrade of
MySQL Cluster (see Software Upgrade of MySQL Cluster
).
-c connect_string, --connect-string
ndb_mgmd
does not
accept the -c option before MySQL 5.0, as this switch specifies the
configuration file in previous versions). Available with ndb_mgm
from
MySQL 4.1.8 onwards.
shell> ndbd --connect-string="nodeid=2;host=ndb_mgmd.mysql.com:1186"
--debug[=options]
mysqld
process.
mysqld
--ndbcluster
--skip-ndbcluster
--ndb-connectstring=connect_string
ndbd
For some common options see section 16.5.5 Command Options for MySQL Cluster Processes.
-d, --daemon
ndbd
to execute as a daemon process. From MySQL 4.1.5 on,
this is the default behavior.
--nodaemon
ndbd
not to start as a daemon process. Useful when
ndbd
is being debugged and one wants output redirected to the screen.
--initial
ndbd
to perform an initial start. An initial start erases
any files created for recovery purposes by earlier instances of ndbd
.
It also re-creates recovery log files. Note that on some operating systems
this process can take a substantial amount of time.
An --initial
start is to be used only the very first time that the
ndbd
process is started, as it removes all files from the Cluster
filesystem and re-creates all REDO log files. The exceptions to this rule
are:
ndbd
.
-i
for this
option. This shortcut was removed to prevent this option from being used by
mistake.
--nostart
ndbd
not to start automatically. When this option is used,
ndbd
connects to the management server, obtains configuration data
from it, and initialises communication objects. However, it will not actually
start the execution engine until specifically requested to do so by the
management server. This can be accomplished by issuing the proper command to the
management client.
ndb_mgmd
For some common options see section 16.5.5 Command Options for MySQL Cluster Processes.
-f filename, --config-file=filename, -c filename
config.ini
. The -f
shortcut is available beginning with MySQL
Cluster 4.1.8. The -c
shortcut became obsolete in MySQL Cluser 5.0.
-d, --daemon
ndb_mgmd
to start as a daemon. This is the default behavior.
-nodaemon
ndb_mgm
For some common options see section 16.5.5 Command Options for MySQL Cluster Processes.
[host_name [port_num]]
localhost
and the default port is 1186 (this was port 2200
prior to version 4.1.8).
--try-reconnect=number
Managing a MySQL Cluster involves a number of tasks, the first of which is to configure and start MySQL Cluster. This is covered in section 16.4 MySQL Cluster Configuration and section 16.5 Process Management in MySQL Cluster.
The following sections cover the management of a running MySQL Cluster.
There are essentially two methods of actively managing a running MySQL Cluster.
The first of these is through the use of commands entered into the management
client whereby cluster status can be checked, log levels changed, backups
started and stopped and nodes stopped and started. The second method involves
studying the contents of the cluster log `ndb_<NodeID>_cluster.log'
in the management server's DataDir
directory. (Recall that
<NodeID> represents the unique identifier of the node whose activity is
being logged.) The cluster log contains event reports generated by
ndbd
. It is also possible to send cluster log entries to a Unix system
log.
In addition to the central configuration file, a cluster may also be controlled
through a command line interface available through the management client
ndb_mgm
. This is the primary administrative interface to a running
cluster.
The management client has the following basic commands.
In the listing that follows, <id> denotes either a database node ID or the
keyword ALL
, which indicates that the command should be applied to all
of the cluster's storage nodes.
HELP
SHOW
<id> START
<id> STOP
<id> RESTART [-N] [-I]
<id> STATUS
ENTER SINGLE USER MODE <id>
EXIT SINGLE USER MODE
mysqld
processes) to access the database.
QUIT
SHUTDOWN
Commands for the event logs are given in the next section; commands for creating backups and restoring from backup are provided in a separate section on these topics.
In this section, we discuss the types of event logs provided by MySQL Cluster, and the types of events that are logged.
MySQL Cluster provides two types of event log. These are the cluster log, which includes events generated by all cluster nodes, and node logs, which are local to each storage node.
Output generated by cluster event logging can have multiple destinations
including a file, the management server console window, or syslog
. Output
generated by node event logging is written to the storage node's console window.
Both sorts of event logs can be set to log different subsets of events.
Note: The cluster log is the log recommended for most uses, since it provides logging information for an entire cluster in a single file. Node logs are intended to be used only during application development, or for debugging application code.
Each reportable event can be distinguished according to 3 different criteria:
STARTUP
,
SHUTDOWN
, STATISTICS
, CHECKPOINT
, NODERESTART
,
CONNECTION
, ERROR
, or INFO
.
ALERT
, CRITICAL
, ERROR
, WARNING
, INFO
, or
DEBUG
.
Both the cluster log and the node log can be filtered on these properties.
The following management commands are related to the cluster log:
CLUSTERLOG ON
CLUSTERLOG OFF
CLUSTERLOG INFO
<id> CLUSTERLOG <category>=<threshold>
CLUSTERLOG FILTER <severity>
The following table describes the default setting (for all storage nodes) of the cluster log category threshold. If an event has a priority with a value lower than or equal to the priority threshold, then it is reported in the cluster log.
Note that events are reported per storage node, and that the threshold can be set to different values on different nodes.
Category | Default threshold (All storage nodes) |
STARTUP | 7 |
SHUTDOWN | 7 |
STATISTICS | 7 |
CHECKPOINT | 7 |
NODERESTART | 7 |
CONNECTION | 7 |
ERROR | 15 |
INFO | 7 |
Thresholds are used to filter events within each category. For example, a
STARTUP
event with a priority of 3 is not logged unless the threshold
for STARTUP
is changed to 3 or lower. Only events with priority 3 or
lower are sent if the threshold is 3.
Event severity levels are shown below. (Note: These correspond to
UNIX syslog
levels, except for LOG_EMERG
and LOG_NOTICE
,
which are not used or mapped.)
1 | ALERT | A condition that should be corrected immediately, such as a corrupted system database |
2 | CRITICAL | Critical conditions, such as device errors or insufficient resources |
3 | ERROR | Conditions that should be corrected, such as configuration errors |
4 | WARNING | Conditions that are not errors, but that might require handling |
5 | INFO | Informational messages |
6 | DEBUG | Debugging messages used for NDB Cluster development |
Event severity levels can be turned on or off. If a severity level is turned on, then all events with a priority less than or equal to the category thresholds are logged. If the severity level is turned off then no events belonging to that severity level are logged.
An event report as reported in the event logs has the following format:
<datetime> [<string>] <severity> -- <message>
.
For example:
09:19:30 2005-04-24 [NDB] INFO -- Node 4 Start phase 4 completed
All reportable events are discussed in this section, ordered by category and severity level within each category.
CONNECTION
EventsThese are events associated with connections between Cluster nodes.
Event | Priority | Severity Level | Description |
DB nodes connected | 8 | INFO | Storage nodes connected |
DB nodes disconnected | 8 | INFO | Storage nodes disconnected |
Communication closed | 8 | INFO | SQL or storage node connection closed |
Communication opened | 8 | INFO | SQL or storage node connection opened |
CHECKPOINT
EventsThe logging messages shown below are assocated with checkpoints.
(Note: GCP = Global Checkpoint, LCP = Local Checkpoint.)
Event | Priority | Severity Level | Description |
LCP stopped in calc keep GCI | 0 | ALERT | LCP stopped |
Local checkpoint fragment completed | 11 | INFO | LCP on a fragment has been completed |
Global checkpoint completed | 10 | INFO | GCP finished |
Global checkpoint started | 9 | INFO | Start of GCP: REDO log is written to disk |
Local checkpoint completed | 8 | INFO | LCP completed normally |
Local checkpoint started | 7 | INFO | Start of LCP: data written to disk |
Report undo log blocked | 7 | INFO | UNDO logging blocked; buffer near overflow |
STARTUP
EventsThe following events are generated in response to the startup of a node or of the cluster and of its success or failure. They also provide information relating to the progress of the startup process, including information concerning logging activities.
Event | Priority | Severity Level | Description |
Internal start signal received STTORRY | 15 | INFO | Blocks received after completion of restart |
Undo records executed | 15 | INFO | |
New REDO log started | 10 | INFO | GCI keep X, newest restorable GCI Y |
New log started | 10 | INFO | Log part X, start MB Y, stop MB Z |
Node has been refused to be included into the cluster | 8 | INFO | Node cannot be included in cluster due to misconfiguration, inability to establish communication, or other problem |
DB node neighbours | 8 | INFO | Shows neighbouring storage nodes |
DB node start phase X completed | 4 | INFO | A storage node start phase has been completed |
Node has been successfully included into the cluster | 3 | INFO | Displays the node, managing node, and dynamic ID |
DB node start phases initiated | 1 | INFO | NDB Cluster nodes starting |
DB node all start phases completed | 1 | INFO | NDB Cluster nodes started |
DB node shutdown initiated | 1 | INFO | Shutdown of storage node has commenced |
DB node shutdown aborted | 1 | INFO | Unable to shut down storage node normally |
NODERESTART
EventsThe following events are generated when restarting a node and relate to the success or failure of the node restart process.
Event | Priority | Severity Level | Description |
Node failure phase completed | 8 | ALERT | Reports completion of node failure phases |
Node has failed, node state was X | 8 | ALERT | Reports that a node has failed |
Report arbitrator results | 2 | ALERT | There are 8 different possible results for arbitration attempts: • Arbitration check failed - less than 1/2 nodes left • Arbitration check succeeded - node group majority • Arbitration check failed - missing node group • Network partitioning - arbitration required • Arbitration succeeded - affirmative response from node X • Arbitration failed - negative response from node X • Network partitioning - no arbitrator available • Network partitioning - no arbitrator configured |
Completed copying a fragment | 10 | INFO | |
Completed copying of dictionary information | 8 | INFO | |
Completed copying distribution information | 8 | INFO | |
Starting to copy fragments | 8 | INFO | |
Completed copying all fragments | 8 | INFO | |
GCP takeover started | 7 | INFO | |
GCP takeover completed | 7 | INFO | |
LCP takeover started | 7 | INFO | |
LCP takeover completed (state = X) | 7 | INFO | |
Report whether an arbitrator is found or not | 6 | INFO | There are 7 different possible outcomes when seeking an arbitrator: • Management server restarts arbitration thread [state=X] • Prepare arbitrator node X [ticket=Y] • Receive arbitrator node X [ticket=Y] • Started arbitrator node X [ticket=Y] • Lost arbitrator node X - process failure [state=Y] • Lost arbitrator node X - process exit [state=Y] • Lost arbitrator node X <error msg>[state=Y] |
STATISTICS
EventsThe following events are of a statistical nature. They provide information such as numbers of transactions and other operations, amount of data sent or received by individual nodes, and memory usage.
Event | Priority | Severity Level | Description |
Report job scheduling statistics | 9 | INFO | Mean internal job scheduling statistics |
Sent number of bytes | 9 | INFO | Mean number of bytes sent to node X |
Received # of bytes | 9 | INFO | Mean number of bytes received from node X |
Report transaction statistics | 8 | INFO | Number of: transactions, commits, reads, simple reads, writes, concurrent operations, attribute info, aborts |
Report operations | 8 | INFO | Number of operations |
Report table create | 7 | INFO | |
Memory usage | 5 | INFO | Data and index memory usage (80%, 90% and 100%) |
ERROR
EventsThese events relate Cluster errors and warnings and generally indicate that a major malfunction or failure has occurred.
Event | Priority | Severity | Description |
Dead due to missed heartbeat | 8 | ALERT | Node X declared dead due to missed heartbeat |
Transporter errors | 2 | ERROR | |
Transporter warnings | 8 | WARNING | |
Missed heartbeats | 8 | WARNING | Node X missed heartbeat #Y |
General warning events | 2 | WARNING |
INFO
EventsThese events provide general information about the state of the cluster and activities associated with Cluster maintenance, such as logging and heartbeat transmission.
Event | Priority | Severity | Description |
Sent heartbeat | 12 | INFO | Heartbeat sent to node X |
Create log bytes | 11 | INFO | Log part, log file, MB |
General info events | 2 | INFO |
Single user mode allows the database administrator to restrict access to the database system to a single MySQL server (SQL node). When entering single user mode all connections to all other MySQL servers are closed gracefully and all running transactions are aborted. No new transactions are allowed to be started.
Once the cluster has entered single user mode, only the designated SQL node is
granted access to the database. You can use the all status
command to
see when the cluster has entered single user mode.
Example:
ENTER SINGLE USER MODE 5
After this command has executed and the cluster has entered single user mode, the SQL node whose node ID is 5 becomes the cluster's only permitted user.
The node specified in the command above must be a MySQL Server node; An attempt to specify any other type of node will be rejected.
Note: When the above coommand is invoked, all transactions running on the designated node are aborted, the connection is closed, and the server must be restarted.
The command EXIT SINGLE USER MODE
changes the state of the cluster's
storage nodes from single user mode to normal mode. MySQL Servers waiting for a
connection (that is, for the cluster to become ready and available), are now
allowed to connect. The MySQL Server denoted as the single-user SQL node
continues to run (if it is connected), during and after the state change.
Example:
EXIT SINGLE USER MODE
The recommend way to a node failure when running in single user mode is to do either of the following:
EXIT SINGLE USER MODE
command
or
This section describes how to create a backup and how to restore the database from a backup at a later time.
A backup is a snapshot of the database at a given time. The backup consists of three main parts:
Each of these is saved on all nodes participating in the backup. During backup each node saves these three parts into three files on disk:
In the table above, <BackupId> stands for the backup identifier and <NodeId> is the unique identifier for the node creating the file.
Before starting a backup, make sure that the cluster is properly configured for performing one. (See section 16.6.4.4 Configuration for Cluster Backup.)
To create a backup using the management server involves the following steps:
ndb_mgm
).
START BACKUP
.
To abort a backup already in progress:
Note: If there is no backup with ID <BackupId> running when it is aborted, the management server will not make any exlicit response. However, there will be a line in the cluster log indicating that an invalid abort command was sent.
The cluster restoration program is implemented as separate command line utility
ndb_restore
, which reads the files created by the backup and inserts
the stored information into the database. The restore program must be executed
once for each set of backup files, that is, as many times as there were database
nodes running when the backup was created.
The first time you run the restoration program, you also need to restore the
metadata; in other words, you must re-create the database tables. (Note that the
cluster should have an empty database when starting to restore a backup.) The
restore program acts as an API to the cluster and therefore requires a free
connection in order to connect to the cluster. This can be verified with the
ndb_mgm
command SHOW
. The switch -c
<connectstring>
may be used to locate the MGM node (see section 16.4.4.2 The MySQL Cluster connectstring
for information on connectstrings). The backup files must
be present in the directory given as an argument to the restoration program.
It is possible to restore a backup to a database with a different configuration
than it was created from. For example, suppose that a backup with backup ID
12, created in a cluster with two database nodes having the node IDs
2 and 3 is to be restored to a cluster with four nodes. Then
ndb_restore
must be run twice - once for each database node in the
cluster where the backup was taken.
Note: For rapid restoration, the data may be restored in parallel, provided that there is a sufficient number of cluster connections available. However, the data files must always be applied before the logs.
There are four configuration parameters essential for backup:
BackupDataBufferSize
BackupLogBufferSize
BackupMemory
BackupWriteSize
More detailed information about these parameters can be found in section 16.4 MySQL Cluster Configuration.
If an error code is returned when issuing a backup request, the most likely cause is insufficient memory or insufficient disk space. You should check that there is enough memory allocated for the backup. Also check that there is enough space on the hard drive partition of the backup target.
Currently (as of MySQL 4.1.10a) NDB does not support repeatable reads, which can cause problems with the restore process. While the backup process is "hot", restoring a MySQL Cluster from backup is not a 100% "hot" process. This is due to the fact that, for the duration of the restore process, running transactions get non-repeatable reads from the restored data. This means that the state of the data is inconsistent while the restore is in progress.
Even before design of NDB Cluster began in 1996, it was evident that one of the major problems to be encountered in building parallel databases would be communication between the nodes in the network. For this reason, NDB Cluster was designed from the very to allow for different transporters.
Currently the code base includes 4 different transporters. Most users today employ TCP/IP over Ethernet since this ubiquitous. This is also by far the best-tested transporter in MySQL Cluster.
We are working to make sure that communication with the ndbd
process
is made in as large chunks as possible since this will benefit all types of data
transmission.
For users who desire it, it is also possible to use cluster interconnects to enhance performance even further. There are two ways to achieve this: either a custom transporter can be designed to handle this case, or one can use socket implementations that bypass the TCP/IP stack to one extent or another. We have experimented with both of these techniques using the SCI (Scalable Coherent Interface) technology developed by Dolphin.
In this section we will show how one can use a cluster configured for normal TCP/IP communication to use SCI Sockets instead. This documentation is based on SCI Sockets version 2.3.0 as of 1 October 2004.
Any machines with which you wish to use SCI Sockets need to be equipped with SCI cards.
It is possible to use SCI Sockets with any version of MySQL Cluster. No special builds are needed since it uses normal socket calls which are already available in MySQL Cluster. However, SCI Sockets are currently supported only on Linux 2.4 and 2.6 kernels. SCI Transporters works on additional operating systems although we have verified these only with Linux 2.4 to date.
There are essentially four requirements for SCI Sockets. First, it is necessary to build the SCI Socket libraries. Second, the SCI Socket kernel libraries need to be installed. Third, one or two configuration files need to be installed. Finally, the SCI Socket kernel library needs to be enabled either for the entire machine or for the shell where the MySQL Cluster processes are started. This process needs to be repeated for each machine in the cluster where you plan to use SCI Sockets for inter-node communication.
Two packages need to be retrieved to get SCI Sockets working. The first of these builds the libraries on which SCI Sockets are built; the second contains the SCI Socket libraries themselves. Currently, these are available only in source code format. The latest versions of these packages at the time of this writing were `DIS_GPL_2_5_0_SEP_10_2004.tar.gz' and `SCI_SOCKET_2_3_0_OKT_01_2004.tar.gz'> You should be able to find them (or possibly newer versions) at http://www.dolphinics.no/support/downloads.html.
Once you have obtained the library packages, the next step is to unpack them into appropriate directories, with the SCI Sockets library unpacked into a directory below the DIS code. Next, you need to build the libraries. This example shows the commands used on Linux/x86 to perform this task:
shell> tar xzf DIS_GPL_2_5_0_SEP_10_2004.tar.gz shell> cd DIS_GPL_2_5_0_SEP_10_2004/src/ shell> tar xzf ../../SCI_SOCKET_2_3_0_OKT_01_2004.tar.gz shell> cd ../adm/bin/Linux_pkgs shell> ./make_PSB_66_release
It is possible to build these libraries for some 64-bit procesors. To build the
libraries for Opteron CPUs using the 64-bit extensions, run
make_PSB_66_X86_64_release
rather than make_PSB_66_release
;
if the build is made on an Itanium machine, then you should use
make_PSB_66_IA64_release
. The X86-64 variant should work for Intel
EM64T architectures but this has not yet (to our knowledge) been tested.
Once the build process is complete, the compiled libraries will be found in a zipped tar file with a name along the lines of `DIS-<operating-system>-time-date'. It is now time to install the package in the proper place. In this example we will place the installation in `/opt/DIS'. (Note: You will most likely need to run the following as the root user.)
shell> cp DIS_Linux_2.4.20-8_181004.tar.gz /opt/ shell> cd /opt shell> tar xzf DIS_Linux_2.4.20-8_181004.tar.gz shell> mv DIS_Linux_2.4.20-8_181004 DIS
Now that all the libraries and binaries are in their proper place we need to ensure that the SCI cards have proper node IDs within the SCI address space.
It is also necessary to decide on the network structure before proceeding. There are three types of network structures which can be used in this context:
Each of these topologies has its own method for providing node IDs. We discuss each of them in brief.
A simple ring uses node IDs which are non-zero multiples of 4: 4, 8, 12,...
The next possibility uses SCI switches. An SCI switch has 8 ports, each of which can support a ring. It is necessary to make sure that that different rings use different node ID spaces. In a typical configuration, the first port uses node IDs below 64 (4 - 60), the next 64 node IDs (68 - 124) are assigned to the next port, and so on, with node IDs 452 - 508 being assigned to the eighth port.
Two- and three-dimensional torus network structures take into account where each node is located in each dimension, incrementing by 4 for each node in the first dimension, by 64 in the second dimension, and (where applicable) by 1024 in the third dimension. See Dolphin's Web site for more thorough documentation.
In our testing we have used switches, although most large cluster installations use the 2-D/3-D torus structures. The advantage provided by switches is that, with dual SCI cards and dual switches, it is possible to build relatively easily a redundant network where the average failover time on the SCI network is on the order of 100 microseconds. This is supported by the SCI transporter in MySQL Cluster and is also under development for the SCI Socket implementation.
Failover for the 2-D/3-D torus is also possible but requires sending out new routing indexes to all nodes. However, this requires only 100 milliseconds or so to complete and should be acceptable for most high-availability cases.
By placing cluster storage nodes properly within the switched architecture, it is possible to use 2 switches to build a structure whereby 16 computers can be interconnected and no single failure can hinder more than one computer. With 32 computers and 2 switches it is possible to configure the cluster in such a manner that no single failure can cause the loss of more than two nodes; in this case it is also possible to know which pair of nodes will be affected. Thus, by placing the two in separate node groups, it is possible to build a safe MySQL Cluster installation.
To set the node ID for an SCI card use the following command in the
`/opt/DIS/sbin' directory. In this example -c 1
refers to the
number of the SCI card (this will always be 1 if there is only 1 card in the
machine), -a 0
refers to adapter 0; and 68
is the node ID:
shell> ./sciconfig -c 1 -a 0 -n 68
If you have multiple SCI cards in the same machine, you can determine which card has which slot by issuing the following command:
shell> ./sciconfig -c 1 -gsn
This will give you the SCI card's serial number. Then repeat this procedure for
-c 2
and so on, for each card in the machine. Once you have matched each
card with a slot, you can set node IDs for all cards.
After installing the necessary libraries and binaries, and setting the SCI node IDs, the next step is to set up the mapping from hostnames (or IP addresses) to SCI node IDs. This is done in the SCI sockets configuration file, which should be saved as `/etc/sci/scisock.conf'. In this file, the SCI node ID will be mapped to the hostname or IP address to communicate with, through the proper SCI card. Here is a very simple example of such a configuration file:
#host #nodeId alpha 8 beta 12 192.168.10.20 16
It is also possible to limit this configuration so that it applies only to a subset of the available ports for these hosts. An additional configuration file `/etc/sci/scisock_opt.conf' can be used to accomplish this, as shown here:
#-key -type -values EnablePortsByDefault yes EnablePort tcp 2200 DisablePort tcp 2201 EnablePortRange tcp 2202 2219 DisablePortRange tcp 2220 2231
With the configuraiton files in place, the drivers can be installed. First, the the low-level drivers and then the SCI socket driver need to be installed:
shell> cd DIS/sbin/ shell> ./drv-install add PSB66 shell> ./scisocket-install add
If desired, the installation can be checked by invoking a script which verifies that all nodes in the SCI socket configuration files are accessible:
shell> cd /opt/DIS/sbin/ shell> ./status.sh
If you discover an error and need to change the SCI socket configuration, it is
necessary to use the ksocketconfig
to accomplish this task:
shell> cd /opt/DIS/util shell> ./ksocketconfig -f
To make sure that SCI sockets are actually being used, you can use the test
program latency_bench
. Using this utility's server component, clients
can connect to the server to test the latency of the connection; determining
whether or not SCI is enabled should be fairly simple from observing the
latency. (Note: Before using latency_bench
, it is necessary
to set the LD_PRELOAD
environment variable as shown later in this
section.)
To set up a server, use the following:
shell> cd /opt/DIS/bin/socket shell> ./latency_bench -server
To run a client, use latency_bench
again, except this time with the
-client
option:
shell> cd /opt/DIS/bin/socket shell> ./latency_bench -client hostname_of_server
SCI socket configuration should now be complete and MySQL Cluster ready to use both SCI Sockets and the SCI transporter (see section 16.4.4.9 MySQL Cluster SCI Transport Connections).
The next step in the process is to start MySQL Cluster. To enable usage of SCI
Sockets it is necessary to set the environment variable LD_PRELOAD
before
starting ndbd
, mysqld
, and ndb_mgmd
. This variable
should point to the kernel library for SCI Sockets.
To start ndbd
in a bash shell, do the following:
bash-shell> export LD_PRELOAD=/opt/DIS/lib/libkscisock.so bash-shell> ndbd
In a tcsh environment the same thing can be accomplished with:
tcsh-shell> setenv LD_PRELOAD=/opt/DIS/lib/libkscisock.so tcsh-shell> ndbd
Note: MySQL Cluster can use only the kernel variant of SCI Sockets.
The ndbd
process has a number of simple constructs which are used to
access the data in a MySQL Cluster. We have created a very simple benchmark to
check the performance of each of these and the effects which various
interconnects have on their performance.
There are four access methods:
ndbd
. Such requests also
benefit from batching.
ndbd
process, which
then divides the table scan into a set of parallel scans on all cluster
ndbd
processes. In future versions of MySQL Cluster, an SQL node will
be able to filter some of these scans.
To check the base performance of these access methods we have developed a set of
benchmarks. One such benchmark, testReadPerf
, tests simple and batched
primary and unique key accesses. This benchmark also measures the setup cost of
range scans by issuing scans returning a single record. There is also a variant
of this benchmark which uses a range scan to fetch a batch of records.
In this way, we can test the cost of single key access and single record scan accesses, and measure the impact of the communication media used, on base access methods.
In our tests, we ran the base benchmarks for both a normal transporter using TCP/IP sockets and a similar setup using SCI sockets. The figures reported below are for small accesses of 20 records per access. The difference between serial and batched access decreases by a factor of 3 to 4 when using 2 kB records instead. SCI Sockets were not tested with 2 kB records. Tests were performed on a cluster with 2 storage nodes running on 2 dual-CPU machines equipped with AMD MP1900+ processors.
Access type: TCP/IP sockets SCI Socket Serial pk access: 400 microseconds 160 microseconds Batched pk access: 28 microseconds 22 microseconds Serial uk access: 500 microseconds 250 microseconds Batched uk access: 70 microseconds 36 microseconds Indexed eq-bound: 1250 microseconds 750 microseconds Index range: 24 microseconds 12 microseconds
We also performed another set of tests to check the performance of SCI Sockets vis-à-vis that of the SCI transporter, and both of these as compared with the TCP/IP transporter. All these tests used primary key accesses either serially and multi-threaded, or multi-threaded and batched.
The tests showed that SCI sockets were about 100% faster than TCP/IP. The SCI
transporter was faster in most cases compared to SCI sockets. One notable case
occurred with many threads in the test program, which showed that the SCI
transporter did not perform very well when used for the mysqld
process.
Our overall conclusion was that, for most benchmarks, using SCI sockets improves
performance by approximately 100% over TCP/IP, except in rare instances when
communication performance is not an issue. This can occur when scan filters
make up most of processing time or when very large batches of primary key
accesses are achieved. In that case the CPU processing in the ndbd
processes becomes a fairly large part of the overhead.
Using the SCI transporter instead of SCI Sockets is only of interest in
communicating between ndbd
processes. Using the SCI transporter is
also only of interest if a CPU can be dedicated to the ndbd
process
since the SCI transporter ensures that this process will never go to sleep. It
is also important to ensure that the ndbd
process priority is set in
such a way that the process does not lose priority due to running for an
extended period of time, as can be done by locking processes to CPU's in Linux
2.6. If such a configuration is possible, then the ndbd
process will
benefit by 10-70% compared to using SCI sockets. (The larger figures will be
seen when performing updates and probably on parallel scan operaitons as well.)
There are several other optimized socket implementations for computer clusters, including Myrinet, Gigabit Ethernet, Infiniband and the VIA interface. We have tested MySQL Cluster so far only with SCI sockets. We also include documentation above on how to set up SCI sockets using ordinary TCP/IP for MySQL Cluster.
In this section, we provide a listing of known limitations in MySQL Cluster release 4.1 when compared to features available when using the MyISAM and InnoDB storage engines. Currently there are no plans to address these in coming releases of 4.1; howver, we will supply fixes for these issues in MySQL 5.0 and subsequent releases. If you check the Cluster category in the MySQL bugs database at http://bugs.mysql.com, you can find known bugs which (if marked 4.1) we intend to correct in upcoming releases of MySQL 4.1.
DataMemory
and
IndexMemory
, repectively).
MaxNoOfConcurrentOperations
. Note
that bulk loading, TRUNCATE TABLE
, and ALTER TABLE
are handled
as special cases by running multiple transactions, and so are not subject to
this limitation.
MaxNoOfOrderedIndexes
.
VARCHAR
fields
containing only relatively small values, more memory and disk space will be
required when using the NDB storage engine than would be for the same table
and data using the MyISAM engine. We are working to rectify this issue in
MySQL 5.1.
Records in range
statistic is not supported, resulting in
non-optimal query plans in some cases. Employ USE INDEX
or FORCE
INDEX
as a workaround.
USING HASH
cannot be used for
accessing a table if NULL
is given as part of the key.
READ_COMMITTED
. (InnoDB
supports READ_COMMITTED
, REPEATABLE_READ
, and
SERIALIZABLE
.) See MySQL Cluster Backup Troubleshooting for
informaiton on how this can effect backup/restore of Cluster databases.
ALTER TABLE
is not fully locking when running multiple MySQL
servers (no distributed table lock).
CREATE DATABASE db_name;
statement on each additional MySQL
server that access the same MySQL Cluster. (As of MySQL 5.0.2 you may also use
CREATE SCHEMA db_name;
.) Once this has been done for a given
MySQL server, that server should be able to detect the database tables without
error.
mysql
or other clients that may be accessing the cluster's
SQL nodes.
ALTER TABLE
or CREATE INDEX
. (However,
you can import or create a table that uses a different storage engine, then
convert it to NDB using ALTER TABLE tbl_name ENGINE=NDBCLUSTER;
.)
This listing is intended to be complete with respect to the conditions set forth at the beginning of this section. You can report any discrepancies that you encounter to the MySQL bugs database at http://bugs.mysql.com/. If we do not plan to fix the problem in MySQL 4.1, we will add it to the list above.
ndb_mgmd
; the management client used to control MySQL
Cluster via the MGM node is ndb_mgm
.
ndbd
.
mysqld
) started with the --ndb-cluster
option.
SHOW ERRORS
or SHOW
WARNINGS
immediately upon being notified of the error or warning condition.
These can also be displayed in MySQL Query Browser.
perror --ndb error-code
.
SHOW VARIABLES LIKE 'have_%';
or SHOW ENGINES;
. (See
section 5.1.2 The mysqld-max
Extended MySQL Server for more information.)
Linux users, please note that NDB is not included in the RPMs; you
should use the binaries supplied as .tar.gz
archives in the
MySQL Downloads area instead. You can
also obtain NDB support by compiling the -max
binaries from source, but
it is not necessary to do so simply to use MySQL Cluster.
(SizeofDatabase * NumberOfReplicas * 1.1 ) / NumberOfStorageNodesTo calculate the memory requirements more exactly requires determining, for each table in the cluster database, the storage space required per row (see section 11.5 Column Type Storage Requirements for details), and multiplying this by the number of rows. You must also remember to account for any column indexes as follows:
IndexMemory
.
DataMemory
.
USING HASH
. In other words, if created
without USING HASH
, a primary key or unique index on a Cluster table
will take up 35 bytes (plus the size of the key) per record in MySQL 4.1, and
31-35 bytes per record in MySQL 5.0.
Note that creating MySQL Cluster tables with USING HASH
for all primary
keys and unique indexes will generally cause table updates to run more
quickly. This is due to the fact that less memory is required (since no
ordered indexes are created), and that less CPU must be utilised (since fewer
indexes must be read and possibly updated).
USING HASH
.
We often see questions from users who report that, when they're trying to
populate a Cluster database, the loading process terminates prematurely and an
error message like this one is observed:
ERROR 1114: The table 'my_cluster_table' is fullWhen this occurs, the cause is very likely to be that your setup does not provide sufficient RAM for all table data and all indexes, including the primary key required by NDB. It is also worth noting that all storage nodes should have the same amount of RAM, as no storage node in a cluster can use more memory than the least amount available to any individual storage node. In other words, if there are three computers hosting Cluster storage nodes, with two of these having three gigabytes of RAM available to store Cluster data, and one having only one GB RAM, then each storage node can devote only one GB for Cluster.
FULLTEXT
indexing is not currently (MySQL 4.1.9) supported by the NDB
storage engine. We are working to add this capability in a future release.
ndb_mgmd
and ndb_mgm
processes.
--initial
startup option
FULLTEXT
indexes and prefix indexes are not supported. Only complete
columns may be indexed.
ENGINE=NDB
or ENGINE=NDBCLUSTER
.
ArbitrationRank
configuration parameter for MySQL Server and management
server nodes. (See section 16.4.4.4 Defining the MySQL Cluster Management Server for details.) It should
also be noted that the role of arbitrator does not in and of itself impose any
heavy demands upon the host so designated, and thus the artitrator host does not
need to be particularly fast or to have extra memory especially for this
purpose.
ENGINE=NDBCLUSTER
) have only fixed-width
rows. This means that (for example) each record containing a VARCHAR(255)
column will will require 256 bytes of storage for that column, regardless of the
size of the data stored therein. This issue is expected to be fixed in MySQL
5.1.
See section 16.8 Cluster Limitations in MySQL 4.1 for more information about these
issues.
ndb_mgmd
command.
ndbd
command.
mysqld_safe --user=mysql &
.
ndb_mgm
on the machine housing the MGM node.
shell> ndb_mgm -e shutdownThis will cause the
ndb_mgm
, ndb_mgm
, and any
ndbd
processes to terminate gracefully. MySQL servers running as
Cluster SQL nodes can be stopped using mysqladmin shutdown
.
For more information, see section 16.6.1 Commands in the Management Client and
section 16.3.6 Safe Shutdown and Restart.
mysqld
must use a different
TCP/IP port.
The following terms are useful to an understanding of MySQL Cluster or have specialised meanings when used in relation to it.
shmop(2)
is a good place to begin obtaining additional information
about this topic.
ndbd
: The data node daemon (runs a data node process)
ndb_mgmd
: The management server daemon (runs a management
server process)
ndb_mgm
: The management client (provides an interface for
executing management commands)
ndb_waiter
: Used to verify status of all nodes in a cluster
ndb_restore
: Restores cluster data from backup
Go to the first, previous, next, last section, table of contents.