MySQL - MySQL Basic, Expression & Operators, Working with Tables

MySQL Basic

MySQL Data types

MySQL knows about several data types-that is, general categories in which values can be represented. These include numbers, string values, temporal values such as dates and times, and the NULL value.

Numeric Values

Numbers are values such as 48 or 193.62. MySQL understands numbers specified as integers (with no fractional part) or floating-point values (with a fractional part). Integers can be specified in decimal or hexadecimal format.

An integer consists of a sequence of digits with no decimal point. In numeric contexts, an integer can be specified as a hexadecimal constant and is treated as a 64-bit integer. The syntax for specifying hexadecimal values is given in the next section, "String (Character) Values," because they are treated as strings by default.

A floating-point number consists of a sequence of digits, a decimal point, and another sequence of digits. One sequence of digits or the other can be empty, but not both.

MySQL understands scientific notation. This is indicated by immediately following an integer or floating-point number with 'e' or 'E', a sign character ('+' or '-'), and an integer exponent. 1.34E+12 and 43.27e-1 are numbers in legal scientific notation. The number 1.34E12 is also legal even though it is missing a sign character before the exponent, but only as of MySQL 3.23.26. Prior to that version, a sign character is required.

Hexadecimal numbers cannot be used in scientific notation: The 'e' that begins the exponent part is also a legal hex digit and thus would be ambiguous. Any number can be preceded by a minus sign ('-') to indicate a negative value.

String (Character) Values

Strings are values, such as 'Madison, Wisconsin', or 'patient shows improvement'. You can use either single or double quotes to surround a string value. The ANSI SQL standard specifies single quotes, so statements written using them are more portable to other database engines.

Several escape sequences are recognized within strings and can be used to indicate special characters. Each sequence begins with a backslash character ('\') to signify a temporary escape from the usual rules for character interpretation. Note that a NUL byte is not the same as the NULL value; NUL is a zero-valued byte, whereas NULL is the absence of a value.

Sequence	Meaning
\0	NUL (ASCII 0)
\'	Single quote
>\"	Double quote
\b	Backspace
\n	Newline (linefeed)
\r	Carriage return
\t	Tab
\\	Backslash
\Z	Ctrl-Z (Windows EOF character)

To include either kind of quote character within a string, you can do one of three things:

. Double the quote character if the string is quoted using the same character:
. 'I can''t'
. "He said, ""I told you so."""
. Quote the string with the other quote character; in this case, you do not double the quote characters within the string:
. "I can't"
. 'He said, "I told you so."'

. Escape the quote character with a backslash; this works regardless of the quote characters used to quote the string:
. 'I can\'t'
. "I can\'t"
. "He said, \"I told you so.\""
. 'He said, \"I told you so.\"'

Hexadecimal constants can be used to specify string values. There are two different syntaxes for such constants. The first consists of '0x' followed by one or more hexadecimal digits ('0' through '9' and 'a' through 'f'). For example, 0x0a is 10 decimal, and 0xffff is 65535 decimal. Non-decimal hex digits can be specified in uppercase or lowercase, but the leading '0x' cannot be given as '0X'.

That is, 0x0a and 0x0A are legal, but 0X0a and 0X0A are not. In string context, pairs of hexadecimal digits are interpreted as ASCII codes, converted to characters, and the result is used as a string. In numeric context, a hexadecimal constant is treated as a number. The following statement illustrates both uses:

mysql> SELECT 0x616263, 0x616263+0;

+----------+------------+
      | 0x616263 | 0x616263+0 |
+----------+------------+
| abc      |    6382179 |
+----------+------------+

As of MySQL 4.0, string values can also be specified using the ANSI SQL notation X'val', where val consists of pairs of hexadecimal digits. As with 0x notation, such values are interpreted as strings but can be used as numbers in a numeric context:

mysql> SELECT X'616263', X'616263'+0;

  +-----------+-------------+
  | X'616263' | X'616263'+0 |
+-----------+-------------+
| abc       |     6382179 |
+-----------+-------------+

Unlike 0x notation, the leading 'X' is not case sensitive:

mysql> SELECT X'61', x'61';

+-------+-------+
| X'61' | x'61' |
+-------+-------+
| a     | a     |
+-------+-------+

From MySQL 4.1 and later, string values can be specified to lie within a particular character set. Before that, string values are interpreted using the server's default character set.

Date and Time (Temporal) Values

Dates and times are values such as '2002-06-17' or '12:30:43'. MySQL also understands combined date/time values, such as '2002-06-17 12:30:43'. Take special note of the fact that MySQL represents dates in year-month-day order.

This often surprises newcomers to MySQL, although this format is the ANSI SQL standard (also known as ISO 8601 format). You can display date values any way you want by using the DATE_FORMAT() function, but the default display format lists the year first, and input values must be specified with the year first.

The NULL Value

NULL is something of a "typeless" value. Generally, it's used to mean "no value," "unknown value," "missing value," "out of range," "not applicable," "none of the above," and so on. You can insert NULL values into tables, retrieve them from tables, and test whether a value is NULL. However, you cannot perform arithmetic on NULL values; if you try, the result is NULL.

MySQL Column type

Each table in a database is made up of one or more columns. When you create a table using a CREATE TABLE statement, you specify a type for each column.
A column type is more specific than a data type, which is just a general category, such as "number" or "string."

A column type precisely characterizes the kind of values a given table column can contain, such as SMALLINT or VARCHAR(32).
MySQL's column types are the means by which you describe what kinds of values a table's columns contain, which in turn determines how MySQL treats those values.

For example, if you have numeric values, you can store them using a numeric or a string column type, but MySQL will treat the values somewhat differently depending on how you store them.
Each column type has several characteristics:

. What kind of values you can store in it
. How much space values take up, and whether the values are fixed-length (all values of the type taking the same amount of space) or variable-length (the amount of space depending on the particular value being stored)

. How values of the type are compared and sorted
. Whether the type allows NULL values
. Whether the type can be indexed

The following discussion surveys MySQL's column types briefly in a broad overview and then describes in more detail the properties that characterize each type.

Overview of Column Types

MySQL provides column types for values from all the general data type categories except the NULL value. NULL spans all types in the sense that the property of whether a column can contain NULL values is treated as a type attribute.

MySQL has numeric column types for both integer and floating-point values. Integer columns can be signed or unsigned. A special attribute allows sequential integer column values to be generated automatically, which is useful in applications that require a series of unique identification numbers.

Numeric Column Types
Type Name	Meaning
TINYINT	A very small integer
SMALLINT	A small integer
MEDIUMINT	A medium-sized integer
INT	A standard integer
BIGINT	A large integer
FLOAT	A single-precision floating-point number
DOUBLE	A double-precision floating-point number
DECIMAL	A floating-point number, represented as a string

MySQL string column types are shown in a table below. Strings can hold anything, even arbitrary binary data such as images or sounds. Strings can be compared according to whether or not they are case sensitive. In addition, you can perform pattern matching on strings. (Actually, in MySQL you can perform pattern matching on any column type, but it's most often done with string types.)

String Column Types
Type Name	Meaning
CHAR	A fixed-length character string
VARCHAR	A variable-length character string
TINYBLOB	A very small BLOB (binary large object)
BLOB	A small BLOB
MEDIUMBLOB	A medium-sized BLOB
LONGBLOB	A large BLOB
TINYTEXT	A very small text string
TEXT	A small text string
MEDIUMTEXT	A medium-sized text string
LONGTEXT	A large text string
ENUM	An enumeration; column values may be assigned one enumeration member
SET	A set; column values may be assigned multiple set members

MySQL date and time types are shown below, where CC, YY, MM, DD hh, mm, and ss represent century, year, month, day, hour, minute, and second. For temporal values, MySQL provides types for dates and times (either combined or separate) and timestamps (a special type that allows you to track when changes were last made to a record). There is also a type for efficiently representing year values when you don't need an entire date.

Date and Time Column Types
Type Name	Meaning
DATE	A date value, in 'CCYY-MM-DD' format
TIME	A time value, in 'hh:mm:ss' format
DATETIME	A date and time value, in 'CCYY-MM-DD hh:mm:ss' format
TIMESTAMP	A timestamp value, in CCYYMMDDhhmmss format
YEAR	A year value, in CCYY format

Creating Tables

To create a table, issue a CREATE TABLE statement and specify a list of the columns that make up the table. Each column has a name and a type, and various attributes can be associated with each type. The following example creates a table named my_test_tbl containing three columns named f, c, and i:

CREATE TABLE my_test_tbl
(
name VARCHAR(25) NOT NULL DEFAULT 'none'
);

The syntax for declaring a column is as follows:

col_name col_type [col_attributes] [general_attributes]

Numeric Column Types

MySQL's numeric column types fall into two general classifications:

. Integer types are used for numbers that have no fractional part, such as 1, 43, -3, 0, or -798432. You can use integer columns for data represented by whole numbers, such as weight to the nearest pound, height to the nearest inch, number of stars in a galaxy, number of people in a household, or number of bacteria in a petri dish.

. Floating-point types are used for numbers that may have a fractional part, such as 3.14159, -.00273, -4.78, or 39.3E+4. You can use floating-point column types for values that may have a fractional part or that are extremely large or small. Some types of data you might represent as floating-point values are average crop yield, distances, money values, unemployment rates, or stock prices.

Integer types are the simplest. Floating-point types are more complex, particularly because their behavior has changed at certain points in MySQL's development.

Floating-point values can be assigned to integer columns but will be rounded to the nearest integer. Conversely, integer values can be assigned to floating-point columns. They are treated as floating-point values with a fractional part of zero.

Type Specification	Range
TINYINT[(M)]	Signed values: –128 to 127 (–27 to 27 – 1)
	Unsigned values: 0 to 255 (0 to 28 – 1)
SMALLINT[(M)]	Signed values: –32768 to 32767 (–215 to 215 – 1)
	Unsigned values: 0 to 65535 (0 to 216 – 1)
MEDIUMINT[(M)]	Signed values: –8388608 to 8388607 (–223 to 223 – 1)
	Unsigned values: 0 to 16777215 (0 to 224 – 1)
INT[(M)]	Signed values: –2147683648 to 2147483647 (–231 to 231 – 1)
	Unsigned values: 0 to 4294967295 (0 to 232 – 1)
BIGINT[(M)]	Signed values: –9223372036854775808 to 9223372036854775807 (–263 to 263 – 1)
	Unsigned values: 0 to 18446744073709551615 (0 to 264 – 1)
FLOAT[(M,D)]	Minimum non-zero values: ±1.175494351E–38
	Maximum non-zero values: ±3.402823466E+38
DOUBLE[(M,D)]	Minimum non-zero values: ±2.2250738585072014E–308
	Maximum non-zero values: ±1.7976931348623157E+308
DECIMAL([M[,D]])	Varies; range depends on M and D

Integer Column Types

Type Specification	Storage Required
TINYINT[(M)]	1 byte
SMALLINT[(M)]	2 bytes
MEDIUMINT[(M)]	3 bytes
INT[(M)]	4 bytes
BIGINT[(M)]	8 bytes
FLOAT[(M,D)]	4 bytes
DOUBLE[(M,D)]	8 bytes
DECIMAL([M[,D]])	M bytes (MySQL < 3.23), M+2 bytes (MySQL 3.23)

MySQL provides five integer types: TINYINT, SMALLINT, MEDIUMINT, INT, and BIGINT. INTEGER is a synonym for INT. These types vary in the range of values they can represent and in the amount of storage space they require. (Types with a larger range require more storage.) Integer columns can be declared as UNSIGNED to disallow negative values; this shifts the range for the column upward to begin at 0.

When you declare an integer column, you can specify an optional display size M. If given, M should be an integer from 1 to 255. It represents the number of characters used to display values for the column. For example, MEDIUMINT(4) specifies a MEDIUMINT column with a display width of 4.

If you declare an integer column without an explicit width, a default width is assigned. The defaults are the lengths of the "longest" values for each type. Note that displayed values are not chopped to fit within M characters. If the printable representation of a particular value requires more than M characters, MySQL displays the full value.

The display size M for an integer column is related only to the number of characters used to display column values. It has nothing to do with the number of bytes of storage space required. For example, BIGINT values require 8 bytes of storage regardless of the display width.

It is not possible to magically cut the required storage space for a BIGINT column in half by declaring it as BIGINT(4). Nor does M have anything to do with the range of values allowed. If you declare a column as INT(3), that will not restrict it to a maximum value of 999.

The following statement creates a table to illustrate the default values of M and D for integer column types:

create TABLE my_test_tbl2
(
i tinyint,
i_u tinyint unsigned,
i_s smallint,
i_s_u smallint unsigned,
i_m mediumint,
i_m_u mediumint unsigned,
i_r int,
i_r_u int unsigned,
i_b bigint,
i_b_u bigint unsigned
);

If you issue a DESCRIBE my_test_tbl2 statement after creating the table, the number following each type name shows the value that MySQL uses by default in the absence of an explicit display width specifier:

Floating-Point Column Types

MySQL provides three floating-point types: FLOAT, DOUBLE, and DECIMAL. Synonymous types are DOUBLE PRECISION and REAL for DOUBLE, and NUMERIC for DECIMAL. Ranges for these types differ from ranges for integer types in the sense that there is not only a maximum value a floating-point type can represent, but also a minimum non-zero value.

The minimum values provide a measure of how precise the type is, which is often important for recording scientific data. (There are, of course, corresponding negative maximum and minimum values.)

Floating-point types can be declared as UNSIGNED, although not until MySQL 4.0.2 for FLOAT and DOUBLE. Unlike the integer types, declaring a floating-point type UNSIGNED doesn't shift the type's range upward, it merely eliminates the negative end.

For each floating-point type, you can specify a maximum display size M and the number of decimal places D. The value of M should be from 1 to 255. The value of D can be from 0 to 30, but should be no more than M-2. (If you're more familiar with ODBC terms, M and D correspond to the ODBC concepts of "precision" and "scale.")

For FLOAT and DOUBLE, M and D are optional. If they are omitted, these types are treated as follows:

. Prior to MySQL 3.23.6, FLOAT and DOUBLE are treated as FLOAT(10,2) and DOUBLE(16,4) with stored values rounded to 2 and 4 decimals, respectively.
. For MySQL 3.23.6 and later, FLOAT and DOUBLE are stored to the full precision allowed by your hardware.

For DECIMAL, M and D may or may not be optional, depending on your version of MySQL:

. Prior to MySQL 3.23.6, M and D are required for DECIMAL columns.
. For MySQL 3.23.6 and later, if D is omitted, it defaults to 0. If M is omitted as well, it defaults to 10. In other words, the following equivalences hold:
. DECIMAL = DECIMAL(10) = DECIMAL(10,0)
. DECIMAL(n) = DECIMAL(n,0)

FLOAT(p) syntax is also allowed for ODBC compatibility. However, the precise behavior of columns specified using this syntax is somewhat complicated:

. Prior to MySQL 3.23, the allowable values of p are 4 and 8, indicating the number of bytes of storage per value. FLOAT(4) and FLOAT(8) are treated as FLOAT(10,2) and DOUBLE(16,4) with stored values rounded to 2 and 4 decimals, respectively.

. For MySQL 3.23.0 to 3.23.5, the allowable values of p are still 4 and 8 and indicate the number of bytes of storage, but FLOAT(4) and FLOAT(8) are treated as single-precision and double-precision columns with values stored to full hardware precision.

. For MySQL 3.23.6 and later, p can range from 0 to 53 and indicates the minimum number of bits of precision required for stored values. For p values from 0 to 24, the column is treated as single-precision. For values from 25 to 53, the column is treated as double-precision.

More confusing still is that MySQL allows FLOAT4 and FLOAT8 as synonyms, but what they are synonyms for depends on your version of MySQL:

. Prior to MySQL 3.23.6, FLOAT4 and FLOAT8 are equivalent to FLOAT(10,2) and DOUBLE(16,4).
. For MySQL 3.23.6 and later, FLOAT4 and FLOAT8 are equivalent to FLOAT and DOUBLE.

If you carefully compare these equivalences to those for FLOAT(4) and FLOAT(8), you'll see that FLOAT4 and FLOAT8 are not quite the same as FLOAT(4) and FLOAT(8), although you might have expected them to be.

Checking How MySQL Treats a Type Specification

If you're not sure how your version of MySQL will treat a given floating-point column specification, try the following. Create a table that contains a column defined the way you're wondering about and then use DESCRIBE to see how MySQL reports the type. For example, in MySQL 3.23.0, if you create a column using FLOAT4, you'd see the following:

mysql> CREATE TABLE test (flt FLOAT4);
mysql> DESCRIBE test;

 +-------+-------------+------+-----+---------+-------+
 | Field | Type| Null | Key | Default | Extra     |
 +-------+-------------+------+-----+---------+-------+
  | flt | float(10,2) | YES | | NULL | |
 +-------+-------------+------+-----+---------+-------+

In MySQL 3.23.6 and later, you'd see the following instead:

mysql> CREATE TABLE tbl (flt FLOAT4);
mysql> DESCRIBE tbl;

+-------+-------+------+-----+---------+-------+
| Field | Type  | Null | Key | Default | Extra |
+-------+-------+------+-----+---------+-------+
flt     | float | YES  |     | NULL    |       |
+-------+-------+------+-----+---------+-------+

The lack of a (M,D) indicator in the latter case indicates that values are stored to the full precision allowed by the hardware. This technique actually works to see how MySQL treats any column definition, but I have found it most useful for floating-point types.

Numeric Column Type Attributes

The ZEROFILL attribute can be specified for all numeric types. It causes displayed values for the column to be padded with leading zeros to the display width. You can use ZEROFILL when you want to make sure column values always display using a given number of digits.

Actually, it's more accurate to say "a given minimum number of digits" because values wider than the display width are displayed in full without being chopped. You can see this by issuing the following statements:

mysql> DROP TABLE IF EXISTS my_test_tbl;
mysql> CREATE TABLE my_test_tbl (my_zerofill INT(5) ZEROFILL);
mysql> INSERT INTO my_test_tbl VALUES(1),(100),(10000),(1000000);
mysql> SELECT my_zerofill FROM my_test_tbl;

+-------------+
| my_zerofill |
+-------------+
|       00001 |
|       00100 |
|       10000 |
|     1000000 |
+-------------+

Note that the final value, which is wider than the column's display width, is displayed in full.

The UNSIGNED attribute disallows negative values. It is most often used with integer types. Making an integer column UNSIGNED doesn't change the "size" of the underlying data type's range; it just shifts the range upward. Consider this table specification:

CREATE TABLE my_test_tbl
(
itiny TINYINT,
itiny_u TINYINT UNSIGNED
);

itiny and itiny_u are both TINYINT columns with a range of 256 values but differ in the particular allowable values. The range of itiny is -128 to 127, whereas the range of itiny_u is shifted up, resulting in a range of 0 to 255.

UNSIGNED is useful for integer columns into which you plan to store information that doesn't take on negative values, such as population counts or attendance figures. If you use a signed column for such values, you use only half of the column type's range.

By making the column UNSIGNED, you effectively double your range. If you use the column for sequence numbers, it will take twice as long to run out of values if you make it UNSIGNED.

You can also specify UNSIGNED for floating-point columns, although the effect is slightly different than for integer columns. The range does not shift upward; instead, the upper end remains unchanged and the lower end becomes zero.

A precaution to observe is that you should not use UNSIGNED with FLOAT or DOUBLE columns prior to MySQL 4.0.2. In earlier versions, MySQL allows these types to be declared as UNSIGNED, but doing so may result in unpredictable column behavior. (This prohibition does not apply to DECIMAL.)

One other attribute, AUTO_INCREMENT, can be specified for integer column types only. Use the AUTO_INCREMENT attribute when you want to generate unique identifiers or values in a series. When you insert NULL into an AUTO_INCREMENT column, MySQL generates the next sequence value and stores it in the column.

Normally, unless you take steps to cause otherwise, AUTO_INCREMENT values begin at 1 and increase by 1 for each new row. The sequence can be affected if you delete rows from the table. This depends on the table type, which determines whether or not sequence values are reused.

You can have at most one AUTO_INCREMENT column in a table. The column should be NOT NULL, and it should be declared as a PRIMARY KEY or as a UNIQUE key. Also, because sequence values are always positive, you normally declare the column UNSIGNED as well. For example, you can declare an AUTO_INCREMENT column in any of the following ways:

CREATE TABLE ai1 (i INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY);
CREATE TABLE ai2 (i INT UNSIGNED AUTO_INCREMENT NOT NULL, PRIMARY KEY (i));
CREATE TABLE ai3 (i INT UNSIGNED AUTO_INCREMEN
T NOT NULL, UNIQUE (i));

It is always allowable to declare an AUTO_INCREMENT column explicitly NOT NULL, as shown. However, for versions 3.23 and later, MySQL treats AUTO_INCREMENT columns as NOT NULL automatically.

Following the attributes just described, which are specific to numeric columns, you can also specify the general attributes NULL or NOT NULL. If you do not specify NULL or NOT NULL for a numeric column, the default is NULL. You can also specify a default value using the DEFAULT attribute. If you do not specify a default value, one is chosen automatically. For all numeric column types, the default is NULL for columns that may contain NULL, and 0 otherwise.

The following table contains three INT columns, having default values of -1, 1, and NULL:

CREATE TABLE t
(
i1 INT DEFAULT -1,
i2 INT DEFAULT 1,
i3 INT DEFAULT NULL
);

String Column Types

MySQL provides several string types to hold character data. Strings are often used for values like the following:

'N. Bertram, et al.'
'Pencils (no. 2 lead)'
'123 Elm St.'
'Monograph Series IX'

But strings are actually "generic" types in a sense because you can use them to represent any value. For example, you can use string types to hold binary data, such as images or sounds, or output from gzip, should you want to store compressed data.

Table below shows the types provided by MySQL for declaring string-valued columns and the maximum size and storage requirements of each type. For variable-length column types, the amount of storage taken by a value varies from row to row and depends on the length of the values actually stored in the column. This length is represented by L in the table.

The extra bytes required in addition to L are the number of bytes needed to store the length of the value. MySQL handles variable-

length values by storing both the content of the value and its length. These extra bytes are treated as an unsigned integer. Notice the correspondence between a variable-length type's maximum length, the number of extra bytes required for that type, and the range of the unsigned integer type that uses the same number of bytes.

For example, MEDIUMBLOB values can be up to 224-1 bytes long and require 3 bytes to record the result. The 3-byte integer type MEDIUMINT has a maximum unsigned value of 224-1. That's not a coincidence.

String Column Types
Type Specification	Maximum Size	Storage Required
CHAR[(M)]	M bytes	M bytes
VARCHAR(M)	M bytes	L+1 bytes
TINYBLOB, TINYTEXT	28 – 1 bytes	L+1 bytes
BLOB, TEXT	216 – 1 bytes	L+2 bytes
MEDIUMBLOB, MEDIUMTEXT	224 – 1 bytes	L+3 bytes
LONGBLOB, LONGTEXT	232 – 1 bytes	L+4 bytes
ENUM('value1','value2',...)	65535 members	1 or 2 bytes
SET('value1','value2',...)	64 members	1, 2, 3, 4, or 8 bytes

For ENUM and SET, the column definition includes a list of legal values. Attempting to store a value other than those causes the value to be converted to '' (the empty string). For the other string types, values that are too long are chopped to fit. But string types range from very small to very large, with the largest type able to hold nearly 4GB of data, so you should be able to find something long enough to avoid truncation of your information.

The effective maximum column size is actually imposed by the maximum packet size of the client/server communication protocol. This value is 16MB prior to MySQL 4, and 1GB for MySQL 4 and later.

ENUM and SET values are stored internally as numbers. Values for the other string types are stored as a sequence of bytes and treated either as bytes or characters, depending on whether the type holds binary or non-binary strings:

. A binary string is treated as a generic sequence of bytes, without respect to any character set. BLOB columns hold binary values, as do CHAR and VARCHAR columns if they are declared with the BINARY attribute.

. A non-binary string is treated as a sequence of characters and interpreted with respect to the properties of a particular character set. TEXT columns hold non-binary strings, as do CHAR and VARCHAR columns if they are declared without the BINARY attribute. For a single-byte character set, each character takes one byte.

For multi-byte character sets, characters can take more than one byte. In MySQL 4.1 and later, columns can be assigned character sets individually. Prior to MySQL 4.1, the server's default character set is used to interpret character strings.

Use of a character set causes non-binary strings to be compared and sorted using the character set's collating sequence. By contrast, a binary string has no character set and thus no collating sequence. This results in some differences in the way binary and non-binary strings are interpreted:

. Binary strings are processed byte-by-byte in comparisons based only on the underlying numeric value of each byte. One implication of this property is that binary values are case sensitive, because the lowercase and uppercase versions of a given letter have different numeric codes.

. Non-binary strings are processed character-by-character in comparisons using the character set collating sequence. For most character sets, uppercase and lowercase versions of a given letter have the same collating value, which means that non-binary string comparisons are not case sensitive. Similar characters with different accents also may have the same collating value. For example, 'E' and 'É' compare as the same character in the latin1 character set.

There are a few character sets that do treat uppercase and lowercase as having different collating values and that distinguish between accent marks: cp1521csas, cp1527ltlvcsas, latin1csas, maccecsas, and macromancsas. Note that these character set names each end with csas, which means "case sensitive, accent sensitive."

." They're something of a special case, so although elsewhere in this book I discuss non-binary strings as not case sensitive, keep in mind that these character sets exist as exceptions to the rule.

The distinction between characters and bytes can be seen easily by considering the length of a string containing multi-byte characters. For example, in MySQL 4.1 and later, you can use the CONVERT() function to generate a string in any available character set. The following statement creates @s as a string using ucs2, a character set that uses two bytes to encode each character:

mysql> SET @s = CONVERT('ABC' USING ucs2);

What is the "length" of the string @s? It depends. If you measure with CHAR_LENGTH(), which is multi-byte aware, you get the length in characters. If you measure with LENGTH(), which is not multi-byte aware, you get the length in bytes:

mysql> SELECT CHAR_LENGTH(@s), LENGTH(@s);

+-----------------+------------+
| CHAR_LENGTH(@s) | LENGTH(@s) |
+-----------------+------------+
|               3 |          6 |
+-----------------+------------+

A binary string has no character set and is treated simply as a sequence of individual bytes. Consequently, the length of the string is the same whether measured in characters or bytes:

mysql> SET @s = BINARY CONVERT('ABC' USING ucs2);
mysql> SELECT CHAR_LENGTH(BINARY @s),
LENGTH(BINARY @s);

+------------------------+-------------------+
| CHAR_LENGTH(BINARY @s) |  LENGTH(BINARY @s)|
+------------------------+-------------------+
|                      6 |                 6 |
+------------------------+-------------------+

The difference between lengths in characters and in bytes is significant for interpreting the meaning of string column types. For example, a column declaration of VARCHAR(20) doesn't really mean "20 characters maximum," it means "as many characters as will fit in 20 bytes." For single-byte character sets, the two are the same because the number of characters is the same as the number of bytes. But, for a multi-byte character set, the number of characters can be many less than 20.

The CHAR and VARCHAR Column Types

CHAR and VARCHAR are the most commonly used string types. The difference between them is that CHAR is a fixed-length type and VARCHAR is a variable-length type. Values in a CHAR(M) column each take M bytes; shorter values are right-padded with spaces when they are stored.

(Trailing spaces are stripped off on retrieval, however.) Values in a VARCHAR(M) column are stored using only as many bytes as necessary, plus one byte to record the length. Trailing spaces are stripped from VARCHAR values when they are stored; this differs from the ANSI SQL standard for VARCHAR values. (A VARCHAR type for which trailing spaces are not stripped may be introduced in a future version of MySQL.)

CHAR and VARCHAR columns can be declared with a maximum length M from 1 to 255. M is optional for CHAR and defaults to 1 if missing. Beginning with MySQL 3.23, CHAR(0) is also legal. CHAR(0) is useful as a placeholder when you want to declare a column but don't want to allocate space for it if you're not sure yet how wide to make it.

You can use ALTER TABLE to widen the column later. A CHAR(0) column can also be used to represent on/off values if you allow it to be NULL. Values in such a column can have two values-NULL or the empty string. A CHAR(0) column takes very little storage space in the table-only a single bit. As of MySQL 4.0.2, VARCHAR(0) is allowable as well, but it's treated as CHAR(0).

Keep two general principles in mind when choosing between CHAR and VARCHAR column types:

. If your values are all the same length, VARCHAR actually will use more space due to the extra byte required to record the length of values. On the other hand, if your values vary in length, VARCHAR columns have the advantage of taking less space. A CHAR(n) column always takes n bytes, even if it is empty or NULL.

. If your values don't vary much in length, CHAR is a better choice than VARCHAR if you're using MyISAM or ISAM tables. For such table types, tables with fixed-length rows can be processed more efficiently than tables with variable-length rows.

With a few limited exceptions, you cannot mix CHAR and VARCHAR within the same table. MySQL will even change columns from one type to another, depending on the circumstances. (This is something that other databases do not do.) The principles that apply are as follows:

. Table rows are fixed-length only if all the columns in the table are fixed-length types.
. If even a single column has a variable length, table rows become variable-length as well. . If table rows are variable-length, any fixed-length columns in the column may as well be converted to variable-length equivalents when that will save space.

What this means is that if you have VARCHAR, BLOB, or TEXT columns in a table, you cannot also have CHAR columns; MySQL silently converts them to VARCHAR. Suppose you create a table as follows:

CREATE TABLE my_test_tbl
(
c1 CHAR(10),
c2 VARCHAR(10)
);

If you issue a DESCRIBE query, the output is as follows:

mysql> DESCRIBE my_test_tbl;

+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| c1    | varchar(10) | YES  |     | NULL    |       |
| c2    | varchar(10) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+

Notice that the presence of the VARCHAR column causes MySQL to convert c1 to VARCHAR as well. If you try using ALTER TABLE to convert c1 to CHAR, it won't work. The only way to convert a VARCHAR column to CHAR is to convert all VARCHAR columns in the table at the same time:

mysql> ALTER TABLE my_test_tbl MODIFY c1 CHAR(10), MODIFY c2 CHAR(10);
mysql> DESCRIBE my_test_tbl;

+-------+----------+------+-----+---------+-------+
  | Field | Type | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| c1    | char(10) | YES  |     | NULL    |       |
| c2    | char(10) | YES  |     | NULL    |       |
+-------+----------+------+-----+---------+-------+

The BLOB and TEXT column types are variable-length like VARCHAR, but they have no fixed-length equivalent, so you cannot use CHAR columns in the same table as BLOB or TEXT columns. Any CHAR column will be converted to VARCHAR.

The exception to non-mixing of fixed- and variable-length columns is that CHAR columns shorter than four characters are not converted to VARCHAR. For example, MySQL will not change the CHAR column in the following table to VARCHAR:

CREATE TABLE my_test_tbl
(
c1 CHAR(2),
c2 VARCHAR(10)
);

You can see this from the output of DESCRIBE:

mysql> DESCRIBE my_test_tbl;

+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| c1    | char(2)     | YES  |     | NULL    |       |
| c2    | varchar(10) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+

The reason columns shorter than four characters are not converted is that, on average, any savings you might gain by not storing trailing spaces are offset by the extra byte needed in a VARCHAR column to record the length of each value.

In fact, if all your columns are short, MySQL will convert any that you declare as VARCHAR to CHAR. MySQL does this because the conversion will decrease storage requirements on average and, for MyISAM and ISAM tables, will improve performance by making table rows fixed-length. Suppose you create a table with the following specification:

CREATE TABLE my_test_tbl
(
c0 VARCHAR(0),
c1 VARCHAR(1),
c2 VARCHAR(2),
c3 VARCHAR(3)
);

DESCRIBE reveals that MySQL silently changes all the VARCHAR columns to CHAR:

mysql> DESCRIBE my_test_tbl;

+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| c0    | char(0) | YES  |     | NULL    |       |
| c1    | char(1) | YES  |     | NULL    |       |
| c2    | char(2) | YES  |     | NULL    |       |
| c3    | char(3) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+

The BLOB and TEXT Column Types

A "BLOB" is a binary large object-basically, a container that can hold anything you want to toss into it, and that you can make about as big as you want. In MySQL, the BLOB type is really a family of types (TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB), which are identical except for the maximum amount of information they can hold ().

BLOB columns are useful for storing data that may grow very large or that can vary widely in size from row to row. Some examples are word-processing documents, images and sounds, compound data, and news articles. MySQL also has a family of TEXT types (TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT). These are similar to the corresponding BLOB types, except that they are associated with a character set and operations on TEXT columns take character set into account.

(For MySQL 4.1 and later, this is the character set assigned to the TEXT column itself. Prior to 4.1, it is the server's default character set.) This results in the general differences between binary and non-binary strings that were described earlier. For example, in comparison and sorting operations, BLOB values are case sensitive and TEXT values are not.

BLOB or TEXT columns sometimes can be indexed, depending on the table type you're using:

. MyISAM tables support BLOB and TEXT indexing (for MySQL 3.23.2 and later), as do BDB tables. However, you must specify a prefix size to be used for the index. This avoids creating index entries that might be huge and thereby defeat any benefits to be gained by that index. The exception is that no prefix is specified for FULLTEXT indexes on TEXT columns, because FULLTEXT searches are based on the entire content of the indexed columns.

. ISAM, HEAP, and InnoDB tables do not support BLOB and TEXT indexes.

BLOB or TEXT columns may require special care:

. Due to the typical large variation in the size of BLOB and TEXT values, tables containing them are subject to high rates of fragmentation if many deletes and updates are done.

. If you're using very large values, you may need to tune the server to increase the value of the max_allowed_packet parameter.

The ENUM and SET Column Types

ENUM and SET are special string column types for which values must be chosen from a fixed (predefined) list of allowable strings. The primary difference between them is that ENUM column values must consist of exactly one member of the list of values, whereas SET column values can contain any or all members of the list. In other words, ENUM is used for values that are mutually exclusive, whereas SET allows multiple choices from the list.

The ENUM column type defines an enumeration. ENUM columns can be assigned values consisting of exactly one member chosen from a list of values specified at table-creation time. You can define an enumeration to have up to 65,535 members. Enumerations are commonly used to represent category values.

For example, values in a column declared as ENUM('N','Y') can be either 'N' or 'Y'. Or you can use ENUM for such things as answers to multiple-choice questions in a survey or questionnaire, or available sizes or colors for a product:

employees ENUM('less than 100','100-500','501-1500','more than 1500') color ENUM('red','green','blue','black') size ENUM('S','M','L','XL','XXL')

If you are processing selections from Web pages, you can use an ENUM to represent the option that a visitor to your site chooses from a set of mutually exclusive radio buttons on a page. For example, if you run an online pizza ordering service, an ENUM can be used to represent the type of crust a customer orders:

crust ENUM('thin','regular','pan style','deep dish')

If enumeration categories represent counts, it's important to choose your categories properly when you create the enumeration. For example, when recording white blood cell counts from a laboratory test, you may group the counts into categories as follows:

wbc ENUM('0-100','101-300','>300')

When a test result comes in as an exact count, you can record the value in the wbc column terms of the category into which the count falls. But you cannot recover the original count if you decide you want to convert the column from a category-based ENUM to an integer column based on exact count. (If you really need the exact count, use an integer column instead.)

The SET type is similar to ENUM in the sense that when you create a SET column, you specify a list of legal set members. But unlike ENUM, each column value can consist of any number of members from the set. The set can have up to 64 members. You can use a SET when you have a fixed set of values that are not mutually exclusive, as they are in an ENUM column. For example, you might use a SET to represent options available for an automobile:

SET('luggage rack','cruise control','air conditioning','sun roof')

Then, particular SET values would represent those options actually ordered by customers:

'cruise control,sun roof'
'luggage rack,air conditioning'
'luggage rack,cruise control,air conditioning'
'air conditioning' ''

The final value shown (the empty string) means that the customer ordered no options. This is a legal SET value. SET column values are represented as a single string. If a value consists of multiple set members, the members are separated in the string by commas. Obviously, this means you shouldn't use a string containing a comma as a SET member.

The way you declare the legal value list for an ENUM or SET column is significant in several ways:

. The list determines the possible legal values for the column, as has already been discussed.

. You can insert ENUM or SET values in any lettercase, but the lettercase of the strings specified in the column declaration determines the lettercase of column values when they are retrieved later. For example, if you have an ENUM('Y','N') column and you store 'y' and 'n' in it, the values are displayed as 'Y' and 'N' when you retrieve them. This does not affect comparison or sorting behavior because ENUM and SET columns are not case sensitive.

. The order of values in an ENUM declaration is the order used for sorting. The order of values in a SET declaration also determines sort order, although the relationship is more complicated because column values can contain multiple set members.

. The order of values in a SET declaration determines the order in which set members appear when SET column values consisting of multiple members are displayed.

ENUM and SET are classified as string types because enumeration and set members are specified as strings when you create columns of these types. However, the members are stored internally as numbers and you can operate on them as such. This means that ENUM and SET types are more efficient than other string types because they often can be handled using numeric operations rather than string operations. It also means that ENUM and SET values can be used in either string or numeric contexts.

ENUM members in the column declaration are numbered sequentially beginning with 1. (0 is reserved by MySQL for the error member, which is represented in string form by the empty string.) The number of enumeration values determines the storage size of an ENUM column. One byte can represent 256 values, two bytes can represent 65,536 values.

(Compare this to the ranges of the one-byte and two-byte integer types TINYINT UNSIGNED and SMALLINT UNSIGNED.) Thus, the maximum number of enumeration members is 65,536 (counting the error member) and the storage size depends on whether or not there are more than 256 members.

You can specify a maximum of 65,535 (not 65,536) members in the ENUM declaration because MySQL reserves a spot for the error member as an implicit member of every enumeration. When you assign an illegal value to an ENUM column, MySQL assigns the error member instead.

The following is an example you can try using the mysql client. It demonstrates that you can retrieve ENUM values in either string or numeric form (which shows the numeric ordering of enumeration members and also that the NULL value has no number in the ordering):

mysql> CREATE TABLE e_table (e ENUM('jane','fred','will','marcia'));
mysql> INSERT INTO e_table -> VALUES('jane'),('fred'),('will'),('marcia'),(''),(NULL);
mysql> SELECT e, e+0, e+1, e*3 FROM e_table;

+--------+------+------+------+
| e      | e+0  | e+1  | e*3  |
+--------+------+------+------+
| jane   |    1 |    2 |    3 |
| fred   |    2 |    3 |    6 |
| will   |    3 |    4 |    9 |
| marcia |    4 |    5 |   12 |
|        |    0 |    1 |    0 |
| NULL   | NULL | NULL | NULL |
+--------+------+------+------+

You can compare ENUM members either by name or number:

mysql> SELECT e FROM e_table WHERE e='will';

+------+
| e    |
+------+
| will |
+------+

mysql> SELECT e FROM e_table WHERE e=3;

+------+
| e    |
+------+
| will |
+------+

It is possible to declare the empty string as a legal enumeration member. It will be assigned a non-zero numeric value, just as any other member listed in the declaration would be. However, using an empty string may cause some confusion because that string is also used for the error member that has a numeric value of 0.

In the following example, assigning the illegal enumeration value 'x' to the ENUM column causes the error member to be assigned. This is distinguishable from the empty string member only when retrieved in numeric form:

mysql> CREATE TABLE t (e ENUM('a','','b'));
mysql> INSERT INTO t VALUES('a'),(''),('b'),('x');
mysql> SELECT e, e+0 FROM t;

+------+------+
| e    | e+0  |
+------+------+
| a    |    1 |
|      |    2 |
| b    |    3 |
|      |    0 |
+------+------+

The numeric representation of SET columns is a little different than for ENUM columns. Set members are not numbered sequentially. Instead, each member corresponds to an individual bit in the SET value. The first set member corresponds to bit 0, the second member corresponds to bit 1, and so on. A numeric SET value of 0 corresponds to the empty string.

SET members are maintained as bit values. Eight set values per byte can be stored this way, so the storage size for a SET column is determined by the number of set members, up to a maximum of 64 members. SET values take 1, 2, 3, 4, or 8 bytes for set sizes of 1 to 8, 9 to 16, 17 to 24, 25 to 32, and 33 to 64 members.

The representation of a SET as a set of bits is what allows a SET value to consist of multiple set members. Any combination of bits can be turned on in the value, so the value can consist of any combination of the strings in the SET declaration that correspond to those bits.

The following is an example that shows the relationship between the string and numeric forms of a SET column; the numeric value is displayed in both decimal and binary form:

mysql> CREATE TABLE s_table (s SET('jane','fred','will','marcia'));
mysql> INSERT INTO s_table -> VALUES('jane'),('fred'),('will'),('marcia'),(''),(NULL);
mysql> SELECT s, s+0, BIN(s+0) FROM s_table;

+--------+------+----------+
| s      | s+0  | BIN(s+0) |
+--------+------+----------+
| jane   |    1 | 1        |
| fred   |    2 | 10       |
| will   |    4 | 100      |
| marcia |    8 | 1000     |
|        |    0 | 0        |
| NULL   | NULL | NULL     |
+--------+------+----------+

If you assign a value containing substrings that are not listed as set members to a SET column, those strings drop out and the column is assigned a value consisting of the remaining substrings. When you assign values to SET columns, the substrings don't need to be listed in the same order that you used when you declared the column.

However, when you retrieve the value later, members will be listed in declaration order. Suppose you declare a SET column to represent furniture items using the following declaration:

SET('table','lamp','chair')

If you assign a value of 'chair,couch,table' to this column, two things happen. First, 'couch' drops out because it's not a member of the set. Second, when you retrieve the value later, it appears as 'table,chair'. This occurs because MySQL determines which bits correspond to each substring of the value to be assigned and turns them on in the stored value.

The fact that MySQL reorders members in a SET value means that if you search for values using a string, you must list members in the proper order. If you insert 'chair,table' and then search for 'chair,table' you won't find the record; you must look for it as 'table,chair'.

Sorting and indexing of ENUM and SET columns is done according to the internal (numeric) values of column values. The following example might appear to be incorrect otherwise because the values are not sorted in alpha numeric order:

mysql> SELECT e FROM e_table ORDER BY e;

+--------+
| e      |
+--------+
| NULL   |
|        |
| jane   |
| fred   |
| will   |
| marcia |
+--------+

The placement of the NULL value depends how your version of MySQL sorts NULL values.

If you have a fixed set of values and you want them to sort in a particular order, you can exploit the ENUM sorting properties. Represent the values as an ENUM column in a table and list the enumeration values in the column declaration in the order that you want them to be sorted.

Suppose you have a table representing personnel for a sports organization, such as a football team, and that you want to sort output by personnel position so that it comes out in a particular order, such as the coaches, assistant coaches, quarterbacks, running backs, receivers, linemen, and so on. Define the column as an ENUM and list the enumeration elements in the order that you want to see them. Sort operations on that column will automatically come out in the order you specify.

For cases where you want an ENUM to sort in regular lexical order, you can convert the column to a non-ENUM string by using CONCAT() and sorting the result:

mysql> SELECT CONCAT(e) AS e_str FROM e_table ORDER BY e_str;

+--------+
| e_str  |
+--------+
| NULL   |
|        |
| fred   |
| jane   |
| marcia |
| will   |
+--------+

CONCAT() doesn't change the displayed values but has the side effect in this query of performing an ENUM-to-string conversion that alters their sorting properties.

String Column Type Attributes

The BINARY attribute can be specified for the CHAR and VARCHAR types to cause column values to be treated as binary strings (that is, as a string of bytes rather than as a string of characters). A common use for this is to cause column values to be case sensitive.

In MySQL 4.1 and later, you can specify a CHARACTER SET charset attribute for CHAR, VARCHAR, and TEXT columns. charset should be a valid character set name. The character set may differ among columns. For example, the following table contains latin1_de (German), utf8 (Unicode), and sjis (Japanese) columns:

CREATE TABLE my_test_tbl
(
c1 CHAR(10) CHARACTER SET latin1_de,
c2 VARCHAR(40) CHARACTER SET utf8,
t MEDIUMTEXT CHARACTER SET sjis
);

In versions of MySQL for which individual columns may be assigned character sets, DESCRIBE output will show that information:

mysql> DESCRIBE my_test_tbl;

Binary strings do not have character sets, so the CHARACTER SET attribute is not applicable to CHAR BINARY, VARCHAR BINARY, or any of the BLOB types. Character sets cannot be assigned to ENUM or SET columns, either, because values in such columns are represented numerically.

Character sets can be designated at the column, table, database, or server level, so when you create a character column, MySQL determines which character set to assign to it by trying the following rules in order:

1. If the column definition includes a character set, use that set.
2. Otherwise, if the table definition includes a table-level character set other than DEFAULT, use that set.
3. Otherwise, if the database has been assigned a character set other than DEFAULT, use that set.
4. Otherwise, use the server's default character set.

In other words, MySQL searches up through the levels at which character sets can be specified until it finds an explicit character set and then uses that for the column's set. The server always has a default character set, so the search process is guaranteed to terminate at the server level even if no character set is specified explicitly at any of the lower levels.

Suppose the server's character set is greek and that the current database has a character set of DEFAULT. The following CREATE TABLE statement specifies no character set at either the column or table level:

CREATE TABLE t (c CHAR(10));

The database has no explicit set either, so MySQL searches all the way up to the server level to find a character set (greek) to use for the column c. You can verify that with DESCRIBE:

mysql> DESCRIBE t;

+-------+------------------------------+------+-----+---------+-------+
| Field | Type                         | Null | Key | Default | Extra |
+-------+------------------------------+------+-----+---------+-------+
| c     | char(10) character set greek | YES  |     | NULL    |       |
+-------+------------------------------+------+-----+---------+-------+

The next statement specifies a table-level character set, so MySQL searches only up to that level to determine that the character set for column c should be czech:

CREATE TABLE t (c CHAR(10)) CHARACTER SET czech;

Again, you can verify that with DESCRIBE:

mysql> DESCRIBE t;

+-------+------------------------------+------+-----+---------+-------+
| Field | Type                         | Null | Key | Default | Extra |
+-------+------------------------------+------+-----+---------+-------+
| c     | char(10) character set czech | YES  |     | NULL    |       |
+-------+------------------------------+------+-----+---------+-------+

An empty value is different than a missing value, so don't make the mistake of thinking that you can force a string column to contain non-empty values by declaring it NOT NULL. If you require string values to be non-empty, that is a constraint you must enforce from within your own applications.

You can also specify a default value using the DEFAULT attribute for all string column types except the BLOB and TEXT types. If you don't specify a default value, one is chosen automatically. The default is NULL for columns that may contain NULL.

For columns that may not contain NULL, the default is the empty string except for ENUM, where the default is the first enumeration member. (For SET, the default when the column cannot contain NULL is actually the empty set, but that is equivalent to the empty string.)

Date and Time Column Types

MySQL provides several column types for temporal values-DATE, DATETIME, TIME, TIMESTAMP, and YEAR

Date and Time Column Types
Type Specification	Range
DATE	'1000-01-01' to '9999-12-31'
TIME	'-838:59:59' to '838:59:59'
DATETIME	'1000-01-01 00:00:00' to '9999-12-31 23:59:59'
TIMESTAMP[(M)]	19700101000000 to sometime in the year 2037
YEAR[(M)]	1901 to 2155 for YEAR(4), and 1970 to 2069 for YEAR(2)

Date and Time Column Type Storage Requirements
Type Specification	Storage Required
DATE	3 bytes (4 bytes prior to MySQL 3.22)
TIME	3 bytes
DATETIME	8 bytes
TIMESTAMP	4 bytes
YEAR	1 byte

Each date and time type has a "zero" value that is stored when you insert a value that is illegal for the type. This value is also the default value for date and time columns that are declared NOT NULL.

Date and Time Type "Zero" Values
Type Specification	Zero Value
DATE	'0000-00-00'
TIME	'00:00:00'
DATETIME	'0000-00-00 00:00:00'
TIMESTAMP	00000000000000
YEAR	0000

MySQL always represents dates with the year first, in accordance with the ANSI SQL and ISO 8601 specifications. For example, December 3, 2004 is represented as '2004-12-03'. MySQL does allow some leeway in the way it allows input dates to be specified.

For example, it will convert two-digit year values to four digits, and you need not supply a leading zero digit for month and day values that are less than 10. However, you must specify the year first and the day last. Formats that you may be more used to, such as '12/3/99' or '3/12/99', will be interpreted incorrectly.

Time values are returned in the time zone local to the server; MySQL doesn't make any time zone adjustments for the values that it returns to the client.

The DATE, TIME, and DATETIME Column Types

The DATE, TIME, and DATETIME types hold date, time, and combined date and time values. The formats are 'CCYY-MM-DD', 'hh:mm:ss', and 'CCYY-MM-DD hh:mm:ss', where CC, YY, MM, DD hh, mm, and ss represent century, year, month, day, hour, minute, and second.

. For the DATETIME type, the date and time parts are both required; if you assign a DATE value to a DATETIME column, MySQL automatically adds a time part of '00:00:00'. (Conversely, if you assign a DATETIME value to a DATE column, MySQL discards the time part.)

MySQL treats the time in DATETIME and TIME values slightly differently. For DATETIME, the time part represents a time of day. A TIME value, on the other hand, represents elapsed time-that's why the range for TIME columns is so great and why negative values are allowed.

One thing to watch out for when inserting TIME values into a table is that if you use a "short" (not fully qualified) value, it may not be interpreted as you expect.

For example, you'll probably find that if you insert '30' and '12:30' into a TIME column, one value will be interpreted from right to left and the other from left to right, resulting in stored values of '00:00:30' and '12:30:00'. If you consider '12:30' to represent a value of "12 minutes, 30 seconds," you should specify it in fully qualified form as '00:12:30'.

The TIMESTAMP Column Type

TIMESTAMP columns represent values in CCYYMMDDhhmmss format, with a range from 19700101000000 to sometime in the year 2037. The range is tied to UNIX time, where the first day of 1970 is "day zero," also known as "the epoch." The beginning of 1970 determines the lower end of the TIMESTAMP range. The upper end of the range corresponds to the four-byte limit on UNIX time, which can represent values into the year 2037.

The TIMESTAMP type is so called because it has some special properties for recording when a row is created or modified:

. If you insert a NULL into any TIMESTAMP column, the column value is set automatically to the current date and time.

. The current date and time are also used if you create or update a row without assigning an explicit value to the column, but only for the first TIMESTAMP column in a row.

. For any TIMESTAMP column, you can update its value to the current timestamp by setting it to NULL, or you can defeat timestamping by inserting an explicit date and time value into the column rather than NULL.

A TIMESTAMP column declaration can include a specification for a maximum display width M. Odd values of M are treated as the next higher even number.

TIMESTAMP Display Formats
Type Specification	Display Format
TIMESTAMP(14)	CCYYMMDDhhmmss
TIMESTAMP(12)	YYMMDDhhmmss
TIMESTAMP(10)	YYMMDDhhmm
TIMESTAMP(8)	CCYYMMDD
TIMESTAMP(6)	YYMMDD
TIMESTAMP(4)	YYMM
TIMESTAMP(2)	YY

The display width for TIMESTAMP columns has nothing to do with storage size or the values stored internally. TIMESTAMP values are always stored in 4 bytes and used in calculations to full 14-digit precision, regardless of the display width. To see this, suppose you declare a table as follows and then insert some rows into it and retrieve them:

mysql> CREATE TABLE my_test_tbl (ts TIMESTAMP(8), i INT);
mysql> INSERT INTO my_test_tbl VALUES(20020801120000,3);
mysql> INSERT INTO my_test_tbl VALUES(20020801120001,2);
mysql> INSERT INTO my_test_tbl VALUES(20020801120002,1);
mysql> INSERT INTO my_test_tbl VALUES(20020801120003,0);
mysql> SELECT * FROM my_test_tbl ORDER BY ts, i;

+----------+------+
| ts       | i    |
+----------+------+
| 20020801 |    3 |
| 20020801 |    2 |
| 20020801 |    1 |
| 20020801 |    0 |
+----------+------+

On the face of it, the rows produced by the SELECT statement appear to be sorted in the wrong order-the values in the first column are all the same, so it seems the sort should order the rows according to the values in the second column.

This apparently anomalous result is due to the fact that MySQL is using the full 14-digit values inserted into the TIMESTAMP column for sorting. These values are all distinct, so they entirely determine the sort order of the result.

MySQL has no column type that can be set to the current date and time when a record is created and that remains immutable thereafter. If you want to achieve that, you can do it two ways:

. Use a TIMESTAMP column. When you create a new record, set the column to NULL to initialize it to the current date and time:

. INSERT INTO tbl_name (ts_col, ...) VALUES(NULL, ...); Whenever you update the record thereafter, explicitly set the column to the value it already has. Assigning an explicit value defeats the timestamping mechanism because it prevents the column's value from being automatically updated: UPDATE tbl_name SET ts_col=ts_col WHERE ... ;

. Use a DATETIME column. When you create a record, initialize the column to NOW():

. INSERT INTO tbl_name (dt_col, ...) VALUES(NOW(), ...); Whenever you update the record thereafter, leave the column alone: UPDATE tbl_name SET ... anything BUT dt_col here ... WHERE ... ;

If you want to use TIMESTAMP columns to maintain both a time-created value and a last-modified value, you can do so by using one TIMESTAMP for the time-modified value and a second TIMESTAMP for the time-created value.

Make sure the time-modified column is the first TIMESTAMP, so that it's set when the record is created or changed. Make the time-created column the second TIMESTAMP, and initialize it to NOW() when you create new records. That way, its value will reflect the record creation time and will not change after that.

The YEAR Column Type

YEAR is a one-byte column type used for efficient representation of year values. A YEAR column declaration can include a specification for a display width M, which should be either 4 or 2. If M is omitted from a YEAR declaration, the default is 4. YEAR(4) has a range of 1901 to 2155.

YEAR(2) has a range of 1970 to 2069, but only the last two digits are displayed. You can use the YEAR type when you want to store date information but only need the year part of the date, such as year of birth, year of election to office, and so forth. When you do not need a full date value, YEAR is much more space-efficient than other date types.

TINYINT has the same storage size as YEAR (one byte), but not the same range. To cover the same range of years as YEAR by using an integer type, you would need a SMALLINT, which takes twice as much space. If the range of years you need to represent coincides with the range of the YEAR type, YEAR is more space-efficient than SMALLINT.

Another advantage of YEAR over an integer column is that MySQL will convert two-digit values into four-digit values for you using MySQL's usual year-guessing rules. For example, 97 and 14 become 1997 and 2014. However, be aware that inserting the numeric value 00 into a four-digit YEAR column will result in the value 0000 being stored, not 2000. If you want a value of 00 to convert to 2000, you must specify it in string form as '00'.

Date and Time Column Type Attributes

There are no attributes that are specific to the date and time column types. The general attributes NULL or NOT NULL can be specified for any of the date and time types. If you don't specify either of them, NULL is the default. You can also specify a default value using the DEFAULT attribute. If you don't specify a default value, one is chosen automatically.

The default is NULL for columns that may contain NULL. Otherwise, the default is the "zero" value for the type. TIMESTAMP columns are special; the default for the first such column in a table is the current date and time and the "zero" value for any others.

Note that because default values must be constants. you cannot use a function such as NOW() to supply a value of "the current date and time" as the default for a DATETIME column. To achieve that result, set the column value explicitly to NOW() whenever you create a new record or else use a TIMESTAMP column (assuming that the special properties of TIMESTAMP are suitable for your purposes).

Working with Date and Time Columns

MySQL tries to interpret date and time values in a variety of formats, including both string and numeric forms

Date and Time Type Input Formats
Type	Allowable Formats
DATETIME, TIMESTAMP	'CCYY-MM-DD hh:mm:ss'
	'YY-MM-DD hh:mm:ss'
	'CCYYMMDDhhmmss'
	'YYMMDDhhmmss'
	CCYYMMDDhhmmss
	YYMMDDhhmmss
DATE	'CCYY-MM-DD'
	'YY-MM-DD'
	'CCYYMMDD'
	'YYMMDD'
	CCYYMMDD
	YYMMDD
TIME	'hh:mm:ss'
	'hhmmss'
	hhmmss
YEAR	'CCYY'
	'YY'
	CCYY
	YY

In addition, for the string formats that include delimiters, you need not specify two digits for month, day, hour, minute, or second values that are less than 10. The following are all equivalent:

'2012-02-03 05:04:09'
'2012-2-03 05:04:09'
'2012-2-3 05:04:09'
'2012-2-3 5:04:09'
'2012-2-3 5:4:09'
'2012-2-3 5:4:9'

Note that values with leading zeroes may be interpreted differently depending on whether they are specified as strings or numbers. The string '001231' will be seen as a six-digit value and interpreted as '2000-12-31' for a DATE and as '2000-12-31 00:00:00' for a DATETIME.

On the other hand, the number 001231 will be seen as 1231 after the parser gets done with it and then the interpretation becomes problematic. This is a case where it's best to supply a string value '001231' or else use a fully qualified value if you are using numbers (that is, 20001231 for DATE and 200012310000 for DATETIME).

In general, you can freely assign values between the DATE, DATETIME, and TIMESTAMP types, although there are certain restrictions to keep in mind:

. If you assign a DATETIME or TIMESTAMP value to a DATE, the time part is discarded.

. If you assign a DATE value to a DATETIME or TIMESTAMP, the time part of the resulting value is set to zero ('00:00:00').

. The types have different ranges. In particular, TIMESTAMP has a more limited range (1970 to 2037), so, for example, you cannot assign a pre-1970 DATETIME value to a TIMESTAMP and expect reasonable results. Nor can you assign values that are far in the future to a TIMESTAMP.

Interpretation of Ambiguous Year Values

For all date and time types that include a year part (DATE, DATETIME TIMESTAMP, YEAR), MySQL handles values that contain two-digit years by converting them to four-digit years. This conversion is performed according to the following rules:

. Year values from 00 to 69 become 2000 to 2069.
. Year values from 70 to 99 become 1970 to 1999.

You can see the effect of these rules most easily by assigning different two-digit values to a YEAR column and then retrieving the results. This will also demonstrate something you should take note of:

mysql> CREATE TABLE y_table (y YEAR);
mysql> INSERT INTO y_table VALUES(68),(69),(99), (00);
mysql> SELECT * FROM y_table;

+------+
| y    |
+------+
| 2068 |
| 2069 |
| 1999 |
| 0000 |
+------+

Notice that 00 was converted to 0000, not to 2000. That's because as a number, 00 is the same as 0 and is a perfectly legal value for the YEAR type. If you insert a numeric zero, that's what you get. To get 2000 using a value that does not contain the century, insert the string '0' or '00'.

You can make sure MySQL sees a string and not a number by inserting YEAR values using CONCAT(). This function returns a string result uniformly regardless of whether its argument is a string or a number.

In any case, keep in mind that the rules for converting two-digit to four-digit year values provide only a reasonable guess. There is no way for MySQL to be certain about the meaning of a two-digit year when the century is unspecified. If MySQL's conversion rules don't produce the values that you want, the solution is to provide unambiguous data with four-digit years.

Choosing Column Type

When you choose a numeric type, consider the range of values you need to represent and choose the smallest type that will cover the range. Choosing a larger type wastes space, leading to tables that are unnecessarily large and that cannot be processed as efficiently as if you had chosen a smaller type.

For integer values, TINYINT is the best if the range of values in your data is small, such as a person's age or number of siblings. MEDIUMINT can represent millions of values and can be used for many more types of values, at some additional cost in storage space.

BIGINT has the largest range of all but requires twice as much storage as the next smallest integer type (INT) and should be used only when really necessary. For floating-point values, DOUBLE takes twice as much space as FLOAT.

Unless you need exceptionally high precision or an extremely large range of values, you can probably represent your data at half the storage cost by using FLOAT.

Every numeric column's range of values is determined by its type. If you attempt to insert a value that lies outside the column's range, truncation occurs; MySQL clips the value to the appropriate endpoint of the range and uses the result. No truncation occurs when values are retrieved.

Value truncation occurs according to the range of the column type, not the display width. For example, a SMALLINT(3) column has a display width of 3 and a range from -32768 to 32767.

The value 12345 is wider than the display width but within the range of the column, so it is inserted without clipping and retrieved as 12345. The value 99999 is outside the range, so it is clipped to 32767 when inserted. Subsequent retrievals retrieve the value 32767.

In general, values assigned to a floating-point column are rounded to the number of decimals indicated by the column specification. If you store 1.23456 in a FLOAT(8,1) column, the result is 1.2. If you store the same value in a FLOAT(8,4) column, the result is 1.2346.

This means you should declare floating-point columns with a sufficient number of decimals to give you values as precise as you require. If you need accuracy to thousandths, don't declare a type with only two decimal places.

The DECIMAL type is a floating-point type, but it differs from FLOAT and DOUBLE in that DECIMAL values actually are stored as strings and have a fixed number of decimals.

The significance of this fact is that DECIMAL values are not subject to roundoff error the way that FLOAT and DOUBLE columns are-a property that makes DECIMAL especially applicable to currency calculations.

The corresponding tradeoff is that DECIMAL values are not as efficient as floating-point values stored in native format that the processor can operate on directly.

How D Affects the Range of DECIMAL(M,D)
Type Specification	Range (for MySQL < 3.23)	Range (for MySQL >3.23)
DECIMAL(4,0)	–999 to 9999	–9999 to 99999
DECIMAL(4,1)	–9.9 to 99.9	–999.9 to 9999.9
DECIMAL(4,2)	–.99 to 9.99	–99.99 to 999.99

The maximum possible range for DECIMAL is the same as for DOUBLE, but the effective range is determined by the values of M and D. If you vary M and hold D fixed, the range becomes larger as M becomes larger.

How M Affects the Range of DECIMAL(M,D)

Type Specification	Range (for MySQL < 3.23)	Range (for MySQL
DECIMAL(4,1)	–9.9 to 99.9	–999.9 to 9999.9
DECIMAL(5,1)	–99.9 to 999.9	–9999.9 to 99999.9
DECIMAL(6,1)	–999.9 to 9999.9	–99999.9 to 999999.9

The range for a given DECIMAL type depends on your version of MySQL. As of MySQL 3.23, DECIMAL values are handled according to the ANSI specification, which states that a type of DECIMAL(M,D) must be able to represent any value with M digits and D decimal places.

For example, DECIMAL(4,2) must be able to represent values from -99.99 to 99.99. Because the sign character and decimal point must still be stored, this requires an extra two bytes, so DECIMAL(M,D) values for MySQL 3.23 and later use M+2 bytes. For DECIMAL(4,2), six bytes are needed for the "widest" value (-99.99).

At the positive end of the range, the sign byte is not needed to hold a sign character, so MySQL uses it to extend the range beyond that required by the ANSI specification. In other words, for DECIMAL(4,2), the maximum value that can be stored in the six bytes available is 999.99.

There are two special conditions that reduce the DECIMAL storage requirement of M+2 bytes to a lesser value:

. If D is 0, DECIMAL values have no fractional part and no byte need be allocated to store the decimal point. This reduces the required storage by one byte.

. If a DECIMAL column is UNSIGNED, no sign character need be stored, also reducing the required storage by one byte.

For versions of MySQL prior to 3.23, DECIMAL values are represented in a slightly different fashion. A DECIMAL(M,D) column is stored using M bytes per value, and the sign character and decimal point (if needed) are included in the M bytes.

Thus, for a type DECIMAL(4,2), the range is -.99 to 9.99 because those cover all the possible 4-character values. If D is 0, no decimal point need be stored, and the byte usually used for that purpose can be used to store another digit. The effect is to extend the range of the column by an extra order of magnitude.

Working with Sequences

Many applications need to use unique numbers for identification purposes. The requirement for unique values occurs in a number of contexts: membership numbers, sample or lot numbering, customer IDs, bug report or trouble ticket tags, and so on.

MySQL's mechanism for providing unique numbers is through AUTO_INCREMENT columns that allow you to generate sequential numbers automatically.

However, AUTO_INCREMENT columns are handled somewhat differently for the various table types that MySQL supports, so it's important to understand not only the general concepts underlying the AUTO_INCREMENT mechanism, but also the differences between table types.

This section describes how AUTO_INCREMENT columns work so that you can use them effectively without running into the traps that sometimes surprise people. It also describes how you can generate a sequence without using an AUTO_INCREMENT column.

Till version 3.23 of MySQL, the only table type available was ISAM. After that, additional table types were introduced-the MyISAM and HEAP types first, and the BDB and InnoDB types later. The discussion here indicates how each table type behaves with respect to AUTO_INCREMENT columns.

Behavior AUTO_INCREMENT for different table types

AUTO_INCREMENT for ISAM Tables

AUTO_INCREMENT columns in ISAM tables behave as follows:

. Inserting NULL into an AUTO_INCREMENT column causes MySQL to automatically generate the next sequence number and insert that value into the column. AUTO_INCREMENT sequences begin at 1, so the first record inserted into the table gets a sequence column value of 1 and subsequent records get values of 2, 3, and so forth. Each automatically generated value will be one more than the current maximum value stored in the column.

. Inserting 0 into an AUTO_INCREMENT column has the same effect as inserting NULL. However, this is not guaranteed to be true indefinitely, so it's better to insert NULL.

. Inserting a row without specifying an explicit value for the AUTO_INCREMENT column is the same as inserting NULL into the column.

. If you insert a record and specify a non-NULL, non-zero value for the AUTO_INCREMENT column, one of two things will happen. If a record already exists with that value, an error occurs because values in AUTO_INCREMENT columns must be unique. If a record does not exist with that value, the record is inserted and the sequence continues with the next value after that for subsequent rows. In other words, you can "bump up" the counter by inserting a record with a sequence value greater than the current counter value.

Bumping up the counter can result in gaps in the sequence, but you can also exploit this behavior to generate a sequence that begins at a value higher than 1. Suppose you create an ISAM table with an AUTO_INCREMENT column, but you want the sequence to begin at 1000 rather than at 1. To achieve this, insert a "fake" record with a value of 999 in the AUTO_INCREMENT column. Records inserted subsequently will be assigned sequence numbers beginning with 1000, after which you can delete the fake record.

(Why might you want to begin a sequence with a value higher than 1? One reason is to make sequence numbers all have the same number of digits. If you're generating customer ID numbers, and you expect never to have more than a million customers, you could begin the series at 1,000,000.

You'll be able to add well over a million customer records before the digit count for customer ID values changes. Other reasons for not beginning a sequence at 1 might have nothing to do with technical considerations.

For example, if you were assigning membership numbers, you might want to begin a sequence at a number higher than 1 to forestall political squabbling over who gets to be member number 1-by making sure there isn't any such number. Hey, it happens. Sad, but true.)

. If you delete the record containing the largest value in an AUTO_INCREMENT column, that value is reused the next time you generate a new value. This is a consequence of the principle that for ISAM tables, each new automatically generated value is one larger than the current maximum value stored in the column. Another consequence is that if you delete all the records in the table, all values are reused, so the sequence starts over beginning at 1.

. If you use UPDATE to set an AUTO_INCREMENT column to a value that already exists in another row, a "duplicate key" error occurs. If you update the column to a value larger than any existing column value, the sequence continues with the next number after that for subsequent records.

. If you use REPLACE to update a record based on the value of the AUTO_INCREMENT column, the AUTO_INCREMENT value does not change. If you use REPLACE to update a record based on the value of some other PRIMARY KEY or UNIQUE index, the AUTO_INCREMENT column will be updated with a new sequence number if you set it to NULL.

. The value of the most recent automatically generated sequence number is available by calling the LAST_INSERT_ID() function. This allows you to reference the AUTO_INCREMENT value in other statements without knowing what the value is.

LAST_INSERT_ID() is tied to AUTO_INCREMENT values generated during the current server session; it is not affected by AUTO_INCREMENT activity associated with other clients. If no AUTO_INCREMENT value has been generated during the current session, LAST_INSERT_ID() returns 0.

The AUTO_INCREMENT mechanism for ISAM forms the basis for understanding sequence behavior for the other table types. Those types implement behavior that for the most part is similar to that just described, so keep the preceding discussion in mind as you read on.

AUTO_INCREMENT for MyISAM Tables

MyISAM tables offer the most flexibility for sequence handling. The MyISAM storage format introduces some features that address some of the shortcomings of ISAM tables:

. With ISAM tables, values deleted from the top of the sequence are reused. If you delete the record with the highest sequence number, the new record added gets the same sequence value as the deleted record. This results in sequences that are not strictly monotonic, which is a problem should you need to guarantee that no record be given a number that has been used before.

With MyISAM, the values in an automatically generated series are strictly increasing and are not reused. If the maximum value is 143 and you delete the record containing that value, MySQL still generates the next value as 144.

. ISAM sequences always begin at 1 unless you use the fake-record technique mentioned earlier to start the sequence at a higher value. With MyISAM tables, you can specify the initial value explicitly by using an AUTO_INCREMENT = n option in the CREATE TABLE statement.

The following example creates a MyISAM table with an AUTO_INCREMENT column named seq that begins at 1,000,000:

. CREATE TABLE my_test_tbl4
. ( . tid INT UNSIGNED AUTO_INCREMENT NOT NULL,
. PRIMARY KEY (tid)
. ) TYPE = MYISAM AUTO_INCREMENT = 1000000;

Note: A table can have only one AUTO_INCREMENT column, so there is never any ambiguity about the column to which the terminating AUTO_INCREMENT = n option applies, even if the table has multiple columns (as most tables do).

. You can change the current sequence counter for an existing MyISAM table with ALTER TABLE. If the sequence currently stands at 1000, the following statement will cause the next number generated to be 2000:

ALTER TABLE my_test_tbl4 AUTO_INCREMENT = 2000;

If you want to reuse values that have been deleted from the top of the sequence, you can do that, too. The following statement will set the counter down as far as possible, causing the next number to be one larger than the current maximum sequence value:

ALTER TABLE my_test_tbl4 AUTO_INCREMENT = 1;

In addition to overcoming the weaknesses of ISAM sequence handling, the MySQL table handler as of MySQL 3.23.5 supports the use of composite (multiple-column) indexes for creating multiple independent sequences within the same table.

To use this feature, create a multiple-column PRIMARY KEY or UNIQUE index that includes an AUTO_INCREMENT column as its last column. For each distinct key in the leftmost column or columns of the index, the AUTO_INCREMENT column will generate a separate sequence of values.

. For example, you might use a table named bugs for tracking bug reports of several software projects, where the table is declared as follows:

CREATE TABLE associate_query
(
associate_name VARCHAR(20) NOT NULL,
tid INT UNSIGNED AUTO_INCREMENT NOT NULL,
description VARCHAR(100),
PRIMARY KEY (associate_name, tid)
) TYPE = MYISAM;

Here, the associate_name column identifies the associate name and the description column contains the problem description. The tid column is an AUTO_INCREMENT column; by creating an index that ties it to the associate_name column, you can generate an independent series of sequence numbers for each query.

AUTO_INCREMENT for HEAP Tables

HEAP tables do not support the AUTO_INCREMENT mechanism prior to MySQL 4.1. As of 4.1, AUTO_INCREMENT columns are allowed and behave as follows:

. The initial sequence value can be set with an AUTO_INCREMENT = n option in the CREATE TABLE statement and can be modified after table creation time using that option with ALTER TABLE.
. Values that are deleted from the top of the sequence are not reused.
. Composite indexes cannot be used to generate multiple independent sequences within a table.

AUTO_INCREMENT for BDB Tables

The BDB table handler manages AUTO_INCREMENT columns as follows:

. The initial sequence value cannot be set with an AUTO_INCREMENT = n option in the CREATE TABLE statement. Nor can it be modified using that option with ALTER TABLE.
. Values that are deleted from the top of the sequence are reused.
. Composite indexes can be used to generate multiple independent sequences within a table.

AUTO_INCREMENT for InnoDB Tables

The InnoDB table handler manages AUTO_INCREMENT columns as follows:

. The initial sequence value cannot be set with an AUTO_INCREMENT = n option in the CREATE TABLE statement. Nor can it be modified using that option with ALTER TABLE.
. Values that are deleted from the top of the sequence are not reused.
. Composite indexes cannot be used to generate multiple independent sequences within a table.

AUTO_INCREMENT issues to be considered

You should keep the following points in mind to avoid being surprised when you use AUTO_INCREMENT columns:

. AUTO_INCREMENT is not a column type; it's a column type attribute. Furthermore, AUTO_INCREMENT is an attribute intended for use only with integer types. Versions of MySQL earlier than 3.23 are sloppy in enforcing this constraint and will let you declare a column type such as CHAR with the AUTO_INCREMENT attribute. However, only the integer types work correctly as AUTO_INCREMENT columns.

. The primary purpose of the AUTO_INCREMENT mechanism is to allow you to generate a sequence of positive integers, so you should declare AUTO_INCREMENT columns to be UNSIGNED. This also has the advantage of giving you twice as many sequence numbers before you hit the upper end of the column type's range.

It is possible under some circumstances to generate sequences of negative values using an AUTO_INCREMENT column. But this is an unsupported use of AUTO_INCREMENT and the results are not guaranteed.

My own experiments indicate somewhat inconsistent behavior between versions with regard to negative sequences, so even if you achieve the results you want with one version of MySQL, that may change if you upgrade to a newer version.

(In other words, attempting to use AUTO_INCREMENT for anything but a sequence of positive integers can result in unpredictable behavior. You have been warned!)

. Don't be fooled into thinking that adding AUTO_INCREMENT to a column declaration is a magic way of getting an unlimited sequence of numbers. It's not; AUTO_INCREMENT sequences are always bound by the range of the underlying column type. For example, if you use a TINYINT column, the maximum sequence number is 127. When you reach that limit, your application will begin to fail with "duplicate key" errors. If you use TINYINT UNSIGNED instead, you'll reach the limit at 255.

. Clearing a table's contents entirely may reset a sequence to begin again at 1, even for table types that normally to not reuse AUTO_INCREMENT values. This can occur for either of the following statements:

. DELETE FROM tbl_name;
. TRUNCATE TABLE tbl_name;

The sequence reset occurs due to the way MySQL optimizes a complete table erasure operation: It tosses the data rows and indexes and recreates the table from scratch rather than deleting individual rows.

This causes all sequence number information to be lost. If you want to delete all records but preserve the sequence information, you can suppress this optimization by using DELETE with a WHERE clause that is always true:

DELETE FROM tbl_name WHERE 1;

This forces MySQL to evaluate the condition for each row and thus delete every row individually.

Forcing Non-Reuse of Sequence Values

What can you do to maintain a strictly increasing series of values for table types that reuse values that are deleted from the top of a sequence? One solution is to maintain a separate table that you use only for generating AUTO_INCREMENT values and from which you never delete records.

That way, the values in the table are never reused. When you need to generate a new record in your main table, first insert a NULL into the sequence number table. Then insert the record into your main table using the value of LAST_INSERT_ID() for the column that you want to contain a sequence number:

INSERT INTO ai_tbl SET ai_col = NULL; INSERT INTO main_tbl SET id=LAST_INSERT_ID() ... ;

Adding a Sequence Number Column to a Table

Suppose you create a table and put some information into it:

mysql> CREATE TABLE test_tbl (ch CHAR(10));
mysql> INSERT INTO test_tbl VALUES('a'),('b'),('c');
mysql> SELECT * FROM test_tbl;

+------+
| ch    |
+------+
| a    |
| b    |
| c    |
+------+

Then you decide that you want to include a sequence number column in the table. To do this, issue an ALTER TABLE statement to add an AUTO_INCREMENT column, using the same kind of type definition that you'd use with CREATE TABLE:

mysql> ALTER TABLE test_tbl ADD i INT AUTO_INCREMENT NOT NULL PRIMARY KEY; mysql> SELECT * FROM test_tbl;

+------+---+
| ch   | i |
+------+---+
| a    | 1 |
| b    | 2 |
| c    | 3 |
+------+---+

Note how MySQL has assigned sequence values to the AUTO_INCREMENT column automatically. You need not do so yourself.

Resequencing an Existing Column

If a table already has an AUTO_INCREMENT column but you want to renumber it to eliminate gaps in the sequence that may have resulted from row deletions, the easiest way to do it is to drop the column and then add it again. When MySQL adds the column, it will assign new sequence numbers automatically, as shown in the previous example.

Note how MySQL has assigned sequence values to the AUTO_INCREMENT column automatically. You need not do so yourself.

Resequencing an Existing Column

mysql> CREATE TABLE test_tbl (ch CHAR(10), i INT NOT NULL AUTO_INCREMENT PRIMARY KEY);
mysql> INSERT INTO test_tbl (ch) -> VALUES('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j'),('k');
mysql> DELETE FROM t WHERE ch IN('a','d','f','g','j');
mysql> SELECT * FROM test_tbl;

+------+----+
| ch   | i  |
+------+----+
| b    |  2 |
| c    |  3 |
| e    |  5 |
| h    |  8 |
| i    |  9 |
| k    | 11 |
+------+----+

The following ALTER TABLE statement drops the column and then adds it again:

mysql> ALTER TABLE test_tbl
-> DROP i,
-> ADD i INT UNSIGNED AUTO_INCREMENT NOT NULL,
-> AUTO_INCREMENT = 1;
mysql> SELECT * FROM test_tbl;

+------+---+
| ch    | i |
+------+---+
| b    | 1 |
| c    | 2 |
| e    | 3 |
| h    | 4 |
| i    | 5 |
| k    | 6 |
+------+---+

1. For a MyISAM table (or a HEAP table as of MySQL 4.1), you can use a value other than 1 to begin the sequence at a different value. For other table types, just omit the AUTO_INCREMENT clause, because they do not allow the initial value to be specified this way. The sequence will begin at 1.

Note that although it's easy to resequence a column, there is usually very little reason to do so. MySQL doesn't care whether a sequence has holes in it, nor do you gain any performance efficiencies by resequencing.

Generating Sequences Without AUTO_INCREMENT

Another method for generating sequence numbers doesn't use an AUTO_INCREMENT column at all. Instead, it uses an alternate form of the LAST_INSERT_ID() function that takes an argument. (This form was introduced in MySQL 3.22.9.) If you insert or update a column using LAST_INSERT_ID(expr), the next call to LAST_INSERT_ID() with no argument returns the value of expr.

In other words, expr is treated as though it had been generated as an AUTO_INCREMENT value. This allows you to generate a sequence number and then retrieve it later in your session, confident that the value will not have been affected by the activity of other clients.

One way to use this strategy is to create a single-row table containing a value that is updated each time you want the next value in the sequence. For example, you can create and initialize the table as follows:

CREATE TABLE test_table5 (tid INT UNSIGNED NOT NULL);
INSERT INTO test_table5 VALUES(0);

These statements set up seq_table with a single row containing a seq value of 0. To use the table, generate the next sequence number and retrieve it as follows:

UPDATE test_table5 SET tid = LAST_INSERT_ID(tid+1);
SELECT LAST_INSERT_ID();

The UPDATE statement retrieves the current value of the tid column and increments it by 1 to produce the next value in the sequence. Generating the new value using LAST_INSERT_ID(tid+1) causes it to be treated like an AUTO_INCREMENT value, which allows it to be retrieved by calling LAST_INSERT_ID() without an argument.

. LAST_INSERT_ID() is client-specific, so you get the correct value even if other clients have generated other sequence numbers in the interval between the UPDATE and the SELECT.

Other uses for this method are to generate sequence values that increment by a value other than 1 or that are negative. For example, the following statement can be executed repeatedly to generate a sequence of numbers that increase by 100 each time:

UPDATE test_table5 SET tid = LAST_INSERT_ID(tid+100);

Repeating the following statement will generate a sequence of decreasing numbers:

UPDATE test_table5 SET tid = LAST_INSERT_ID(tid-1);

You can also use this technique to generate a sequence that begins at an arbitrary value by setting the seq column to an appropriate initial value.

The preceding discussion describes how to set up a counter using a table with a single row. That's okay for a single counter, but if you want several of them, creating one table per counter leads to needless multiplication of tables.

For example, suppose you have a Web site and you want to put some "this page has been accessed n times" counters in several pages. You probably don't want to set up a separate counter table for every page that has a counter.

One way to avoid creating multiple counter tables is to create a single table with two columns. One column holds a counter value; the other holds a name that uniquely identifies each counter. You can still use the LAST_INSERT_ID() function, but you determine which row it applies to by using the counter name. The table looks like this:

CREATE TABLE ctr
(
pg_name VARCHAR(255) BINARY NOT NULL,
PRIMARY KEY (name),
value INT UNSIGNED
);

The name column is a string so that you can name a counter whatever you want, and it's declared as a PRIMARY KEY to prevent duplicate names. This assumes that applications using the table agree on the names they'll be using.

For Web counters, uniqueness of counter names is ensured simply by using the pathname of each page within the document tree as its counter name. The BINARY attribute causes pathname values to be treated as case sensitive. (Omit it if your system has pathnames that are not case sensitive.)

To use the ctr table, insert a row corresponding to each page for which you need a counter. For example, to set up a new counter for the site's home page, do the following:

INSERT INTO ctr (pg_name,value) VALUES('index.php' ,0);

That initializes a counter named 'index.php' with a value of zero. To generate the next sequence value for the page, use its pathname to look up the correct counter value and increment it with

LAST_INSERT_ID(expr) and then retrieve the value with LAST_INSERT_ID(): UPDATE ctr SET value = LAST_INSERT_ID(value+1) WHERE name = 'index.php'; SELECT LAST_INSERT_ID();
An alternative approach is to increment the counter without using LAST_INSERT_ID(): UPDATE ctr SET value = value+1 WHERE name = 'index.php' ;
SELECT value FROM ctr WHERE name = 'index.php' ;

However, that doesn't work correctly if another client increments the counter after you issue the UPDATE and before you issue the SELECT. You could solve that problem by using a transaction or by putting LOCK TABLES and UNLOCK TABLES around the two statements to block other clients while you're using the counter.

Because its value is client-specific, you always get the value you inserted, not the one from some other client, and you don't have to complicate the code with transactions or locks to keep other clients out.

Character Set Support

Character-based data values are interpreted with respect to a given character set, which determines the allowable characters that can be used. Character sets also have a collating (sorting) order, which affects many types of operations on character values:

. Comparisons: <, <=, =, <>, >=, and >.
. Sorting: ORDER BY, MIN(), MAX().
. Grouping: GROUP BY, DISTINCT.

The character set also affects other aspects of server operation, such as which characters can be used in database, table, and column names, because names normally are constructed from the alphanumeric characters in the server's default character set.

The level of character set support available to you depends on your version of MySQL. Prior to MySQL 4.1, the server operates using a single character set at a time. As of MySQL 4.1, the server can support multiple character sets simultaneously, and character sets can be specified at the server, database, table, column, or string constant level.

For example, if you want a table's columns to use latin1 by default, but also to include a Hebrew column and a Greek column, you can do that. You can also find out what character sets are available or convert data from one character set to another.

This section describes how to use the character sets that are supported by your server.

Character Set Support prior to MySQL 4.1

Prior to MySQL 4.1, data values in MySQL have no explicit character set. Instead, string constants and column values are interpreted with respect to the server's character set.

By default, this is the character set selected when the server was built (usually latin1), but the built-in value can be overridden at runtime with the --default-character-set option. This is very simple but quite limiting. For example, you cannot have a table that stores values using different character sets for different columns.

The single-character-set model also can lead to index-related problems if you change the server's character set after having already created tables and loaded character data into them. These problems occur due to the fact that index values are stored in sorted order, with the order for character columns being determined by the collating sequence of the character set that happens to be in force at the time the index entries are created.

Some character sets have different collating sequences than others, so if you load a table while the server is using one character set and then reconfigure the server to use a different set, it's possible that the index entries will no longer be in the correct order with respect to the collating sequence of the new character set. Worse, if you add new rows to the table, the index that was initially created using the sort order of the original character set will be updated using the order of the new set. Consequently, index-based queries may not work correctly.

Character Set Support in MySQL 4.1 and Later

Character set support has been revised considerably in MySQL 4.1 to provide the following features:

. Support for using multiple character sets simultaneously

. The ability to specify character sets at the server, database, table, column, and string constant level, not just at the server level:
o An ALTER DATABASE statement for database character set assignment
o CREATE TABLE and ALTER TABLE clauses for table- and column-level character set assignment

. Functions and operators for converting individual values from one character set to another or for determining the character set of a value

. A COLLATE operator for treating values in one character set as having the collating order of another character set

. A SHOW CHARACTER SET statement to list all the character sets the server knows about

. Automatic index reordering when character set changes occur
. Unicode support, provided by the utf8 and ucs2 character sets
. Many other new character sets

You cannot mix character sets within a string or use different character sets for different rows of a given column. However, by using a Unicode character set (which represents the encodings for many languages within a single character set), you may be able to implement multi-lingual support of the type you desire.

Specifying Character Sets

Character sets can be assigned at several levels, from the default used by the server down to the set used for individual strings:

. The server's default character set is built in at compile time, and you can override it at server startup time by using a --default-character-set option.
. To specify a default character set for a database, use the following statement:

. ALTER DATABASE db_name DEFAULT CHARACTER SET charset; charset is the name of a supported character set, or DEFAULT. A value of DEFAULT indicates that the database has no explicit character set; in this case, the server makes database-level character set decisions by referring to the server's default character set.

. To specify a default character set for a table, use a CHARACTER SET table option at table creation time:
. CREATE TABLE tbl_name (...) CHARACTER SET = charset; charset is the name of a supported character set, or DEFAULT. A value of DEFAULT tells the server to make table-level character set decisions by referring to the database character set.

. Columns in a table can be assigned a character set explicitly with a CHARACTER SET attribute. For example:

. c CHAR(10) CHARACTER SET
charset In this case, the charset value must be the name of a supported character set; it cannot be DEFAULT. However, you can omit the CHARACTER SET attribute entirely, in which case the table-level character set is used. Column types for which a character set can be given are CHAR and VARCHAR (if declared without the BINARY attribute) and the TEXT types.

. String constants can be converted to a given character set using the following notation, where charset is the name of a supported char acter set:

The following examples produce strings in the latin1_de and utf8 character sets:

_latin1_de 'abc' _utf8 'def'

This notation works only for literal quoted strings, not for hexadecimal constants, string expressions, or column values. However, any string can be converted to a designated character set using the CONVERT() function:

SELECT CONVERT(str USING charset);

It's also possible to sort values from given character set using the collating sequence for a different set by using the COLLATE operator. For example, if c is a latin1 column but you want to order it using latin1_de sorting rules, do this:

SELECT c FROM t ORDER BY c COLLATE latin1_de;

Determining What Character Sets Are In Use

Character set support in MySQL 4.1 and up includes statements for obtaining information at several levels:

. At the server level, you can find out which character sets are available using the following query:
. SHOW CHARACTER SET; To determine what the server's default character set is, issue the following query: SHOW VARIABLES LIKE 'character_set';

. The database-level character set for a given database can be obtained as follows:
. SHOW CREATE DATABASE db_name; If the statement output doesn't indicate a character set, the database's character set has been never been set or has been set explicitly to DEFAULT.

. A table's character set can be discovered two ways:
. SHOW CREATE TABLE tbl_name;
. SHOW TABLE STATUS LIKE 'tbl_name';
. Individual character set assignments for a table's columns are displayed by each of the following statements:
. DESCRIBE tbl_name;
. SHOW COLUMNS FROM tbl_name;

. SHOW CREATE TABLE tbl_name;
. To determine what character set is associated with a string, string expression, or column value, use the CHARSET() function:
. SELECT CHARSET(str);

Unicode Support

One of the reasons there are so many character sets is that different encodings have been developed for different languages. This presents several problems. For example, a given character that is common to several languages might be represented by different numeric values in different encodings.

Also, different languages require different numbers of bytes to represent characters. The Latin-1 character set is small enough that every character fits in a single byte, but some languages, such as those used in Japan and China, contain so many characters that they require multiple bytes per character.

The goal of Unicode is to provide a unified character-encoding system within which all languages can be represented in a consistent manner. In MySQL, Unicode support is provided through two character sets:

. UTF-8 is a variable-length format in which characters are represented using from one to four characters. (UTF is an abbreviation for UCS Transformation Format, where UCS is itself an abbreviation for Universal Character Set.) The utf8 character set in MySQL does not include any four-byte characters, although support for them may be added in the future.

. The other Unicode character set in MySQL is UCS2. The ucs2 set represents each character using two bytes, most significant byte first. This character set does not represent characters that require more than two bytes.

Expression & Operators

MySQL Expressions

MySQL allows you to write expressions that include constants, function calls, and references to table columns.

These values can be combined using different kinds of operators, such as arithmetic or comparison operators, and terms of an expression can be grouped with parentheses. Expressions occur most commonly in the output column list and WHERE clause of SELECT statements. For example, the following:

SELECT
CONCAT(last_name, ', ', first_name),
(YEAR(death) - YEAR(birth)) - IF(RIGHT(death,5) < RIGHT(birth,5),1,0)
FROM president
WHERE
birth > '1900-1-1' AND DEATH IS NOT NULL;

Each column selected represents an expression, as does the content of the WHERE clause. Expressions also occur in the WHERE clause of DELETE and UPDATE statements, the VALUES() clause of INSERT statements, and so on.

When MySQL encounters an expression, it evaluates it to produce a result. For example, (4*3)/(4-2) evaluates to the value 6. Expression evaluation may involve type conversion, such as when MySQL converts the number 960821 into a date '1996-08-21' if the number is used in a context requiring a DATE value.

Writing Expressions

An expression can be as simple as a single constant:

0 Numeric constant
'abc' String constant

Expressions can use function calls. Some functions take arguments (values inside the parentheses) and some do not. Multiple arguments should be separated by commas.

NOW() Function with no arguments
STRCMP('abc','def') Function with two arguments
STRCMP( 'abc', 'def' ) Spaces around arguments are legal
STRCMP ('abc','def') Space after function name is illegal

If there is a space after the function name, the MySQL parser may interpret the function name as a column name. (Function names are not reserved words, and you can use them for column names if you want.) The usual result is a syntax error.

You can use table column values in expressions. In the simplest case, when the table to which a column belongs is clear from context, a column reference can be given simply as the column name. Only one table is named in each of the following SELECT statements, so the column references are unambiguous, even though the same column names are used in each statement:

SELECT last_name, first_name FROM associate;
SELECT last_name, first_name FROM member;

If it's not clear which table should be used, column names can be preceded by the table name. If it's not clear which database should be used, the table name can be preceded by the database name. You can also use these more-specific forms in unambiguous contexts if you simply want to be more explicit:

SELECT
associate.last_name, president.first_name,
member.last_name, member.first_name
FROM president, member
WHERE associate.last_name = member.last_name;
SELECT sampdb.student.name FROM sampdb.student;

Finally, you can combine all these kinds of values (constants, function calls, and column references) to form more complex expressions.

Arithmatic Operators

MySQL includes several kinds of operators that can be used to combine terms of expressions. Arithmetic operators, include the usual addition, subtraction, multiplication, and division operators, as well as the modulo operator.

Arithmetic is performed using BIGINT (64-bit) integer values for +, -, and * when both operands are integers, as well as for / and % when the operation is performed in a context where the result is expected to be an integer.

Otherwise, DOUBLE is used. Be aware that if an integer operation involves large values such that the result exceeds 64-bit range, you will get unpredictable results. (Actually, you should try to avoid exceeding 63-bit values; one bit is needed to represent the sign.)

Arithmetic Operators
+, -, *, /	basic arithmetic
%	mod (remainder of integer division)
DIV	alternative division operator (since MySQL 4.1)
MOD	alternative modulo operator (since MySQL 4.1)

Logical Operators

Logical operators, evaluate expressions to determine whether they are true (non-zero) or false (zero). It is also possible for a logical expression to evaluate to NULL if its value cannot be ascertained (for example, 1 AND NULL is of indeterminate value).

MySQL allows the C-style &&, ||, and ! operators as alternative forms of AND, OR, and NOT. Note in particular the || operator; ANSI SQL specifies || as the string concatenation operator, but in MySQL it signifies a logical OR operation. If you want the ANSI behavior for ||, start the server with the --ansi or --sql-mode=PIPES_AS_CONCAT option.

If you use the following expression, expecting it to perform string concatenation, you may be surprised to discover that it returns the number 0:

'abc' || 'def'

'abc' and 'def' are converted to integers for the operation, and both turn into 0. In MySQL, you must use CONCAT('abc','def') to perform string concatenation:

CONCAT('abc','def') 'abcdef'

Logical Operators
Operator	Syntax	Meaning
AND, &&	a AND b, a && b	Logical intersection; true if both operands are true
OR, \|\|	a OR b, a \|\| b	Logical union; true if either operand is true
XOR	a XOR b	Logical exclusive-OR; true if exactly one operand is true
NOT, !	NOT a, !a	Logical negation; true if operand is false

Bit Operators

Bit operators, perform bitwise intersection, union and exclusive-OR where each bit of the result is evaluated as the logical AND, OR, or exclusive-OR of the corresponding bits of the operands.

(The XOR and ^ exclusive-OR operators are not available until MySQL 4.0.2.) You can also perform bit shifts left or right. Bit operations are performed using BIGINT (64-bit) integer values.

Bit Operators
Operator	Syntax	Meaning
&	a & b	Bitwise AND (intersection); each bit of result is set if corresponding bits of both operands are set
\|	a \| b	Bitwise OR (union); each bit of result is set if corresponding bit of either operand is set
^	a ^ b	Bitwise exclusive-OR; each bit of result is set only if exactly one corresponding bit of the operands is set
<<	a << b	Left shift of a by b bit positions
>>	a >> b	Right shift of a by

Comparison Operators

Comparison operators, include operators for testing relative magnitude or lexical ordering of numbers and strings as well as operators for performing pattern matching and for testing NULL values. The <=> operator is MySQL-specific and was introduced in MySQL 3.23.

Comparison Operators
Operator	Syntax	Meaning
=	a = b	True if operands are equal
<=>	a <=> b	True if operands are equal (even if NULL)
!=, <>	a != b, a <> b	True if operands are not equal
<	a < b	True if a is less than b
<=	a <= b	True if a is less than or equal to b
>=	a >= b	True if a is greater than or equal to b
>	a > b	True if a is greater than b
IN	a IN (b1, b2, ...)	True if a is equal to any of b1, b2, …
BETWEEN	a BETWEEN b AND C	True if a is between the values of b and c, inclusive
NOT BETWEEN	a NOT BETWEEN b AND C	True if a is not between the values of band c, inclusive
LIKE	a LIKE b	SQL pattern match; true if a matches b
NOT LIKE	a NOT LIKE b	SQL pattern match; true if a does not match b
REGEXP	a REGEXP b	Regular expression match; true if a matches b
NOT REGEXP	a NOT REGEXP b	Regular expression match; true if a does not match b
IS NULL	a IS NULL	True if operand is NULL
IS NOT NULL	a IS NOT NULL	True if operand is not NULL

Comparison operators return 1 if the comparison is true and 0 if the comparison is false. You can compare numbers or strings. Operands are converted as necessary according to the following rules:

. Other than for the <=> operator, comparisons involving NULL values evaluate as NULL. (<=> is like =, except that the value of the expression NULL <=> NULL is true.)

. If both operands are strings, they are compared lexically as strings. Binary strings are compared on a byte-by-byte basis using the numeric value of each byte. Comparisons for non-binary strings are performed character by character, using the collating sequence of the character set in which the strings are expressed. If the strings have different character sets (as is possible as of MySQL 4.1), the comparison might not yield meaningful results. A comparison between a binary and a non-binary string is treated as a comparison of binary strings.

. If both operands are integers, they are compared numerically as integers.

. As of MySQL 3.23.22, hexadecimal constants are compared as numbers. Before that, hex constants that are not compared to a number are compared as binary strings.

. If either operand is a TIMESTAMP or DATETIME value and the other is a constant, the operands are compared as TIMESTAMP values. This is done to make comparisons work better for ODBC applications.

. If none of the preceding rules apply, the operands are compared numerically as floating-point values. Note that this includes the case of comparing a string and a number. The string is converted to a number, which results in a value of 0 if the string doesn't look like a number. For example, '14.3' converts to 14.3,but 'L4.3' converts to 0.

The following comparisons illustrate these rules:

2 < 12 1

'2' < '12' 0

'2' < 12 1

The first comparison involves two integers, which are compared numerically. The second comparison involves two strings, which are compared lexically. The third comparison involves a string and a number, so they are compared as floating-point values.

Operator Precedence

Operators have varying levels of precedence. The levels are shown in the following list, from highest to lowest. Operators on the same line have the same precedence. Operators at a given precedence level are evaluated left to right. Operators at a higher precedence level are evaluated before operators at a lower precedence level.

BINARY COLLATE
NOT !
^
XOR
- (unary minus)
~ (unary bit negation)
* / %
+ -
<< >>
&
|
< <= = <=> != <> >= > IN IS LIKE REGEXP RLIKE
BETWEEN CASE WHEN THEN ELSE
AND &&
OR
|| :=

The unary operators (unary minus, unary bit negation, NOT, and BINARY) bind more tightly than the binary operators. That is, they group with the immediately following term in an expression, not with the rest of the expression as a whole.

-2+3 =1

-(2+3) = -5

Type Conversion

Whenever a value of one type is used in a context that requires a value of another type, MySQL performs extensive type conversion automatically according to the kind of operation you're performing. Type conversion can occur for any of the following reasons:

. Conversion of operands to a type appropriate for evaluation of an operator
. Conversion of a function argument to a type expected by the function
. Conversion of a value for assignment into a table column that has a different type

You can also perform explicit type conversion using a cast operator or function.

The following expression involves implicit type conversion. It consists of the addition operator + and two operands, 1 and '2':

1 + '2'

The operands are of different types (number and string), so MySQL converts one of them to make them the same type. But which one should it change? In this case, + is a numeric operator; MySQL wants the operands to be numbers and converts the string '2' to the number 2. Then it evaluates the expression to produce the result 3.

Here's another example. The CONCAT() function concatenates strings to produce a longer string as a result. To do this, it interprets its arguments as strings, no matter what type they are. If you pass it a bunch of numbers, CONCAT() will convert them to strings and then return their concatenation:

CONCAT(1,2,3)= '123'

If the call to CONCAT() is part of a larger expression, further type conversion may take place. Consider the following expression and its result:

REPEAT('X',CONCAT(1,2,3)/10)= 'XXXXXXXXXXXX'

CONCAT(1,2,3) produces the string '123'. The expression '123'/10 is converted to 123/10 because division is an arithmetic operator. The result of this expression would be 12.3 in floating-point context, but REPEAT() expects an integer repeat count, so an integer division is performed to produce 12. Then REPEAT('X',12) produces a string result of 12 'X' characters.

A general principle to keep in mind is that MySQL attempts to convert values to the type required by an expression rather than generating an error. Depending on the context, it will convert values of each of the three general categories (numbers, strings, or dates and times) to values in any of the other categories.

However, values can't always be converted from one type to another. If a value to be converted to a given type doesn't look like a legal value for that type, the conversion fails. Conversion to numbers of things like 'abc' that don't look like numbers results in a value of 0.

Conversion to date or time types of things that don't look like a date or time result in the "zero" value for the type. For example, converting the string 'abc' to a date results in the "zero" date '0000-00-00'. On the other hand, any value can be treated as a string, so it's generally not a problem to convert a value to a string.

MySQL also performs more minor type conversions. If you use a floating-point value in an integer context, the value is converted (with rounding). Conversion in the other direction works as well; an integer can be used without problem as a floating-point number.

Hexadecimal constants are treated as strings unless the context clearly indicates a number. In string contexts, each pair of hexadecimal digits is converted to a character and the result is used as a string. The following examples illustrate how this works:

0x61= 'a'

0x61 + 0 =97

X'61'= 'a'

X'61' + 0 =97

CONCAT(0x61)= 'a'

CONCAT(0x61 + 0) ='97'

CONCAT(X'61')= 'a'

CONCAT(X'61' + 0) ='97'

In comparisons, treatment of hexadecimal constants depends on your version of MySQL. From MySQL 3.23.22 and later, hex constants in comparisons are treated as numbers:

0x0a = '\n' =0
0xaaab < 0xab= 0
0xaaab > 0xab= 1
0x0a = 10= 1

Some operators force conversion of the operands to the type expected by the operator, no matter what the type of the operands is. Arithmetic operators are an example of this; they expect numbers and the operands are converted accordingly:

3 + 4 =7
'3' + 4= 7
'3' + '4' =7

In string-to-number conversion, it's not enough for a string simply to contain a number somewhere. MySQL doesn't look through the entire string hoping to find a number, it looks only at the beginning; if the string has no leading numeric part, the conversion result is 0.

'1973-2-4' + 0
1973 '12:14:01' + 0= 12
'23-skidoo' + 0= 23
'-23-skidoo' + 0= -23
'carbon-14' + 0= 0

The logical and bit operators are even stricter than the arithmetic operators. They want the operators to be not only numeric, but to be integers, and type conversion is performed accordingly. This means that a floating-point number, such as 0.3, is not considered true, even though it's non-zero; that's because the result is 0 when it's converted to an integer. In the following expressions, the operands are not considered true until they have a value of at least 1.

When evaluating comparisons, MySQL converts operands as necessary according to the following rules:

. Other than for the <=> operator, comparisons involving NULL values evaluate as NULL. (<=> is like =, except that NULL <=> NULL is true.)

. If both operands are strings, they are compared lexically as strings. Binary strings are compared on a byte-by-byte basis using the numeric value of each byte. Comparisons for non-binary strings are performed character-by-character using the collating sequence of the character set in which the strings are expressed. If the strings have different character sets (as is possible as of MySQL 4.1), the comparison may not yield meaningful results. A comparison between a binary and a non-binary string is treated as a comparison of binary strings.

. If both operands are integers, they are compared numerically as integers.
. As of MySQL 3.23.22, hexadecimal constants are compared as numbers. Before that, hex constants that are not compared to a number are compared as binary strings.

. If either operand is a TIMESTAMP or DATETIME value and the other is a constant, the operands are compared as TIMESTAMP values. This is done to make comparisons work better for ODBC applications.
. Otherwise, the operands are compared numerically as floating-point values. Note that this includes the case of comparing a string and a number. The string is converted to a number, which results in a value of 0 if the string doesn't look like a number. For example, '14.3' converts to 14.3, but 'L4.3' converts to 0.

Conversion of Out-of-Range or Illegal Values

The basic principle is this: Garbage in, garbage out. If you don't verify your data first before storing it, you may not like what you get. Having said that, the following are some general principles that describe MySQL's handling of out-of-range or otherwise improper values:

. For numeric or TIME columns, values that are outside the legal range are clipped to the nearest endpoint of the range and the resulting value is stored.

. For date and time columns other than TIME, values that are outside the range for the type may be converted to the "zero" value, NULL, or some other value. (In other words, the results are unpredictable.)

. For string columns other than ENUM or SET, strings that are too long are truncated to fit the maximum length of the column. Assignments to an ENUM or SET column depend on the values that are listed as legal in the column definition. If you assign to an ENUM column a value that is not listed as an enumeration member, the error member is assigned instead (that is, the empty string that corresponds to the zero-valued member). If you assign to a SET column a value containing substrings that are not listed as set members, those strings drop out and the column is assigned a value consisting of the remaining members.

. For date or time columns, illegal values are converted to the appropriate "zero" value for the.

These conversions are reported as warnings for ALTER TABLE, LOAD DATA, UPDATE, INSERT INTO ... SELECT, and multiple-row INSERT statements. In the mysql client, this information is displayed in the status line that is reported for a query.

In a programming language, you may be able to get this information by some other means. If you're using the MySQL C or PHP APIs, you can invoke the mysql_info() function. With the Perl DBI API, you can use the mysql_info attribute of your database connection. The information provided is a count of the number of warnings.

Working with Tables

MySQL Table Types

MySQL supports multiple table handlers, each of which implements a table type that has a specific set of properties or characteristics. The table types actually available to you will depend on your version of MySQL, how it was configured at build time, and the options with which it was started.

The current table type handlers and the versions in which they are first available are listed in the following table:

Table Type	MySQL Version
ISAM	All versions
MyISAM	3.23.0
MERGE	3.23.25
HEAP	3.23.0
BDB	3.23.17/3.23.34a
InnoDB	3.23.29/3.23.34a

Two version numbers are listed for BDB and InnoDB. The first number indicates when the table type appeared in binary distributions, the second when it became available in source distributions. MRG_MyISAM and BerkeleyDB are synonyms for MERGE and BDB. (From 3.23.29 through 3.23.36, the InnoDB table type was known as Innobase; thereafter, InnoDB is the preferred name, though Innobase is recognized as a synonym.)

Because MySQL can be configured in different ways, it's quite possible that a server for a given version of MySQL will not support all table types available in that version. The general characteristics of MySQL's table types are described below:

ISAM Tables

The ISAM handler manages tables that use the indexed sequential access method. The ISAM storage format is the original MySQL table type and is the only one available prior to Version 3.23. The ISAM handler has since been superceded by the MyISAM handler; MyISAM tables are the preferred general replacement because they have fewer limitations.

The ISAM type is still available but is considered pretty much obsolete. Support for it will fade over time. (ISAM table support has been omitted from the embedded server now, for example, and probably will disappear entirely in MySQL 5.)

MyISAM Tables

The MyISAM storage format is the default table type in MySQL as of version 3.23, unless the server has been configured otherwise.

. Tables can be larger than for the ISAM storage method if your operating system itself allows large file sizes.

. Table contents are stored in machine-independent format. This means you can copy tables directly from one machine to another, even if they have different architectures.

. Relative to ISAM tables, MyISAM relaxes several indexing constraints.

. MyISAM format provides better key compression than does ISAM format. Both formats use compression when storing runs of successive similar string index values, but MyISAM also can compress runs of similar numeric index values because numeric values are stored with the high byte first. (Index values tend to vary faster in the low-order bytes, so high-order bytes are more subject to compression.) To enable numeric compression, use the PACK_KEYS=1 option when creating a table.

. MyISAM has more capable AUTO_INCREMENT handling than is available for other table types.

. For improved table integrity checking, each MyISAM table has a flag that is set when the table is checked by the server or by myisamchk. MyISAM tables also have a flag indicating whether a table was closed properly. If the server shuts down abnormally or the machine crashes, the flag can be used to detect tables that need to be checked. This can be done automatically at server startup time by specifying the --myisam-recover option.

. The MyISAM handler supports full text searching through the use of FULLTEXT indexes.

MERGE Tables

MERGE tables are a means for grouping multiple MyISAM tables into a single logical unit. By querying a MERGE table, you are in effect querying all the constituent tables. One advantage of this is that you can in effect exceed the maximum table size allowed by the file system for individual MyISAM tables.

The tables that make up a MERGE table must all have the same structure. This means the columns in each table must be defined with the same names and types in the same order, and the indexes must be defined in the same way and in the same order. It is allowable to mix compressed and uncompressed tables.

A MERGE table cannot refer to tables in a different database.

HEAP Tables

The HEAP storage format uses tables that are stored in memory and that have fixed-length rows, two characteristics that make them very fast. HEAP tables are temporary in the sense that they disappear when the server terminates.

However, in contrast to temporary tables created with CREATE TEMPORARY TABLE, HEAP tables are visible to other clients. Several constraints apply to HEAP tables that allow them to be handled more simply and thus more quickly:

. Indexes are used only for comparisons performed with the = and <=> operators. This is due to the use of hashed indexes, which are very fast for equality comparisons but not for range searches with comparison operators such as < or >. Indexes also are not used in ORDER BY clauses for this reason.

. You cannot have NULL values in indexed columns prior to MySQL 4.0.2.

. AUTO_INCREMENT columns cannot be used prior to MySQL 4.1.

. BLOB and TEXT columns cannot be used. Because rows are stored using fixed-length format, you cannot use variable length column types such as BLOB and TEXT. VARCHAR is allowed but is treated internally as the corresponding CHAR type.

BDB Tables

BDB tables are managed by the Berkeley DB handler developed by Sleepycat. The BDB handler offers these features:

. Transaction-safe tables with commit and rollback
. Automatic recovery after a crash
. Page-level locking for good concurrency performance under query mix conditions that include both retrievals and updates

InnoDB Tables

InnoDB tables are the most recent table type added to MySQL. They are managed by the InnoDB handler developed by Innobase Oy. The InnoDB handler offers the following features:

. Transaction-safe tables with commit and rollback.
. Automatic recovery after a crash.
. Foreign key support, including cascaded delete.

. Row-level locking for good concurrency performance under query mix conditions that include both retrievals and updates.

. InnoDB tables are managed within a separate tablespace rather than by using table-specific files like the other table types. The tablespace can consist of multiple files and can include raw partitions. The InnoDB handler, in effect, treats the tablespace as a virtual file system within which it manages the contents of all InnoDB tables.

. Tables can exceed the size allowed by the file system for individual files through use of multiple files or raw partitions in the tablespace.

Table Representation on Memory

Every table, no matter its format, is represented on disk by a file that contains the table's format (that is, its definition). This file has a basename that is the same as the table name and a .frm extension.

For most table types, a table's contents are stored on disk using other files that are unique to the table. The exceptions are for HEAP and InnoDB tables, for which the .frm file is the only one that is uniquely associated with a given table.

(HEAP table contents are stored in memory. InnoDB table contents are managed within the InnoDB tablespace in common with other InnoDB tables, not within files specific to a particular table.) The various table types use files with the following extensions:

Table Type	Files on Disk
ISAM	.frm (definition), .ISD (data), .ISM (indexes)
MyISAM	.frm (definition), .MYD (data), .MYI (indexes)
MERGE	.frm (definition), .MRG (list of constituent MyISAM table names)
HEAP	.frm (definition)
BDB	.frm (definition), .db (data and indexes)
InnoDB	.frm (definition)

For any given table, the files specific to it are located in the directory that represents the database to which the table belongs.

Portability Characterstics

Any table is portable to another server in the sense that you can dump it into a text file with mysqldump, move the file to the machine where the other server runs, and load the file to recreate the table.

Portability as described in this section means that you can directly copy the files that represent the table on disk to another machine, install them into a database directory, and expect the MySQL server there to be able to use the table.

Of course, HEAP tables do not satisfy this definition because their contents are stored in memory, not on disk. Of the other table types, some are portable and some are not:

. ISAM tables are stored in a machine-dependent format, so they are portable only between machines that have identical hardware characteristics.

. BDB tables are not portable because the location of the table is encoded into the table's .db file. This makes a BDB table location-specific within the file system of the machine on which the table was created. (That's the conservative view of BDB portability.

I have experimented with BDB files in various ways, such as by moving them between database directories, renaming the files to use a different basename, and so on. I have not observed ill effects. But presumably it's better to play it safe and move BDB tables by dumping them with mysqldump and re-creating them on the destination machine by reloading the dump file.)

. MyISAM and InnoDB tables are stored in machine-independent format and are portable, assuming that your processor uses two's-complement integer arithmetic and IEEE floating-point format. Unless you have some kind of oddball machine, neither of these conditions should present any real issues.

In practice, you're probably most likely to see portability-compromising variation in hardware if you're using an embedded server built for a special-purpose device, as these sometimes will use processors that have non-standard operating characteristics.

. MERGE tables are portable as long as their constituent MyISAM files are portable.

In essence, the portability requirements for MyISAM and InnoDB tables are that they either contain no floating-point columns or that both machines use the same floating-point storage format.

"Floating-point" means FLOAT and DOUBLE here. DECIMAL columns are stored as strings, which are portable.

Note that for InnoDB, portability must be assessed at the tablespace level, not at the table level. The InnoDB handler stores the contents of all InnoDB tables within the tablespace rather than within table-specific files.

Consequently, it's the InnoDB tablespace files that are or are not portable, not individual InnoDB tables. This means that the floating-point portability constraint applies if any InnoDB table uses floating-point columns.

Regardless of a table type's general portability characteristics, you should not attempt to copy table or tablespace files to another machine unless the server has been shut down cleanly.

You cannot assume the integrity of your tables if you perform a copy after an unclean shutdown; they may be in need of repair or there may be transaction information still stored in a table handler's log files that needs to be applied or rolled back to bring tables up to date.

Similarly, if the server is running and actively updating tables, the table contents on disk will be in flux and the associated files will not yield usable table copies. In the case of a running server, you may be able to tell it to leave the tables alone while you copy them.

Creating a Table

Databases store data in tables. So what are these tables? In simplest terms, tables consist of rows and columns. Each column defines data of a particular type. Rows contain individual records.

Consider the following:

emp_id	Name	Age	Country	Email
11001	Rakesh	22	India	raka@mkdtutorials.com
11002	Vishwa	25	USA	vishwa@mkdtutorials.com
11003	Naresh Thapa	48	Nepal	nareshthapa@yahoo.com
11004	Pushpa	23	Pakistan	pushpa@pakmail.com

The table above contains five columns that store the employee id, name, age, country and email. Each row contains data for one individual. This is called a record. To find the country and email of Naresh Thapa , you'd first pick the name from the first column and and then look in the third and fourth columns of the same row.

A database can have many tables; it is tables, that contain the actual data. Hence, we can segregate related (or unrelated) data in different tables. For our employees database we'll have one table that stores company details of the employees. The other table would contain personal information. Let's make the first table.

The SQL command for creating tables looks complex when you view it for the first time. Don't worry if you get confused, we'll be discussing this in more detail in later sessions.

CREATE TABLE employee_details
(
emp_id int unsigned not null auto_increment primary key,
name varchar(20),
age int,
country varchar(30),
email varchar(60)
)TYPE=MYISAM AUTO_INCREMENT=11000;

Note: In MySQL, commands and column names are not case-sensitive; however, table and database names might be sensitive to case depending on the platform (as in Linux). You can thus, use create table instead of CREATE TABLE.

The CREATE TABLE keywords are followed by the name of the table we want to create, employee_details. Each line inside the parenthesis represents one column. These columns store the employee id, name, age, country of origin and emails of our employees and are given descriptive names emp_id, name, age, country and email, respectively.

Each column name is followed by the column type. Column types define the type of data the column is set to contain. In our example, columns, name, country and email would contain small text strings, so we set the column type to varchar, which means varriable characters.

The maximum number of characters for varchar columns is specified by a number enclosed in parenthesis immediately following the column name. Columns age would contain numbers (integers), so we set the column type to int. Our first column (emp_id) contains an employee id. Its column type looks really mean, yeh?. Let's break it down.

int: specifies that the column type is an integer (a number). unsigned: determines that the number will be unsigned (positive integer). not null: specifies that the value cannot be null (empty); that is, each row in the column would have a value.

auto_increment: When MySQl comes across a column with an auto_increment attribute, it generates a new value that is one greater than the largest value in the column. Thus, we don't need to supply values for this column, MySQL generates it for us! Also, it follows that each value in this column would be unique. (We'll discuss the benefits of having unique values very shortly). primary key: helps in indexing the column that help in faster searches. Each value has to be unique.

Why have a column with unique values?

Our company MKDTutorials.com has grown tremendously over the past two years. We've recruited thousands. Don't you think there is a fair chance that two employees might have the same name? Now, when that happens, how can we distinguish the records of these two employees unless we give them unique identification numbers? If we have a column with unique values, we can easily distinguish the two records. The best way to assign unique numbers is to let MySQL do it!

Using a database

We've already created our employees database. Now let's start the mysql client program and select our database. Once at the mysql prompt, issue the command:

SELECT DATABASE();

The system responds with

mysql> SELECT DATABASE();
or mysql>show databases;

+------------+
| DATABASE() |
+------------+
|            |
+------------+

1 row in set (0.01 sec)

The above shows that no database has been selected. Actually, everytime we work with mysql client, we have to specify which database we plan to use. There are several ways of doing it.

Specifying the database name at the start; type the folowing at the system prompt:

mysql employees (under Windows)
mysql employees -u manish -p (under Linux)

Specifying the database with the USE statement at the mysql prompt:

mysql>USE employees;

Specifying the database with \u at the mysql prompt:

mysql>\u employees;

It's necessary to specify the database we plan to use, else MySQL will throw an error.

Provisional Table Creation

To create a table only if it doesn't already exist, use CREATE TABLE IF NOT EXISTS. This feature is available as of MySQL 3.23.0. You can use it for an application that makes no assumptions about whether a table that it needs has been set up in advance.

The application can go ahead and attempt to create the table as a matter of course. The IF NOT EXISTS modifier is particularly useful for scripts that you run as batch jobs with mysql. In this context, a regular CREATE TABLE statement doesn't work very well.

The first time the job runs, it creates the table, but the second time an error occurs because the table already exists. If you use IF NOT EXISTS, there is no problem. The first time the job runs, it creates the table, as before. For the second and subsequent times, table creation attempts are silently ignored without error. This allows the job to continue processing as if the attempt had succeeded.

Temporary Tables

You can use CREATE TEMPORARY TABLE to create temporary tables that disappear automatically when your session ends. This is handy because you don't have to bother issuing a DROP TABLE statement explicitly to get rid of the table, and the table doesn't hang around if your session terminates abnormally.

For example, if you have a canned query in a batch file that you run with mysql and decide not to wait for it to finish, you can kill the script in the middle with impunity and the server will remove any temporary tables that the script creates.

A temporary table is visible only to the client that creates the table. The name can be the same as that of an existing permanent table. This is not an error, nor does the existing permanent table get clobbered.

Instead, the permanent table becomes hidden (inaccessible) while the temporary table exists. Suppose you create a temporary table in the sampdb database named member. The original member table becomes hidden, and references to member refer to the temporary table.

If you issue a DROP TABLE member statement, the temporary table is removed and the original member table "reappears." If you simply disconnect from the server without dropping the temporary table, the server automatically drops it for you.

The next time you connect, the original member table is visible again. (The original table also reappears if you rename a temporary table that hides it to have a different name. If the temporary table's new name happens to be that of another permanent table, that table becomes hidden while the temporary table has its name.)

The name-hiding mechanism works only to one level. That is, you cannot create two temporary tables with the same name. A TEMPORARY table can be created with a particular storage format by using a TYPE option. (Prior to MySQL 3.23.54, a MERGE table cannot be TEMPORARY.)

Prior to MySQL 3.23.2, TEMPORARY is unavailable, so there are no true temporary tables except in the sense that you consider them temporary in your own mind. You must remember to drop such a table yourself.

If you forget, the table hangs around until you notice and remove it. Table persistence also occurs if an application creates a table but exits early due to an error before it can drop the table.

Creating Tables the Other way

One of the key concepts of relational databases is that everything is represented as a table of rows and columns, and the result of every SELECT is also a table of rows and columns.

In many cases, the "table" that results from a SELECT is just an image of rows and columns that scroll off the top of your display as you continue working. But sometimes it is desirable to save a query result in another table so that you can refer to it later.

As of MySQL 3.23.0, you can do that easily. Use a CREATE TABLE ... SELECT statement to cause a new table to spring into existence on-the-fly to hold the result of an arbitrary SELECT query. You can do this in a single step without having to know or specify the data types of the columns you're retrieving.

This makes it exceptionally easy to create a table fully populated with the data you're interested in, ready to be used in further queries. For example, the following statement creates a new table named student_f that consists of information for all female students in the student table:

CREATE TABLE student_f SELECT * FROM student WHERE sex = 'f';

To copy an entire table, omit the WHERE clause:

CREATE TABLE new_tbl_name SELECT * FROM tbl_name;

Or, to create an empty copy, use a WHERE clause that always evaluates to false:

CREATE TABLE new_tbl_name SELECT * FROM tbl_name WHERE 0;

Creating an empty copy of a table is useful if you want to load a data file into the original table using LOAD DATA, but you're not sure if you have the options for specifying the data format quite right.

You don't want to end up with malformed records in the original table if you don't get the options right the first time! Using an empty copy of the original table allows you to experiment with the LOAD DATA options for specifying column and line delimiters until you're satisfied your input records are being interpreted properly.

After you're satisfied, you can load the file into the original table. Do that either by rerunning the LOAD DATA statement with the original table name or by copying the data into it from the copy:

INSERT INTO orig_tbl SELECT * FROM copy_tbl;

You can combine CREATE TEMPORARY TABLE with SELECT to retrieve a table's contents into a temporary copy of itself:

CREATE TEMPORARY TABLE mytbl SELECT * FROM mytbl;

That allows you to modify the contents of mytbl without affecting the original, which can be useful when you want to try out some queries that modify the contents of the table, but you don't want to change the original table.

To use pre-written scripts that use the original table name, you don't need to edit them to refer to a different table; just add the CREATE TEMPORARY TABLE statement to the beginning of the script. The script will create a temporary copy and operate on the copy, and the server will delete the copy when the script finishes.

Note: One caution to observe here is that some clients, such as mysql, attempt to reconnect to the server automatically if the connection drops. Should this happen when you're working with the temporary table, it will be dropped and the queries executed subsequent to reconnecting will use the original table. Keep this in mind if you have an unreliable network.

To create a table as an empty copy of itself, use a WHERE clause that is never true in conjunction with CREATE TEMPORARY TABLE ... SELECT:

CREATE TEMPORARY TABLE mytbl SELECT * FROM mytbl WHERE 0;

Creating a table on-the-fly from the results of a SELECT statement is a powerful capability, but there are several issues to consider when doing this.

With CREATE TABLE ... SELECT, you should use aliases as necessary to provide reasonable column names. When you create a table by selecting data into it, the column names are taken from the columns that you are selecting.

If a column is calculated as the result of an expression, the "name" of the column is the text of the expression. Prior to MySQL 3.23.6, the following statement will fail outright, because expressions aren't legal as column names:

mysql> create table my_test_tbl6 select PI();
ERROR 1166: Incorrect column name 'PI()'

From 3.23.6 on, column naming rules are relaxed, so the statement will succeed but create a table with an unusual column name:

mysql> create table my_test_tbl6 select PI();
mysql> SELECT * FROM mytbl;

+-----------+
| PI()      |
+-----------+
| 3.141593 |
+-----------+

That's unfortunate, because the column name can be referred to directly only by enclosing it within backticks:

mysql> SELECT `PI()` FROM my_test_tbl6;

+----------+
| PI()     |
+----------+
| 3.141593|
+----------+

To provide a column name that is easier to work with when selecting an expression, use an alias:

mysql> CREATE TABLE my_test_tbl7 SELECT PI() AS mycol;
mysql> SELECT mycol FROM my_test_tbl7;

+----------+
| mycol    |
+----------+
| 3.141593|
+----------+

A related snag occurs if you select columns from different tables that have the same name. Suppose tables t1 and t2 both have a column c and you want to create a table from all combinations of rows in both tables. The following statement will fail because it attempts to create a table with two columns named c:

mysql> CREATE TABLE t3 SELECT * FROM t1, t2;
ERROR 1060: Duplicate column name 'c'

You can provide aliases to specify unique column names in the new table:

mysql> CREATE TABLE t3 SELECT t1.c AS c1, t2.c AS c2 FROM t1, t2;

Another thing to watch out for is that characteristics of the original table that are not reflected in the selected data will not be incorporated into the structure of the new table.

For example, creating a table by selecting data into it does not automatically copy any indexes from the original table, because result sets are not themselves indexed. Similarly, column attributes such as AUTO_INCREMENT or the default value may not be carried into the new table.

. (Newer versions do better than older ones.) In some cases, you can force specific attributes to be used in the new table by invoking the CAST() function, which is available as of MySQL 4.0.2.

The following CREATE TABLE ... SELECT statement forces the columns produced by the SELECT to be treated as INT UNSIGNED, DATE, and CHAR BINARY, which you can verify with DESCRIBE:

mysql> CREATE TABLE mytbl SELECT
-> CAST(1 AS UNSIGNED) AS i,
-> CAST(CURDATE() AS DATE) AS d,
-> CAST('Hello, world' AS BINARY) AS c;
mysql> DESCRIBE mytbl; ;

+-------+-----------------+------+-----+------------+-------+
| Field | Type            | Null | Key | Default    | Extra |
+-------+-----------------+------+-----+------------+-------+
| i     | int(1) unsigned |      |     | 0          |       |
| d     | date            |      |     | 0000-00-00 |       |
| c     | char(12) binary |      |     |            |       |
+-------+-----------------+------+-----+------------+-------+

You can apply CAST() to column values retrieved from other tables as well. The allowable cast types are BINARY (binary string), DATE, DATETIME, TIME, SIGNED, SIGNED INTEGER, UNSIGNED, and UNSIGNED INTEGER.

As of MySQL 4.1, it's possible to provide even more information about the types that you want the columns in the new table to have by giving explicit definitions for them. Columns in the table are matched with the selected columns by name, so provide aliases for the selected columns if necessary to cause them to match up properly:

mysql> CREATE TABLE mytbl (i INT UNSIGNED, d DATE, c CHAR(20) BINARY)
-> SELECT -> 1 AS i,
-> CURDATE() AS d,
-> 'Hello, world' AS c;
mysql> DESCRIBE mytbl;

+-------+------------------+------+-----+---------+-------+
| Field | Type             | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+-------+
| i     | int(10) unsigned | YES  |     | NULL    |       |
| d     | date             | YES  |     | NULL    |       |
| c     | char(20) binary  | YES  |     | NULL    |       |
+-------+------------------+------+-----+---------+-------+

Note that this allows you to create character columns that have a different width than that of the longest value in the result set. Also note that the Null and Default attributes of the columns are different for this example than for the previous one. You could provide explicit declarations for those attributes as well if necessary.

Prior to MySQL 3.23, CREATE TABLE ... SELECT is unavailable. If you want to save the results of a SELECT in a table for use in further queries, you must make special arrangements in advance:

1. Run a DESCRIBE or SHOW COLUMNS query to determine the types of the columns in the tables from which you want to capture information.

2. Issue an explicit CREATE TABLE statement to create the table into which you want to save the SELECT results. The statement should specify the names and types of the columns that the SELECT will retrieve.

3. After creating the table, issue an INSERT INTO ... SELECT query to retrieve the results and insert them into the table.

Clearly, compared to CREATE TABLE ... SELECT, this involves a lot of ugly messing around.

Merge Tables

MERGE tables are essentially a virtual union of several existing MyISAM tables all of which exhibit identical column definitions.

A MERGE table composed of several tables can have some advantages over a single, large, MyISAM table, such as a higher read speed (if the tables are distributed over several hard drives) or a circumvention of the maximum file size in a number of older operating systems (for example, 2 gigabytes for Linux; 2.2 for 32-bit processors).

Among the disadvantages are that it is impossible to insert data records into MERGE tables (that is, INSERT does not function). Instead, INSERT must be applied to one of the subtables.

In the meanwhile, since most modern operating systems support files of arbitrary size, as well as RAID (that is, the division of a file system on several hard disks), MERGE tables play a subordinate role in practice.

MERGE table type, available in MySQL 3.23.25 and up, provides a way to perform queries on a set of tables simultaneously by treating them all as a single logical unit.

MERGE can be applied to a collection of MyISAM tables that all have identical structure. Suppose you have a set of individual log tables that contain log entries on a year-by-year basis and that each are defined like this, where CCYY represents the century and year:

CREATE TABLE log_CCYY
(
dt DATETIME NOT NULL,
info VARCHAR(100) NOT NULL,
INDEX (dt)
) TYPE = MYISAM;

If the current set of log tables includes log_1999 and log_2000 you can set up a MERGE table that maps onto them like this:

CREATE TABLE log_all
(
dt DATETIME NOT NULL,
info VARCHAR(100) NOT NULL,
INDEX (dt)
) TYPE = MERGE UNION = (log_1999, log_2000);

The TYPE option must be MERGE, and the UNION option lists the tables to be included in the MERGE table. After the table has been set up, you query it just like any other table, but the queries will refer to all the constituent tables at once.

The following query determines the total number of rows in all the log tables:

SELECT COUNT(*) FROM log_all;

This query determines how many log entries there are per year:

SELECT YEAR(dt) AS y, COUNT(*) AS entries FROM log_all GROUP BY y;

Besides the convenience of being able to refer to multiple tables without issuing multiple queries, MERGE tables offer some other nice features:

. A MERGE table can be used to create a logical entity that exceeds the allowable size of individual MyISAM tables.

. You can include compressed tables in the collection. For example, after a given year comes to an end, you wouldn't be adding any more entries to the corresponding log file, so you could compress it with myisampack to save space. The MERGE table will continue to function as before.

. Operations on MERGE tables are similar to UNION operations. UNION is unavailable prior to MySQL 4, but MERGE tables can be used in some cases as a workaround.

MERGE tables also support DELETE and UPDATE operations. INSERT is trickier, because MySQL needs to know which table to insert new records into. As of MySQL 4.0.0, MERGE table definitions can include an INSERT_METHOD option with a value of NO, FIRST, or LAST to indicate that INSERT is forbidden or that records should be inserted into the first or last table named in the UNION option.

For example, the following definition would cause an INSERT into log_all to be treated as an INSERT into log_2003, the last table named in the UNION option:

CREATE TABLE log_all
(
dt DATETIME NOT NULL,
info VARCHAR(100) NOT NULL,
INDEX (dt)
) TYPE = MERGE UNION = (log_1999, log_2000, log_2001, log_2002, log_2003) INSERT_METHOD = LAST;

Dropping Tables

Dropping a table is much easier than creating it because you don't have to specify anything about its contents. You just have to name it:

DROP TABLE tbl_name;

MySQL extends the DROP TABLE statement in some useful ways. First, you can drop several tables by specifying them all on the same statement:

DROP TABLE tbl_name1, tbl_name2, ... ;

Second, if you're not sure whether or not a table exists, but you want to drop it if it does, you can add IF EXISTS to the statement. This causes MySQL not to complain or issue an error if the table or tables named in the statement don't exist:

DROP TABLE IF EXISTS tbl_name;

IF EXISTS is particularly useful in scripts that you use with the mysql client. By default, mysql exits when an error occurs, and it is an error to try to remove a table that doesn't exist. For example, you might have a setup script that creates tables that you use as the basis for further processing in other scripts.

In this situation, you want to make sure the setup script has a clean slate when it begins. If you use a regular DROP TABLE at the beginning of the script, it would fail the first time because the tables have never been created. If you use IF EXISTS, there is no problem. If the tables are there, they are dropped; if not, the script continues anyway.

Indexing Tables

Indexes are the primary means of speeding up access to the contents of your tables, particularly for queries that involve joins on multiple tables.

Table Type Indexing Characteristics

MySQL provides quite a bit of flexibility in the way you can construct indexes:

. You can index single columns or construct composite indexes from combinations of columns.

. An index can be allowed to contain duplicate values or required to contain only unique values.

. You can have more than one index on a table if you want to be able to look up a values quickly from different columns of a table.

. For string column types other than ENUM or SET, you may elect to index a prefix of a column, that is, only the leftmost n bytes. (In fact, for BLOB and TEXT columns, you cannot set up an index unless you do specify a prefix length.)

Prefixes can be up to 255 bytes. If the column is mostly unique within the first n bytes, you usually won't sacrifice performance, and may well improve it. Indexing a column prefix rather than the entire column can make an index much smaller and faster to access.

Not all table types offer all indexing features. The following table summarizes the indexing properties of the various table types.

(The table does not include the MERGE type because MERGE tables are created from MyISAM tables and have similar indexing characteristics.)

Index Characteristic	ISAM	MyISAM	HEAP	BDB	InnoDB
NULL values allowed	No	Yes	As of 4.0.2	Yes	Yes
Columns per index	16	16	16	16	16
Indexes per table	16	32	32	31	32
Maximum index row size (bytes)	256	500	500	500/1024	500/1024
Index column prefixes allowed	Yes	Yes	Yes	Yes	No
BLOB/TEXT indexes allowed	No	Yes (255 bytes max)	No	Yes (255 bytes max)	No

Two numbers are shown for the BDB and InnoDB index row sizes. For these table types, the size is 500 bytes up through 4.0.3 and 1024 bytes thereafter.

The table illustrates some of the reasons why MyISAM storage format generally is to be preferred to the ISAM format that it succeeds. MyISAM relaxes several of the indexing constraints that apply to ISAM tables.

For example, with MyISAM tables, you can index columns that contain NULL values, you can index BLOB and TEXT columns, and you can have a larger number of indexes per table.

One implication of the differences in indexing characteristics for the various table types is that, depending on your version of MySQL, you may simply not be able to index certain columns. For example, you can use only ISAM tables if your MySQL is older than 3.23, which means you can't index a column if you want it to be able to contain NULL values.

Conversely, if you require an index to have certain properties, you may not be able to use certain types of tables. If you need to index a BLOB column, for example, you must use a MyISAM or BDB table.

If you have an existing table of one type but would like to convert it to another type that has more suitable indexing characteristics, use ALTER TABLE to change the type. Suppose you have MySQL 3.23 or later but have older tables that were originally created as ISAM tables.

You can easily convert them to MyISAM storage format using ALTER TABLE, which allows you to take advantage of MyISAM's superior indexing features:

ALTER TABLE tbl_name TYPE = MYISAM;

Creating & Dropping Index

MySQL can create several types of index:

. A regular (non-unique) index. This gives you indexing benefits but allows duplicates.

. A unique index. This disallows duplicate values. For a single-column index, this ensures that the column contains no duplicate values. For a multiple-column (composite) index, it ensures that no combination of values in the columns is duplicated among the rows of the table.

. A FULLTEXT index, used when you want to perform full text searches. This index type is supported only for MyISAM tables.

You can create indexes for a new table when you use CREATE TABLE, or add indexes to existing tables with CREATE INDEX or ALTER TABLE.

CREATE INDEX was introduced in MySQL 3.22, but you can use ALTER TABLE if your version of MySQL is older than that. (MySQL maps CREATE INDEX statements onto ALTER TABLE operations internally.)

ALTER TABLE is the more versatile than CREATE INDEX. You can use it to create a regular index, a UNIQUE index, a PRIMARY KEY, or a FULLTEXT index :

ALTER TABLE tbl_name ADD INDEX index_name (index_columns);
ALTER TABLE tbl_name ADD UNIQUE index_name (index_columns);
ALTER TABLE tbl_name ADD PRIMARY KEY (index_columns);
ALTER TABLE tbl_name ADD FULLTEXT (index_columns);

tbl_name is the name of the table to add the index to, and index_columns indicates which column or columns should be indexed. If the index consists of more than one column, separate the names by commas.

The index name index_name is optional, so you can leave it out and MySQL will pick a name based on the name of the first indexed column.

ALTER TABLE allows you to specify multiple table alterations in a single statement, so you can create several indexes at the same time. (This is faster than adding them one at a time with individual statements.)

To require that an index contain only unique values, create the index as a PRIMARY KEY or a UNIQUE index. The two types of index are very similar. In fact, a PRIMARY KEY is just a UNIQUE index that has the name PRIMARY. Two differences between the types of index are:

. A table can contain only one PRIMARY KEY because you can't have two indexes with the name PRIMARY. You can place multiple UNIQUE indexes on a table, although it's somewhat unusual to do so.

. A PRIMARY KEY cannot contain NULL values, whereas a UNIQUE index can. If a UNIQUE can contain NULL values, it usually can contain multiple NULL values. The reason for this is that it is not possible to know whether one NULL represents the same value as another, so they cannot be considered equal. (BDB tables are an exception-a BDB table allows only one NULL value within a UNIQUE index.)

CREATE INDEX can add a regular, UNIQUE, or FULLTEXT index to a table, but not a PRIMARY KEY:

CREATE INDEX index_name ON tbl_name (index_columns);
CREATE UNIQUE INDEX index_name ON tbl_name (index_columns);
CREATE FULLTEXT INDEX index_name ON tbl_name (index_columns);

tbl_name, index_name, and index_columns have the same meaning as for ALTER TABLE. Unlike ALTER TABLE, the index name is not optional with CREATE INDEX, and you cannot create multiple indexes with a single statement.

To create indexes for a new table when you issue a CREATE TABLE statement, the syntax is similar to that used for ALTER TABLE, but you specify the index-creation clauses as part of the column specification list:

CREATE TABLE tbl_name
(
... column declarations ... INDEX index_name (index_columns),
UNIQUE index_name (index_columns),
PRIMARY KEY (index_columns),
FULLTEXT index_name (index_columns),
...
);

As with ALTER TABLE, the index name is optional in CREATE TABLE statements for each INDEX, UNIQUE, and FULLTEXT clause; MySQL will pick an index name if you leave it out.

As a special case, you can create a single-column PRIMARY KEY by adding PRIMARY KEY to the end of a column declaration. As of MySQL 3.23, you can do the same for a UNIQUE index. For example, this statement:

CREATE TABLE my_test_tbl8
(
num_col INT NOT NULL PRIMARY KEY,
char_col CHAR(10) NOT NULL UNIQUE
);
is equivalent to the following one:
CREATE TABLE my_test_tbl9
(
num_col INT NOT NULL,
char_col CHAR(10) NOT NULL,
PRIMARY KEY (num_col),
UNIQUE (char_col)
);

Each of the preceding table-creation examples have specified NOT NULL for the indexed columns. For ISAM tables (and for HEAP tables prior to MySQL 4.0.2), that's a requirement because you cannot index columns that may contain NULL values.

For other table types, indexed columns can be NULL as long as the index is not a PRIMARY KEY.

To index a prefix of a string column (the leftmost n bytes of column values), the syntax for naming the column in the index definition is col_name(n) rather than simply col_name. For example, the following statement creates a table with two CHAR columns but uses only the first 10 bytes from each in the index created from those columns.

CREATE TABLE my_test_tbl10
(
name CHAR(30) NOT NULL,
address CHAR(60) NOT NULL,
INDEX (name(10),address(10))
);

Index prefixes are supported for ISAM, MyISAM, HEAP, and BDB tables, but not for InnoDB tables.

Prefix lengths, just like column lengths, refer to bytes rather than characters. The two are the same for single-byte character sets, but not for multi-byte character sets. MySQL will store into an index value as many complete characters as will fit.

For example, if an index prefix is 5 bytes long and a column value consists of 2-byte characters, the index value will contain 2 characters, not 2.5 characters. In some circumstances, you may find it not only desirable but necessary to index a column prefix rather than the entire column:

. Prefixes are necessary for BLOB or TEXT columns in any table type that allows those column types to be indexed. The prefix may be up to 255 bytes long.

. The length of index rows is equal to the sum of the length of the index parts of the columns that make up the index. If this length exceeds the allowable length of index rows, you can make the index "narrower" by indexing a column prefix. Suppose a MyISAM table contains two CHAR(255) columns named c1 and c2, and you want to create an index based on both of them.

The length of an index row in this case would be 255+255, which exceeds the MyISAM limit of 500 bytes per index row. However, you can create the index by indexing a shorter part of one or both columns.

Indexing a prefix of a column constrains that changes that you can make to the column later. You cannot shorten the column to a length less than the prefix length without dropping the index and re-creating it using a shorter length for the indexed part of a column.

If you index the first 30 bytes of a 40-byte CHAR column but then discover that you never store more than 20 bytes in the column, you might decide to save space in the table by changing the column to be only 20 bytes wide. In this case, you must drop the index first before making the column narrower. Then you can add the index again, indexing 20 or fewer bytes.

Columns in FULLTEXT indexes do not have prefixes. If you specify a prefix length for a FULLTEXT index, it will be ignored.

Dropping Indexes

To drop an index, use either a DROP INDEX or an ALTER TABLE statement. Like the CREATE INDEX statement, DROP INDEX was introduced in MySQL 3.22, is handled internally as an ALTER TABLE statement and cannot be used to affect a PRIMARY KEY. The syntax for index-dropping statements looks like this:

DROP INDEX index_name ON tbl_name;
ALTER TABLE tbl_name DROP INDEX index_name;
ALTER TABLE tbl_name DROP PRIMARY KEY;

The first two statements are equivalent. The third is used only for dropping a PRIMARY INDEX; it is unambiguous because a table can have only one such key. If no index was created explicitly as a PRIMARY KEY but the table has one or more UNIQUE indexes, MySQL drops the first of them.

Indexes can be affected if you drop columns from a table. If you drop a column that is a part of an index, the column is removed from the index as well. If all columns that make up an index are dropped, the entire index is dropped.

Altering Table Structure

ALTER TABLE is a versatile statement in MySQL, and you can use it to do many things. We've already seen some of its capabilities. You can also use ALTER TABLE to rename tables, add or drop columns, change column types, and more. In this section, we'll cover some of the other features it offers.

ALTER TABLE is useful when you find that the structure of a table no longer reflects what you want to do with it. You may want to use the table to record additional information, or perhaps it contains information that has become superfluous.

Maybe existing columns are too small, or perhaps you've declared them larger than it turns out you need and you'd like to make them smaller to save space and improve query performance. Or maybe you just typed in the table's name incorrectly when you issued the CREATE TABLE statement. The following are some examples:

. You're running a research project. You assign case numbers to research records using an AUTO_INCREMENT column. You didn't expect your funding to last long enough to generate more than about 50,000 records, so you made the column type SMALLINT UNSIGNED, which holds a maximum of 65,535 unique values.

However, the funding for the project was renewed, and it looks like you may generate another 50,000 records. You need to make the type bigger to accommodate more case numbers.

. Size changes can go the other way, too. Maybe you created a CHAR(255) column but now recognize that no value in the table is more than 100 characters long. You can shorten the column to save space.

. You want to convert a table to another type to take advantage of features offered by that type. For example, an ISAM table won't allow NULL values in indexed columns. If you really need to index a column that contains NULL, you can convert it to be a MyISAM table.

The syntax for ALTER TABLE is as follows:

ALTER TABLE tbl_name action, ... ;

Each action specifies a modification you want to make to the table. Some database engines allow only a single action in an ALTER TABLE statement, but MySQL allows multiple actions; just separate the actions by commas.

This extension to ALTER TABLE is useful because some types of table modifications cannot be performed with single-action statements.

For example, it's impossible to change all the VARCHAR columns to CHAR columns by changing them one at a time. You must change them all at once.

The following examples show some of the capabilities of ALTER TABLE:

. Renaming a table. Use a RENAME clause that specifies the new table name:

ALTER TABLE tbl_name RENAME TO new_tbl_name;

Another way to rename tables is with RENAME TABLE, available as of MySQL 3.23.23. The syntax looks like this:

RENAME TABLE old_name TO new_name;

One thing that RENAME TABLE can do that ALTER TABLE cannot is rename multiple tables in the same statement. For example, you can swap the names of two tables like this:

RENAME TABLE t1 TO tmp, t2 TO t1, tmp TO t1;

If you qualify a table name with a database name, you can move a table from one database to another by renaming it. Either of the following statements move the table t from the sampdb database to the test database:

ALTER TABLE sampdb.t RENAME TO test.t;
RENAME TABLE sampdb.t TO test.t;

You cannot rename a table to use a name that already exists, however.

. Changing a column type. To change a column type, you can use either a CHANGE or MODIFY clause. Suppose the column in a table mytbl is SMALLINT UNSIGNED and you want to change it to MEDIUMINT UNSIGNED. Do so using either of the following commands:

. ALTER TABLE mytbl MODIFY i MEDIUMINT UNSIGNED;
. ALTER TABLE mytbl CHANGE i i MEDIUMINT UNSIGNED;

Why is the column named twice in the command that uses CHANGE? Because one thing that CHANGE can do that MODIFY cannot is to rename the column in addition to changing the type.

If you had wanted to rename i to j at the same time you changed the type, you'd do so like this:

ALTER TABLE mytbl CHANGE i j MEDIUMINT UNSIGNED;

The important thing with CHANGE is that you name the column you want to change and then specify a complete column declaration, which includes the column name. You must include the name in the declaration, even if it's the same as the old name.

As of MySQL 4.1, you can assign character sets to individual columns, so it's possible to use the CHARACTER SET attribute in a column's definition to change its character set:

ALTER TABLE t MODIFY c CHAR(20) CHARACTER SET ucs2;

An important reason for changing column types is to improve query efficiency for joins that compare columns from two tables. A comparison is quicker when the columns are both the same type. Suppose you're running a query like the following:

SELECT ... FROM t1, t2 WHERE t1.name = t2.name;

If t1.name is CHAR(10) and t2.name is CHAR(15), the query won't run as quickly as if they were both CHAR(15). You can make them the same by changing t1.name using either of the following commands:

ALTER TABLE t1 MODIFY name CHAR(15);
ALTER TABLE t1 CHANGE name name CHAR(15);

Prior to MySQL 3.23, it's essential that joined columns be of the same type, or indexes will not be used for the comparison and the join will run more slowly.

But even from 3.23 and later, when indexes can be used in joins between dissimilar column types, a query will still be faster if the types are identical.

. Converting a table from variable-length rows to fixed-length rows. Suppose you have a table chartbl that was created like this:

CREATE TABLE chartbl (name VARCHAR(40), address VARCHAR(80));

The table contains VARCHAR columns and you want to convert them to CHAR columns to see what kind of performance improvements you get. (If the table uses ISAM or MyISAM storage format, fixed-length rows generally can be processed more quickly than variable-length rows.)

The problem here is that you need to change the columns all at once in the same ALTER TABLE statement. You can't do them one at a time or the attempt will be ineffective.

(Try changing just one of them and then run DESCRIBE chartbl; you'll find that the columns are still defined as VARCHAR!) The reason for this is that if you change a single column at a time, MySQL notices that the table still contains variable-length columns and reconverts the changed column back to VARCHAR to save space. To deal with this, change all the VARCHAR columns at the same time:

ALTER TABLE chartbl MODIFY name CHAR(40), MODIFY address CHAR(80);

Now DESCRIBE will show that the table contains CHAR columns. It's exactly this type of operation that makes it important that ALTER TABLE support multiple actions in the same statement.

There is a caveat to be aware of when you want to convert a table as just shown: BLOB and TEXT types are variable-length types with no fixed-length equivalent.

The presence of any BLOB or TEXT columns in a table will defeat any attempt to convert it to fixed-length row format because even one variable-length column in a table causes it to have variable-length rows.

. Converting a table from fixed-length rows to variable-length rows. Suppose you discover that chartbl is indeed faster with fixed-length rows. On the other hand, it takes more storage than you'd like, so you decide to convert it back to its original form to save space.

Converting a table in this direction is much easier. You only need to change one CHAR column to VARCHAR and MySQL will convert the other CHAR columns automatically. To convert the chartbl table, either of the following statements will do:

. ALTER TABLE chartbl MODIFY name VARCHAR(40);
. ALTER TABLE chartbl MODIFY address VARCHAR(80);
. Converting a table type. To convert a table from one storage format to another, use a TYPE clause to change the table's type:
. ALTER TABLE tbl_name TYPE = tbl_type;

tbl_type is a type specifier such as ISAM, MYISAM, HEAP, BDB, or INNODB (lettercase does not matter).

Changing table types can be useful when you upgrade your MySQL installation to a newer version that provides additional table-handling features. For example, if you upgrade from a pre-3.23 version of MySQL to 3.23 or later, your older tables will be in ISAM format. To change them to MyISAM format, use the following statement for each one:

ALTER TABLE tbl_name TYPE = MYISAM;

Doing this allows you to take advantages of the capabilities that MyISAM offers than ISAM does not. For example, MyISAM tables are machine independent, so you can move them to other machines by copying table files directly, even if the machines have different hardware architectures.

Another reason to change a table type is to make it transaction-safe. Suppose you have a MyISAM table and discover that an application that uses it needs to perform transactional operations, including rollback in case failures occur.

MyISAM tables do not support transactions, but you can make the table transaction-safe by converting it to a BDB or InnoDB table:

ALTER TABLE tbl_name TYPE = BDB;
ALTER TABLE tbl_name TYPE = INNODB;

ALTER TABLE is useful in many ways, but there are circumstances under which you should not use it. The following are two examples:

. HEAP tables are held in memory and disappear when the server exits. It is not a good idea to convert a table to type HEAP if you require the table to last beyond server shutdown.

. If you use a MERGE table to group a collection of MyISAM tables together, you should avoid using ALTER TABLE to modify any of the MyISAM tables unless you make the same change to all of them, and to the MERGE table as well.

The proper functioning of a MERGE table depends on its having the same structure as all of its constituent MyISAM tables.

Foreign Keys & Referential Integrity

A foreign key relationship allows you to declare that an index in one table is related to an index in another and allows you to place constraints on what may be done to the table containing the foreign key. The database enforces the rules of this relationship to maintain referential integrity.

For example, the commission table in the sampdb sample database contains an associate_id column, which we use to relate commission records to associates in the associate table. When we created these tables in earlier , we did not set up any explicit relationship between them.

Were we to do so, we would declare commission.associate_id to be a foreign key for the associate.associate_id column. That prevents a record from being entered into the commission table unless it contains an associate_id value that exists in the associate table.

(In other words, the foreign key prevents entry of commissions for non-existent associates.) We could also set up a constraint such that if an associate is deleted from the associate table, all corresponding records for the associate in the commission table should be deleted automatically as well.

This is called cascaded delete because the effect of the delete cascades from one table to another.

Foreign keys help maintain the consistency of your data, and they provide a certain measure of convenience. Without foreign keys, you are responsible for keeping track of inter-table dependencies and maintaining their consistency from within your applications. In many cases, doing this isn't really that much work.

It amounts to little more than adding a few extra DELETE statements to make sure that when you delete a record from one table, you also delete the corresponding records in any related tables.

But if your tables have particularly complex relationships, you may not want to be responsible for implementing these dependencies in your applications. Besides, if the database engine will perform consistency checks for you, why not let it?

Foreign key support in MySQL is provided by the InnoDB table handler. This section describes how to set up InnoDB tables to define foreign keys, and how foreign keys affect the way you use tables. But first, it's necessary to define some terms:

. The parent is the table that contains the original key values.

. The child is the related table that refers to key values in the parent.

. Parent table key values are used to associate the two tables. Specifically, the index in the child table refers to the index in the parent. Its values must match those in the parent or else be set to NULL to indicate that there is no associated parent table record.

The index in the child table is known as the foreign key—

that is, the key that is foreign (external) to the parent table but contains values that point to the parent. A foreign key relationship can be set up to disallow NULL values, in which case all foreign key values must match a value in the parent table.

InnoDB enforces these rules to guarantee that the foreign key relationship stays intact with no mismatches. This is called referential integrity.

« Previous Next Chapter »

SHARE THIS PAGE

MySQL - MySQL Basic, Expression & Operators, Working with Tables

0 Comments:

Post a Comment