MySQL中的char/varchar和binary/varbinary

本文最后更新于2015年5月1日,已超过 1 年没有更新,如果文章内容失效,还请反馈给我,谢谢!

官方文档:

==varbinary类型的值比较时区分大小写;作为索引的话速度较varchar快;而且不会对原始数据进行修改==

The BINARY and VARBINARY types are similar to CHAR and VARCHAR, except that they contain binary strings rather than nonbinary strings.(MySQL的binary和varbinary类型 & char和varchar类型 是非常像的,除了[var]binary存放的是二进制字符串以外) That is, they contain byte strings rather than character strings.(它们包含的是字节串而不是字符串) This means that they have no character set, and sorting and comparison are based on the numeric values of the bytes in the values.(这意味着它们没有字符集的概念,因此排序和比较都是基于字节中的实际数值大小进行的)

The permissible maximum length is the same for BINARY and VARBINARY as it is for CHAR andVARCHAR, except that the length for BINARY and VARBINARY is a length in bytes rather than in characters.(binary和varbinary的允许的最大长度和char/varchar类型是一样的,除了binary/varbinary的长度指的是字节串的长度之外)

The BINARY and VARBINARY data types are distinct from the CHAR BINARY and VARCHAR BINARY data types.(binary/varbinary类型和char binary/varchar binary类型是不同的) For the latter types, the BINARY attribute does not cause the column to be treated as a binary string column. Instead, it causes the binary collation for the column character set to be used, and the column itself contains nonbinary character strings rather than binary byte strings. For example, CHAR(5) BINARY is treated as CHAR(5) CHARACTER SET latin1 COLLATE latin1_bin, assuming that the default character set is latin1. This differs from BINARY(5), which stores 5-bytes binary strings that have no character set or collation. For information about differences between nonbinary string binary collations and binary strings, see Section 10.1.7.6, “The _bin and binary Collations”.

If strict SQL mode is not enabled and you assign a value to a BINARY or VARBINARY column that exceeds the column’s maximum length, the value is truncated to fit and a warning is generated.(在非严格的SQL模式下,如果事先长度没有指定对,超过的部分会被截断,同时产生warning) For cases of truncation, you can cause an error to occur (rather than a warning) and suppress insertion of the value by using strict SQL mode. SeeSection 5.1.7, “Server SQL Modes”.

When BINARY values are stored, they are right-padded with the pad value to the specified length. The pad value is 0x00 (the zero byte). Values are right-padded with 0x00 on insert, and no trailing bytes are removed on select. All bytes are significant in comparisons, including ORDER BY and DISTINCT operations. 0x00 bytes and spaces are different in comparisons, with 0x00 < space.(binary类型的值在存储的时候是会对不足的地方用0x00在右侧进行填充的,被填充了之后,“值”在进行比较的时候是会把0x00带上的)

Example: For a BINARY(3) column, ‘a ‘ becomes ‘a \0’ when inserted. ‘a\0’ becomes ‘a\0\0’ when inserted. Both inserted values remain unchanged when selected.

For VARBINARY, there is no padding on insert and no bytes are stripped on select. All bytes are significant in comparisons, including ORDER BY and DISTINCT operations. 0x00 bytes and spaces are different in comparisons, with 0x00 < space.(varbinary类型和binary类型是不同的,varbinary是不会进行填充操作的,该是怎么样就是怎么样)

For those cases where trailing pad bytes are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad bytes will result in a duplicate-key error. For example, if a table contains ‘a’, an attempt to store ‘a\0’ causes a duplicate-key error.(如果某一有索引的列需要唯一的值,在同一列中插入只在末尾填充字符有些许差别的值时,会引起duplicate-key的错误)

You should consider the preceding padding and stripping characteristics carefully if you plan to use theBINARY data type for storing binary data and you require that the value retrieved be exactly the same as the value stored. The following example illustrates how 0x00-padding of BINARY values affects column value comparisons:

If the value retrieved must be the same as the value specified for storage with no padding, it might be preferable to useVARBINARY or one of the BLOB data types instead.(如果值在进行检索的时候需要和未填充之前相同的话,建议使用varbinary或者某一个blob类型替代,而不要使用binary类型)

==

The CHAR and VARCHAR types are similar, but differ in the way they are stored and retrieved.(char和varchar类型是相似的,但在 存储 & 检索 的方式上有所不同) They also differ in maximum length and in whether trailing spaces are retained.(它们在最大长度和是否保留尾部空格这两点上有所不同)

The CHAR and VARCHAR types are declared with a length that indicates the maximum number of characters you want to store. For example, CHAR(30) can hold up to 30 characters.(char和varchar在声明的时候需要指明能够存放的最大长度)

The length of a CHAR column is fixed to the length that you declare when you create the table. The length can be any value from 0 to 255.(char那一列的长度在你创建表的时候就已经被指定了,可以为0-255之间的任何值) When CHAR values are stored, they are right-padded with spaces to the specified length.(char类型的值在存储的时候是会在最右端用空格填充不足的部分的) When CHAR values are retrieved, trailing spaces are removed unless the PAD_CHAR_TO_FULL_LENGTH SQL mode is enabled.(但在检索char类型的值时,默认情况下是会先将尾部的空格给去掉然后比较的,除非设置了PAD_CHAR_TO_FULL_LENGTH模式)

Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 65,535. The effective maximum length of a VARCHAR is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used. See Section D.10.4, “Limits on Table Column Count and Row Size”.

In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.(与char类型相比,varchar类型的值需要1到2个字节的空间存放除了数据之外的长度信息;当值的大小在0-255之间时,只需要1字节,否则需要2个字节)

If strict SQL mode is not enabled and you assign a value to a CHAR or VARCHAR column that exceeds the column’s maximum length, the value is truncated to fit and a warning is generated. For truncation of nonspace characters, you can cause an error to occur (rather than a warning) and suppress insertion of the value by using strict SQL mode. See Section 5.1.7, “Server SQL Modes”.(如果不是严格的sql模式,存放的内容大小超出了列声明的大小的话,多余的部分会被截断同时产生warning,但不会报错)

For VARCHAR columns, trailing spaces in excess of the column length are truncated prior to insertion and a warning is generated, regardless of the SQL mode in use. For CHAR columns, truncation of excess trailing spaces from inserted values is performed silently regardless of the SQL mode.

VARCHAR values are not padded when they are stored. Trailing spaces are retained when values are stored and retrieved, in conformance with standard SQL.(varchar类型的值不会在尾部进行填充处理;但如果值的尾部本来是有空格的话,尾部的空格也不会被删除)

The following table illustrates the differences between CHAR and VARCHAR by showing the result of storing various string values into CHAR(4) and VARCHAR(4) columns (assuming that the column uses a single-byte character set such as latin1).

Value CHAR(4) Storage Required VARCHAR(4) Storage Required
‘    ‘ 4 bytes 1 byte
‘ab’ ‘ab  ‘ 4 bytes ‘ab’ 3 bytes
‘abcd’ ‘abcd’ 4 bytes ‘abcd’ 5 bytes
‘abcdefgh’ ‘abcd’ 4 bytes ‘abcd’ 5 bytes

The values shown as stored in the last row of the table apply only when not using strict mode; if MySQL is running in strict mode, values that exceed the column length are not stored, and an error results.

If a given value is stored into the CHAR(4) and VARCHAR(4) columns, the values retrieved from the columns are not always the same because trailing spaces are removed from CHAR columns upon retrieval.(在进行检索的时候,char和varchar类型的数据不总是相同的,因为尾部的空格会被删除) The following example illustrates this difference:

Values in CHAR and VARCHAR columns are sorted and compared according to the character set collation assigned to the column.(用char和varchar类型存储的列在比较的时候是通过字符集的方式进行的

All MySQL collations are of type PADSPACE. This means that all CHAR, VARCHAR, and TEXT values in MySQL are compared without regard to any trailing spaces.(所有的char、varchar和text类型的值在进行比较的时候是不考虑尾部的空格的;有一点例外就是在使用like进行模式匹配的时候尾部的空格是有意义的,会考虑在内Comparison in this context does not include the LIKE pattern-matching operator, for which trailing spaces are significant. For example:

This is true for all MySQL versions, and is not affected by the server SQL mode.

Note

For more information about MySQL character sets and collations, see Section 10.1, “Character Set Support”. For additional information about storage requirements, see Section 11.7, “Data Type Storage Requirements”.

For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad characters will result in a duplicate-key error. For example, if a table contains ‘a’, an attempt to store ‘a ‘ causes a duplicate-key error.(当尾部的填充字符被去掉之后或者在比较的时候忽略它们时,如果某一列有一个索引需要唯一值,但插入列中的值只在尾部的空格数量上不同的话,就会引发duplicate-key错误)

==

总结起来就是,使用char类型存放某一值时,如果该值的末尾包含空格,那么MySQL在进行操作(显示、比较……)的时候就会先将尾部的空格给删除,但在存储的时候是会在最后用空格进行填充;用varchar的好处就是,你给我什么,我就存什么,不作任何额外的操作(但在进行比较的时候’a ‘和’a’被认为是相同的)。

====

为什么现在大多数情况下人们用varbinary替代varchar?
搜索关键字:
  • MySQL varbinary vs varchar
  • MySQL varbinary
参考链接:
参考回答:

the binary comparisons will be more efficient, since it won’t involve all of the code that’s there to deal with collations.(二进制的比较效率更高;varbinary在有索引的情况下效率较varchar类型的更高)

have no character set or collation(没有字符集限制

binary保存二进制字符串,它保存的是字节而不是字符,没有字符集限制

binary(8)可以保存8个字符,每个字符占1个字节,共占8个字节

进行比较时是按字节进行比较,而不是按字符(char),按字节比较比字符简单快速

按字符比较不区分大小写,而binary在比较时区分大小写,结尾使用\0填充,而不是空格

声明: 除非注明,ixyzero.com文章均为原创,转载请以链接形式标明本文地址,谢谢!
https://ixyzero.com/blog/archives/2118.html

《MySQL中的char/varchar和binary/varbinary》上有4条评论

  1. CHAR 和 VARCHAR 存取的差别
    https://paper.tuisec.win/detail/117fc6990f3cc61
    https://toutiao.io/posts/5v7b23/preview

    你真的知道CHAR和VARCHAR类型在存储和读取时的区别吗?

    还是先抛几条结论吧:
    1、存储的时候,CHAR总是会补足空格后再存储,不管用户插入数据时尾部有没有包含空格。
    2、存储的时候,VARCHAR不会先补足空格后再存储,但如果是用户在插入时特地加了空格那就会如实存储,而不会给删除。
    3、读取数据时,CHAR总是会删除尾部空格(哪怕是写入时包含空格)。
    4、读取数据时,VARCHAR总是如实取出之前存入的值(如果存储时尾部包含空格,就会继续保留着,不会像CHAR那样删除尾部空格)。

  2. Mysql Error 1264: Out of range value for column int(11) at row 1
    https://stackoverflow.com/questions/14284494/mysql-error-1264-out-of-range-value-for-column

    一个错误原因是要插入的内容超过了字段的取值边界。

    The value 3172978990 is greater than 2147483647 – the maximum value for INT – hence the error. MySQL integer types and their ranges are listed here.

    Also note that the (10) in INT(10) does not define the "size" of an integer. It specifies the display width of the column. This information is advisory only.

    To fix the error, change your datatype to VARCHAR. Phone and Fax numbers should be stored as strings. See this discussion.

    # 应对方式一(将原字段的 int(10) 类型更新成 bigint 类型)
    ALTER TABLE T_PERSON MODIFY mobile_no BIGINT;

    # 应对方式二
    使用 VARCHAR 类型来表示该字段

    What is the size of column of int(11) in mysql in bytes?
    https://stackoverflow.com/questions/5634104/what-is-the-size-of-column-of-int11-in-mysql-in-bytes

    An INT will always be 4 bytes no matter what length is specified.

    TINYINT = 1 byte (8 bit)
    SMALLINT = 2 bytes (16 bit)
    MEDIUMINT = 3 bytes (24 bit)
    INT = 4 bytes (32 bit)
    BIGINT = 8 bytes (64 bit).

    The length just specifies how many characters to display when selecting data with the mysql command line client.

    ... and the maximum value will be 2147483647 (Signed) or 4294967295 (Unsigned)

发表评论

电子邮件地址不会被公开。 必填项已用*标注