You are here

Encoding

MySQL: #1071 - Specified key was too long; max key length is 767 bytes

Submitted by Druss on Fri, 2015-04-03 01:33

Here I was simply creating a MySQL (5.5) table when suddenly up pops the following error:

#1071 - Specified key was too long; max key length is 767 bytes

After a little trial and error, I found that since one of my VARCHAR fields was being used for a UNIQUE index, MySQL was basically telling me that it was using too much space. When I reduced the length of this field from its initial 512 setting to 256 & then 255, it still complained. However, reducing it further to 128 fixed the issue!

MySQL encoding error: Warning (Code 1366): Incorrect string value: '\xE9, a <...' for column 'body' at row 3

Submitted by Druss on Sun, 2013-06-02 13:10

While performing a CSV import recently, I ran into the following error messages:

Warning (Code 1366): Incorrect string value: '\xE9, a <...' for column 'body' at row 3
Warning (Code 1366): Incorrect string value: '\xE6. He ...' for column 'body' at row 24
Warning (Code 1366): Incorrect string value: '\xE9, and...' for column 'body' at row 26

The first message was triggered due to the accented é in the word, protegé, in the input. The rest of the field was not imported. The others were similarly triggered.

Changing a file's encoding using Vim

Submitted by Druss on Sun, 2013-06-02 13:07

During imports and stuff, it's imperative that all steps utilise the same encoding/character set. If a text file is not using the preferred encoding, we can use Vim to change it during its save action as follows:

:set fileencoding=utf8
:w

or if you want to save it to a different file and leave the current file unchanged:

:w ++enc=utf-8 newfile.txt

Setting up Unicode support for PuTTY

Submitted by Druss on Tue, 2011-09-27 23:34

I work extensively on a Windows desktop. However, I do SSH into Linux servers often and I do so using PuTTY, a free and open source client. Everything works peachy. However, I recently had occasion to work extensively with some Unicode source data and I found that there were times when I thought that there were encoding issues with the data as they were not being displayed correctly on my screen.

Finding out the character set of a file in Linux

Submitted by Druss on Tue, 2011-09-27 22:21

It is often important, especially when dealing with databases and such, that files are stored in the correct character set. Failure to do so can result in illegible displays or even data corruption. Checking the character set of a file in Linux can be accomplished using the file command:

Jubal@Stranger:$ file migrate1.csv
migrate1.csv: Little-endian UTF-16 Unicode English text, with CRLF, LF line terminators
Jubal@Stranger:$ file migrate2.csv

MySQL charset issues while importing data using LOAD DATA INFILE

Submitted by Druss on Tue, 2011-09-27 20:47

Earlier today, I was banging my head against the wall trying to import some data in a CSV file into MySQL. While my imports have gone well thus far, this time around I was dealing with data involving lots of strange diacritics, runic squiggles and other manners of gibberish that make the world as fun as it can be. In other words, I was dealing with Unicode.

Subscribe to RSS - Encoding