While performing a CSV import recently, I ran into the following error messages:
Warning (Code 1366): Incorrect string value: '\xE9, a <...' for column 'body' at row 3
Warning (Code 1366): Incorrect string value: '\xE6. He ...' for column 'body' at row 24
Warning (Code 1366): Incorrect string value: '\xE9, and...' for column 'body' at row 26
The first message was triggered due to the accented é in the word, protegé, in the input. The rest of the field was not imported. The others were similarly triggered.
The problem here is the mixing and matching of encodings that can happen during imports. While I am usually quite anal to have everything in UTF-8 during imports/exports, I appear to have missed a spot, which, in this case, was the encoding of the CSV file. I confirmed this by finding out information on the file which gave me:
bad.csv: ISO-8859 English text, with very long lines, with CRLF line terminators
Once I used Vim to change the encoding of the file to UTF-8, this changed to:
good.csv: UTF-8 Unicode English text, with very long lines, with CRLF line terminators
A Windows trick is to open the text file in Notepad and save it as a new UTF-8 file via File -> Save As.
Once this change was made, my import went through swimmingly.
- Log in to post comments