It is often important, especially when dealing with databases and such, that files are stored in the correct character set. Failure to do so can result in illegible displays or even data corruption. Checking the character set of a file in Linux can be accomplished using the
Jubal@Stranger:$ file migrate1.csv
migrate1.csv: Little-endian UTF-16 Unicode English text, with CRLF, LF line terminators
Jubal@Stranger:$ file migrate2.csv
migrate2.csv: UTF-8 Unicode (with BOM) English text, with CRLF line terminators
In the above example, we are told that
migrate1.csv file is a UTF-16 file with mixed line terminators as well as other information. Similarly, the second file,
migrate2.csv is a standard UTF-8 document.
Hope this helps!