Last Updated on 15/11/2020
This tutorial will show you how to quickly check and convert file encoding charsets on Unix based operational systems, such as Linux distros and Mac OS.
Check your file encoding
In order to check the current file encoding, use the command below, replacing <filename>
by the desired file.
file -I <filename>
Example:
file -I test.csv test.csv: text/plain; charset=iso-8859-1
Convert your file encoding
Now that you already know the encoding of your file, you can convert your source file to a new one with the desired encoding. In order to do so, run the command below replacing the parameters <source_encoding>
, <destination_encoding>
, <source_filename>
, and <destination_filename>
.
iconv -f <source_encoding> -t <destination_encoding> <source_file> > <destination_file>
Example:
iconv -f iso-8859-1 -t utf-8 test.csv > new_test.csv
The output file new_test.csv has the contents of the old file but with the desired encoding:
file -I new_test.csv new_test.csv: text/plain; charset=utf-8
Useful tip
Files with charset US-ASCII are compatible with the UTF-8 charset, so in these cases, if you try to convert from US-ASCII
to UTF-8
the output file will still be US-ASCII
since no conversion is necessary.
References
- Unix & Linux Stack Exchange – Why did this file not convert to UTF-8 when using iconv?
What a useful post! Thanks!