This tutorial will show you how to quickly check and convert file encoding charsets on Unix based operational systems, such as Linux distros and Mac OS.
In order to check the current file encoding, use the command below, replacing <filename>
by the desired file.
file -I <filename>
Example:
file -I test.csv test.csv: text/plain; charset=iso-8859-1
Now that you already know the encoding of your file, you can convert your source file to a new one with the desired encoding. In order to do so, run the command below replacing the parameters <source_encoding>
, <destination_encoding>
, <source_filename>
, and <destination_filename>
.
iconv -f <source_encoding> -t <destination_encoding> <source_file> > <destination_file>
Example:
iconv -f iso-8859-1 -t utf-8 test.csv > new_test.csv
The output file new_test.csv has the contents of the old file but with the desired encoding:
file -I new_test.csv new_test.csv: text/plain; charset=utf-8
Files with charset US-ASCII are compatible with the UTF-8 charset, so in these cases, if you try to convert from US-ASCII
to UTF-8
the output file will still be US-ASCII
since no conversion is necessary.
This guide will show you how to create a Python function decorator with a few…
This guide will show you how to fix the error Got permission denied while trying…
This guide will show you how to create a Python virtual environment on Intellij IDEA…
This tutorial will quickly show you how to to find and kill processes on Linux,…
This guide shows a possible solution for Python error Relocation R_X86_64_PC32 against symbol can not…
I condensed below a cheat sheet of Kubernetes useful commands. I will keep updating this…
View Comments
What a useful post! Thanks!