is an easy-to-use Unicode conversion application that can help you batch convert multiple
ansi/Unicode/non Unicode encoding documents between any encoding, and
supports Unicode(UTF-8/UTF-16/UTF-7/UTF-32), Chinese simplified GBK, Chinese traditional
BIG5, Japanese SHIFT-JIS, Japanese EUC-JP,
Korean euc-kr, French ISO-8859-1, Thai ISO-8859-11 characters set encoding etc, and it can process thousands of files within several minutes.
It can convert non Unicode to Unicode, for examples, convert Chinese simplified GBK, Chinese traditional BIG5 to Unicode, Japanese to Unicode, French to UTF8, Thai to UTF8, iso8859-1 to Unicode and ansi to Unicode ect.
It also can convert from Unicode to non Unicode, for example, convert Unicode to GBK, Unicode to BIG5, Unicode to Japanese, Unicode to iso8859-1, and Unicode to ansi ect.
It also can convert between Unicode code pages, for example, convert UTF-16 to UTF-8, UTF-8 to UTF-16.
Unicode Converter just does plain text conversion, fox example it can convert .txt text files, .php files, .xml files, .html files and more from ansi to Unicode/UTF-8 encoding. Unicode Converter is not a file format converter! fox example can not convert PDF to text files; from Word to Html files or anything else like that.
Who would need Unicode Converter, and for what purpose?
* People who have text files with unkown text encodings, and receive emails or files that don't display properly, simply because the text encoding is incompatible with their system.
* People who have a vast number of text files in an older non-Unicode format that needs to be upgraded to Unicode. or people whose files needs to be converted from Unicode to an older format for legacy systems.
* People who want to convert a file's newline formatting to or from DOS (CR/LF), Unix (LF), Mac (CR) newline format.
How can I run the program from the command line?
tec <source file path> <-de:destination encode code> [-dp:destination file path] [-dn:destination newline code] [-se:source encode code] [-is] [-iw:include words] [-ew:exclude words] [-b2b] [-b2p:bak to path] [-nb]
|source file path||the path and files to be converted. This parameter must exist.
for example, "d:\source\*.txt" (use quotes when paths contain spaces)
|-de:destintation encoding code||destination encoding code. This parameter must exist.
You can get the full code list from the graphical interface , please see the following red frame.
for example, -de:41, the destination encoding is utf-8
|-dp:"destination file path"||Destination file path. For example, -dp:"d:\dest"
if this parameter is ignored, the source file will be converted to the same file path, and the source file will be overwritten. (Use quotes when the destination path contains spaces.)
|-dn:destination newline code||Destination newline code. You can get the full code list from the graphical interface, please see the following red frame.
For example, -dn:0, means DON'T convert the newline format. For this setting, the source file's newline formatting is preserved in the destination file, altered only as needed to satisfy the requirements of the destination encoding code.
If this parameter is ignored, it is the same as –dn:0.
|-se:source encoding code||Source encoding code,. You can get the full code list in the graphical interface, please see the following red frame.
for example, -se:0, TEC will automatically determine the source file's encoding format
If this parameter is ignored, it is the same as -se:0, ie, auto detection of source file encoding format
|-is||Include sub-folders; If specified, source files contained in sub folders of the source path folder will be converted as well.|
|-iw:include words||Include words in source file name or file path. Only convert files whose names include the specified words. Used for wild card or entire folder source paths.
for example, -iw:test;2010*.log, means convert only those source files whose files names include "test" or are of the form "2010*.log"
|-ew:exclude words||Exclude words in source file name or file path. Only convert files whose names do not include the specified words. Used for wild card, or entire folder source paths.
for example, -ew:.bak, source files with an extension of .bak will excluded.
|-b2b||Backup files to .bak, used when the destination path is the same as the source|
|-b2p:"bak to path"||Backup files to specified path,. Used when the destination path is the same as the source path.
For example, -b2b:"d:\bak". Source files will be backed up to the folder "d:\bak" (Use quotes when the path contains spaces)
|-nb||Don't backup source files|
Command Line Example 1:
tec "c:\source file\*.php" -de:41 -b2b -is
This will convert *.php files in the "c:\source file\" folder, and its sub-folders, to utf-8 file encoding format, and it will backup the original source files to the same folder using the .bak file extension
Command Line Example 2:
tec "c:\source file\*.php" -de:-2 -dn:1 -dp:"d:\dest file\" -is -ew:.bak
This will convert *.php files in "c:\source file" and its sub-folders to utf-8 no bom file encoding format, and convert to unix newline format, the destination file path is the folder "d:\dest file", and it will not convert .bak files in source file path.
Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. There are several Unicode encodings: the most popular is UTF-8 and UTF-16. UTF-8 uses a variable-length character encoding, and all basic Latin character codes are identical to ASCII.
I have more questions - who should I write to?
Please send your additional questions to email@example.com.
|Western European (ISO-8859-1)
Central European (ISO-8859-2)
Baltic (old) (ISO-8859-4)
Western European with Euro (ISO-8859-15)
Windows Thai (CP 874)
Japanese SHIFT-JIS (CP932)
Chinese simplified GBK (CP936)
Korean EUC-KR (CP949)
Chinese traditional BIG5 (CP950)
|Windows Central European (CP 1250)
Windows Cyrillic (CP 1251)
Windows Western European (CP 1252)
Windows Greek (CP 1253)
Windows Turkish (CP 1254)
Windows Hebrew (CP 1255)
Windows Arabic (CP 1256)
Windows Baltic (CP 1257)
Windows/DOS OEM (CP 437)
Unicode 7 bit (UTF-7)
Unicode 8 bit (UTF-8)
Unicode 8 bit (UTF-8) NO BOM