Question: I'm investigating conversion utilities for Ansi native character languages to UTF-8 in PHP, then we can convert to UTF8 documents in batch easily. I know of several tools/libraries that can do this but I want to know if this is possible with just program in PHP. Thanks in advance, I hope I can find a solution on GoFunNow?
Answer: According to your needs, maybe you can have a free trial of software Text Encoding Converter, by which you can convert Encoding of multiple ansi/ utf-8/ unicode plain text documents to and from any Encoding together with C#, VB .NET, MS Visual Basic, Borland Delphi, VBA (MS Office products such as Access), C and C++ via command line interface. Please check more related information of this software on homepage, in the following part, let us check how to use this software.
Step 1. Free download trial from Text Encoding Converter homepage.
When downloading finishes, you need to install it then you can find tec.exe from the installation path, final please copy this tec.exe to where you like to bundle.
Step 2. Convert Ansi to UTF8 from PHP
tec <source file path> <-de:destination encoding code> [-dfn:destination file name] [-dp:destination file path] [-dn:destination newline code] [-se:source encoding code] [-is] [-iw:include words] [-ew:exclude words] [-b2b] [-b2p:bak to path] [-nb] [-sametime]
Please notice: add quotes when argument contain spaces
|source file path||The path and files to be converted. This parameter must exist.
for example, "d:\source\*.txt" (use quotes when paths contain spaces)
|-de:destintation encoding code||Destination Encoding code. This parameter must exist.
You can get the full code list from the graphical interface , please see the following red frame.
for example, -de:41, the destination Encoding is utf-8
|-dp:destination file path||Destination file path. For example, -dp:"d:\dest files"
if this parameter is ignored, the source file will be converted to the same file path, and the source file will be overwritten. (Use quotes when the destination path contains spaces.)
|-dfn:destination file name||
Destination file name, for example, -dfn:"d:\dest files\dest1.csv"
|-dn:destination newline code||Destination newline code. You can get the full code list from the graphical interface, please see the following red frame.
For example, -dn:0, means DON'T convert the newline format. For this setting, the source file's newline formatting is preserved in the destination file, altered only as needed to satisfy the requirements of the destination Encoding code.
If this parameter is ignored, it is the same as -dn:0.
|-se:source encoding code||Source Encoding code,. You can get the full code list in the graphical interface, please see the following red frame.
for example, -se:0, TEC will automatically determine the source file's Encoding format
If this parameter is ignored, it is the same as -se:0, ie, auto detection of source file Encoding format
|-is||Include sub-folders; If specified, source files contained in sub folders of the source path folder will be converted as well.|
|-iw:include words||Include words in source file name or file path. Only convert files whose names include the specified words. Used for wild card or entire folder source paths.
for example, -iw:test;2010*.log, means convert only those source files whose files names include "test" or are of the form "2010*.log"
|-ew:exclude words||Exclude words in source file name or file path. Only convert files whose names do not include the specified words. Used for wild card, or entire folder source paths.
for example, -ew:.bak, source files with an extension of .bak will excluded.
|-b2b||Backup files to .bak, used when the destination path is the same as the source|
|-b2p:"bak to path"||Backup files to specified path,. Used when the destination path is the same as the source path.
For example, -b2b:"d:\bak". Source files will be backed up to the folder "d:\bak" (Use quotes when the path contains spaces)
|-nb||Don't backup source files|
|-sametime||The destination file has same file time stamp as the source file, so that you will not loose track of dates changed|
Command Line Example 1:
tec "c:\source file\*.php" -de:41 -b2b -is
This will convert *.php files in the "c:\source file\" folder, and its sub-folders, to the utf-8 file Encoding format, and it will backup the original source files to the same folder using the .bak file extension
Command Line Example 2:
tec "c:\source file\*.php" -de:-2 -dn:1 -dp:"d:\dest file" -is -ew:.bak -sametime
This will convert *.php files in "c:\source file" and its sub-folders to the utf-8 no bom file Encoding format, and convert to unix newline format, the destination file path is the folder "d:\dest file", and it will not convert .bak files in source file path, and the destination file has same file time stamp as the source file.
Command Line Example 3:
tec "c:\source files\source1.php" -de:41 "-dfn:c:\dest files\dest1.php"
This will convert source1.php file in the "c:\source files\" folder to dest1.php file in the "c:\dest files" folder, the destincation encoding format is the utf-8 file Encoding format.
//app path can't include space, otherwise app will not run
//source file name and destincation file name can include space, but must add ""
$sCommand=$sAppPath.' "'.$sSourceFileName.'" "-dfn:'.$sDestFileName.'" -de:41';
echo exec($sCommand, $sOutput);
echo sprintf("Error, can't convert %s to %s.", $sSourceFileName,$sDestFileName);
echo sprintf("convert %s to %s ok", $sSourceFileName,$sDestFileName);