Johan van Mol .org
HomeHome
ArticlesArticles
BlogBlog
Advanced SearchAdvanced Search
Home arrow Articles arrow Web development arrow The Unicode Workflow
The Unicode Workflow Print E-mail
Tuesday, 11 April 2006
Article Index
The Unicode Workflow
Input
Application
Output
Sample code

Sample code

Here is an example of a script in PHP and Perl that contains a lot of stuff we went through. It takes input from a form and sends it out as an e-mail in ISO-8895-1 or UTF-8 format.
To be fully Unicode-Workflow-compliant you should save the source code in UTF-8 encoding. But since I don't know how your browser will download this script and whether you feel like changing your editor's preferences, I used HTML entities in the string literals. So you don't have to worry about that.

 



Comments
pretty good advice though i don't like and don't recommend using html entities. good reading. thanks.

ps: i still get a kick out of hard php makes i18n work, coldfusion is a lot cleaner when it comes to this sort of thing.

  Posted by PaulH, Whose homepage is http://www.sustainableGIS.com/blog/cfg11n/ on Wednesday, 12 April 2006 at 6:18

Personally, I'd amend the first step of your workflow to read:

* Whenever data is entered in your application, convert it to UTF-8 if it's not, and then normalize the data to your preferred Unicode normalization form.

  Posted by AndrewC, on Friday, 19 May 2006 at 5:03

Great article!

I'm a rusty - very rusty web dev and I had never had to import dynamic text to Flash... Since my content was in French... I had loads of trouble with accents. The Unicode Workflow fixed my problems. Thank you!

  Posted by Chris, on Saturday, 03 June 2006 at 11:36

Thanks a LOT for this tutorial! It was great reading and it helped a lot.
  Posted by Gregor, on Tuesday, 20 June 2006 at 10:37

u have writen wounderfull articale about unicode..
  Posted by bari, on Tuesday, 01 August 2006 at 2:48

Yep, unfortunatelly there are still browsers running which do not support unicode. It is really tricky if you are working with asp/asp.net. It do things which you cant controll and that why it sucks so badly.
  Posted by anonymous email, Whose homepage is http://www.anonymousspeech.com on Friday, 27 October 2006 at 11:47

Wonderfull article, very in-depth also! I still tend to prefer to use html entities in html files and such, but your article fills many gaps in my knowledge of text encoding. This truly is a reference.


You still might want to add though, that PHP files should be saved without a bit order mark, because these are sent as a header and can create problems with script generated headers.

  Posted by Bram Esposito, Whose homepage is http://www.patpitiee.be on Friday, 10 November 2006 at 5:21

Thank you very much for your help
  Posted by valery, on Tuesday, 19 December 2006 at 1:06

Hotmail and yahoo utf-8 problem solved...:

$char='UTF-8';
$e= explode('@',$toAddress);
$e=$e[1];
$e= explode('.',$e);
$e=$e[0];
$e=strtolower($e);
if($e=='hotmail' || $e=='yahoo'){
   $fromName=utf8_decode($fromName);
   $subject=utf8_decode($subject);
   $message=utf8_decode($message);
   $char='ISO-8859-1';
}

$headers = 'MIME-Version: 1.0 rn';
$headers .= 'Content-type: text/html; charset=$char rn';
$headers .= 'From: '.$fromName.' ';

mail($toAddress, $subject, $message, $headers);

  Posted by ff, Whose homepage is http://shoppingP.com on Monday, 25 December 2006 at 9:14

Thanks very much. I have been looking for precisely this information. And it ain't easy to find in any tongue.
  Posted by Mark Solomon, Whose homepage is http://hanged.man.tripod.com/majorarcanum on Wednesday, 11 April 2007 at 11:27

thanks for the info - good stuff!
  Posted by tag hag, on Monday, 30 April 2007 at 6:11

Really helpful. I found the function at http://uk2.php.net/manual/en/function.imap-8bit.php#61216 worked fine for sending email subjects, but one change needed making:

Change:
$sLine = implode( '=' . chr(13).chr(10), $aMatch[0] ); // add soft crlf's

to
$sLine = '=?utf-8?Q?'.implode( '?=rnt=?utf-8?Q?' . chr(13).chr(10), $aMatch[0] ).'?=';

  Posted by Rob, on Friday, 04 May 2007 at 10:28

Very nice concept it did not get to into depth, but was very well explained. A lot better explained then what you will find on mirc thats for sure.
  Posted by Tyler Dewitt, Whose homepage is http://www.dewittsmedia.com on Sunday, 21 October 2007 at 2:37

When I was converting php to asp, I've found

utf8_decode($aUsers[$i]).

Wot's the php utf8_decode() func replacement in asp ???????? I searched all the web, but I couldn't find my solution..! could u help me...

  Posted by srikanth dhondi, on Thursday, 14 February 2008 at 4:17


 1 
Page 1 of 1 ( 14 comments )
©2005 MosCom

You are not authorized to leave comments - please login.