as the previouly posted version of this function doesn't handle UTF-8 characters, I simply tried to replace ucfirst to mb_convert_case, but then any previous case foldings were lost while looping through delimiters.
So I decided to do an mb_convert_case on the input string (it also deals with words is uppercase wich may also be problematic when doing case-sensitive search), and do the rest of checking after that.
As with mb_convert_case, words are capitalized, I also added lowercase convertion for the exceptions, but, for the above mentioned reason, I left ucfirst unchanged.
Now it works fine for utf-8 strings as well, except for string delimiters followed by an UTF-8 character ("Mcádám" is unchanged, while "mcdunno's" is converted to "McDunno's" and "ökrös-TÓTH éDUa" in also put in the correct form)
I use it for checking user input on names and addresses, so exceptions list contains some hungarian words too.
<?php
function titleCase($string, $delimiters = array(" ", "-", ".", "'", "O'", "Mc"), $exceptions = array("út", "u", "s", "és", "utca", "tér", "krt", "körút", "sétány", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX", "X", "XI", "XII", "XIII", "XIV", "XV", "XVI", "XVII", "XVIII", "XIX", "XX", "XXI", "XXII", "XXIII", "XXIV", "XXV", "XXVI", "XXVII", "XXVIII", "XXIX", "XXX" )) {
$string = mb_convert_case($string, MB_CASE_TITLE, "UTF-8");
foreach ($delimiters as $dlnr => $delimiter){
$words = explode($delimiter, $string);
$newwords = array();
foreach ($words as $wordnr => $word){
if (in_array(mb_strtoupper($word, "UTF-8"), $exceptions)){
$word = mb_strtoupper($word, "UTF-8");
}
elseif (in_array(mb_strtolower($word, "UTF-8"), $exceptions)){
$word = mb_strtolower($word, "UTF-8");
}
elseif (!in_array($word, $exceptions) ){
$word = ucfirst($word);
}
array_push($newwords, $word);
}
$string = join($delimiter, $newwords);
}return $string;
}
?>