To highlight words in multi-byte text:
<?php
$s = 'Алабала';
$f = 'а';
echo preg_replace('/('.$f.')/iu', '<b>$1</b>', $s);
?>
(PHP 4 >= 4.2.0, PHP 5, PHP 7)
mb_eregi_replace — Replace regular expression with multibyte support ignoring case
$pattern
, string $replace
, string $string
[, string $option
= "msri"
] )
Scans string
for matches to
pattern
, then replaces the matched text
with replacement
.
pattern
The regular expression pattern. Multibyte characters may be used. The case will be ignored.
replace
The replacement text.
string
The searched string.
option
option
has the same meaning as in
mb_ereg_replace().
The resultant string or FALSE
on error.
버전 | 설명 |
---|---|
7.1.0 | The e modifier has been deprecated. |
Note:
내부 인코딩이나 mb_regex_encoding()으로 정의한 문자 인코딩을 이 함수의 문자 인코딩으로 사용할 수 있습니다.
신뢰할 수 없는 입력에 대해서 e 변경자를 사용하지 마십시오. 자동 회피를 수행하지 않습니다. (preg_replace()와 마찬가지) 주의하지 않으면 원격 코드 실행 취약점을 가지게 됩니다.
To highlight words in multi-byte text:
<?php
$s = 'Алабала';
$f = 'а';
echo preg_replace('/('.$f.')/iu', '<b>$1</b>', $s);
?>
Transliterator for cyrillic-to-latin letters for UTF chars:
<?php
function do_translit($st) {
$replacement = array(
"й"=>"i","ц"=>"c","у"=>"u","к"=>"k","е"=>"e","н"=>"n",
"г"=>"g","ш"=>"sh","щ"=>"sh","з"=>"z","х"=>"x","ъ"=>"\'",
"ф"=>"f","ы"=>"i","в"=>"v","а"=>"a","п"=>"p","р"=>"r",
"о"=>"o","л"=>"l","д"=>"d","ж"=>"zh","э"=>"ie","ё"=>"e",
"я"=>"ya","ч"=>"ch","с"=>"c","м"=>"m","и"=>"i","т"=>"t",
"ь"=>"\'","б"=>"b","ю"=>"yu",
"Й"=>"I","Ц"=>"C","У"=>"U","К"=>"K","Е"=>"E","Н"=>"N",
"Г"=>"G","Ш"=>"SH","Щ"=>"SH","З"=>"Z","Х"=>"X","Ъ"=>"\'",
"Ф"=>"F","Ы"=>"I","В"=>"V","А"=>"A","П"=>"P","Р"=>"R",
"О"=>"O","Л"=>"L","Д"=>"D","Ж"=>"ZH","Э"=>"IE","Ё"=>"E",
"Я"=>"YA","Ч"=>"CH","С"=>"C","М"=>"M","И"=>"I","Т"=>"T",
"Ь"=>"\'","Б"=>"B","Ю"=>"YU",
);
foreach($replacement as $i=>$u) {
$st = mb_eregi_replace($i,$u,$st);
}
return $st;
}
?>
when trying to find a way to strip newline from a multibyte UTF-8 string i got to this function just to discover later that POSIX don't "do" newline so i can't strip them, examples of what i tried are : \r\n , \\r\\n , (\\r\\n) (\\r|\\n)
and got no result
so since i wanted something like mb_nl2br() that's simple i wrote this little recursive function for UTF-8:
<?php
function mb_str_replace($find,$replace,&$str)
{
$i = mb_strpos($str,$find, 0,"UTF-8");
if ($index===false) {return;}
$str = mb_substr($str, 0,$i).$replace.mb_substr($str, $i+mb_strlen($find,"UTF-8"),mb_strlen($str,"UTF-8"));
$this->mb_str_replace($find,$replace,$str);
}
?>
note: moderate unit tesing was done, changed to other encodings