It should be noted that $offset is a **character offset**, not a **byte offset**. This means that most other PHP string functions that deal with lengths and offsets (strlen, strpos, preg_match with PREG_OFFSET_CAPTURE, etc.) use and return values unsuitable for this method if used with multibyte strings (like UTF-8 strings).
Byte offsets can be converted to character offsets with mb_strlen:
<?php
function char_offset($string, $byte_offset, $encoding = null)
{
$substr = substr($string, 0, $byte_offset);
return mb_strlen($substr, $encoding ?: mb_internal_encoding());
}
?>