This function doesn't always produce the expected results if you have a needle that isn't UTF-8 but are looking for it in a UTF-8 string. This won't be a concern for most people, but if you are mixing old and new data, especially if reading data from a file, it could be an issue.
Here's a "mb_*"-esque function that searches the string:
<?php
function mb_str_contains(string $haystack, string $needle, $encoding = null) {
return $needle === '' || mb_substr_count($haystack, $needle, (empty($encoding) ? mb_internal_encoding() : $encoding)) > 0;
}
?>
I used mb_substr_count() instead of mb_strpos() because mb_strpos() will still match partial characters as it's doing a binary search.
We can compare str_contains to the above suggested function:
<?php
$string = hex2bin('e6bca2e5ad97e381afe383a6e3838be382b3e383bce38389');
$contains = hex2bin('e383');
$contains2 = hex2bin('e383bc');
echo " = Haystack: ".var_export($string, true)."\r\n";
echo " = Needles:\r\n";
echo " + Windows-1252 characters\r\n";
echo " - Results:\r\n";
echo " > str_contains: ".var_export(str_contains($string, $contains), true)."\r\n";
echo " > mb_str_contains: ".var_export(mb_str_contains($string, $contains), true)."\r\n";
echo " + Valid UTF-8 character\r\n";
echo " - Results:\r\n";
echo " > str_contains: ".var_export(str_contains($string, $contains2), true)."\r\n";
echo " > mb_str_contains: ".var_export(mb_str_contains($string, $contains2), true)."\r\n";
echo "\r\n";
?>
Output:
= Haystack: '漢字はユニコード'
= Needles:
+ Windows-1252 characters
- Results:
> str_contains: true
> mb_str_contains: false
+ Valid UTF-8 character
- Results:
> str_contains: true
> mb_str_contains: true
It's not completely foolproof, however. For instance, ド in Windows-1252 will match ド from the above string. So it's still best to convert the encoding of the parameters to be the same first. But, if the character set isn't known/can't be detected and you have no choice but to deal with dirty data, this is probably the simplest solution.