A function I wrote last night was fairly flexible in terms of detecting whitespace, and even took into account the pesky non-breaking spaces / zero-width spaces further up the Unicode alphabet.
The benefit here was being able to isolate and identify specific Unicode indices based on their subrange.
<?php
// Returns TRUE if the ASCII value of $string matches a registered whitespace character.
// * This includes non-breaking spaces, zero-width spaces, and any unicode values below 32.
// * $string: Character to identify. If string extends past one character, the value
// is truncated and only the initial character is examined.
function is_whitespace($string){
// Return FALSE if passed an empty string.
if($string == "") return FALSE;
$char = ord($string);
// Control Characters
if($char < 33) return TRUE;
if($char > 8191 && $char < 8208) return TRUE;
if($char > 8231 && $char < 8240) return TRUE;
// Additional Characters
switch($char){
case 160: // Non-Breaking Space
case 8287: // Medium Mathematical Space
return TRUE;
break;
}
return FALSE;
}
?>