Since I feel this is rather vague and non-helpful, I thought I'd make a post detailing the mechanics of the glob regex.
glob uses two special symbols that act like sort of a blend between a meta-character and a quantifier. These two characters are the * and ?
The ? matches 1 of any character except a /
The * matches 0 or more of any character except a /
If it helps, think of the * as the pcre equivalent of .* and ? as the pcre equivalent of the dot (.)
Note: * and ? function independently from the previous character. For instance, if you do glob("a*.php") on the following list of files, all of the files starting with an 'a' will be returned, but * itself would match:
a.php // * matches nothing
aa.php // * matches the second 'a'
ab.php // * matches 'b'
abc.php // * matches 'bc'
b.php // * matches nothing, because the starting 'a' fails
bc.php // * matches nothing, because the starting 'a' fails
bcd.php // * matches nothing, because the starting 'a' fails
It does not match just a.php and aa.php as a 'normal' regex would, because it matches 0 or more of any character, not the character/class/group before it.
Executing glob("a?.php") on the same list of files will only return aa.php and ab.php because as mentioned, the ? is the equivalent of pcre's dot, and is NOT the same as pcre's ?, which would match 0 or 1 of the previous character.
glob's regex also supports character classes and negative character classes, using the syntax [] and [^]. It will match any one character inside [] or match any one character that is not in [^].
With the same list above, executing
glob("[ab]*.php) will return (all of them):
a.php // [ab] matches 'a', * matches nothing
aa.php // [ab] matches 'a', * matches 2nd 'a'
ab.php // [ab] matches 'a', * matches 'b'
abc.php // [ab] matches 'a', * matches 'bc'
b.php // [ab] matches 'b', * matches nothing
bc.php // [ab] matches 'b', * matches 'c'
bcd.php // [ab] matches 'b', * matches 'cd'
glob("[ab].php") will return a.php and b.php
glob("[^a]*.php") will return:
b.php // [^a] matches 'b', * matches nothing
bc.php // [^a] matches 'b', * matches 'c'
bcd.php // [^a] matches 'b', * matches 'cd'
glob("[^ab]*.php") will return nothing because the character class will fail to match on the first character.
You can also use ranges of characters inside the character class by having a starting and ending character with a hyphen in between. For example, [a-z] will match any letter between a and z, [0-9] will match any (one) number, etc..
glob also supports limited alternation with {n1, n2, etc..}. You have to specify GLOB_BRACE as the 2nd argument for glob in order for it to work. So for example, if you executed glob("{a,b,c}.php", GLOB_BRACE) on the following list of files:
a.php
b.php
c.php
all 3 of them would return. Note: using alternation with single characters like that is the same thing as just doing glob("[abc].php"). A more interesting example would be glob("te{xt,nse}.php", GLOB_BRACE) on:
tent.php
text.php
test.php
tense.php
text.php and tense.php would be returned from that glob.
glob's regex does not offer any kind of quantification of a specified character or character class or alternation. For instance, if you have the following files:
a.php
aa.php
aaa.php
ab.php
abc.php
b.php
bc.php
with pcre regex you can do ~^a+\.php$~ to return
a.php
aa.php
aaa.php
This is not possible with glob. If you are trying to do something like this, you can first narrow it down with glob, and then get exact matches with a full flavored regex engine. For example, if you wanted all of the php files in the previous list that only have one or more 'a' in it, you can do this:
<?php
$list = glob("a*.php");
foreach ($list as $l) {
if (preg_match("~^a+\.php$~",$file))
$files[] = $l;
}
?>
glob also does not support lookbehinds, lookaheads, atomic groupings, capturing, or any of the 'higher level' regex functions.
glob does not support 'shortkey' meta-characters like \w or \d.