preg_replace

(PHP 4, PHP 5, PHP 7)

preg_replace — 정규 표현식 검색과 치환을 수행

설명

mixed preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit [, int &$count ]] )

subject를 검색하여 매치된 pattern을 replacement로 치환합니다.

인수

pattern

검색할 패턴. 문자열이나 문자열을 가진 배열일 수 있습니다.

/e 변경자는 preg_replace()가 replacement 인수를 참조 치환을 하고 PHP 코드로 취급하도록 합니다. 팁: replacement가 정상적인 PHP 코드 문자열을 가지게 하십시오. 그렇지 않으면, PHP는 preg_replace()를 포함하는 줄에서 해석 오류를 일으킵니다.

replacement

치환할 문자열이나 문자열을 가진 배열. 이 인수가 문자열이고 pattern 인수가 배열이면, 모든 패턴은 해당 문자열로 치환됩니다. pattern과 replacement 인수가 모두 배열이면, 각 pattern은 해당하는 replacement로 치환됩니다. pattern 배열보다 replacement 배열보다 적은 원소를 가지고 있으면, 남는 pattern은 빈 문자열로 치환됩니다.

replacement는 \\n나 (PHP 4.0.4부터) $n 형태의 참조를 포함할 수 있습니다. 그러한 모든 참조는 n번째로 잡은 괄호 패턴으로 대체됩니다. n은 0에서 99까지 가능하고, \\0나 $0는 전체 패턴에 매치한 텍스트를 의미합니다. 괄호를 여는것은 서브패턴을 포함하여 왼쪽에서 오른쪽(1로부터 시작)으로 카운트합니다. replacement에 백슬래시를 사용할 때는, 이중으로 해야 합니다. ("\\\\" PHP 문자열)

역참조 바로 뒤에 다른 숫자가 따라오는 패턴을 사용할 때는(즉, 매치된 패턴 바로 뒤에 수 문자가 위치), \\1 식의 역참조를 사용할 수 없습니다. 예를 들면, \\11은 preg_replace()에서 문자 1이 따라오는 역참조 \\1인지, 역참조 \\1인지 구분할 수 없습니다. 이 경우, 해결책은 \${1}1를 사용합니다. 이는 독립된 역참조 $1를 작성하고, 1을 문자로 남겨놓습니다.

e 변경자를 사용할 때, 이 함수는 역참조로 치환한 문자열에서 몇몇 문자(', ", \, NULL)를 이스케이프합니다. 이는 홑따옴표나 곁따옴표로 이루어진 역참조 사용에서 구문 오류를 막기 위해서 이루어집니다. (예. 'strlen(\'$1\')+strlen("$2")') PHP의 문자열 구문을 확인하여 문자열이 어떻게 해석되고 보여지는지 정확히 이해할 필요가 있습니다.

subject

검색 치환할 문자열이나 문자열을 가진 배열.

subject가 배열이면, 검색 치환은 모든 subject에 이루어지고, 반환값도 배열이 됩니다.

limit

각 subject 문자열에 대한 각 패턴의 최대 치환수. 기본값은 -1. (무제한)

count

지정하면, 이 변수는 치환이 일어난 횟수로 채워집니다.

반환값

preg_replace()는 subject 인수엣 따라서 배열이나 문자열을 반환합니다.

매치가 발견되면 새 subject를 반환하고, 그렇지 않으면 변경되지 않은 subject를 반환합니다. 오류가 발생하면 NULL을 반환합니다.

변경점

버전	설명
5.1.0	`count` 인수 추가
4.0.4	`replacement` 인수에 '$n' 형식 추가
4.0.2	`limit` 인수 추가

예제

Example #1 수가 따라오는 역참조 사용하기


<?php
$string = 'April 15, 2003';
$pattern = '/(\w+) (\d+), (\d+)/i';
$replacement = '${1}1,$3';
echo preg_replace($pattern, $replacement, $string);
?>

위 예제의 출력:

April1,2003

Example #2 preg_replace()와 인덱스 배열 사용하기


<?php
$string = 'The quick brown fox jumped over the lazy dog.';
$patterns[0] = '/quick/';
$patterns[1] = '/brown/';
$patterns[2] = '/fox/';
$replacements[2] = 'bear';
$replacements[1] = 'black';
$replacements[0] = 'slow';
echo preg_replace($patterns, $replacements, $string);
?>

위 예제의 출력:

The bear black slow jumped over the lazy dog.

patterns와 replacements를 ksort해서, 원하던 결과를 얻습니다.


<?php
ksort($patterns);
ksort($replacements);

echo preg_replace($patterns, $replacements, $string);
?>

위 예제의 출력:

The slow black bear jumped over the lazy dog.

Example #3 여러 값 치환하기


<?php
$patterns = array ('/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/',
                   '/^\s*{(\w+)}\s*=/');
$replace = array ('\3/\4/\1\2', '$\1 =');
echo preg_replace($patterns, $replace, '{startDate} = 1999-5-27');
?>

위 예제의 출력:

$startDate = 5/27/1999

Example #4 'e' 변경자 사용하기


<?php
preg_replace("/(<\/?)(\w+)([^>]*>)/e", 
             "'\\1'.strtoupper('\\2').'\\3'", 
             $html_body);
?>

위 예제는 입력된 텍스트의 모든 HTML 태그를 대문자로 변경합니다.

Example #5 공백 제거하기

이 예제는 문자열에서 초과된 공백을 제거합니다.


<?php
$str = 'foo   o';
$str = preg_replace('/\s\s+/', ' ', $str);
// 이제 'foo o'가 됩니다.
echo $str;
?>

Example #6 count 인수 사용하기


<?php
$count = 0;

echo preg_replace(array('/\d/', '/\s/'), '*', 'xp 4 to', -1, $count);
echo $count; //3
?>

위 예제의 출력:

xp***to
3

주의

Note:
pattern과 replacement에 배열을 사용할 때, 키는 배열에 나타난 순서대로 처리합니다. 동일한 숫자 인덱스 순서를 가질 필요가 없습니다. 어떤 pattern이 replacement로 치환할 지 인덱스로 정한다면, preg_replace()를 호출하기 전에 각 배열에 ksort()를 수행해야 합니다.

참고

preg_match() - 정규표현식 매치를 수행
preg_replace_callback() - 콜백을 이용한 정규 표현식 검색과 치환을 수행
preg_split() - 정규 표현식에 따라 문자열을 나눔

add a note

User Contributed Notes 9 notes

down

766

arkani at iol dot pt ¶

15 years ago


Because i search a lot 4 this:

The following should be escaped if you are trying to match that character

\ ^ . $ | ( ) [ ]
* + ? { } ,

Special Character Definitions
\ Quote the next metacharacter
^ Match the beginning of the line
. Match any character (except newline)
$ Match the end of the line (or before newline at the end)
| Alternation
() Grouping
[] Character class
* Match 0 or more times
+ Match 1 or more times
? Match 1 or 0 times
{n} Match exactly n times
{n,} Match at least n times
{n,m} Match at least n but not more than m times
More Special Character Stuff
\t tab (HT, TAB)
\n newline (LF, NL)
\r return (CR)
\f form feed (FF)
\a alarm (bell) (BEL)
\e escape (think troff) (ESC)
\033 octal char (think of a PDP-11)
\x1B hex char
\c[ control char
\l lowercase next char (think vi)
\u uppercase next char (think vi)
\L lowercase till \E (think vi)
\U uppercase till \E (think vi)
\E end case modification (think vi)
\Q quote (disable) pattern metacharacters till \E
Even More Special Characters
\w Match a "word" character (alphanumeric plus "_")
\W Match a non-word character
\s Match a whitespace character
\S Match a non-whitespace character
\d Match a digit character
\D Match a non-digit character
\b Match a word boundary
\B Match a non-(word boundary)
\A Match only at beginning of string
\Z Match only at end of string, or before newline at the end
\z Match only at end of string
\G Match only where previous m//g left off (works only with /g)

down

nik at rolls dot cc ¶

11 years ago


To split Pascal/CamelCase into Title Case (for example, converting descriptive class names for use in human-readable frontends), you can use the below function:

<?php
function expandCamelCase($source) {
  return preg_replace('/(?<!^)([A-Z][a-z]|(?<=[a-z])[^a-z]|(?<=[A-Z])[0-9_])/', ' $1', $source);
}
?>

Before:
  ExpandCamelCaseAPIDescriptorPHP5_3_4Version3_21Beta
After:
  Expand Camel Case API Descriptor PHP 5_3_4 Version 3_21 Beta

down

bublifuk at mailinator dot com ¶

6 years ago


A delimiter can be any ASCII non-alphanumeric, non-backslash, non-whitespace character:  !"#$%&'*+,./:;=?@^_`|~-  and  ({[<>]})

down

ismith at nojunk dot motorola dot com ¶

17 years ago


Be aware that when using the "/u" modifier, if your input text contains any bad UTF-8 code sequences, then preg_replace will return an empty string, regardless of whether there were any matches.

This is due to the PCRE library returning an error code if the string contains bad UTF-8.

down

php-comments-REMOVE dot ME at dotancohen dot com ¶

16 years ago


Below is a function for converting Hebrew final characters to their
normal equivelants should they appear in the middle of a word.
The /b argument does not treat Hebrew letters as part of a word,
so I had to work around that limitation.

<?php

$text="עברית מבולגנת";

function hebrewNotWordEndSwitch ($from, $to, $text) {
   $text=
    preg_replace('/'.$from.'([א-ת])/u','$2'.$to.'$1',$text);
   return $text;
}

do {
   $text_before=$text;
   $text=hebrewNotWordEndSwitch("ך","כ",$text);
   $text=hebrewNotWordEndSwitch("ם","מ",$text);
   $text=hebrewNotWordEndSwitch("ן","נ",$text);
   $text=hebrewNotWordEndSwitch("ף","פ",$text);
   $text=hebrewNotWordEndSwitch("ץ","צ",$text);
}   while ( $text_before!=$text );

print $text; // עברית מסודרת!

?>

The do-while is necessary for multiple instances of letters, such
as "אנני" which would start off as "אןןי". Note that there's still the
problem of acronyms with gershiim but that's not a difficult one
to solve. The code is in use at http://gibberish.co.il which you can
use to translate wrongly-encoded Hebrew, transliterize, and some
other Hebrew-related functions.

To ensure that there will be no regular characters at the end of a
word, just convert all regular characters to their final forms, then
run this function. Enjoy!

down

me at perochak dot com ¶

13 years ago


If you would like to remove a tag along with the text inside it then use the following code.



<?php

preg_replace('/(<tag>.+?)+(<\/tag>)/i', '', $string);

?>



example

<?php $string='<span class="normalprice">55 PKR</span>'; ?>



<?php

$string = preg_replace('/(<span class="normalprice">.+?)+(<\/span>)/i', '', $string);

?>



This will results a null or empty string.



<?php

$string='My String <span class="normalprice">55 PKR</span>';



$string = preg_replace('/(<span class="normalprice">.+?)+(<\/span>)/i', '', $string);

?>



This will results a " My String"

down

-1

sternkinder at gmail dot com ¶

17 years ago


From what I can see, the problem is, that if you go straight and substitute all 'A's wit 'T's you can't tell for sure which 'T's to substitute with 'A's afterwards. This can be for instance solved by simply replacing all 'A's by another character (for instance '_' or whatever you like), then replacing all 'T's by 'A's, and then replacing all '_'s (or whatever character you chose) by 'A's:



<?php

$dna = "AGTCTGCCCTAG";

echo str_replace(array("A","G","C","T","_","-"), array("_","-","G","A","T","C"), $dna); //output will be TCAGACGGGATC

?>



Although I don't know how transliteration in perl works (though I remember that is kind of similar to the UNIX command "tr") I would suggest following function for "switching" single chars:



<?php

function switch_chars($subject,$switch_table,$unused_char="_") {

    foreach ( $switch_table as $_1 => $_2 ) {

        $subject = str_replace($_1,$unused_char,$subject);

        $subject = str_replace($_2,$_1,$subject);

        $subject = str_replace($unused_char,$_2,$subject);

    }

    return $subject;

}



echo switch_chars("AGTCTGCCCTAG", array("A"=>"T","G"=>"C")); //output will be TCAGACGGGATC

?>

down

-1

mail at johanvandemerwe dot nl ¶

5 years ago


Sample for replacing bracketed short-codes

The used short-codes are purely used for educational purposes for they could be shorter as in 'italic' to 'i' or 'bold' to 'b'.

Sample text
----
This sample shows how to have [italic]italic[/italic], [bold]bold[/bold] and [underline]underlined[/underline] and [strikethrough]striked[/striketrhough] text. 

with this function:

<?php
function textDecoration($html)
{
    $patterns = [
        '/\[(italic)\].*?\[\/\1\] ?/',
        '/\[(bold)\].*?\[\/\1\] ?/',
        '/\[(underline)\].*?\[\/\1\] ?/'
    ];

    $replacements = [
        '<i>$1</i>',
        '<strong>$1</strong>',
        '<u>$1</u>'
    ];

    return preg_replace($patterns, $replacements, $html);
}

$html = textDecoration($html);

echo $html; // or return
?>

results in:
----
This sample shows how to have <i>italic</i>, <b>bold</b> and <u>underlined</u> and [strikethrough]striked[/striketrhough] text.

Notice!
There is no [strikethrough]striked[/striketrhough] fallback in the patterns and replacements array

down

-3

razvan_bc at yahoo dot com ¶

2 years ago


How to replace all comments inside code without remove crln  = \r\n or cr \r each line?

<?php
$txt_target=<<<t1
this;//    dsdsds
    nope
    
/*
    ok
    */
is;huge
/*text bla*/
    /*bla*/
 
t1;

/*
=======================================================================
expected result:
=======================================================================
this;
    nope

is;huge
=======================================================================
visualizing in a hex viewer .. to_check_with_a_hex_viewer.txt ...
 t  h  i  s  ; LF TAB n  o  p  e CR LF CR LF  i  s  ;  h  u  g  e CR LF
74 68 69 73 3b 0a 09 6e 6f 70 65 0d 0a 0d 0a 69 73 3b 68 75 67 65 0d 0a
I used F3 (viewer + options 3: hex) in mythical TOTAL COMMANDER!
=======================================================================
*/

echo '<hr><pre>';
echo  $txt_target;
echo '</pre>';

//  a single line '//' comments
$txt_target = preg_replace('![ \t]*//.*[ \t]*!', '', $txt_target);

//  /* comment */
$txt_target = preg_replace('/\/\*([^\/]*)\*\/(\s+)/smi', '', $txt_target);
echo '<hr><pre>';
echo  $txt_target;
echo '</pre><hr>';

file_put_contents('to_check_with_a_hex_viewer.txt',$txt_target);

?>

add a note