星期六, 9月 03, 2011

正規表達式 Regular Expression

Metacharacters

用來描敘match項目的特殊字元,可組合使用
符號說明
. 任一字元
ˆ 放在最前面表示後面的符號需在開頭,如果在中間表是否定
$ 符號需在結尾字串
\s Match any whitespace character
\d Match any digit
\w Match any “word” character

Quantifiers
指定次數
符號說明
* The character can appear zero or more times
+ The character can appear one or more times
? The character can appear zero or one times
() 表示一個 sub pattern ,符合 sub pattern 的字串內容會被存放在匹配陣列中,並依序指派數字代表此 sub pattern 。
例如 /The h([0-9]) means Title (\1)/ 即為 'The h1 means Title 1', 'The h2 means Title 2' ...
[]表示字串含有括號中任一字元的內容。可以 - 表示一組連續字元,例如 /[a-z]/, /[0-9]/ 。注意, [] 僅代表一個字元,例如 /[abc]/ 表示 'a' 或 'b' 或 'c' ,而不是 'abc'
/ab[cd]e/
abce, abde皆正確

/ab[c-e\d]/
 abc, abd, abe及任何ab後接數字皆正確
{n,m} The character can appear at least n times, and no more than m.
Either parameter can be omitted to indicated a minimum limit with no maximum, or a maximumlimit without a minimum, but not both.


Example

驗證

`[a-zA-Z0-9]$` 英數字,不含特殊字元

取值

  • 取html tag值或attribute
    //取tag value
    function getTextBetweenTags($string, $tagname) {
        $pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
        preg_match($pattern, $string, $matches);
        return $matches[1];
    }
    
    //取tag attribue
    function getAttribute($attrib, $tag){
                    //get attribute from html tag
                    $re = '/' . preg_quote($attrib) . '=([\'"])?((?(1).+?|[^\s>]+))(?(1)\1)/is';
                    if (preg_match($re, $tag, $match)) {
                            return urldecode($match[2]);
                    }
                    return false;
            }
    
  • host相關
    // get host name from URL
    preg_match('@^(?:http://)?([^/]+)@i',
        "http://www.php.net/index.html", $matches);
    $host = $matches[1];
    
    // get last two segments of host name
    preg_match('/[^.]+\.[^.]+$/', $host, $matches);
    echo "domain name is: {$matches[0]}\n";

  • 取代
    將轉換tag - php
    $subjects['body'] = "[b]Make Me Bold![/b]";
    $subjects['subject'] = "[i]Make Me Italics![/i]";
    $regex[] = "@\[b\](.*?)\[/b\]@i";
    $regex[] = "@\[i\](.*?)\[/i\]@i";
    $replacements[] = "<b>$1</b>";
    $replacements[] = "<i>$1</i>";
    $results = preg_replace($regex, $replacements, $subjects);
    
    //ouput---
    array(2) {
      ["body"]=>
        string(20) "<b>Make Me Bold!</b>"
      ["subject"]=>
        string(23) "<i>Make Me Italic!</i>"
    }


  • 驗證 


References Regular Expression (RegExp) in JavaScript

沒有留言: