Professional PHP

PHP Programming, Web Development, PHP Advocacy and PHP Best Practices.
« Communicating a Vision with Open Source
A Glimpse into the Future: PHP 6 »

Two preg_replace Escaping Gotchas

November 13th, 2005

preg_replace is a major workhorse function in PHP. Unfortunately, there are some less than obvious issues with using it properly. Here are two:

The e modifier causes the replacement value of preg_replace (including backreferences) to be evaluated as PHP code. This is a powerful capability. If you’ve ever seen an SQL injection, this sounds dangerous. It would be, too, but PHP automatically escapes any backreferences before building the string to evaluate. So this is safe:

$input = '" . die() . "';
var_dump(preg_replace('|^(.*)$|e', '"\1"', $input));
// output: string(13) "" . die() . ""

However, if you use double quotes inside your replacement string, as in this example, then variable parsing is still active. This can lead to problems with syntax errors or with PHP variable values inserted into the string:

$password = 'secret';
$input = '$password';
var_dump(preg_replace('|^(.*)$|e', '"\1"', $input));
// output: string(6) "secret"

So obviously, we want to be careful to use single quotes to avoid variable parsing. However, we aren’t done with single quotes. preg_replace doesn’t know which quote style you use, so it escapes both of them. That means that if your input actually does contain a quote, your quote will gain an unwanted slash. How many times have you seen bad php code do that?

$input = '"';
var_dump(preg_replace('|^(.*)$|e', "'\\1'", $input));
// output: string(2) "\""

A naive solution might run the value through stripslashes to fix that, but if your input actually has a slash in it, it will be unexpectedly removed:

$input = '\\';
var_dump(preg_replace('|^(.*)$|e', "stripslashes('\\1')", $input));
// output: string(0) ""

So what is the best solution? Well, in my book, it is to use preg_replace_callback and avoid preg_replace on e altogether. This has the dual advantage of avoiding all the escaping issues and also not triggering an eval on every call if you happen to be in a loop.

Second, most users of the preg_ functions are familiar with preg_quote for escaping strings to use them as literals in regular expression patterns. However, many people don’t realize that the replacement parameter of preg_replace also has special characters:

$input = '$5 dollars';
$replacement = '$10';
var_dump(preg_replace('|^(.*) dollars$|', $replacement . ' dollars', $input));
// output: string(8) " dollars

Where did the $10 go? Well, it got turned into backreference 10, which was empty. A naive solution would be to use preg_quote:

$input = '$5 dollars';
$replacement = '$10+$5';
var_dump(preg_replace('|^(.*) dollars$|', preg_quote($replacement) . ' dollars', $input));
// output: string(15) "$10\+$5 dollars"

But now we’ve got that spare slash that tells the world that this code ain’t quite right. The reason for this is that the characters that are special in the replacement value of preg_replace are not the same characters that are special in the pattern. So here is a solution:

function preg_replacement_quote($str) {
    return
preg_replace('/(\$|\\\\)(?=\d)/', '\\\\\1', $str);
}

$input = '$5 dollars';
$replacement = '$10+$5';
var_dump(preg_replace('|^(.*) dollars$|', preg_replacement_quote($replacement) . ' dollars', $input));
//  output: string(14) "$10+$5 dollars"

Now we get the expected output. Proper data handling is a good thing.

Filed Under

  • PHP

Related Posts

  • PHP first impressions from a J2EE programmer
  • Why isn’t PHP the natural successor to Java?
  • The Problem with Markup Languages
  • goto in PHP
  • OOP is Mature, not Dead
You can leave a response, or trackback from your own site.

16 Responses to “Two preg_replace Escaping Gotchas”

  1. Christian says:
    11/14/2005 at 1:59 am

    I wrote a small paper on how the “e” modifier can be abused by attackers a couple of weeks ago ( http://hauser-wenz.de/playground/papers/RegExInjection.pdf ). I was prompted to do that because I was discussing in some security talks this year whether upcoming attacks like “XPath Injection” are ridiculous or a real threat. I rather thought of the former, but then I found the “e” modifier in a real-world application I audited earlier this year, *ouch*.
    Nice examples, btw!

  2. Roan says:
    11/17/2005 at 2:04 am

    Sorry for unrelated comment, but thanks to your site I found out that someone samelessly stole article from my site. Not you :) You have “recent bookmarks” sidebar with link “Why is PHP a Pain? installing php applications” and linked article http://www.designbytim.com/blog/2005/10/27/22/ is actually stolen from my blog http://blog.enargi.com/programming/php/why-is-it-so-hard/ . Unbeliveable. I never saw someone actually stealing articles and especially on the subject of professional PHP.

  3. AllThingsDev.com » preg_replace gotchas says:
    11/19/2005 at 6:44 pm

    [...] I’ve been reading Professional PHP for a few weeks now and I’m really enjoying it. It is one of the few blogs out there that actually writes about code and coding in general. For example, their latest post gives a quick overview of some gotchas with the php function preg_replace. Professional PHP gives some good pointers about how to remove those unwanted extra slashes and things of that nature. [...]

  4. SitePoint Blogs » The Joy of Regular Expressions [3] says:
    9/27/2006 at 11:42 am

    [...] Another read, specific to escaping regular expressions and the types of security holes you might fall into with preg_replace(), is Jeff’s explanation of two preg_replace() escaping gotcha’s, which describes the exact nature of the problem plus provides a solution to escaping replacement strings. [...]

  5. Benjamin A. Shelton | Blog » Blog Archive » Symfony 1.3/1.4 and Suhosin says:
    3/2/2010 at 10:03 pm

    [...] a good source on preg_replace, why you should always use single quotes, common mistakes, and why you should really just avoid [...]

  6. Readlf says:
    5/4/2010 at 7:18 am

    Can u please make function preg_replacement_quote for double quoted strings? (Mail is not fake!)

  7. Rob Rasner IMDB says:
    5/7/2011 at 3:20 pm

    My partner and I stumbled over from a alternate page plus thought I might check it out on Two preg_replace Escaping Gotchas – Professional PHP . We like what I notice so I am just a fan. Look forward to checking out your web site repeatedly… FYI whats the latest on Libya amazing information what do you reckon … All the best Rob Rasner IMDB

  8. Blake says:
    7/14/2011 at 11:15 am

    This is a very old blog entry, but I’d like to thank the author for the preg_replacement_quote function. Something like this should definitely be in the PHP library. Thanks again!

  9. Efrain Divento says:
    7/16/2011 at 4:28 am

    I genuinely similar to this blog, make sure you don’t quit!

  10. deer antler plus says:
    11/22/2011 at 3:48 pm

    This post provides the light in which we are able to observe the reality. That is very good a single and offers in-depth information.

  11. imbd says:
    11/23/2011 at 3:19 pm

    imbd…

    [...]Two preg_replace Escaping Gotchas – Professional PHP[...]…

  12. Fallon Brinkerhoff says:
    1/10/2012 at 5:13 am

    Spring, also used car mount sterling kentucky, also 8[, also irs 2002 tax refunds, also fptw, also ringtone true tone, also P, also business il insurance life small, also %[[, also sunset beach north carolina rental, also (, also home for sale in pinson alabama, also 8-), also irondale industrial contractors, also vjugnb, also easy small business loan, also :-O, also

  13. köp spel till xbox 360 says:
    1/12/2012 at 7:59 pm

    Odd this kind of publish is actually totaly unimportant towards the research query I entered on the internet but it has been in initial site. Who is actually Common Failure, and just he reading my hard disk drive? Related to Steven Wright

  14. Gleb says:
    1/13/2012 at 12:11 am

    Thank you!!!

  15. Laronda Dutter says:
    1/14/2012 at 11:58 am

    Whoops I’m Retarded.

  16. Caitlin Bornstein says:
    2/7/2012 at 5:42 pm

    This is on the list of optimum posts that I’ve ever noticed; you may include some much more concepts in the similar theme. I’m nonetheless waiting for some exciting thoughts from your side in your next post.

Leave a Reply

Click here to cancel reply.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

code: use [code=php][/code].

Comment Preview

    Subscribe Feed
    Share Subscribe to this blog…
    Share Bookmark or share this page…
  • About

    My name is Jeff Moore. I'm a PHP programmer living in San Francico and working for a startup.

    More about me…

  • Categories (Home)

    • Agile Methods (14)
    • Mac (14)
    • Misc (18)
    • Open Source (14)
    • PHP (99)
    • Software Design (29)
    • Usability (14)
    • Web Design (20)
  • Recent Comments

    • The Legality of Republishing RSS Feeds  28
      Tory Rennemeyer, eenicker, Reverse Phone Lookup [...]
    • Working with PHP 5 in Mac OS X 10.5 (Leopard)  258
      Tuan Lal, Lavagem de estofados, Edward L. Kind [...]
    • php | tek 2008  36
      how to mend ice machine, Akademija Debelih, Odbacena [...]
    • goto in PHP  59
      kasor, Thomas Valdivieso, Murray Ziadie [...]
    • Firefox Extensions for Web Developers  33
      kasor, Website Design Toronto, mobila bistrita [...]
    • Why PHP is easier to learn than Java  68
      kasor, Justina Calvery, Guy Lipton [...]
    • Meta Tag Refresh Faux Paux  43
      html email templates, E-Juice Reviews, image [...]
    • Improved Error Messages in PHP 5  49
      Carroll Tina, Przeprowadzka, Emery Harari [...]
    • Benchmarking PHP's Magic Methods  33
      kayu oyunlar?,dora,oyun,oyna, Benjamin Bejjani, paypal website [...]
    • Microbenchmarks of single and double qouting.  24
      kefir grains minneapolis, sexshop dildo, tuim688 [...]
  • Recent Posts

    • Richard Thomas
    • ZendCon: Writing Maintainable PHP Code
    • Looking Towards the Cloud
    • Holiday Tech Support
    • Closures are coming to PHP
    • php | tek Wrapup
    • php | tek 2008
    • Sarah Snow Stever
    • Benchmarking PHP’s Magic Methods
    • The Endpoints of the Scale of Stupidity on Video
  • Site

    • Archives
    • Log in
  • Search