Professional PHP

PHP Programming, Web Development, PHP Advocacy and PHP Best Practices.
« Communicating a Vision with Open Source
A Glimpse into the Future: PHP 6 »

Two preg_replace Escaping Gotchas

November 13th, 2005

preg_replace is a major workhorse function in PHP. Unfortunately, there are some less than obvious issues with using it properly. Here are two:

The e modifier causes the replacement value of preg_replace (including backreferences) to be evaluated as PHP code. This is a powerful capability. If you’ve ever seen an SQL injection, this sounds dangerous. It would be, too, but PHP automatically escapes any backreferences before building the string to evaluate. So this is safe:

$input = '" . die() . "';
var_dump(preg_replace('|^(.*)$|e', '"\1"', $input));
// output: string(13) "" . die() . ""

However, if you use double quotes inside your replacement string, as in this example, then variable parsing is still active. This can lead to problems with syntax errors or with PHP variable values inserted into the string:

$password = 'secret';
$input = '$password';
var_dump(preg_replace('|^(.*)$|e', '"\1"', $input));
// output: string(6) "secret"

So obviously, we want to be careful to use single quotes to avoid variable parsing. However, we aren’t done with single quotes. preg_replace doesn’t know which quote style you use, so it escapes both of them. That means that if your input actually does contain a quote, your quote will gain an unwanted slash. How many times have you seen bad php code do that?

$input = '"';
var_dump(preg_replace('|^(.*)$|e', "'\\1'", $input));
// output: string(2) "\""

A naive solution might run the value through stripslashes to fix that, but if your input actually has a slash in it, it will be unexpectedly removed:

$input = '\\';
var_dump(preg_replace('|^(.*)$|e', "stripslashes('\\1')", $input));
// output: string(0) ""

So what is the best solution? Well, in my book, it is to use preg_replace_callback and avoid preg_replace on e altogether. This has the dual advantage of avoiding all the escaping issues and also not triggering an eval on every call if you happen to be in a loop.

Second, most users of the preg_ functions are familiar with preg_quote for escaping strings to use them as literals in regular expression patterns. However, many people don’t realize that the replacement parameter of preg_replace also has special characters:

$input = '$5 dollars';
$replacement = '$10';
var_dump(preg_replace('|^(.*) dollars$|', $replacement . ' dollars', $input));
// output: string(8) " dollars

Where did the $10 go? Well, it got turned into backreference 10, which was empty. A naive solution would be to use preg_quote:

$input = '$5 dollars';
$replacement = '$10+$5';
var_dump(preg_replace('|^(.*) dollars$|', preg_quote($replacement) . ' dollars', $input));
// output: string(15) "$10\+$5 dollars"

But now we’ve got that spare slash that tells the world that this code ain’t quite right. The reason for this is that the characters that are special in the replacement value of preg_replace are not the same characters that are special in the pattern. So here is a solution:

function preg_replacement_quote($str) {
    return
preg_replace('/(\$|\\\\)(?=\d)/', '\\\\\1', $str);
}

$input = '$5 dollars';
$replacement = '$10+$5';
var_dump(preg_replace('|^(.*) dollars$|', preg_replacement_quote($replacement) . ' dollars', $input));
//  output: string(14) "$10+$5 dollars"

Now we get the expected output. Proper data handling is a good thing.

Filed Under

  • PHP

Related Posts

  • PHP first impressions from a J2EE programmer
  • Why isn’t PHP the natural successor to Java?
  • The Problem with Markup Languages
  • goto in PHP
  • OOP is Mature, not Dead
Both comments and pings are currently closed.

16 Responses to “Two preg_replace Escaping Gotchas”

  1. Christian says:
    11/14/2005 at 1:59 am

    I wrote a small paper on how the “e” modifier can be abused by attackers a couple of weeks ago ( http://hauser-wenz.de/playground/papers/RegExInjection.pdf ). I was prompted to do that because I was discussing in some security talks this year whether upcoming attacks like “XPath Injection” are ridiculous or a real threat. I rather thought of the former, but then I found the “e” modifier in a real-world application I audited earlier this year, *ouch*.
    Nice examples, btw!

  2. Roan says:
    11/17/2005 at 2:04 am

    Sorry for unrelated comment, but thanks to your site I found out that someone samelessly stole article from my site. Not you :) You have “recent bookmarks” sidebar with link “Why is PHP a Pain? installing php applications” and linked article http://www.designbytim.com/blog/2005/10/27/22/ is actually stolen from my blog http://blog.enargi.com/programming/php/why-is-it-so-hard/ . Unbeliveable. I never saw someone actually stealing articles and especially on the subject of professional PHP.

  3. AllThingsDev.com » preg_replace gotchas says:
    11/19/2005 at 6:44 pm

    [...] I’ve been reading Professional PHP for a few weeks now and I’m really enjoying it. It is one of the few blogs out there that actually writes about code and coding in general. For example, their latest post gives a quick overview of some gotchas with the php function preg_replace. Professional PHP gives some good pointers about how to remove those unwanted extra slashes and things of that nature. [...]

  4. SitePoint Blogs » The Joy of Regular Expressions [3] says:
    9/27/2006 at 11:42 am

    [...] Another read, specific to escaping regular expressions and the types of security holes you might fall into with preg_replace(), is Jeff’s explanation of two preg_replace() escaping gotcha’s, which describes the exact nature of the problem plus provides a solution to escaping replacement strings. [...]

  5. Benjamin A. Shelton | Blog » Blog Archive » Symfony 1.3/1.4 and Suhosin says:
    3/2/2010 at 10:03 pm

    [...] a good source on preg_replace, why you should always use single quotes, common mistakes, and why you should really just avoid [...]

  6. Readlf says:
    5/4/2010 at 7:18 am

    Can u please make function preg_replacement_quote for double quoted strings? (Mail is not fake!)

  7. Rob Rasner IMDB says:
    5/7/2011 at 3:20 pm

    My partner and I stumbled over from a alternate page plus thought I might check it out on Two preg_replace Escaping Gotchas – Professional PHP . We like what I notice so I am just a fan. Look forward to checking out your web site repeatedly… FYI whats the latest on Libya amazing information what do you reckon … All the best Rob Rasner IMDB

  8. Blake says:
    7/14/2011 at 11:15 am

    This is a very old blog entry, but I’d like to thank the author for the preg_replacement_quote function. Something like this should definitely be in the PHP library. Thanks again!

  9. Efrain Divento says:
    7/16/2011 at 4:28 am

    I genuinely similar to this blog, make sure you don’t quit!

  10. deer antler plus says:
    11/22/2011 at 3:48 pm

    This post provides the light in which we are able to observe the reality. That is very good a single and offers in-depth information.

  11. imbd says:
    11/23/2011 at 3:19 pm

    imbd…

    [...]Two preg_replace Escaping Gotchas – Professional PHP[...]…

  12. Fallon Brinkerhoff says:
    1/10/2012 at 5:13 am

    Spring, also used car mount sterling kentucky, also 8[, also irs 2002 tax refunds, also fptw, also ringtone true tone, also P, also business il insurance life small, also %[[, also sunset beach north carolina rental, also (, also home for sale in pinson alabama, also 8-), also irondale industrial contractors, also vjugnb, also easy small business loan, also :-O, also

  13. köp spel till xbox 360 says:
    1/12/2012 at 7:59 pm

    Odd this kind of publish is actually totaly unimportant towards the research query I entered on the internet but it has been in initial site. Who is actually Common Failure, and just he reading my hard disk drive? Related to Steven Wright

  14. Gleb says:
    1/13/2012 at 12:11 am

    Thank you!!!

  15. Laronda Dutter says:
    1/14/2012 at 11:58 am

    Whoops I’m Retarded.

  16. Caitlin Bornstein says:
    2/7/2012 at 5:42 pm

    This is on the list of optimum posts that I’ve ever noticed; you may include some much more concepts in the similar theme. I’m nonetheless waiting for some exciting thoughts from your side in your next post.

    Subscribe Feed
    Share Subscribe to this blog…
    Share Bookmark or share this page…
  • About

    My name is Jeff Moore. I'm a PHP programmer living in San Francico and working for a startup.

    More about me…

  • Categories (Home)

    • Agile Methods (14)
    • Mac (14)
    • Misc (18)
    • Open Source (14)
    • PHP (99)
    • Software Design (29)
    • Usability (14)
    • Web Design (20)
  • Recent Comments

    • rsync to remote server via ssh  37
      Petr Halounek, Penni Tomasino, Rodney Kohnen [...]
    • WordPress BBCode Plugin  30
      wepniveth, Pamella Philipps, evakuat [...]
    • PEAR Templates  18
      Sang Bellotti, Kandice Sansing, car insurance estimates for teenagers [...]
    • Extreme Simplicity  15
      Gilbert Moatz, Roni Beauregard, Barb Geyer [...]
    • Manual Memory Management is Dead  6
      Grass Fed Filet Mignon, Kellie Carello, PAPANDOR [...]
    • Friendster wrapup: does MySQL scale  38
      Ollie Joya, nfl jersey on sale, selling scrap gold [...]
    • The Coding Apprentice  51
      fkawau, Annamae Mccane, Boca Raton Personal Injury [...]
    • The Legality of Republishing RSS Feeds  30
      dasfdsfsd, reebok authentic nfl jersey, Tory Rennemeyer [...]
    • Exceptional PHP  7
      Sports, The Click, Laraine Waterhouse [...]
    • PDO versus MDB2  42
      selling silver coins, Oliver Luongo, ddkoaorpa [...]
  • Recent Posts

    • Richard Thomas
    • ZendCon: Writing Maintainable PHP Code
    • Looking Towards the Cloud
    • Holiday Tech Support
    • Closures are coming to PHP
    • php | tek Wrapup
    • php | tek 2008
    • Sarah Snow Stever
    • Benchmarking PHP’s Magic Methods
    • The Endpoints of the Scale of Stupidity on Video
  • Site

    • Archives
    • Log in
  • Search