There seems to be much interest lately in input filtering in PHP, especially in cross site scripting prevention. I’ve always preferred input validation to input filtering, but I am giving filtering a new examination. My problem with filtering is with usability. The comments to this post are a good example. There are obviously some usability issues going on here.
I think the fundamental problem with input filtering and especially XSS filtering is that it violates the principle of least surprise. User input is silently modified without the user’s knowledge. If the violation is innocent, then the software surprises the user. This is bad. At least with validation, the user gets a heads up on the problem.
Let me try to name and enumerate some scenarios:
Direct Filter
This is what WordPress did in the example post. It simply accepted the user input and silently changed it. The filtered value is stored directly into the database. The original input is lost. There is no preview. I think this has to be a usability worse case scenario.
Filter with Preview
This scenario adds a preview capability to the last. The filter is still applied. A validation failure or explicit preview button causes the form values to be re-displayed and a preview panel to be shown. However, the previous input value is silently modified and sent back to the user. The user may or may not realize that his original input has been changed during the round trip.
This is also seems like a usability problem, but every once and a while it happens to me when entering legitimate input into professionally written programs.
Filter with Buffered Preview
This scenario adds an additional buffer to the last. The filter is applied, but the original input is sent back to the user in the form field. However, the preview panel shows the modified value.
I don’t really see this very often outside of fields with a dedicated markup language (for example BBCode).
Filter with Forced Preview
The input value is silently filtered. However, the user is forced to preview the output at least once. Its up to the user to notice the results of the filter.
I think slashdot does this.
Filter with Confirmation
A stricter variation of Forced Preview where as the last stage, the user must confirm their input once without the ability to change it. It is up to the user to notice the results of the filter.
I think this is popular as the last stage of a wizard style interface.
Filter with Confirmation and Warning
The filter is applied and the user’s input is changed, however, the user is warned exactly which value was changed by the filter.
I don’t think I’ve ever seen this one.
Validation
The program notifies the user that the input value is bad, but does not modify it. The user must change the value to proceed.
I tend to use this one. I escape all output, so I don’t worry too much about displaying XSS in the preview panel.
Obviously, you can mix and match scenarios for different input rules and fields. I’m sure there are other scenarios. Please suggest some.
I guess I’ve been programming for about 23 years now. The longer I do it, the more reluctant I am to be strict with user input. Ultra sanitized, ultra structured data may seem attractive to the programmer, but its a pain for the user and its only a matter of time before a legitimate exception comes along. A European phone number, the 51rst state, a canadian postal code, a new millennium, etc. The exception is the rule. Understandably, XSS must be prevented, but its easy to go too far.
Which of these scenarios do you think are best from the user’s perspective? From the programmers perspective?
Hmm interesting issue. I guess I like the Filter with buffered preview best. You apply the filter, don’t change the users original input and allow him to learn from his mistakes by changing the original input to learn how to do it right in one go the next time. This way the user is aware of the filter and will even learn how it works. Software that quite painlessly educates the user on how to use it. Sounds great!
I’m a little confused by “Filter with Confirmation.” Why would you bother showing somebody a preview if they can’t change it? What would the point of the preview step be then?
seth, the only think I can think of is the fact that they can at least cancel it if the output is something completely different than they wanted.
Marco, consider that for infrequently used applications, especially web applications, any learning forced on the user is not painless. A user may never have seen your software before, and perhaps will never again. However, I do think “Filter with Buffered Preview” is a good option for infrequent use.
User input is not disturbed. No surprises there. One has to assume if the user hits the preview button that the user will inspect the resulting preview panel for correctness. The only chance for surprise is if the user chooses to directly submit without a preview.
The reason “Filter with Confirmation” is stricter than “Forced Preview” is that with forced preview you could have the following sequence of events:
1. Fill out initial form.
2. Click preview (no other option available)
3. Change form values
4. Click Submit.
There is still chance for surprise here because there was no preview required after form values were changed in step 3. “Filter with Confirmation” requires at least one stage at the end where there is no chance that a user change might result in a surprise filtering.
I think this option works best with infrequently used forms. (Like wizards, surprise, surprise.) Requiring confirmation in commonly used scenarios just trains users to ignore and become annoyed with the confirmation stage.
Filter with Buffered Preview, then a twist.
I’m not sure how relevant this is to the topic. I take user input in the style of ‘tell us if you see an error on this page’ input – and poeple like to swear and hit “send”, too tempting for them, see?
So I have a profanity filter switched on to catch all the rude words and spare admin users blushes.
So we ALLOW the users to swear, and show them their naughty input and they continue thier sad lives, thinking they have struck one for liberty – but if fact I comment out all the bad chars before storing it. Without that confirmation they figure out other ways of getting their msg across.(e.g separate letters with s p a c e s)
So heres a case where you would want to do hidden filtering, and yet show the user their original comment.
As ever in the UK, this won’t work in Scunthorpe.
The best solution is none of the above, it is to treat variables as attached to a certain context.
When changing context (sending them to the DB, using them in the Shell or outputing in a XHTML page) make a context transition (usually escaping control characters)..
Something that smells bad about all the above solutions is that if you do a site for storing XSS attacks, information or whatever all the above alternatives would catch this as threatening input.
Another issue is that although most of these methods would catch one potentially dangerous variable, it would not catch if 2 variables would appear in the same page near one another it could still be possible to exploit it
(you could open the tag and leave one of its properties open until you close it in the next tag, completing the necessary code for the exploit).
Remember that good XHTML has only “” as attribute markers, well browsers still accept ”, so in the new and well done XHTML compliant applications it is trivial to do such a exploit…
BTW, i am totally for XHTML, just think the way to security is more along the context evaluation route…
Filtering and validating are the same. Filtering is the formal term that has been adopted by the security community, but there are many synonyms in common lore: validating, cleaning, scrubbing, etc.
There is a security principle that says that we should never modify invalid data in order to make it valid. Therefore, filtering (by any name) refers to an inspection process. This is why you’ll never see me recommending the use of strip_tags().
All XSS and SQL injection vulnerabilities I’ve seen are due to a failure to escape output. I’m glad to see a heightened interest in filtering input, but I hope it doesn’t distract people from the equally important step of escaping output.
[...] That said, Jeff raises good points on the ins and outs of input filtering here and it’s worth bearing in mind his closing remark; I guess I’ve been programming for about 23 years now. The longer I do it, the more reluctant I am to be strict with user input. Ultra sanitized, ultra structured data may seem attractive to the programmer, but its a pain for the user and its only a matter of time before a legitimate exception comes along. A European phone number, the 51rst state, a canadian postal code, a new millennium, etc. The exception is the rule. Understandably, XSS must be prevented, but its easy to go too far. [...]
To make life easy for moderators, you can use a script to highlight anything that might be an issue. I am working on this for my own site.
I have to disagree with Chris Shiflett. Validation is YES or NO. there is no in between. Filtering can do a combination of things such as:
a) allow everything
b) allow certain things (white list)
c) deny certain things (black list)
d) change certain things
Validating and filtering are comparable to a lens cap and a filter on a camera.
-Frank
I always wondered how forms are spoofed and after reading various articles on your site it has made me more wise, I am now implementing data filtering on all of my forms.
Thanks again for explaining this in a way that is easy to understand, well done.
wow. lastly, I discovered one thing useful for my paper to write down about. this is attention-grabbing and helps me with extra analysis in the future. Glad I found this blog.Thank you. And I do hope you’ll broaden some of your ideas about this topic and I’ll certain come again and skim it. Thanks for the effort and time.
Hi, a friend of mine recommended The Usability of Input Filtering – Professional PHP to me so I came to see it. I really like the design. I will bookmark it and come back to it again. I just wanted to ask you what your theme is called and if I can find it for free.
Fantastic goods from you, man. I have understand your stuff previous to and you’re just too great. I really like what you have acquired here, really like what you’re stating and the way in which you say it. You make it enjoyable and you still take care of to keep it smart. I can’t wait to read much more from you. This is really a great web site.
Hi, I think that I saw you visited my website thus I came to “return the favorâ€.I am trying to find things to improve my website!I suppose its ok to use some of your ideas!!
I think this is one of the most significant info for me. And i am glad reading your article. But should remark on few general things, The web site style is wonderful, the articles is really excellent : D. Good job, cheers
I’ve been surfing online more than three hours today, yet I never found any interesting article like yours. It is pretty worth enough for me. Personally, if all web owners and bloggers made good content as you did, the net will be much more useful than ever before.
Hello There. I found your blog using msn. This is a really well written article. I’ll be sure to bookmark it and come back to read more of your useful information. Thanks for the post. I’ll definitely comeback.
I’m really impressed with your writing skills and also with the layout on your weblog. Is this a paid theme or did you customize it yourself? Anyway keep up the nice quality writing, it’s rare to see a great blog like this one today..
Hi there, You’ve done a great job. I’ll definitely digg it and personally suggest to my friends. I’m sure they’ll be benefited from this website.
Nice post. I was checking constantly this blog and I am impressed! Very helpful information specifically the last part
I care for such information much. I was looking for this certain information for a very long time. Thank you and good luck.
Its like you read my mind! You appear to know so much about this, like you wrote the book in it or something. I think that you can do with some pics to drive the message home a little bit, but instead of that, this is great blog. A fantastic read. I’ll certainly be back.
hey there and thank you for your information – I have definitely picked up something new from right here. I did however expertise several technical issues using this web site, as I experienced to reload the website lots of times previous to I could get it to load correctly. I had been wondering if your hosting is OK? Not that I’m complaining, but slow loading instances times will sometimes affect your placement in google and could damage your quality score if advertising and marketing with Adwords. Anyway I’m adding this RSS to my e-mail and could look out for a lot more of your respective exciting content. Ensure that you update this again soon..
Excellent post. I was checking continuously this blog and I’m impressed! Very useful info specially the last part
I care for such info a lot. I was looking for this particular info for a very long time. Thank you and good luck.