User-Provided Data: Scrubbing & Displaying Thereof
I'm in the middle of completely rebuilding an old application from the ground up and wanted to get some feedback on user-provided data and security, see what new methods people are using, and get some general feedback. With past applications I've always made use of CFQUERYPARAM on my database transactions, verified the proper length of all data provided by the user, and ran each form field through a custom tag that, rather harshly, checked for characters commonly used within SQL injection strings and cross-site scripting attacks. This time around I'd like to maintain, if not increase the security of my application when it comes to user-provided data, but at the same time not hinder or upset users by blocking just so many "suspect" characters.
Of course about half of the user data I get can be scrubbed easily enough: E-Mail addresses, passwords, telephone numbers.
Where I've run into problems in the past is the more free-form fields such as names, addresses, search criteria, and "comments" boxes, all which could easily have characters such as ', /, %, --, and the like. In the past I've judged these and many other characters harshly but I wonder if I've been too harsh?
So, I'm curious to hear some opinions on the much-discussed topic of SQL injection and XSS and how it relates to user-provided data.
What characters are you checking for in user data? How do you go about handling them? Escape them? Remove them? Generate an error to the user?
Some things I've heard in the past that I'm curious if people do:
01) htmleditformat() every piece of user data as it is outputted as text within a page.
02) htmleditformat() every piece of user data as it is inserted into the database. An issue with this is that as htmleditformat() translates characters into lengthy ASCII strings and often extends user input beyond the MAXLENGTH of the given form/database field, and the customer won't understand why.
