Last updated by 5 years ago

Page: Property Encodings Proposal, Version:0

The problem

Every well coded web application needs to re-encode data in the view to avoid clashes with forbidden characters or escape sequences in the content format being rendered. The JIRA Issue related to this feature is GRAILS-392

Typically for HTML content this means that form and link parameters must be "URL Encoded" to escape non-ASCII chars using the %nn notation, and replace spaces (SPC) with "+" and so on. It also means that data within form fields needs to be HTML escaped to prevent HTML injection attacks which represent a security/phishing risk and also at a less scandalous level can just screw up the rendering of a page i.e:

<input name="productDescription" value="${productDescription}">
...will go bad if ${productDescription} contains a double-quote.

There are many other re-encodings of model data that may be used commonly:

  • URL Encoding
  • HTML Escaping
  • Localization of numbers and dates (the controller should not need to do this)
  • MD5 Digesting
  • SHA-1 Digesting
  • XML Escaping (without assuming HTML entities are valid)
  • Base64
  • UUEncoding
  • JavaScript compatible escaping

Considerations

All of these encodings could be achieved by calling static methods of helper classes, and indeed some of these already exist. However this can soon make GSP pages and other view templates rather ugly by increasing code noise.

It should be easy to add application-specific encodings aside from those supplied out of the box by Grails.

It must be possible to obtain alternative encodings of any property of any model object, irrespective of property nesting depth.

Possible solutions

Please add proposed solutions below.

Option 1

Add dynamic methods to all objects - either throughout grails or just while in the view evaluation:

def encode(String codecName)
def encode(String codecName, def params) // For any encoding related params i.e. locale, key lengths etc

Usage in GSP for example:

<a href="/edit?id=${product.id.encode('URLEncode')}">Edit</a>

<input name="productDescription" value="${product.description.encode('HTMLEscape')}"/>

The implementation of these would look up the codecName against a list of registered codecs which are simply classes with a single method:

def encode(def obj, def propertyValue, def params)

Side effects: I think that any Java code (i.e. JSP) trying to access these methods would have trouble.

Option 2

Add dynamic methods to all objects in the style of GORM dynamic finder methods - either throughout grails or just while in the view evaluation:

def encodeAsXXXXX()
def encodeAsXXXXX(def params)

Usage in GSP for example:

<a href="/edit?id=${product.id.encodeAsURL()}">Edit</a>

<input name="productDescription" value="${product.description.encodeAsHTML()}"/>

The implementation of these would look up the codecName against a list of registered codecs which are simply classes with a single method:

def encode(def obj, def propertyValue, def params)

Side effects: I think that any Java code (i.e. JSP) trying to access these methods would have trouble.

Caveats: We'd probably need a form encodeAs(String codecName) still to allow runtime parameterization of encoding depending on other factors i.e.

${product.description.encodeAs(contentType == 'text/xml' ? 'XMLEscape' : 'HTMLEscape')}

Option 3

Dynamically add toXXXXX() methods based on registered encodings. Each encoding would return a discreet type so that code would be able to tell whether or not a value was already encoded and if so how:

EncodedObject toXXXXXX()
EncodedObject toXXXXXX(def params)

EncodedObject would be defined as:

abstract class EncodedObject {

def value

String toString() { value.toString{} } }

Grails would create a new subclass of EncodedObject for every encoding registered, using the convention name of the encoding class, i.e. URLEncodingEncoder would result in a URLEncodingEncodedObject, and HTMLEscapeEncoder would result in HTMLEscapeEncodedObject. Grails would, upon receiving an encoding request, create an instance of the appropriate XXXEncodedObject type, call the encoder and set the value property to the encoded value.

Usage in GSP for example:

<a href="/edit?id=${product.id.toURLEncoding()}">Edit</a>

<input name="productDescription" value="${product.description.toHTMLEscape()}"/>

It's plain to see that naming of codec implementations could present some awkward toXXXXXX names.

The implementation of these would look up the codecName against a list of registered codecs which are simply classes with a single method:

def encode(def obj, def propertyValue, def params)

By having a discrete type for the encoded values we could have smarter usage where for example data may be encoded in different ways based on application logic:

def passwordDigest = userDetails.passwordDigest

switch (passwordDigest.class) { case MD5DigestEncodedObject: println 'Encoding: MD5' case SHA1DigestEncodedObject: println 'Encoding: SHA-1' case Base64EncodedObject: println 'Encoding: Base64' }

Side effects:

  1. I think that any Java code (i.e. JSP) trying to access these methods would have trouble.
  2. We lose original type information - unless we define EncodedObject as an interface, and even then we could still have trouble with final classes such as String.
Caveats:
  1. People might think the whole object is encoded if they for example to myBook.toURLEncoded() which will actuall only URL encode myBook.toString() - so the EncodedObject terminology, or use of toXXXXX() could be confusing.
  2. I'm not sure how useful knowing the encoding type is.

Option 4

This is the same as Option 2 but instead of dynamically adding encodeAsXXXX methods to all objects, we add them just to the objects what will need access to encodings… i.e. GSP (and JSP context?), Controllers and Services (?), and use them as if they were "global" functions.

Usage in GSP for example:

<a href="/edit?id=${encodeAsURL(product.id)}">Edit</a>

<input name="productDescription" value="${encodeAsHTML(product.description)}"/>

This is a more traditional approach, apart from the dynamic method naming which is very "Grailsy". However the primary motivation for this alternative approach is that it may be more performant - intercepting method calls to GSP, controllers or services only, not to all objects in the application.