Skip to main content

Search and Top Navigation

#4105 closed bug (notabug)

Opened February 10, 2009 09:47AM UTC

Closed February 10, 2009 08:54PM UTC

Last modified October 11, 2012 09:15PM UTC

Be more liberal in accepting IDs

Reported by: brettz9 Owned by:
Priority: minor Milestone:
Component: ui.core Version: 1.6rc6
Keywords: Cc:
Blocked by: Blocking:
Description

In the core code I think:

quickExpr = /^[^<]*(<(.|\\s)+>)[^>]*$|^#([\\w-]+)$/

should be changed to be more liberal in accepting valid XML IDs:

quickExpr = /^[^<]*(<(.|\\s)+>)[^>]*$|^#([:A-Z_a-z\\u00C0-\\u00D6\\u00D8-\\u00F6\\u00F8-\\u02FF\\u0370-\\u037D\\u037F-\\u1FFF\\u0200C-\\u200D\\u2070-\\u218F\\u2C00-\\u2FEF\\u3001-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFFD][.0-9\\u00B7\\u0300-\\u036F\\u203F-\\u2040-]*)$/

Note that I have not included characters in the range #x10000-#xEFFFF since, even when expressed with surrogates, these seem unable to be used as a range in JavaScript (and there are way too many in that range to list out).

Per http://www.w3.org/TR/1999/REC-html401-19991224/types.html#type-name , unlike XML, HTML does not allow ':' or '_' at the beginning of an ID, nor does it allow any of the Unicode hex sequences above, but it is otherwise the same (i.e., the above expression for XML would work for HTML in being excessively liberal, as is the existing regexp).

I have not attempted to narrow down the first alternative in the expression, but that at least is not overly restrictive.

If you don't care for this verbosity, I think you could at least add the period to the latter alternative:

quickExpr = /^[^<]*(<(.|\\s)+>)[^>]*$|^#([\\w.-]+)$/

since '.' is allowable in an HTML or XML ID and might be more widely used. See

Per http://www.w3.org/TR/1999/REC-html401-19991224/types.html#type-name

http://www.w3.org/TR/REC-xml/#id

http://www.w3.org/TR/REC-xml/#NT-Name

http://www.w3.org/TR/REC-xml/#NT-NameChar

thanks!

Attachments (0)
Change History (3)

Changed February 10, 2009 10:33AM UTC by brettz9 comment:1

Sorry, I realized

1) I hadn't allowed for start characters as end characters (corrected below)

2) You could also allow for the higher range (comprised of surrogates) by checking for the surrogates as individuals (it should be safe since the document wouldn't even be well-formed if the surrogates were not arranged properly and by excluding private use surrogates, we shouldn't be allowing any unwanted characters either. So, we need to add: \\ud800-\\udb7f\\udc00-\\udfff

var nameStartChar = ':A-Z_a-z\\u00C0-\\u00D6\\u00D8-\\u00F6\\u00F8-\\u02FF\\u0370-\\u037D\\u037F-\\u1FFF\\u0200C-\\u200D\\u2070-\\u218F\\u2C00-\\u2FEF\\u3001-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFFD\\ud800-\\udb7f\\udc00-\\udfff';
var nameEndChar = '.0-9\\u00B7\\u0300-\\u036F\\u203F-\\u2040-';
var xmlName = '['+nameStartChar+']['+nameStartChar+nameEndChar+']*';

quickExpr = new RegExp('[<]*(<(.|\\s)+>)[>]*$|#('+xmlName+')$');

Changed February 10, 2009 08:54PM UTC by scottgonzalez comment:2

resolution: → invalid
status: newclosed

This is part of jQuery, not jQuery UI. You can file the ticket at http://dev.jquery.com, but it will most likely just be closed as invalid there as well since dots are used for class selectors.

Changed October 11, 2012 09:15PM UTC by scottgonzalez comment:3

milestone: TBD

Milestone TBD deleted