HTML Filtering For RedCloth #
This isn’t a patch for RedCloth, it’s a method you can use to filter out HTML in general. But it works nicely with RedCloth output. I use it for the comments on this site. It’s the best solution I can think of for world-writable files.
 class String
     ## Dictionary describing allowable HTML
     ## tags and attributes.
     BASIC_TAGS = {
         'a' => ['href', 'title'],
         'img' => ['src', 'alt', 'title'],
         'br' => [],
         'i' => nil,
         'u' => nil,
         'b' => nil,
         'pre' => nil,
         'kbd' => nil,
         'code' => ['lang'],
         'cite' => nil,
         'strong' => nil,
         'em' => nil,
         'ins' => nil,
         'sup' => nil,
         'sub' => nil,
         'del' => nil,
         'table' => nil,
         'tr' => nil,
         'td' => nil,
         'th' => nil,
         'ol' => nil,
         'ul' => nil,
         'li' => nil,
         'p' => nil,
         'h1' => nil,
         'h2' => nil,
         'h3' => nil,
         'h4' => nil,
         'h5' => nil,
         'h6' => nil,
         'blockquote' => ['cite']
     }
     ## Method which cleans the String of HTML tags
     ## and attributes outside of the allowed list.
     def clean_html!( tags = BASIC_TAGS )
         gsub!( /<(\/*)(\w+)([^>]*)>/ ) do
             raw = $~
             tag = raw[2].downcase
             if tags.has_key? tag
                 pcs = [tag]
                 tags[tag].each do |prop|
                     ['"', "'", ''].each do |q|
                         q2 = ( q != '' ? q : '\s' )
                         if raw[3] =~ /#{prop}\s*=\s*#{q}([^#{q2}]+)#{q}/i
                             pcs << "#{prop}=\"#{$1.gsub('"', '\\"')}\"" 
                             break
                         end
                     end
                 end if tags[tag]
                 "<#{raw[1]}#{pcs.join " "}>" 
             else
                 " " 
             end
         end
     end
 end
	Be sure to use it after you convert your Textile to HTML.
comment = RedCloth.new( entry.comment ).to_html comment.clean_html!
I’d like to make RedCloth’s built-in filter allow this kind of customization. It may even be worthwhile to have it scan for allowed CSS within a style declaration. On a Wiki, it’s nice to allow people to come up with widths and floating directions and detailed colors, you know?
What do you think


 

flgr
I think it is vulnerable to this attack:
>flgr
< plaintext>
flgr
flgr
flgr
Hm, holds up quite well. :)
flgr
I wonder what happens when I use newlines.
flgr
” style=”font-size: 500pt;”>Attribute injection?
< plaintext>
flgr
xal
My wish for redcloth would be that the output could be created using a visitor style approach. This would make it a lot easier to create different output scripts like tolatex and todocbook.
why