hoodwink.d enhanced
RSS
2.0
XHTML
1.0

RedHanded

In and Out Filters for Hacked mod_ruby #

by why in inspect

I’ve been playing with an experimental MouseHole which uses an Apache output filter on content pulled through mod_proxy. On my Linux machine, mod_proxy is actually much slower than WEBrick::HTTPProxy, so there’s been no gain. Except the newly hacked input and output filters for mod_ruby.

May still have a pile of bugs, but download here: mod_ruby-filtered-12.27.2005.tar.gz.

To write your own filtered proxy:

 RubySafeLevel 0
 RubyTimeOut 10
 RubyAddPath /home/why/lib
 RubyRequire proxyTest

 <IfModule mod_proxy.c>
   ProxyRequests On
   <Proxy *>
     Order deny,allow
     Deny from all
     Allow from 127.0.0.1
     RubyOutputFilter ProxyTest.instance REWRITER
     SetOutputFilter REWRITER
   </Proxy>
   ProxyVia On
 </IfModule>

And the proxyTest.rb looks like this:

 require 'singleton'

 class ProxyTest
   include Singleton

   def output_filter(filter)
     if filter.req.content_type !~ %r!text\/html!
       filter.pass_on
     else
       s = filter.read
       while s
         filter.write(s.gsub(/Ruby/i, "#{filter.req.content_type}"))
         s = filter.read
       end
       filter.close if filter.eos?
     end
   end
 end

So, yeah, the object must respond to input_filter or output_filter. (Turns to Shugo Maeda.) I think we should start duck typing mod_ruby. Rather than having to explicitly state the various handlers in the httpd.conf, we should use respond_to? in mod_ruby to scan for the capabilities of the class.

said on 29 Dec 2005 at 19:18

Oh and if mod_proxy was up to it, this is such an easy way for all the languages to get custom filtering proxies because mod_python, mod_perl and many of the others all support this kind of filtering. However, mod_io and mod_haskell do not.

said on 30 Dec 2005 at 00:43

Thats seriously cool _why. Must… go… play…

said on 30 Dec 2005 at 10:16

I’m not generally slow or stupid, but I’ll say that Why makes my head spin! I’ll be honest and say that I don’t quite understand exactly what your code does. Generally I can figure out your examples (despite sparse descriptions), but today is just not my day. Can you give me a quick run-down of what you’re actually doing here?

Thanks, and sorry.

M.T.

said on 30 Dec 2005 at 11:16

Okay, well, mod_ruby is an extension to the Apache web server, right? So you can run a Ruby interpreter inside Apache. You add directives to Apache, which can let URLs, authentication headers, etc. go through a Ruby object.

This hack adds two new directives: RubyOutputFilter and RubyInputFilter. These filters are used to completely modify the request and response within Apache. (Think filtering like: removing cuss words or eliminating ads from a page.) mod_ruby’s handlers can’t currently do this because they happen before the page ever comes back.

I’m actually not sure why you’d need filters in a traditional Apache setup (without mod_proxy.) But it could be used like Monkeygrease to offer slightly modified versions of your own applications.

Anyway, what I’m advocating is use of mod_proxy with mod_ruby filtering. And in the above example, I’m using mod_proxy to set up a personal proxy at http://127.0.0.1:37004 on my laptop. And then I’m passing all the proxy pages through the ProxyTest object. See, the output_filter method takes an HTML page and replaces the word Ruby with the mime type of the page. It’s a stupid example that’s full of baloney, but it illustrates the basics. If I can get mod_proxy to speed up, then I’m sure you’ll be seeing a lot more of it.

said on 30 Dec 2005 at 13:37

Wow, interesting. I’m a bit surprised that mod_proxy is slower than WEBrick’s, though…

said on 30 Dec 2005 at 17:20

Thanks for that excellent elaboration. I guess some of the terms I just wasn’t as comfortable with, such as filtered proxy.

Keep up the creative thinking! It inspires me a great deal in my coding.

M.T.

Comments are closed for this entry.