Injecting a Hash Backwards and the Merge Block #
Here’s a fun snippet cooked up for the Camping 1.3 release due later today, laid out a bit nicer so you can play with it on your own.
Goal is: to parse a query string in as few bytes as possible. And to allow the Hash-like syntax from PHP and Rails.
def qs_parse(qs) qs.split(/[&;]/n). inject({}) { |h,p| k, v = p.split('=',2) h.merge( k.split(/[\]\[]+/).reverse. inject(v) { |x,i| {i=>x} } ){|_,o,n|o.merge(n)} } end
Believe me, there’s a real zen in the inner-inject/merge-block routine. Wield it like so:
>> qs_parse("name=Philarp+Tremain&hair=sandy+blonde") => {"name"=>"Philarp+Tremain", "hair"=>"sandy+blonde"} >> qs_parse("post[id]=4&post[nick]=_why&post[message]=GROSS!&a=1") => {"a"=>"1", "post"=>{"message"=>"GROSS!", "nick"=>"_why", "id"=>"4"}}
Obviously, in the final version, stuff gets unescaped and all that. This exercise only focuses on parsing the structure. It would be nice for the merge-block to be tail recursive. It only goes one level presently.
Update: Here’s a right-good one which recurses to any depth and builds an array from duplicate entries as Dan pointed out.
def qs_parse(qs) m = proc {|_,o,n|o.merge(n,&m)rescue(o.to_a<<n)} qs.split(/[&;]/n). inject({}) { |h,p| k, v = p.split('=',2) h.merge( k.split(/[\]\[]+/).reverse. inject(v) { |x,i| {i=>x} },&m) } end
Ezra Zygmuntowicz
oooh! Pretty!
yerejm
Hmm… > _.class => NilClass
fansipans
_ is just a variable name, best used as a place holder in tiny code. it is used above elegantly with the block-enabled Hash.merge to handle cases where you have two hashes with the same name, but with their own values. {|key,oldval,newval| oldval.merge(newval)} blends the two hashes from the query string
so foo[bar]=baz and foo[qux]=bop ends up being the full foo={bar=>baz, qux=>bop}
man yeah i’d love to see some arbitrarily deep hash support (_why, you should have posted this at 4:20 hehe)
mmmmmmmmmm
Harold
The ‘n’ option on the regex is what? Multibyte aware or something?
why
fansipans: I’m with you. This post was underorchestrated.
Harold: Yes, another byte gone. GLAD .
lee b.
you can just have
and the other split elements will be discarded.
also, unless you want to match
or something similarly perverse, the inner split’s regex can be
/[\]\[]/
without the plus sign (or /[][]/ if you don’t mind warnings).
lee
?????
Why do you write the code so tersly? Is it to keep the file size down? It makes it really hard for beginners to follow what is going on. I know I am asking a lot, but could you post an expanded version (full variable names, generous white space) for these short expositions? I think I could learn more that way.
Daniel Berger
Hmm, I thought the URI class had some helpers for this already, but I don’t see anything.
Perhaps URI could use some polish?
I guess we’ll need a URI ::Query class first, eh?mindtriggerz
????: There’s an expanded version in camping svn.
why
Golfing code is meant to challenge. It’s like Sudoku or Rubik, but the kick of it is that you’re actually left with something handy.
Daniel: There’s a nice thought. I’m sure it could wipe out some code in WEBrick, too.
Dan
It barfs on multiple values for the same name, which is legal HTML :
qs_parse(“foo=4&foo=6”)
why
Okay, updated to be recursive and handle the bug Dan found.
yerejm
fansipans: Turns out _ is something irb adds to confuse me…
Dan
Nice fix! On a separate note, I’m a Java developer (boo, hiss) who’s really excited about Ruby. But one of the things I don’t like is when it becomes too ‘Perl-like’ (read: impenetrable, brevity over clarity), and I’m wondering if you see this code snippet as just an exercise, or whether you would write code this way on a large project meant to be shared with other developers. It’s probably just my inexperience with Ruby talking, but if I had to debug this or add a feature to it, I would be well screwed. So, is this meant to be poetry? As that, it’s wonderful.
Avdi
I admit it took me awhile to parse through that code mentally. The irony, though, is that despite it’s almost Perlish density, the functional programming crowd would probably consider it not merely elegant, but the “right” way to do it. And not without reason, either. For all it’s complexity, it’s not relying on any special rules or wierd syntactic truicks like Perl might. Although it would probably be a little prettier in Haskell, and the ‘inject’s would be called ‘fold’.
This isn’t really an opionion one way or another; just an observation that the Functional and Scripting worlds are getting closer and closer together.
THBMan
Ok, that took me a few minutes and my copy of the pickaxe, but I got it. Very nice. The only thing I don’t get is why there’s a rescue in the m Proc. I thought if merge found two elements the same it would overwrite the old with the new, not raise.
cilibrar
Why’s page has been squarely aimed at Ruby experts for all of this year and most of last in my opinion. I had to read the column for over a year (and practice Ruby independently) before I was ready to seriously start understanding the examples like Dwemthy. The fact of the matter is, anybody who doesn’t like Ruby won’t learn it, and if you don’t learn it you won’t have to debug it because you won’t be doing it. simple!
fansipans
cilibrar: _why’s hijinx definitely have a very challenging spirit, i know it’s encouraged me to beef up and understand more and more… co-workers at a big meeting yesterday got a big kick (and nodded in agreement) with my use of the words “kung-fu” and “zen” in reference to proper design ;)
yerejm
THBMan
foo=1&bar=2 : no duplicate keys, proc no fire, merges with no issue
foo[bar]=1&foo[baz]=2 : duplicate outer keys, merge asks proc to fire, bar and baz are inner unique keys (if they weren’t, recursive proc call), no rescue, merges like above
foo=1&foo=2 : duplicate key, proc fires, 1 and 2 are not hashes, rescue fires, foo key points to array in the “merged” hash and values accumulate
merge by default overwrites. The proc given to it doesn’t do that.
Or at least that’s my understanding of the _whyjinx.
Juho Snellman
That seems pretty complicated if the goal really is to accomplish the task in as few bytes as possible.
The following function should fulfill the informal spec about as well as the updated version above, and has only 122 non-whitespace characters compared to 177 in the original.
(Sorry if I missed any obvious golf tricks; this is my first Ruby program).
me
Juho: What is the “informal spec” that you are referring to? There are query strings for which your function doesn’t appear to work properly (as well as other cases for which the updated version above still doesn’t seem to work properly) but I am not clear whether such strings are supposed to permitted, so I’d like to see the spec.
Juho Snellman
Sorry, by “informal spec” I pretty much meant this blog post and the comments, not an actual specification.
The examples specify the intended basic behaviour. The comments suggest that strings like
foo[bar][baz]=1
should be handled properly. And the lack of any error detection (e.g. checking that the brackets are balanced) in the original means that such things matter less than making the code short ;-)why
Juho, you wild man, well played. There’s no rules here, just unfenced sporting arena with starving, sickly grass. I’m with you on the error detection, let’s just parse it into something.
why
Oh, Juho. That qs_parse is going to need a
dup
in there, because the string passed in gets destroyed by thegsub!
.Also, mine does parse
camp[]=Dooley&camp[]=Rheinhart
into{"camp" => ["Dooley", "Rheinhart"]}
, which does right with the Rails/PHP rules.me
Juho: Thanks for the clarification. What I was wondering about wasn’t such errors, but whether strings like foo=1&foo[bar]=2&foo[baz]=3 should be handled. I would expect that to return something like {“foo”=>[“1”, {“bar”=>“2”}, {“baz”=>“3”}]} as the original does but yours raises an error. However, the original doesn’t correctly handle similar strings such as foo[bar]=2&foo[baz]=3&foo=1, i.e., the same string but with the parameters simply in a different order. I had modified the original so I think it handles such cases like the above, but was wondering how they should be treated.
me
me
Can someone please tell me where I can find “the Rails/PHP rules” for handling these types of query strings? It sounds like that will answer the questions I had above. Thanks.
Premshree
I had a situation where I wanted to read from a text file that had on each line something like “foo:bar”. Obviously, in my Ruby script I want to get a hash representation. Not very elegant, I’d guess, but anyway, this is what I did:
ntk
How about using scan: hash = {}; qs.scan(/(name|hair)=(.*?)(&|$)/) { hash.update($1 => $2)}; puts hash.inspect
why
Premshree: That’s very similar to this bit I saw on IRC a great while ago.
ntk
For yet another qs.scan solution check out bigbold.com/snippets/user/ntk !
Comments are closed for this entry.