hoodwink.d enhanced
RSS
2.0
XHTML
1.0

RedHanded

How Do You Parse Tab Separated Values #

by daigo in bits

How to parse tab separated values, values of which may be omitted and they should be recognized as nil—that was discussed at ruby-list.

For instance each row has three values: float, integer and string; “1\t2\t3” should be 1.0, 2, “3”. When values are omitted they should be parsed as nil, not zero nor empty string; “1\t\t3” should be 1.0, nil, “3”.

Following is a solution of Rubikichi-san1.


class TextData < Struct.new(:float, :int, :str)
  CONVERTER = [:to_f, :to_i, :to_s]
  def self.[](input)
    obj = new
    input.chomp.split(/\t/).each_with_index do |x, i|
      obj[i] = (x && x.__send__(CONVERTER[i]))
    end
    obj
  end
end

p TextData["1\t2\t3"]
p TextData["1.0\t2\t"] 
p TextData["1\t2\tfoo"] 

# <struct TextData float=1.0, int=2, str="3">
# <struct TextData float=1.0, int=2, str=nil>
# <struct TextData float=1.0, int=2, str="foo">

Inherit an anonymous structure class!

1 He is a pioneer of Ruby. I learned Ruby techniques from his book and home page. “Rubikichi” is his alias.

There is extra functionality in the original question, which I skip here; converting the string value is required.

said on 28 Apr 2005 at 10:24

How about the more functional


class TextData < Struct.new(:float, :int, :str)
  CONVERTER = [:to_f, :to_i, :to_s]

  def self.[](line)
    new *CONVERTER.zip(line.split("\t")).map { |c, v|
      unless v.nil? or v.empty?
        v.send(c)
      end
    }
  end
end

said on 28 Apr 2005 at 10:25

Looks interesting…would love a line by line commentary. I’m new to Ruby but felt I had a good handle on most of what you can do until I saw this!

said on 28 Apr 2005 at 11:59

chris2: looks good, but how is it more functional?

said on 28 Apr 2005 at 12:01

ah, as in functional programming

said on 28 Apr 2005 at 12:04

There you go:


# Inherit TextData from a Structure with 3 fields.
class TextData < Struct.new(:float, :int, :str)
  # Constant to hold the conversion method names for each column
  CONVERTER = [:to_f, :to_i, :to_s]

  # Define the parsing method
  def self.[](line)
    # Create a new TextData
    new \
    # by splatting (array->arguments) the result
    # of zipping CONVERTER and the line's element:
    # [a,b,c].zip([d,e,f]) == [[a,d],[b,e],[c,f]]
    # and after mapping over it
     *CONVERTER.zip(line.split("\t")).map { |c, v|
      # is the field not undefined or empty
      unless v.nil? or v.empty?
        # apply the mapping method
        v.send(c)
      end
      # else return nil
    }
  end
end
said on 28 Apr 2005 at 12:05

doug: Exactly.

said on 28 Apr 2005 at 18:35

That is soo righteous!!

said on 29 Apr 2005 at 04:32

Except for the lack of test/unit. ;)

(Hey, why all the space around test/unit ?)

Comments are closed for this entry.