Jump to content

Ruby's CSV class choking on large files - Solved!


badp

Recommended Posts

I've got a nice little Ruby script designed to take a CSV file and convert it to a particular format.  It works flawlessly, until I feed it a ~50MB CSV file.  It opens the file but when it goes to parse, Ruby throws and Visual Studio catches an exception:

"An unhandled win32 exception occurred in ruby.exe [3040]."  Then it goes on with the generic debugging message.

The Ruby script is below.  I adapted it from an open source CSV-to-XML script.  Also, as the comments indicate, the point of all this is to get a CSV file into a format that splunk will process correctly (it processed the CSV values, but does not get the fields right and refuses to learn them properly.)

The script:

#!/usr/bin/ruby
# CSV 2 splunk
# Converts a CSV file to a splunk-readable format

require 'csv'

print "CSV file to read: "
input_file = gets.chomp

print "File to write to: "
output_file = gets.chomp

puts "Opening CSV file..."
csvfile = File.open(input_file) {|f| f.read}
puts "CSV file opened."

puts "Parsing CSV file..."

csv = CSV::parse(csvfile)
fields = csv.shift

puts "Writing file..."

File.open(output_file, 'w') do |f|
  csv.each do |record|
    for i in 0..(fields.length - 1)
      f.print "#{fields[i]}="#{record[i]}", "
    end
    f.print "n"
  end
end # End file block - close file
puts "Contents of #{input_file} written to #{output_file}."

CSV::Parse(csvfile) is where it seems to choke.  Any ideas?

Link to comment
Share on other sites

I installed the FasterCSV class and used it instead.  For those who care, here's the new code that doesn't choke:

#!/usr/bin/ruby
# CSV 2 splunk
# Converts a CSV file to a splunk-readable format

require 'fastercsv'

print "CSV file to read: "
input_file = gets.chomp

print "File to write to: "
output_file = gets.chomp

puts "Opening CSV file..."
csvfile = File.open(input_file) {|f| f.read}
puts "CSV file opened."

puts "Parsing CSV file..."

csv = FasterCSV::parse(csvfile)
fields = csv.shift

puts "Writing file..."

File.open(output_file, 'w') do |f|
  csv.each do |record|
    for i in 0..(fields.length - 1)
      f.print "#{fields[i]}="#{record[i]}", "
    end
    f.print "n"
  end
end # End file block - close file
puts "Contents of #{input_file} written to #{output_file}."

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...