Perl Script Process a CSV File Line by Line

in Perl

I use a program called jmemorize to assist with rote memorization. It’s pretty common to find a spreadsheet with a list of things you have to memorize, and jmemorize will allow to import CSV format if you add specific column headers.

The problem is wanting to format the questions and answers in a specific way, or add something on each line. Perl can process a file line by line and output it however you want. While the following Perl script probably won’t be used exactly like I used it – you can still benefit from processing files line by line. Here is how you open a file in Perl, process it line by line (instead of slurping the entire file inefficiently) and modify each line as you go:


# read a csv file of 2 columns
# spit out a import file (free flashcard program).
# ask similar questions/answers for each Q/A pair
# modify the question and answer below to word it how you like
# simple script, but if you have questions:

use strict;
use warnings;

# the file name of your .csv file
my $file = 'data.csv';

# the column headers required by jmemorize
print qq~Frontside,Flipside,Category,Level\n~;

# open the file for reading
open(FILE, "<$file") or
 die("Could not open log file. $!\n");

 #process line by line:
while(<FILE>) {

 # I assign $_ to $line by habit.  It's easier to read,
 # it may change and it's harder to trace if you don't do this
 my($line) = $_;

 # remove any trail space (the newline)
 # not necessary, but again, good habit

 # split the line on colums into $qu and $an on the existing commas.
 # note that if you have commas in a quote, this will break
 # and isntead you should split on the proper delimiter: split(/","/,$line) instead
 my ($qu,$an) = split(/,/,$line);

 # side A of the flashcard, or the question
 print qq~"Q: <i><b>$qu</b></i>",~;

 # Side B of the flashcard, or the answer
 # I print the qeustion again, but you don't have to
 print qq~"Q: <i><b>$qu</b></i>?~;
 print qq~Answer: <i><b>$an</b></i>",~;

 # what category should this go into?
 # jmemorize requires the category and a "zero" as part of the format, so:
 print qq~"Questions",0\n~;


# output looks like:
#"Q: <i><b>The Q1</b></i>","Q: <i><b>The Q1</b></i>? A: <i><b>The A1</b></i>","Questions",0
#"Q: <i><b>The Q2</b></i>","Q: <i><b>The Q2</b></i>? A: <i><b>The A2</b></i>","Questions",0
#"Q: <i><b>The Q3</b></i>","Q: <i><b>The Q3</b></i>? A: <i><b>The A3</b></i>","Questions",0
#"Q: <i><b>The Q4</b></i>","Q: <i><b>The Q4</b></i>? A: <i><b>The A4</b></i>","Questions",0

# which is valid input format for jmemorize flashcards
# the point of the script is the ability to modify the Q & A
# so if you have a large spreadsheet, you don't have to type it over and over
# I used it to study circuit ID names and format them in bold/italics
# so I could quickly glance at the card and just focus on the Q&A

There ya go – a common solution to processing a file line by line using Perl

{ 1 comment… read it below or add one }

Nitesh February 18, 2013 at 2:08 am

This code was of great use specially the way it was commented. Thanks a lot


Leave a Comment

Previous post:

Next post: