Splitting a csv file using bash in linux

This is a typo update on a post i’ve seen on this blog post http://www.geekology.co.za/blog/2009/02/bash-script-to-split-single-csv-file-into-multiple-files-with-headers/

What this will do is split a csv file into chucks of 300 lines but also prepend the original header from the csv file to the chunks that have been cut.


# check if an input filename was passed as a command line argument:

if [ ! $# == 1 ]; then  echo "Please specify the name of a file to split!"  exit


# create a directory to store the output:
mkdir output

# create a temporary file containing the header without
# the content:
head -n 1 $1 > header.csv

# create a temporary file containing the content without

# the header:
tail -n +2 $1 > content.csv

# split the content file into multiple files of 5 lines each:
split -l 300 content.csv output/data_

# loop through the new split files, adding the header# and a '.csv' extension:
for f in output/*; do cat header.csv $f > $f.csv; rm $f; done;

# remove the temporary files:
rm header.csv
rm content.csv

You can change it to any number you want, but this is the ideal number of records for importing into a sugarcrm setupon a fasthosts site. If you use any other hosting the memory on sugarcrm may max out at a different level,so just tweak that number a little.

for a non-sugarcrm solution, for example loading into excel you could have a value of 50,000 to beat the 64,000~ barrier.

by the way anyone who want to analyse a 64k line excel file and make business critical decisions based on that is an idiot.