PlacidPages
New Member
I want to re-arrange the data (Header and Detail) in multiple pipe delimited text files, in the order specified in a reference master file and create a single output file, using shell script.
The order of the columns will be different in different input files. Data could be empty for the columns. There could be columns missing or new columns in the input files, when compared to the Reference master file. The order in the Reference master file could change in future and columns could be added to or deleted from any position. I have added the current code below, which can re-arrange the columns according to Reference master file. However, need help to enhance it to add below functionality.
1. If columns are missing in input file, when compared to reference master file, then output file should have the column with empty data.
2. Accept more than one file as input
Sample Input file A:
Name|Amount|Vehicle|City|Visits|Indicator
Jack|123.22|Volkswagen|Altamonte|1|Y
Sam|23.1|Audi||3|N
|21.4|Ford|NewYork||Y
Sample Input file B:
Suffix|Name|Visits|Vehicle|State|Indicator|City|Gender
Mr|Jack|1|Volkswagen|NJ|Y|Hoboken|M
Mr|Rob|6|Buick||N|Stamford|
Reference Master File:
State|1
City|2
Vehicle|3
Gender|4
Name|5
Amount|6
Expected Output:
State|City|Vehicle|Gender|Name|Amount
|Altamonte|Volkswagen||Jack|123.22
||Audi||Sam|23.1
|NewYork|Ford|||21.4
NJ|Hoboken|Volkswagen|M|Jack|
|Stamford|Buick||Rob|
Current Output (accepting only one file (File A) as input):
City|Vehicle|Name|Amount|Visits
Altamonte|Volkswagen|Jack|123.22|1
|Audi|Sam|23.1|3
NewYork|Ford||21.4|
Code:
#Reads the Infile.out column names and assign number based on order for each column.
cat infile.out | head -1 | tr -s '|' '\n' | cat -n > infile_tmp.out
rm -f final.out
#Reads the master_file.out, gets the column name.
#Matches with the row in infile_tmp.out and gets column number
#Writes the column number to final.out in the position as per the master_file.out
while read line; do
touch final.out
vChkClmn=`echo $line | cut -d "|" -f1`
vMasterClm=`grep -i $vChkClmn infile_tmp.out | cut -d " " -f1`
echo "$vMasterClm" >> final.out
done < master_file.out
#Transpose the final.out data, adding a prefix "$" for each number.
rm -f final_master_list.out
echo `cat final.out` | sed 's/[^ ]* */\$&/g' | sed 's/ /\,/g' > final_master_list.out
vOrder=`cat final_master_list.out`
echo "Value of order is $vOrder"
awk -F\| '{print '$vOrder'}' OFS=\| infile.out
The order of the columns will be different in different input files. Data could be empty for the columns. There could be columns missing or new columns in the input files, when compared to the Reference master file. The order in the Reference master file could change in future and columns could be added to or deleted from any position. I have added the current code below, which can re-arrange the columns according to Reference master file. However, need help to enhance it to add below functionality.
1. If columns are missing in input file, when compared to reference master file, then output file should have the column with empty data.
2. Accept more than one file as input
Sample Input file A:
Name|Amount|Vehicle|City|Visits|Indicator
Jack|123.22|Volkswagen|Altamonte|1|Y
Sam|23.1|Audi||3|N
|21.4|Ford|NewYork||Y
Sample Input file B:
Suffix|Name|Visits|Vehicle|State|Indicator|City|Gender
Mr|Jack|1|Volkswagen|NJ|Y|Hoboken|M
Mr|Rob|6|Buick||N|Stamford|
Reference Master File:
State|1
City|2
Vehicle|3
Gender|4
Name|5
Amount|6
Expected Output:
State|City|Vehicle|Gender|Name|Amount
|Altamonte|Volkswagen||Jack|123.22
||Audi||Sam|23.1
|NewYork|Ford|||21.4
NJ|Hoboken|Volkswagen|M|Jack|
|Stamford|Buick||Rob|
Current Output (accepting only one file (File A) as input):
City|Vehicle|Name|Amount|Visits
Altamonte|Volkswagen|Jack|123.22|1
|Audi|Sam|23.1|3
NewYork|Ford||21.4|
Code:
#Reads the Infile.out column names and assign number based on order for each column.
cat infile.out | head -1 | tr -s '|' '\n' | cat -n > infile_tmp.out
rm -f final.out
#Reads the master_file.out, gets the column name.
#Matches with the row in infile_tmp.out and gets column number
#Writes the column number to final.out in the position as per the master_file.out
while read line; do
touch final.out
vChkClmn=`echo $line | cut -d "|" -f1`
vMasterClm=`grep -i $vChkClmn infile_tmp.out | cut -d " " -f1`
echo "$vMasterClm" >> final.out
done < master_file.out
#Transpose the final.out data, adding a prefix "$" for each number.
rm -f final_master_list.out
echo `cat final.out` | sed 's/[^ ]* */\$&/g' | sed 's/ /\,/g' > final_master_list.out
vOrder=`cat final_master_list.out`
echo "Value of order is $vOrder"
awk -F\| '{print '$vOrder'}' OFS=\| infile.out
Last edited: