shell - search 2 fields in a file in another huge file, passing 2nd file only once -
file1 has 100,000 lines. each line has 2 fields such as:
test 12345678 test2 43213423 another file has millions of lines. here example of how above file entries in file2:
'99' 'databases' '**test**' '**12345678**' '1002' 'exchange' '**test2**' '**43213423**' i way grep these 2 fields file1 can find line contains both, gotcha is, search 100,000 entries through 2nd file once looping grep slow loop 100,000 x 10,000,000.
is @ possible?
you can in awk:
awk -f"['[:blank:]]+" 'nr == fnr { a[$1,$2]; next } $4 subsep $5 in a' file1 file2 first set field separator quotes around fields in second file consumed.
the first block applies first file , sets keys in array a. comma in array index translates control character subsep in key.
lines printed in second file when third , fourth fields (with subsep in between) match 1 of keys. due ' @ start of line, first field $1 empty string, fields want $4 , $5.
if fields quoted in second file, can instead:
awk -v q="'" 'nr == fnr { a[q $1 q,q $2 q]; next } $3 subsep $4 in a' file file2 this inserts quotes array a, fields in second file match without having consume quotes.
Comments
Post a Comment