shell - search 2 fields in a file in another huge file, passing 2nd file only once -
file1 has 100,000 lines. each line has 2 fields such as:
test 12345678 test2 43213423
another file has millions of lines. here example of how above file entries in file2:
'99' 'databases' '**test**' '**12345678**' '1002' 'exchange' '**test2**' '**43213423**'
i way grep
these 2 fields file1 can find line contains both, gotcha is, search 100,000 entries through 2nd file once looping grep slow loop 100,000 x 10,000,000.
is @ possible?
you can in awk:
awk -f"['[:blank:]]+" 'nr == fnr { a[$1,$2]; next } $4 subsep $5 in a' file1 file2
first set field separator quotes around fields in second file consumed.
the first block applies first file , sets keys in array a
. comma in array index translates control character subsep
in key.
lines printed in second file when third , fourth fields (with subsep
in between) match 1 of keys. due '
@ start of line, first field $1
empty string, fields want $4
, $5
.
if fields quoted in second file, can instead:
awk -v q="'" 'nr == fnr { a[q $1 q,q $2 q]; next } $3 subsep $4 in a' file file2
this inserts quotes array a
, fields in second file match without having consume quotes.
Comments
Post a Comment