shell - search 2 fields in a file in another huge file, passing 2nd file only once -

file1 has 100,000 lines. each line has 2 fields such as:

test 12345678 test2 43213423

another file has millions of lines. here example of how above file entries in file2:

'99' 'databases' '**test**' '**12345678**' '1002' 'exchange' '**test2**' '**43213423**'

i way grep these 2 fields file1 can find line contains both, gotcha is, search 100,000 entries through 2nd file once looping grep slow loop 100,000 x 10,000,000.

is @ possible?

you can in awk:

awk -f"['[:blank:]]+" 'nr == fnr { a[$1,$2]; next } $4 subsep $5 in a' file1 file2

first set field separator quotes around fields in second file consumed.

the first block applies first file , sets keys in array a. comma in array index translates control character subsep in key.

lines printed in second file when third , fourth fields (with subsep in between) match 1 of keys. due ' @ start of line, first field $1 empty string, fields want $4 , $5.

if fields quoted in second file, can instead:

awk -v q="'" 'nr == fnr { a[q $1 q,q $2 q]; next } $3 subsep $4 in a' file file2

this inserts quotes array a, fields in second file match without having consume quotes.

Search This Blog

Breniser

shell - search 2 fields in a file in another huge file, passing 2nd file only once -

Comments

Post a Comment

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -