Python: something faster than not in for large lists? -


i'm doing project word lists. want combine 2 word lists, store unique words.

i'm reading words file , seems take long time read file , store list. intend copy same block of code , run using second (or subsequent) word files. slow part of code looks this:

    while inline!= "":         inline = inline.strip()         if inline not in inlist:             inlist.append(inline)         inline = infile.readline() 

please correct me if i'm wrong, think slow(est) part of program "not in" comparison. ways can rewrite make faster?

judging line:

if inline not in inlist:     inlist.append(inline) 

it looks enforcing uniqueness in inlist container. should consider use more efficient data structure, such inset set. not in check can discarded redundant, because duplicates prevented container anyway.

if insertion ordering must preserved, can achieve similar result using ordereddict null values.


Comments

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -