Python: something faster than not in for large lists? -
i'm doing project word lists. want combine 2 word lists, store unique words.
i'm reading words file , seems take long time read file , store list. intend copy same block of code , run using second (or subsequent) word files. slow part of code looks this:
while inline!= "": inline = inline.strip() if inline not in inlist: inlist.append(inline) inline = infile.readline()
please correct me if i'm wrong, think slow(est) part of program "not in" comparison. ways can rewrite make faster?
judging line:
if inline not in inlist: inlist.append(inline)
it looks enforcing uniqueness in inlist
container. should consider use more efficient data structure, such inset
set. not in
check can discarded redundant, because duplicates prevented container anyway.
if insertion ordering must preserved, can achieve similar result using ordereddict
null values.
Comments
Post a Comment