How to count across DStreams in Apache Spark? -


i have following question.

imagine there dstream of json strings coming in , apply few different filters in parallel on same dstream (so these filters not applied 1 after other). example here pseudo code if helps

dstream.filter(x -> { check set of keys }) -> filteredstream1  dstream.filter(x -> { check set of keys}) -> filteredstream2 

but cannot dstream.filter(x -> { check set of keys }).filter(x -> { check set of keys})

now want able count number of elements in filteredstream1 , filteredstream2 , combine result 1 message follows

{"filteredstream1" : 50,  "filteredstream2": 25} 

any easy way such leveraging rdd.count across streams or should use maptopair , reducebykey?


Comments

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -