How to count across DStreams in Apache Spark? -

i have following question.

imagine there dstream of json strings coming in , apply few different filters in parallel on same dstream (so these filters not applied 1 after other). example here pseudo code if helps

dstream.filter(x -> { check set of keys }) -> filteredstream1  dstream.filter(x -> { check set of keys}) -> filteredstream2

but cannot dstream.filter(x -> { check set of keys }).filter(x -> { check set of keys})

now want able count number of elements in filteredstream1 , filteredstream2 , combine result 1 message follows

{"filteredstream1" : 50,  "filteredstream2": 25}

any easy way such leveraging rdd.count across streams or should use maptopair , reducebykey?

Search This Blog

Breniser

How to count across DStreams in Apache Spark? -

Comments

Post a Comment

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

python - Error: Unresolved reference 'selenium' What is the reason? -

php - Need to store a large amount of data in session with CI 3 but on storing large data in session it is itself destorying automatically -