design patterns - Workflow System with Azure Table Storage -


i have system need run simple workflow. example:

  1. on jan 1st 08:15 trigger task object z
  2. when triggered run code (implementation details not important)
  3. schedule task b object z run @ jan 3rd 10:25 (and on)

the workflow simple, need run 500.000+ instances , that's tricky part.

i know windows workflow foundation , same reason have chosen not use that.

my initial design use azure table storage , appreciate feedback on design.

the system consist of 2 tables

table "jobs"   partitionkey: objectid   rowkey: processon (utc ticks in reverse newest on top)   attributes: state (pending, processed, error, skipped), etc...  table "timetable"   partitionkey: yyyymmdd   rowkey: yyyymmddhhmm_<guid>   attributes: job_partitionkey, job_rowkey 

the idea runs table have complete history of jobs per object , timetable have list of jobs run in future.

some assumptions:

  • a job never span more 1 object
  • there ever 1 pending job per object
  • the "job" lightweight e.g. posting message queue

the system must able perform these tasks:

  • execute pending jobs

    1. query records in "timetable" "partition <= today" , "rowkey <= today"
    2. for each record (in parallel)
      1. lookup job in jobs table via partitionkey , rowkey
      2. if "not exists" or state != pending skip
      3. execute "logic". if fails => log , maybe retry logic
      4. submit "next run date in timetable"
      5. submit "update state = processed" , "new job record (next run)" single transaction
    3. when finished => delete processed timetable records

    concern: 2 of 3 records modifications in transaction. overcome in way?

  • stop workflow stop/pause workflow object z

    1. query top 1 jobs in jobs table partitionkey
    2. if , state == pending update "cancelled"
    3. (no need bother cleaning timetable clean "when time comes")
  • start workflow

    1. create pending record in jobs table
    2. create record in timetable

in terms of "executing thing" using azure function or scheduler-thing execute pending jobs every 5 minutes or so.

any comments or suggestions highly appreciated.

thanks!

how using service bus instead? brokeredmessage class has property called scheduledenqueuetimeutc. can schedule when want jobs run via scheduledenqueuetimeutc property, , fuggedabouddit. can have triggered webjob monitors service bus messaging queue, , triggered near when job message enqueued. i'm big fan of relying on existing services minimize coding needed.


Comments

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -