Kumofire Jobs Architecture
Scope
This document defines the architecture and protocol direction for Kumofire Jobs.
The initial execution foundation remains:
- create
- dispatch
- consume
- retry
- status management
The architecture should also leave room for:
- recurring jobs
- manual runs
- future multi-step orchestration
Runtime Model
Kumofire Jobs assumes the following runtime components:
- application side
- manages job definitions
- creates manual runs where needed
- stores
kumofire_job_run_idin its own records - reads run status and history through the Kumofire API
- dispatcher side
- scans due schedules and due runs
- creates runs from schedules
- dispatches due runs to the queue
- consumer side
- consumes queue messages
- executes handlers
- updates run status
- storage adapter
- provides the single source of truth for definitions, schedules, and runs
- queue adapter
- delivers
kumofireJobRunId
- delivers
These components may live in a single Worker or be split across multiple Workers. The initial implementation treats Cloudflare D1 and Cloudflare Queues as the first adapters.
Protocol
The protocol is intended to be language-agnostic. It is defined around schema, queue message format, and state transitions rather than TypeScript-specific helpers.
Entities
Job
A Job is the definition of work.
It answers:
- what this job is
- which handler should run
- what default input or options are associated with it
Recommended logical fields:
idnamehandlerpayload_templatedefault_optionscreated_atupdated_at
This row should hold definition-oriented information and remain relatively stable.
Job Schedule
A Job Schedule is a rule that decides when a job should be started.
It answers:
- when this job should run
- whether the schedule is enabled
- what the next due time is
Recommended logical fields:
idjob_idschedule_typeonceintervalcron
schedule_exprtimezonenext_run_atlast_scheduled_atenabledcreated_atupdated_at
The important point is that this is not the execution itself. It is the rule that creates or authorizes executions.
One job may have multiple schedules.
Job Run
A Job Run is one concrete execution instance.
It answers:
- which job ran
- which schedule, if any, caused it
- when it was supposed to run
- what actually happened
Recommended logical fields:
idjob_idjob_schedule_id- nullable for manual runs
statusscheduled_forstarted_atfinished_atattemptinput_payloadoutput_payloaderror_messagecreated_atupdated_at
This is the execution unit of the system. Queue dispatch and consumption should operate on Job Runs, not Job definitions.
Job Lock
A Job Lock is a lease used to suppress duplicate execution of the same Job Run.
Required logical fields:
job_run_idlease_untilupdated_at
The lock is treated as a lease, not as a permanent lock. The design does not assume perfect atomicity between run state updates and lock release.
Queue Message
The queue message payload should be minimal.
{ "kumofireJobRunId": "01J..." }The queue must not carry the full payload or job state. It should only deliver kumofireJobRunId, while the source of truth remains in the storage adapter. It must not expose jobId, payload, schedule metadata, or other internal identifiers.
Run Status Model
The initial run statuses are:
scheduledqueuedrunningsucceededfailedcanceled
Meaning:
scheduled- run exists in storage
- not yet enqueued
- becomes dispatchable when
scheduled_for <= now
queued- already dispatched to the queue
- waiting for consumer pickup
running- acquired by a consumer and currently executing a handler
succeeded- completed successfully
failed- ended without retry, or exceeded retry limit
canceled- removed from execution by an external action
State Transitions
Base transitions for a Job Run:
- create run
- a new run is created as
scheduled
- a new run is created as
- dispatch
- a run with
status = scheduledandscheduled_for <= nowis sent to the queue - dispatch moves the run to
queued
- a run with
- consume start
- after acquiring a lock, the consumer moves the run from
queuedtorunning
- after acquiring a lock, the consumer moves the run from
- success
running -> succeeded
- retry
running -> scheduled- increments
attemptand sets a newscheduled_for
- terminal failure
running -> failed
State transitions should be idempotent and must tolerate duplicate processing under at-least-once delivery.
Create Protocol
jobs.create(...) should create a Job Run.
For a manual or one-shot invocation it does the following:
- resolves the Job definition
- creates a
job_runsrow - sets
job_schedule_id = nullfor direct manual creation - stores
runAtintoscheduled_forif provided - may dispatch immediately when
runAt <= now - only registers the run when
runAt > now, leaving later dispatch to the dispatcher
When deduplication is used, deduplication applies to run creation, not to the Job definition itself.
The application-facing identifier produced here is the Job Run ID. Applications should persist that value in their own records as kumofire_job_run_id. Applications should not persist or depend on Kumofire internal identifiers beyond that boundary.
Schedule Dispatch Protocol
The dispatcher has two responsibilities.
1. Dispatch due runs
Select Job Runs that satisfy:
status = scheduledscheduled_for <= now
Then send each selected kumofireJobRunId to the queue.
2. Materialize due schedules
Select Job Schedules that satisfy:
enabled = truenext_run_at <= now
For each due schedule:
- create a Job Run
- stamp
job_id - stamp
job_schedule_id - set
scheduled_for - advance
next_run_atto the next due time
This allows one global scheduler trigger to handle:
- one-shot delayed runs
- retries
- recurring jobs
Consume Protocol
For each queue message, jobs.consume(...) does the following:
- validates schema version
- reads the Job Run row from
kumofireJobRunId - resolves the Job definition from
job_id - acquires a Job Lock using a lease
- moves the Job Run from
queuedtorunning - restores the input payload
- executes the handler
- moves the Job Run to
succeededon success - increments
attemptand setsscheduledplus a newscheduled_forwhen retryable - moves the Job Run to
failedwhen not retryable
Lock release may fail, so the design relies on lease expiry for natural recovery. Any remaining chance of re-execution must ultimately be absorbed by handler idempotency.
Ownership Boundaries
Kumofire Jobs owns:
- job definition protocol
- schedule protocol
- run state
- retry state
- lock state
- dispatch / consume protocol
The application owns:
- handler implementation
- business results produced by handlers
- API exposure and authorization
- its own persistence of
kumofire_job_run_id
The application boundary is intentionally narrow:
- Kumofire exposes
kumofireJobRunId - the application stores it as
kumofire_job_run_id - the application fetches job status or run details through the Kumofire API using that value
The application should not directly read Kumofire tables with SQL, and should not couple itself to job_id, schedule IDs, retry metadata, or queue payload details. The application reads final business results from its own tables or storage after completion.
Adapters
Kumofire Jobs core does not depend directly on a specific database or message queue. Instead, it implements the protocol through a storage adapter and a queue adapter interface.
Storage Adapter
The storage adapter is responsible for at least:
- managing Job definitions
- managing Job schedules
- creating Job Runs
- fetching Job Runs and Job definitions
- persisting run status transitions
- listing dispatchable runs
- listing due schedules
- updating retry state
- advancing schedule state
- acquiring and expiring lease locks
- validating schema version
The initial implementation uses a D1 adapter.
Queue Adapter
The queue adapter is responsible for at least:
- sending
kumofireJobRunIdmessages - receiving messages in a form the consumer can interpret
The initial implementation uses a Cloudflare Queues adapter.
Design Intent
This separation keeps the core protocol and job lifecycle vendor-agnostic while still allowing Cloudflare to be the first concrete adapter implementation.
The three-way split:
jobsjob_schedulesjob_runs
is the intended long-term model because it separates:
- what the job is
- when it should run
- what actually happened