Skip to content

Cron Jobs

What is cronapp.py?

cronapp.py is a separate Flask application from flaskapp.py. It runs on a schedule (configured externally in AWS) and performs operations that need to happen periodically across all agency databases.

The critical difference from flaskapp.py:

  • flaskapp.py connects to one agency’s database (the instance’s dedicated DB)
  • cronapp.py connects to all agency databases in sequence

The multi-database iteration pattern

# Pseudocode of cronapp.py's main pattern:
databases = get_all_agency_databases() # SHOW DATABASES filtered to agency DBs
for db_name in databases:
try:
engine = create_engine(f"mysql+pymysql://{user}:{password}@{host}/{db_name}")
with app.app_context():
db.session.bind = engine
run_job_for_this_agency(db_name)
except Exception as e:
log_error(e)
continue # never let one agency's failure stop others

Files

  • cronapp.py — 42 KB — the main cron application
  • Directoryorchid/
    • Directoryutils/
      • cron_metrics.py — cron job performance tracking

Adding a new cron job

If a cron job is truly necessary, follow this pattern:

def my_new_cron_job(db_name, engine):
"""
Brief description of what this job does.
Runs once per agency per cron schedule.
"""
try:
# Use db_name and engine to query this agency's database
results = db.session.execute(
text("SELECT id FROM case WHERE status = 'active'")
).fetchall()
for row in results:
# process each result
pass
cron_metrics.record_success('my_new_cron_job', db_name)
except Exception as e:
cron_metrics.record_failure('my_new_cron_job', db_name, e)
log.error(f"my_new_cron_job failed for {db_name}: {e}")
# Do NOT re-raise — let the loop continue to the next agency
# In the main cron loop in cronapp.py:
for db_name in databases:
with engine_for(db_name):
my_new_cron_job(db_name, engine)

Checklist for new cron jobs

  • Error handling with try/except that does not re-raise
  • continue on failure so the next agency still runs
  • cron_metrics.record_success/failure calls for monitoring
  • Tested against a single agency database before deploying
  • No external API calls inside the per-agency loop (or rate-limit-aware if unavoidable)
  • Documented here with what it does and how often it runs

Cron schedule configuration

The cron schedule is configured externally in AWS (not in cronapp.py itself). The schedule determines how often cronapp.py is invoked. Common schedules:

Job typeTypical schedule
Notification sendsEvery 15 minutes
Email sync monitoringEvery 30 minutes
Health checksHourly
Overnight reportsDaily at 2 AM

To change a schedule, update the AWS CloudWatch Events / EventBridge rule that triggers the cron EB environment.

Cron metrics

orchid/utils/cron_metrics.py tracks:

  • How long each cron job takes per agency
  • Success/failure counts per agency
  • Total run duration

Use cron_metrics.record_success() and cron_metrics.record_failure() in every cron job function. This data surfaces in CloudWatch and helps diagnose slow or failing cron runs.

Debugging cron failures

When a cron job fails silently:

  1. Check CloudWatch logs for the cron EB environment (it is a separate environment from the web app). Filter by job name.
  2. Check cron_metrics tables in the database for failure records.
  3. Run the job manually by invoking cronapp.py directly on a test machine pointing to a single agency database:
Terminal window
python cronapp.py --agency=local_main # if --agency flag is supported

Or temporarily add if db_name != 'my_test_db': continue to the loop to target one agency.