Cron Jobs

What is cronapp.py?

cronapp.py is a separate Flask application from flaskapp.py. It runs on a schedule (configured externally in AWS) and performs operations that need to happen periodically across all agency databases.

The critical difference from flaskapp.py:

flaskapp.py connects to one agency’s database (the instance’s dedicated DB)
cronapp.py connects to all agency databases in sequence

The multi-database iteration pattern

# Pseudocode of cronapp.py's main pattern:
databases = get_all_agency_databases()  # SHOW DATABASES filtered to agency DBs

for db_name in databases:
    try:
        engine = create_engine(f"mysql+pymysql://{user}:{password}@{host}/{db_name}")
        with app.app_context():
            db.session.bind = engine
            run_job_for_this_agency(db_name)
    except Exception as e:
        log_error(e)
        continue  # never let one agency's failure stop others

Files

cronapp.py — 42 KB — the main cron application
Directoryorchid/
- Directoryutils/
  - cron_metrics.py — cron job performance tracking

Adding a new cron job

If a cron job is truly necessary, follow this pattern:

def my_new_cron_job(db_name, engine):
    """
    Brief description of what this job does.
    Runs once per agency per cron schedule.
    """
    try:
        # Use db_name and engine to query this agency's database
        results = db.session.execute(
            text("SELECT id FROM case WHERE status = 'active'")
        ).fetchall()

        for row in results:
            # process each result
            pass

        cron_metrics.record_success('my_new_cron_job', db_name)

    except Exception as e:
        cron_metrics.record_failure('my_new_cron_job', db_name, e)
        log.error(f"my_new_cron_job failed for {db_name}: {e}")
        # Do NOT re-raise — let the loop continue to the next agency


# In the main cron loop in cronapp.py:
for db_name in databases:
    with engine_for(db_name):
        my_new_cron_job(db_name, engine)

Checklist for new cron jobs

Error handling with try/except that does not re-raise
continue on failure so the next agency still runs
cron_metrics.record_success/failure calls for monitoring
Tested against a single agency database before deploying
No external API calls inside the per-agency loop (or rate-limit-aware if unavoidable)
Documented here with what it does and how often it runs

Cron schedule configuration

The cron schedule is configured externally in AWS (not in cronapp.py itself). The schedule determines how often cronapp.py is invoked. Common schedules:

Job type	Typical schedule
Notification sends	Every 15 minutes
Email sync monitoring	Every 30 minutes
Health checks	Hourly
Overnight reports	Daily at 2 AM

To change a schedule, update the AWS CloudWatch Events / EventBridge rule that triggers the cron EB environment.

Cron metrics

orchid/utils/cron_metrics.py tracks:

How long each cron job takes per agency
Success/failure counts per agency
Total run duration

Use cron_metrics.record_success() and cron_metrics.record_failure() in every cron job function. This data surfaces in CloudWatch and helps diagnose slow or failing cron runs.

Debugging cron failures

When a cron job fails silently:

Check CloudWatch logs for the cron EB environment (it is a separate environment from the web app). Filter by job name.
Check cron_metrics tables in the database for failure records.
Run the job manually by invoking cronapp.py directly on a test machine pointing to a single agency database:

python cronapp.py --agency=local_main  # if --agency flag is supported

Or temporarily add if db_name != 'my_test_db': continue to the loop to target one agency.