CeleryRedisPythonBackendAsync

Celery and Redis for Background Tasks: Patterns for Reliable Async Workflows

30 July 2025·8 min read·Harshit Gupta

TL;DR

Celery + Redis is the standard Python async task stack for a reason. Key patterns for production reliability: task idempotency (always design tasks to be safely retried), exponential backoff with jitter, task routing to dedicated queues, result backend for tracking, and Flower for real-time monitoring. This post covers all five with production examples.

Why Async Tasks Are Non-Negotiable at Scale

At CertifyMe, credential issuance involves: generating a PDF, anchoring a hash on blockchain, sending an email, and posting a webhook to the issuing organization's systems. Combined latency: 4-12 seconds depending on blockchain network congestion. Without async tasks, every credential issuance is a 12-second HTTP request — that's not a web app, that's a loading screen.

The moment any operation takes more than ~500ms, it belongs in a background task queue. The HTTP handler returns immediately with a task ID. The client polls or receives a webhook when the work is done. This pattern makes your API fast, makes retries automatic, and makes failures observable.

Basic Celery Setup with Redis

# celery_app.py
from celery import Celery

def make_celery(app_name: str) -> Celery:
    return Celery(
        app_name,
        broker='redis://localhost:6379/0',
        backend='redis://localhost:6379/1',  # separate DB for results
        include=['tasks.issuance', 'tasks.notifications', 'tasks.webhooks']
    )

celery = make_celery('certifyme')

# Configuration
celery.conf.update(
    task_serializer='json',
    result_serializer='json',
    accept_content=['json'],
    timezone='UTC',
    enable_utc=True,
    task_track_started=True,       # STARTED state tracking
    task_acks_late=True,           # ACK only after task completes (safer)
    worker_prefetch_multiplier=1,  # One task at a time per worker (fair)
)

Use separate Redis DBs for broker and backend

The broker queue and result backend have very different access patterns. Keeping them in separate Redis DBs (or separate Redis instances in production) prevents result backend reads from competing with queue operations, and makes it easy to flush results independently of the queue.

Pattern 1: Idempotent Tasks

The most critical production pattern. Any task that can be retried (all of them, eventually) must be idempotent — running it twice must produce the same result as running it once:

from celery import shared_task
from celery.utils.log import get_task_logger

logger = get_task_logger(__name__)

@shared_task(
    bind=True,
    max_retries=3,
    default_retry_delay=60,
    name='tasks.issuance.issue_credential'
)
def issue_credential(self, credential_id: str) -> dict:
    logger.info(f"Processing credential {credential_id}", extra={"task_id": self.request.id})

    # Idempotency check — if already issued, return early
    credential = Credential.get(credential_id)
    if credential.status == 'issued':
        logger.info(f"Credential {credential_id} already issued, skipping")
        return {"status": "already_issued", "credential_id": credential_id}

    try:
        pdf_url = generate_pdf(credential_id)
        blockchain_tx = anchor_on_blockchain(credential_id, pdf_url)
        send_issuance_email(credential_id, pdf_url)

        credential.mark_issued(pdf_url=pdf_url, blockchain_tx=blockchain_tx)
        return {"status": "issued", "credential_id": credential_id}

    except BlockchainTimeoutError as exc:
        # Retryable error — exponential backoff
        raise self.retry(exc=exc, countdown=2 ** self.request.retries * 30)
    except InvalidCredentialDataError as exc:
        # Non-retryable error — fail immediately, don't retry
        logger.error(f"Non-retryable failure for {credential_id}: {exc}")
        credential.mark_failed(reason=str(exc))
        raise

Pattern 2: Task Routing and Priority Queues

All tasks should not compete equally for workers. High-priority tasks (user-triggered actions) should not wait behind low-priority tasks (scheduled reports). Use dedicated queues with dedicated worker pools:

# Task routing configuration
celery.conf.task_routes = {
    'tasks.issuance.*': {'queue': 'high_priority'},
    'tasks.notifications.*': {'queue': 'default'},
    'tasks.reports.*': {'queue': 'low_priority'},
    'tasks.cleanup.*': {'queue': 'low_priority'},
}

# Start workers with dedicated queues
# High priority: more workers
# celery -A celery_app worker -Q high_priority --concurrency=8 --hostname=hp@%h
# Low priority: fewer workers
# celery -A celery_app worker -Q low_priority,default --concurrency=2 --hostname=lp@%h

Pattern 3: Task State Tracking via the API

For long-running tasks, expose the task state to the frontend so users get real-time progress feedback:

from flask import jsonify
from celery.result import AsyncResult

@app.route('/credentials/issue', methods=['POST'])
def trigger_issuance():
    credential_id = request.json['credential_id']
    task = issue_credential.delay(credential_id)
    return jsonify({"task_id": task.id, "status": "queued"}), 202

@app.route('/tasks//status')
def task_status(task_id):
    result = AsyncResult(task_id, app=celery)
    response = {
        "task_id": task_id,
        "state": result.state,  # PENDING / STARTED / SUCCESS / FAILURE / RETRY
    }
    if result.state == 'SUCCESS':
        response['result'] = result.result
    elif result.state == 'FAILURE':
        response['error'] = str(result.info)
    elif result.state == 'RETRY':
        response['retry_count'] = result.info.get('retries', 0)
    return jsonify(response)

Pattern 4: Periodic Tasks with Celery Beat

from celery.schedules import crontab

celery.conf.beat_schedule = {
    'cleanup-expired-sessions': {
        'task': 'tasks.cleanup.purge_expired_sessions',
        'schedule': crontab(minute=0, hour=2),  # 2 AM daily
    },
    'generate-daily-report': {
        'task': 'tasks.reports.generate_daily_issuance_report',
        'schedule': crontab(minute=0, hour=6),  # 6 AM daily
    },
    'retry-stuck-credentials': {
        'task': 'tasks.issuance.retry_stuck',
        'schedule': crontab(minute='*/15'),  # every 15 minutes
    },
}

Only run one Celery Beat instance

Celery Beat is a scheduler — running multiple instances will trigger duplicate task executions. In a multi-server deployment, use a distributed lock or run Beat on exactly one instance. This is one of the most common Celery production mistakes.

Key Takeaways

Any operation over ~500ms belongs in a background task — keep HTTP handlers fast
All tasks must be idempotent — always check "already done" before doing work
Distinguish retryable errors (backoff and retry) from non-retryable (fail fast)
Dedicated queues with dedicated worker pools prevent priority inversion
Expose task state via API endpoints for real-time user feedback on async operations
Only one Celery Beat instance per deployment — enforce with distributed locking

Back to All Posts

Written by Harshit Gupta