Patterns & Recipes
This guide covers two operational patterns that come up repeatedly when
running django-cloudflareimages-toolkit in production but that the
package itself deliberately does not bake in: resilience against
Cloudflare API outages, and image-access authorization (including
role-based permissions and dynamic watermarking).
Both are buildable on top of the package’s existing primitives —
cloudflare_service, CloudflareImage, CloudflareImageTransform,
and standard DRF permission classes. The recipes below show concrete,
copy-pasteable code.
Resilience: handling a Cloudflare Images API outage
cloudflare_service makes synchronous HTTPS calls against
api.cloudflare.com. When the Cloudflare control plane is degraded or
unreachable, those calls fail and raise
django_cloudflareimages_toolkit.exceptions.CloudflareImagesError —
the package surfaces the error rather than hiding it. The patterns below
show how to layer retries, a circuit breaker, and graceful degradation
on top of those primitives without forking the package.
Retry with exponential backoff
Transient Cloudflare blips (5xx, connection resets, DNS hiccups) usually clear within a few seconds. Wrap each service call in a backoff loop and log every retry so operators can spot a real outage versus normal noise.
import logging
import random
import time
from functools import wraps
from typing import Callable, TypeVar, ParamSpec
from django_cloudflareimages_toolkit.exceptions import CloudflareImagesError
logger = logging.getLogger(__name__)
P = ParamSpec("P")
T = TypeVar("T")
def retry_cloudflare(
attempts: int = 4,
base_delay: float = 0.5,
max_delay: float = 8.0,
) -> Callable[[Callable[P, T]], Callable[P, T]]:
"""Retry a Cloudflare call with capped exponential backoff + jitter.
Only retries ``CloudflareImagesError`` — caller bugs (TypeError,
ValueError, etc.) bubble immediately so they fail loudly in tests.
"""
def decorator(fn: Callable[P, T]) -> Callable[P, T]:
@wraps(fn)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
last_exc: CloudflareImagesError | None = None
for attempt in range(1, attempts + 1):
try:
return fn(*args, **kwargs)
except CloudflareImagesError as exc:
last_exc = exc
if attempt == attempts:
break
delay = min(
max_delay,
base_delay * (2 ** (attempt - 1)),
)
# Decorrelated jitter so concurrent retriers don't
# all hammer Cloudflare at the same moment.
delay = random.uniform(base_delay, delay)
logger.warning(
"Cloudflare call failed (attempt %d/%d), retrying in %.2fs: %s",
attempt,
attempts,
delay,
exc,
)
time.sleep(delay)
assert last_exc is not None
raise last_exc
return wrapper
return decorator
from django_cloudflareimages_toolkit.services import cloudflare_service
@retry_cloudflare(attempts=4, base_delay=0.5, max_delay=8.0)
def create_upload_url_with_retry(user, **kwargs):
return cloudflare_service.create_direct_upload_url(user=user, **kwargs)
Circuit breaker — fail fast during a real outage
Retries help with blips, but during a multi-minute Cloudflare outage they just multiply the load on a struggling control plane and slow down every request thread. A circuit breaker trips after consecutive failures, fails calls immediately for a cooldown window, then probes to see if the API is back.
The example uses Django’s cache as the shared state store, which means it works across processes and Gunicorn workers without extra infra.
from django.core.cache import cache
from django_cloudflareimages_toolkit.exceptions import CloudflareImagesError
_CB_KEY = "cf_images:circuit_state"
_CB_FAILURE_THRESHOLD = 5
_CB_OPEN_SECONDS = 30
class CircuitOpen(CloudflareImagesError):
"""Raised when the breaker is open. Subclasses CloudflareImagesError
so existing exception handlers catch it transparently."""
def call_with_circuit_breaker(fn, /, *args, **kwargs):
state = cache.get(_CB_KEY) or {"failures": 0, "open_until": 0}
import time as _t
now = _t.time()
if state["open_until"] > now:
raise CircuitOpen("Cloudflare Images breaker is open")
try:
result = fn(*args, **kwargs)
except CloudflareImagesError:
failures = state["failures"] + 1
if failures >= _CB_FAILURE_THRESHOLD:
cache.set(_CB_KEY, {"failures": 0, "open_until": now + _CB_OPEN_SECONDS}, timeout=_CB_OPEN_SECONDS + 5)
else:
cache.set(_CB_KEY, {"failures": failures, "open_until": 0}, timeout=300)
raise
else:
if state["failures"]:
cache.set(_CB_KEY, {"failures": 0, "open_until": 0}, timeout=300)
return result
Graceful degradation in the request path
For read paths (e.g. rendering a page that shows a user’s avatar), treat the Cloudflare URL as a cache-warmed asset and fall back to a placeholder when both the cache and the API are unavailable. Don’t make end users wait on a degraded control plane.
from django.core.cache import cache
from django_cloudflareimages_toolkit.models import CloudflareImage
from django_cloudflareimages_toolkit.exceptions import CloudflareImagesError
PLACEHOLDER_URL = "/static/img/avatar-placeholder.png"
def avatar_url_for(user) -> str:
cache_key = f"avatar:{user.pk}"
cached = cache.get(cache_key)
if cached:
return cached
try:
image = CloudflareImage.objects.filter(user=user, status="uploaded").first()
if image and image.is_uploaded:
url = image.public_url or image.get_variant_url("avatar")
cache.set(cache_key, url, timeout=300)
return url
except CloudflareImagesError:
# Cloudflare is degraded; serve the placeholder rather than
# blocking the page render. The next cache miss will retry.
pass
return PLACEHOLDER_URL
Failure handling for direct creator uploads
Direct creator uploads have three failure modes worth handling explicitly. Each maps to a concrete recovery path:
URL provisioning fails (create_direct_upload_url raises
CloudflareImagesError) — Cloudflare API was unreachable or refused the request. The user clicked “upload” and got nothing.Upload POST fails (the browser or server-side
requests.posttoimage.upload_urlerrors out) — Cloudflare’s edge accepted the URL but couldn’t accept the bytes. Likely transient.Cloudflare rejects the file after upload (webhook delivers a failure event,
check_image_statusreturnsstatus="failed") — Cloudflare took the bytes but processing failed (corrupt JPEG, unsupported format, too large, etc.). Not retryable.
The pattern below combines retry, user feedback, and a local fallback into a single end-to-end recipe.
import logging
from typing import Any
import requests
from django.contrib import messages
from django.core.cache import cache
from django.db import transaction
from django_cloudflareimages_toolkit.exceptions import CloudflareImagesError
from django_cloudflareimages_toolkit.services import cloudflare_service
logger = logging.getLogger(__name__)
@retry_cloudflare(attempts=4, base_delay=0.5, max_delay=8.0)
def _provision_upload_slot(user, metadata):
return cloudflare_service.create_direct_upload_url(
user=user, metadata=metadata, expiry_minutes=30
)
def _post_bytes(upload_url: str, blob: bytes, name: str) -> None:
"""Server-side POST with a short retry on transient network errors."""
last_exc: Exception | None = None
for attempt in range(1, 4):
try:
r = requests.post(
upload_url,
files={"file": (name, blob, "application/octet-stream")},
timeout=30,
)
r.raise_for_status()
return
except requests.RequestException as exc:
last_exc = exc
if attempt < 3:
time.sleep(0.5 * (2 ** (attempt - 1)))
assert last_exc is not None
raise last_exc
def upload_with_recovery(request, blob: bytes, filename: str):
"""End-to-end upload that notifies the user on every failure mode
and persists local state regardless of whether Cloudflare succeeds.
"""
# Step 1: provision the upload slot.
try:
image = _provision_upload_slot(
request.user,
metadata={"source": "user_upload", "ip": _client_ip(request)},
)
except CloudflareImagesError as exc:
logger.exception("Cloudflare URL provisioning failed")
messages.error(
request,
"We're having trouble reaching our image host. "
"Your file was NOT uploaded. Please try again in a few minutes.",
)
_record_failed_attempt(request.user, filename, reason="provision")
return None
# Step 2: post the bytes.
try:
_post_bytes(image.upload_url, blob, image.cloudflare_id)
except requests.RequestException as exc:
logger.exception("Cloudflare upload POST failed")
messages.warning(
request,
"Upload couldn't be completed. We've saved a draft locally — "
"you can retry without re-selecting your file.",
)
_store_local_fallback(request.user, image, blob, filename)
return image
# Step 3: confirm Cloudflare accepted the file.
try:
cloudflare_service.check_image_status(image)
image.refresh_from_db()
except CloudflareImagesError:
# Status check failed but the bytes were delivered; the
# webhook will eventually move the row to UPLOADED or FAILED.
# Don't block the user response on this.
pass
if image.status == "failed":
messages.error(
request,
"Your image was rejected (unsupported format or corrupt file). "
"Please pick a different file.",
)
return image
messages.success(request, "Image uploaded successfully.")
return image
def _store_local_fallback(user, image, blob: bytes, filename: str) -> None:
"""Persist the bytes so the user can retry without re-selecting.
Stash in cache (small footprint, expires automatically) and link
to the CloudflareImage row so the retry handler can pick up where
this attempt left off.
"""
key = f"upload_fallback:{user.pk}:{image.pk}"
cache.set(key, {"blob": blob, "filename": filename}, timeout=3600)
def retry_failed_upload(request, image_id: int):
key = f"upload_fallback:{request.user.pk}:{image_id}"
data = cache.get(key)
if not data:
messages.error(request, "Your previous upload has expired — please re-select your file.")
return None
image = cloudflare_service.create_direct_upload_url(user=request.user)
_post_bytes(image.upload_url, data["blob"], data["filename"])
cache.delete(key)
return image
Key behaviors:
User notification is per failure mode. Provisioning failure says “we couldn’t reach our image host”; upload failure says “we saved a draft, retry”; rejection says “your file is the problem.” That’s three distinct user states with three distinct recovery paths.
Retries are layered —
_provision_upload_slotretries the Cloudflare control plane,_post_bytesretries the edge upload, andcheck_image_statusis not retried because the webhook will drive the same state machine asynchronously.Local fallback uses Django’s cache rather than disk so it expires automatically and works across Gunicorn workers. For larger files, swap the cache for an S3-backed staging bucket.
The CloudflareImage row is always persisted even when the POST fails — that gives the retry handler a stable anchor and lets operators see how many uploads stalled at each step (a useful signal for a Cloudflare degradation dashboard).
Distributed processing pipeline
For write paths (uploads, status checks, deletions), push the work through a task queue (Celery, RQ, Dramatiq) so the request handler returns quickly and retries happen out-of-band against Cloudflare:
# tasks.py
from celery import shared_task
from django_cloudflareimages_toolkit.services import cloudflare_service
from django_cloudflareimages_toolkit.exceptions import CloudflareImagesError
@shared_task(
bind=True,
autoretry_for=(CloudflareImagesError,),
retry_backoff=True,
retry_backoff_max=300,
retry_jitter=True,
max_retries=10,
)
def check_image_status_async(self, image_id: int) -> None:
from django_cloudflareimages_toolkit.models import CloudflareImage
image = CloudflareImage.objects.get(pk=image_id)
cloudflare_service.check_image_status(image)
The view enqueues the task and returns immediately; the worker performs the polling with Celery’s built-in retry backoff. If Cloudflare is down for an hour the tasks stay queued and resume on recovery instead of failing user-visible requests.