Skip to content

Incident response

When something goes wrong, here is what happens.

We monitor the service through:

  • Uptime checks: External monitoring via Updown.io, checking the web interface, Git API, and CI runner API. Alerts fire on downtime.
  • Infrastructure metrics: Scaleway observability for compute, database, and storage. Alerts on resource exhaustion, error rate spikes, and latency.
  • Log analysis: Server logs retained for 30 days. Reviewed for anomalies.
  • User reports: Email to security@codebahn.net or via in-app support chat.

Incidents are classified by severity:

SeverityDefinitionExamples
CriticalData breach, unauthorized access to customer data, or complete service outageCross-tenant data leak, database compromise, full outage
HighPartial service degradation affecting multiple customers, or a vulnerability with high exploit potentialCI runners down, Git push failures, authentication bypass
MediumLimited impact, single-customer issue, or a vulnerability requiring specific conditionsSlow API responses, backup verification failure, low-severity vulnerability
LowCosmetic, informational, or no direct user impactLogging gap, documentation error, non-exploitable misconfiguration
SeverityResponse timeCommunication
CriticalImmediate (within 1 hour)Email to affected customers, status page update
HighWithin 4 hoursStatus page update, email if customer-facing
MediumWithin 1 business dayStatus page if user-visible
LowWithin 5 business daysNo external communication unless relevant

For incidents involving personal data, we notify affected data controllers within 48 hours per our Data Processing Agreement and GDPR Article 33.

  • Status page: status.codebahn.net for real-time service status and incident updates.
  • Email: Direct notification to affected customers for Critical and High incidents.
  • In-app: Support chat for individual follow-up.

We do not use social media for incident communication.

Every Critical and High incident gets a post-incident review within 5 business days. The review covers:

  1. Timeline: What happened and when.
  2. Root cause: Why it happened.
  3. Impact: What was affected and for how long.
  4. Response: What we did and how fast.
  5. Prevention: What changes prevent recurrence.

We publish a summary for incidents with broad user impact. The summary includes the timeline, root cause, and prevention measures. It does not include internal operational details that could aid future attacks.

If recovery from backup is needed:

  • Daily encrypted backups are available, stored on a separate provider in a separate EU region.
  • Backups are verified weekly with tested restore procedures.
  • Recovery target: restore service from backup within hours, not days.

For the full backup details, see the security overview.