Logo Phare status page

New status page certificates failing

  • Resolved
    The issue has been resolved, new custom domains can now be created.
  • Investigating
    Phare reached a provider-defined limit in used status pages certificates hostnames, this prevents the use of new custom domains for status pages. Existing status page with custom domains are not affected.

Degraded access to dashboard, website and documentation

  • Resolved
    The Phare dashboard, website, and documentation were inaccessible for about 10 minutes, due to increased latency from Phare's CDN provider. Uptime monitoring and API remained unaffected.

Delay in sending emails

  • Resolved
    The Scaleway team fixed the issue and released an incident report https://status.scaleway.com/incidents/cm1lmzpwdcr6
  • Identified
    Scaleway is currently delaying all transaction emails for up to 10 minutes. Support has been contacted to restore normal delivery times.

United Kingdom region rerouted

  • Resolved
    The monitoring agent has been routed back to the United Kingdom and is now working correctly.
  • Monitoring
    The United Kingdom monitoring region as been temporarily rerouted to Spain to provide functional uptime monitoring to all users. The issue is actively monitored and Bunny.net is working on a permanent fix.
  • Identified
    The issue is partially fixed, only domains proxied by Bunny.net are now affected in the United Kingdom region.
  • Identified
    A network issue is preventing requests emitted from the United Kingdom region to reach certain hosts. Phare's hosting provider has been alerted and is actively working on a resolution to fix this problem.
  • Investigating
    The monitoring agent running in the United Kingdom is currently down.

Incorrect downtime notification

  • Resolved
    Rollback to a previous version of the monitoring agent is complete, erroneous incidents have been deleted. More extensive testing will be done to establish the reason behind the issue before future updates.
  • Identified
    The issue as been pinpointed to the latest update of the monitoring agent, a rollback as been scheduled..
  • Investigating
    A subset of monitor are sending erroneous downtime notifications

Temporary incident data loss

  • Resolved
    The issue as been resolved and new tests put in place to avoid regression. Incident data as been restored using a snapshot of the database created at 20:00 UTC, leaving a four-minute gap for permanent data loss, but none were identified during that duration.
  • Identified
    The latest deployment introduced a cascading bug on incident deletion around 20:04 UTC, only programmatic incidents are affected and not shown on the user dashboard and status pages.

Scheduled maintenance

  • Resolved
    The migration as been completed.
  • Monitoring
    A scheduled maintenance is currently being performed to migrate the uptime monitoring engine to a new infrastructure. Some monitoring checks might be skipped for a short period of time, and historical data will not be available in the dashboard for a few minutes.

Inacurate performance measurement

  • Resolved
    The issue causing inaccurate performance statistics has been resolved for a few days now. We apologize for any inconvenience caused and appreciate your understanding during the resolution of this issue.
  • Investigating
    As a side effect of the monitoring infrastructure migration, reported performance statistics are inaccurate, mostly in the Europe and America regions. We are actively working on a fix.

Some monitoring checks are not executed

  • Resolved
    The infrastructure migration is successful, and uptime monitoring isn't affected anymore.
  • Monitoring
    The new infrastructure is working as expected, a progressive migration is now in place with a 100% target set on November 18 at 00:00 UTC.
  • Monitoring
    Phare's edge monitoring infrastructure as been replicated on a new provider, Bunny.net. Ten percent of monitoring requests will now be routed to this new infrastructure, to make sure the quality of the monitoring stay consistent between platforms.
  • Identified
    The issue has been traced to an update in Cloudflare routing rules between workers, which was released in a rolling deployment affecting only a few monitor checks starting November 11 at 19:00 UTC up to 100% on November 12 around 17:00 UTC. 
    
    All requests are routed to Germany for the moment to keep the system working at best capacity, until a workaround can be deployed to allow accurate multi-region checks.
  • Investigating
    A workaround as been implemented to re-route all requests through the Germany region while other regions are working properly again.
  • Investigating
    Uptime monitoring is affected in every region, we are currently investigating the issue.

Documentation is down

  • Resolved
    Phare's documentation is back online after some unexpected downtime from our hosting provider