Cloudflare

Update November 02, 22:12 EDT: In an update to the incident report, the company says it partially restored power to its core data center in North America and is working to restore full functionality to impacted products.

"Power to Cloudflare’s core North America data center has been partially restored. Cloudflare has failed over some core services to a backup data center, which has partially remediated impact," Cloudflare said.

"Cloudflare is currently working to restore the remaining affected services and bring the core North America data center back online."

Update November 02, 20:12 EDT: A Cloudflare spokesperson told BleepingComputer that the root cause of this ongoing outage is a regional power issue caused by generator failures that took down facilities offline.

"We operate in multiple redundant data centers in Oregon that power Cloudflare’s control plane (dashboard, logging, etc). There was a regional power issue that impacted multiple facilities in the region. The facilities failed to generate power overnight. Then, this morning, there were multiple generator failures that took the facilities entirely offline," the spokesperson said.

"We have failed over to our disaster recovery facility and most of our services are restored. This data center outage impacted Cloudflare’s dashboards and APIs, but it did not impact traffic flowing through our global network. We are working with our data center vendors to investigate the root cause of the regional power outage and generator failures. We expect to publish multiple blogs based on what we learn and can share those with you when they're live."

An ongoing Cloudflare outage has taken down many of its products, including the company's dashboard and related application programming interfaces (APIs) customers use to manage and read service configurations.

The complete list of services whose functionality is wholly or partially impacted includes the Cloudflare dashboard, the Cloudflare API, Logpush, WARP / Zero Trust device posture, Stream API, Workers API, and the Alert Notification System.

"This issue is impacting all services that rely on our API infrastructure including Alerts, Dashboard functionality, Zero Trust, WARP, Cloudflared, Waiting Room, Gateway, Stream, Magic WAN, API Shield, Pages, Workers," Cloudflare said.

"Customers using the Dashboard / Cloudflare APIs are impacted as requests might fail and/or errors may be displayed."

Customers currently have issues when attempting to log into their accounts and are seeing 'Code: 10000' authentication errors and internal server errors when trying to access the Cloudflare dashboard.

Cloudflare says the service issues don't affect the cached file delivery via the Cloudflare CDN or Cloudflare Edge security features.

Cloudflare dashboard outage
Cloudflare outage (BleepingComputer)

Data center power outage behind dashboard and API issues

Two hours into the outage, the company revealed that the ongoing issues are due to power outages at multiple data centers.

"Cloudflare is assessing a loss of power impacting data centres while simultaneously failing over services. We will keep providing regular updates until the issue is resolved, thank you for your patience as we work on mitigating the problem," an incident report update said.

This is the second large outage that has hit Cloudflare since the start of the week, with the first one taking down multiple products, including Cloudflare Sites and Services (Access, CDN Cache Purge, Dashboard, Images, Pages, Turnstile, Waiting Room, WARP, Workers KV) on Monday, October 30.

As the company explained in a post-mortem published two days later, the Monday outage was caused by a misconfiguration in the tool used to deploy a new Workers KV build.

Workers KV is "used by both customers and Cloudflare teams alike to manage configuration data, routing lookups, static asset bundles, authentication tokens, and other data that needs low-latency access," Cloudflare's  Matt Silverlock and Kris Evans said.

"During this incident, KV returned what it believed was a valid HTTP 401 (Unauthorized) status code instead of the requested key-value pair(s) due to a bug in a new deployment tool used by KV."

Related Articles:

British Airways Cancels All London Flights Following Catastrophic IT Failure

Dell API abused to steal 49 million customer records in data breach

New Latrodectus malware attacks use Microsoft, Cloudflare themes

Telegram is down with "Connecting" error

Reddit down in major outage blocking access to web, mobile apps