On June 12 2025, Sourcegraph hosted services (the “service”) experienced a site-wide outage due to an upstream providers incident. During the outage, some users were unable to access the service or were experiencing transient errors. The service was restored after the Cloud provider mitigated the underlying issue.
The affected service included:
Our service utilized multiple GCP services to host the Sourcegraph application, e.g., Cloud SQL, Google Kubernetes Engine, Cloud Run, Cloud Storage. Service Control, one of the GCP’s internal services, was on the critical path for almost all public and internal API requests, and was responsible for authentication, authentication, and quota enforcement. During the incident, Service Control was down due to an application issue and affected all downstream GCP services.
Our service relies on several GCP APIs to maintain basic functionality. For example, we used GCP Identity and Access Management (IAM) to permit workload to access the GCP-hosted datastore, such as Cloud SQL, and Object Storage. As these API endpoints were all affected by the Service Control outage, our service was inaccessible shortly after the Cloud provider breakage.
We confirmed all services were recovered at 2025-06-12 6:40 PM PDT.
In addition to the GCP service, one of our services, Sourcegraph Workspaces, was affected by Cloudflare outage. The service relied on Cloudflare Workers as a centralized router for all user requests. Between 2025-06-12 10:52 AM PDT and 2025-06-12 1:28 PM PDT Cloudflare Workers experienced a downtime where almost all users' requests were failing.
Our service remained inaccessible after Cloudflare Workers was restored due to the GCP incident above.
There are no follow-up actions to this incident. Our team has previously done tabletop exercises for this scenario where GCP recovery may take multiple days. As a worldwide outage we are susceptible to these scenarios and will bring back our services as soon as we can.