Skip to main content

About this Session

When production incidents hit, SREs need to triage alerts, correlate signals across services, and pinpoint root causes quickly. In complex microservice architectures, knowing how to move between observability signals efficiently is critical.

 

In this hands-on workshop, you'll respond to two production incidents on a microservices-based ecommerce platform. You'll investigate using Real User Monitoring (RUM), Session Replay, Application Performance Monitoring (APM), Error Tracking, Infrastructure Monitoring, Log Management, and Metrics. For each incident, you'll isolate the root cause, execute a remediation, and build proactive monitoring. You'll also use Bits AI SRE to see how AI-powered investigation can accelerate your workflow.

 

By the end of this workshop, you'll have practical experience using Datadog's core observability tools to investigate, resolve, and monitor for real-world production incidents.

Related Sessions

DASH 2027 is coming—Be in the know

Sign up for exclusive previews and announcements. Join us in NYC, June 15-17, 2027.

Thank you for your signing up

You’re on the list to receive updates for Datadog DASH 2027!