Skip to main content
DASH NYC, June 9-10 | AI + Observability.

Back to Catalog

From Alerts to Autonomy: Scaling Incident Management at PUBG with Automation and AI

About this Session

Modern game platforms operate across regions, cloud providers, and highly dynamic workloads. At Krafton (maker of PUBG: Battlegrounds), fast-moving teams building player-facing and backend systems faced a key challenge: speed without autonomy creates friction, but autonomy without guardrails creates risk.

Junghun Kim, Lead of the DevOps Team at Krafton, will share how his team transformed incident management from a centralized SRE function into a developer-centric platform powered by Datadog. By combining unified observability, high-signal monitors, and integrated workflows across Datadog Incident Management, On-Call, and Slack, teams can now detect, declare, and respond to incidents with greater ownership.

He will explore how automation and AI reduce cognitive load during incidents, from automatically creating context-rich Slack war rooms, to enforcing safeguards such as scale-in prevention, capturing response and change context in the incident timeline, and assisting with postmortem reviews.

Through a representative incident-response walkthrough, attendees will gain a practical blueprint for building an incident response model that increases ownership, reduces alert fatigue, and enables faster detection, safer mitigation, and streamlined postmortems.

This session will be part of our livestream programming.

Related Sessions