Building a Proactive Incident Management Practice at Swish
Speakers
Itaú’s financial authorizer processes a massive share of Brazil’s instant payment system (PIX), operating under strict sub-500ms SLAs where every millisecond impacts revenue and customer trust. In this talk, Edinei Piovesan, Principal Software Engineer, will share how Itaú increased availability from below 99% to more than 99.95% by combining observability with operational discipline.
At the center are “OpsMeetings”: automated weekly SLO reports delivered to every team, with enforced accountability through live analysis of burn rates, latency spikes, and error budgets. Edinei will explain how this improved incident response while enabling teams to challenge assumptions, uncover hidden latency, and bring previously “untouchable” systems out of the black box.
He will also show how Itaú uses “Ask Bits” to standardize “war room prompts” and quickly answer availability and impact questions during incidents, while validating whether deployments reduce latency and SLO violations.
Attendees will leave with practical approaches to making performance visible across their organization, retrieving critical information faster during incidents, and using profiling to uncover performance bottlenecks and cost optimization opportunities.
Speakers
Speakers
Speakers