
International Journal on Science and Technology
E-ISSN: 2229-7677
•
Impact Factor: 9.88
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 16 Issue 3
July-September 2025
Indexing Partners



















Toward Intelligent Incident Response: A Framework for Self-Healing Production Systems
Author(s) | Pranav Gorak |
---|---|
Country | United States |
Abstract | The modern world of technology challenges production systems since they work in environments that are constantly evolving with influence from CI/CD and widespread use of the cloud. They call for fast setup and at the same time reliable and resilient operation. Even though looking into system behaviors has become much easier with observability and code-based infrastructure, organizations still rely on manual actions during incident response. As a result, there are risks for late deployment, uneven behavior, and more cases of long outages when the code is deployed frequently or the infrastructure becomes extremely unpredictable. Therefore, the study suggests a detailed approach to organize how self-healing production systems work. The framework brings in real-time data from telemetry and ties it to the way applications are deployed and setup using GitOps workflows. Enabling Kubernetes to control the system and deploying on different clouds allowed the system to find anomalies ahead of time, trigger corrective actions, and cut down mean time to repair (MTTR). Using telemetry and declarative methods, the framework helps developers manage the recovery of systems safe and quickly. Experimental data proves that adding intelligent incident response to CI/CD improves the system’s stability, cuts risk at deployment time and increases the trust of both the development and operations teams. Thanks to automation of finding and resolving the common issues affecting infrastructure and applications, the platform moves production environments toward being more autonomous and steadier. In short, the results point to the fact that incorporating self-healing mechanisms is needed for reliability and at scale in the software systems used today. |
Field | Engineering |
Published In | Volume 15, Issue 4, October-December 2024 |
Published On | 2024-12-11 |
Share this


CrossRef DOI is assigned to each research paper published in our journal.
IJSAT DOI prefix is
10.71097/IJSAT
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.
