Staff Engineer of Virtualization and Management Solutions, Networks Business, Samsung Electronics America
Vijaya Sarathi Mandadi
Successful operation models in any industry strive on adoption, speed and flexibility. Avoiding inertia is critical during major transformations and those who are most successful, are the ones who identify the performance gaps, opportunities and strive to succeed without resistance.
Traditional telco operations revolve around a ticketing system where the Network Operations Center (NOC) is involved in identifying the deviations in the network and tries to fix them. In the case of additional troubleshooting, the issue is escalated to the system design and development team for resolution involving configuration changes, unknown platform issues or a code change.
Now, add an open, virtualized/cloud environment, where there are multiple vendors and a new set of tools and processes that introduce new challenges. As complexity grows, operators are introducing automation and dynamic decisions that will enable optimized real-time RAN performance.
Integration of digital signal processing with machine learning is certain. AI-based spectrum sensing, energy saving, mobility-aware interference mitigation, and channel estimation along with 6G evolution involving integrated sensing and communication (ISAC), non-terrestrial networks (NTN), and heterogeneous situational-aware networks will require large-scale acceleration using GPUs, CPUs, TPUs and FPGA-based solutions. All of this modernization of telco operational requirements needs zero touch provisioning, closed loop automation and quick testing through digital twins.
Considering all the complexities, this change in networks will require operations teams to have exceptional domain knowledge and upskilling of how to identify issues or hurdles caused by these transformative technologies.
The Need for AI in Modern Telco Networks
The transition from LTE to 5G Standalone (SA) and 6G has fundamentally reshaped the operational data landscape for mobile operators, driving a significant increase in fault, performance, and configuration management data. In 5G SA and O-RAN–compliant networks, the number of network functions increased due to the disaggregation of the RAN into the RU, DU, CU, SMO, and Non/Near-RT RIC. All of these are becoming independent producers of telemetry and fault events.
As a result, daily fault management (FM) event volumes have risen due to AI-driven anomaly detectors and increased node granularity. Performance management (PM) data exhibits an even more dramatic growth trajectory – the combination of massive MIMO beam metrics, QoS-flow reporting, and slice-level KPIs expands per-cell PM datasets in advanced 5G deployments. 6G’s ISAC capabilities are projected to drive volumes to terabytes per day.
Configuration management (CM) data has similarly expanded, with the number of configurable parameters rising in 5G SA, and could grow even higher in 6G as AI-native RAN introduces dynamic beam configuration, slice lifecycle automation, and continuous RAN optimization loops. Collectively, FM/PM/CM data volumes are expected to increase from LTE to 5G to 6G. It’s apparent that the data explosion in modern telco networks is real.
While this richer telemetry remains essential for AI/ML training, real-time inference, and reinforcement learning, 6G’s AI-native architecture is expected to shift how it is consumed. More specifically, the dependency on FM/PM/CM for manual troubleshooting and issue identification is likely to reduce. As AI systems mature, well-designed embedded management platforms will increasingly automate analysis, correlation, and decision-making—thereby significantly reducing the need for FM/PM/CM from an operational perspective, while continuing to rely on it as a foundational input for AI systems.
Shift towards Virtualization
Traditionally, the network operates with default configurations and relies on manual monitoring of system logs and performance counters, which limits system performance. By the same token, traditional ticketing systems are built for manual network operations.
The shift to virtualized RAN (vRAN) started a decade ago, pioneered by Samsung and leading operators. vRAN disaggregated network functions from dependency on custom hardware platforms and offered operators the flexibility to select best-in-class telco-grade servers from multiple vendors combined with container as a service (CaaS) providers to offer deployment and planning flexibility. As part of this shift to cloud-native modern architectures, operations must move to more proactive methodologies using AI/ML technology.
Today, most operators still use alarm-based fault detection rather than predictive and automated fault correlation. This means that issues are identified once the customers are already impacted. Human technicians pick through alarms, correlate across systems, dispatch field crews, and troubleshoot on site. This is a slow, error-prone and expensive process.
Field operations (technicians, site visits, maintenance, truck rolls, spare parts logistics) remain a major cost driver for telcos. The shift to automation, orchestration and closed-loop operations offers a key way to reduce OPEX by minimizing manual tasks and the human-intensive process.
How AI Transforms Telco Operations
The telecommunications industry is entering a phase where AI is no longer just an enhancement – it is going to be the backbone of how networks are monitored, optimized, and operated. Traditional, reactive operations will be replaced by autonomous, multi-agent AI systems that continuously sense, analyze, and act across every layer of the network.
Today, modern networks generate massive amounts of telemetry, CPU/GPU loads, memory pressure, network counters, RF measurements, spectral scans, and subscriber-level behavior. By integrating AI, networks are able to transform all of these elements into actionable intelligence through the collected datasets. Furthermore, things like predictive maintenance, autonomous fault management and AI-driven optimization are now activated on the network.
Picture a network where multiple specialized AI agents are working in real time (figure 1 below):
⦁ Resource-Health Agents: Monitoring CPU overloads, GPU utilization, memory saturation, and process anomalies before they trigger outages.
⦁ RAN Intelligence Agents: Performing spectrum sensing, visualizing interference, detecting under-utilized bands, and helping UEs dynamically connect to the best possible spectrum.
⦁ Optimization Agents: Reallocating resources based on load, policy, and traffic type coordinated by an LLM- or LRM-based decision engine that uses reinforcement strategies such as proximal policy optimization (PPO) and group relative policy optimization (GRPO) to determine the most optimal action.
⦁ Fault & Anomaly Agents: Correlating alarms, filtering noise, detecting anomalies early, and reducing false positives through long-term trend learning.
⦁ Dynamic Configuration Agents: Adjusting network parameters autonomously to improve performance, boost spectrum efficiency, enable cell-sleep for energy savings, and maintain load balance.
⦁ Network Slicing Agents: Ensuring each slice receives the correct isolation, bandwidth, and SLA fulfillment in multi-tenant 5G/6G environments.
These agents work in unison to create a self-optimizing, self-healing network – one that identifies issues before they escalate, and continuously tunes itself for peak performance. In the long term, autonomous code correction through iterative improvement can enable truly zero-touch networks. However, in the near term, this capability will evolve within controlled CI/CD pipelines with appropriate guardrails, validation, and staged deployments. We expect to see full production-level autonomy once operator trust and system maturity continues to grow.
What This Means for Telco Operations
This shift fundamentally changes the role of operations engineers. Troubleshooting will no longer be about chasing reactive alarms. Instead, engineers must understand:
⦁ The causal chains behind AI decisions
⦁ How multiple agents coordinate actions
⦁ How reinforcement models reach policy-driven outcomes
⦁ How AI influences RAN, transport, and core behaviors in real time
Telco operations will become more efficient, and deeper domain knowledge combined with a strong understanding of AI, ML, and autonomous systems will be essential.
In summary, AI can fundamentally rewrite the playbook for telco operations by inclusion of autonomous agents carrying out predictive maintenance, dynamic configuration, anomaly detection, and real-time RAN optimization. Operators transition from reactive firefighting to proactive, intelligent network management. This evolution creates opportunities for higher performance and lower costs, but it also demands a new skillset and mindset across the industry.
Standardizing data formats and interfaces is therefore not optional, it is foundational for AI-driven RAN operations.
The Road Ahead: AI-Native 6G and Autonomous Networks
The future of RAN and telco operations is moving decisively toward AI-native architectures, where networks are deployed, optimized, and healed with minimal human intervention. Concepts like zero-touch provisioning, once considered aspirational, are quickly becoming reality. Thousands of sites, distributed applications, and RAN functions will be brought online through AI-driven orchestration and autonomous lifecycle management.
AI is reshaping telco operations at every layer including RAN, transport, core, OSS, field operations, and customer experience. What was once reactive troubleshooting is now transitioning to predictive intelligence, and soon toward fully autonomous self-optimizing networks. This shift is more than a technology upgrade. It is a fundamental redefinition of how networks operate, how engineers work, and how operators create value.
At the far edge, AI compute embedded at cell sites and other locations will become a defining characteristic of next-generation networks. These sites will evolve into AI factories – localized inference hubs capable of supporting real-time decision making for enterprise and consumer applications. Operators will gain new revenue opportunities through resource sharing and multi-tenant edge compute, on-premise/near-premise AI inference services, low-latency application hosting for vertical industries and dynamic RAN automation as a service.
Telcos that embrace AI-native architectures will evolve from connectivity providers into AI-driven digital service platforms running intelligent, distributed, real-time systems that power the next wave of industry, innovation, and human experience.
The road ahead is not just an evolution of existing telecom infrastructure, it is a transformation toward highly intelligent, distributed, and autonomous systems that blur the boundaries between communication, sensing, and computing.
The journey to AI-native networks for 5G & 6G has only begun, but one thing is clear: AI will not just support future networks, AI will be the network.
We will help you find the right solution for your business.
By selecting CONTINUE, you will be entering a website of
website is governed by its own privacy policy, level of security and terms of use
Thank you!
Your enquiry has been successfully submitted. We will get back to you shortly.