Start a conversation

High CPU Utilization on SMSC RTRs Due to Third-Party Agent

Overview

A sudden and sustained increase in CPU utilization was observed on two SMSC RTR nodes (RTR01 and RTR04). This increase was identified via internal monitoring tools and confirmed by performance logs and system diagnostics. The abnormal CPU usage was not linked to any configuration or software changes within the Lithium platform. Initial system logs indicated abnormal behavior commencing late evening on April 23 and early morning April 24, prompting further investigation.



Solution

The root cause of the high CPU usage was identified as an external process called xagt, which is associated with a third-party security agent developed by FireEye (now Trellix). This agent is not a component of the NewNet Lithium system.


Steps and Observations:

  1. Initial Diagnostics:
    • CPU increase was reported to be abrupt, peaking around 45%.
    • The issue was observed across the day with no specific time pattern.
    • No changes were made to the affected systems on or around April 23 or 24.
  2. Log Analysis:
    • Syslog entries showed relevant activity aligning with the CPU spike:
      • RTR01: Significant activity started at 8 PM on April 23.
      • RTR04: Similar activity began at 3 AM on April 24.
  3. Process Identification:
    • Using the top command outputs during high CPU usage, xagt was identified as the top resource-consuming process.
    • This process is not part of the Lithium or SMSC software stack.
    • To check the CPU usage and the RAM RSS usage, execute the commands below:
    • top
    • ps aux --sort -rss
  4. Conclusion and Recommendation:
    • The xagt agent is a FireEye (Trellix) endpoint protection or monitoring service.
    • Its high CPU usage suggests possible misconfiguration, resource contention, or malfunction.
    • To resolve this issue, coordination with the system or OS administration team is necessary to:
      • Review the behavior of the xagt service.
      • Evaluate recent updates or configuration changes to this agent.
      • Consider disabling or reconfiguring the agent to mitigate its impact on system performance.
      • These same steps would apply to any other Third-Party Agent causing CPU issues.

Reference Documentation

Choose files or drag and drop files
Was this article helpful?
Yes
No
  1. Randall Shawver

  2. Posted

Comments