Start a conversation

NewNet Lithium MGR GUI Is Reachable but Stuck/Unresponsive (Restore <code>messages.xml</code> and Validate <code>/tmp/data</code>)

Overview

If the NewNet Lithium MGR web interface loads but becomes frozen (for example, “GUI is stuck and unresponsive”) and the “Manager process is not appearing,” a strong indicator is repeated MGR log entries stating the UI message catalog cannot be loaded.

The most common confirmed indicator is a missing or unreadable /var/TextPass/MGR/xml/messages.xml. This can prevent the GUI from rendering correctly and can generate excessive error/log I/O. Related conditions (such as missing session artifacts under /tmp/data, poller job failures, and extreme OS uptime) can further degrade responsiveness.

Solution

Table of Contents

Symptoms (How to Recognize This Issue)

You may observe one or more of the following:

  • The MGR GUI loads but becomes frozen: “GUI is stuck and unresponsive”
  • The manager function appears missing: “Manager process is not appearing”
  • RMD-side poller/service instability near the incident time, such as:
    • fta.service stop-sigterm timed out. Killing.
    • fta.service: main process exited, code=killed, status=9/KILL
    • Unit fta.service entered failed state.
    • Cannot create lock on /var/TextPass/MGR/pid/MGR.lock ...
  • In MGR application logs (typically under /var/TextPass/MGR/logs/), very frequent errors like:
    • I/O warning : failed to load external entity /var/TextPass/MGR/xml/messages.xml (often repeated thousands of times)
    • Issuing rollback() due to DESTROY without explicit disconnect() of DBD::mysql::db handle database=mgr_domain_main;localhost;3306;... at /usr/TextPass/lib/MGR/Db.pm line 351.
  • Legacy log file note: if /var/TextPass/MGR/tp_mgr_error_log.txt is not updating, use the newer/active logs under /var/TextPass/MGR/logs/ for current diagnostics.

Root Cause (Confirmed Indicator)

A missing or unreadable MGR UI message catalog file:

  • /var/TextPass/MGR/xml/messages.xml

This file is loaded by many GUI/HTTP operations. If it is missing or inaccessible, the GUI may fail to render correctly and can appear stuck/unresponsive while generating heavy error-log churn.

Related / Contributing Conditions (Can Compound Instability)

  1. Missing session XML artifacts under /tmp/data/<session-hash>/...

    This can occur after /tmp is purged (for example, reboot/tmpwatch/systemd-tmpfiles) or if /tmp/data is not writable/creatable by the web/app user.

  2. Poller/transfer subsystem instability and error load

    • fta.service being killed by systemd due to stop timeout
    • Lock contention on .../MGR.lock
    • Repeated scheduler/transport errors such as:
      • Host key verification failed.
      • Could not setup ssh master connection to 127.0.0.1: 65280
      • File::Rsync: log-file - unknown option at /usr/TextPass/lib//MGR/FTA/Transport/RsyncSSH.pm line 54.
    • Polling/connectivity errors such as:
      • No response from remote host <bat_device_ip>
      • ssh: connect to host <remote_ip> port 22: Connection timed out
  3. Long OS uptime on MGR nodes

    Very high uptime can correlate with increased instability risk over time (including SNMP TimeTicks-related risk referenced in SNMP restart guidance). Periodic planned full OS reboots are recommended as preventative maintenance, especially for extreme uptime nodes.

Data to Collect (If It Is Happening Now or Recurs)

Run the following on the affected MGR node (use a 15-minute window before/after the incident time):

1) Process/service state

su - textpass -c tp_status
systemctl --no-pager -l status mgr fta ftransfer
journalctl --since "<YYYY-MM-DD HH:MM>" --until "<YYYY-MM-DD HH:MM>" -u fta -u ftransfer --no-pager

2) System logs

tail -n 200 /var/log/messages

3) MGR application logs (preferred over stale legacy tp_mgr_error_log.txt)

ls -larth /var/TextPass/MGR/logs/

Provide the latest mgr_error_log.* (and any relevant GUI/request logs present there).


Resolution Procedure

Step 1 — Verify the required UI catalog file exists

On the affected MGR node:

ls -la /var/TextPass/MGR/xml/messages.xml

If the file is missing or unreadable:

  • Restore messages.xml from a known-good backup of the same environment, or from the NewNet Lithium installation media/package that matches your deployment.
  • Ensure ownership/permissions allow the web/application user to read it (the exact user varies by environment; commonly the web server user such as apache).
  • Validate directory contents and permissions:
ls -la /var/TextPass/MGR/xml/

Step 2 — Validate /tmp/data health (session artifacts)

Check that /tmp has space and that /tmp/data exists and is writable:

df -h /tmp
ls -la /tmp/data/

If /tmp/data does not exist, create it and apply appropriate ownership/permissions for your web/app user and security policy.

Step 3 — Restart MGR after restoring required files

If mgr.service exists:

systemctl status mgr
systemctl stop mgr
systemctl start mgr
systemctl status mgr

Then verify TextPass process status:

su - textpass -c tp_status

Step 4 — Reduce ongoing error load (recommended)

If you see repeated poller job failures (SSH/rsync/host key verification, missing export directories), address them to prevent future hangs:

  • Fix SSH trust/known_hosts issues for the relevant endpoints (for example, errors like Host key verification failed.).
  • Correct rsync configuration/compatibility causing errors such as:
    • File::Rsync: log-file - unknown option ...
    • ... RsyncSSH.pm line 54.
  • Create/repair any required export/report directories referenced in errors.

Step 5 — Preventative maintenance: planned full OS reboot for extreme uptime nodes

If an MGR node has extremely high OS uptime, schedule a planned full OS reboot (not only restarting TextPass processes) according to your operational procedures. Capture tp_status before and after to confirm a clean baseline uptime reset.

Verification

After completing Steps 1–3:

  1. Confirm the GUI is responsive (log in and navigate pages that previously froze).
  2. Confirm the key log error stops repeating:
    • I/O warning : failed to load external entity /var/TextPass/MGR/xml/messages.xml
  3. Confirm services are stable:
systemctl --no-pager -l status mgr fta ftransfer
  1. Confirm TextPass processes are operating:
su - textpass -c tp_status

If the GUI is still unresponsive after restoring messages.xml and restarting MGR, collect the “Data to Collect” bundle above and include the exact incident timestamp for correlation.

Frequently Asked Questions

1. What exact log message most strongly indicates this specific GUI-freeze root cause?

Repeated entries like:

I/O warning : failed to load external entity /var/TextPass/MGR/xml/messages.xml

This indicates the GUI message catalog file (/var/TextPass/MGR/xml/messages.xml) is missing or not readable by the application.

2. Why is /var/TextPass/MGR/tp_mgr_error_log.txt not updating?

In some environments, that legacy file may be stale while current logging is written under /var/TextPass/MGR/logs/. For active troubleshooting, review and provide the latest mgr_error_log.* files from /var/TextPass/MGR/logs/.

3. Do I need to reboot the OS to fix the stuck GUI?

Typically, restoring messages.xml, validating /tmp/data, and restarting mgr resolves the GUI-freeze behavior. However, for nodes with extremely long OS uptime, a planned full OS reboot is recommended as preventative maintenance to reduce long-uptime instability risk.

Choose files or drag and drop files
Was this article helpful?
Yes
No
  1. Priyanka Bhotika

  2. Posted

Comments