Overview
The tp_ssi process is stuck in the "starting" state, with error messages such as:
Cannot start transaction
Failed to setup a connection to the cluster manager
Failed to connect to database
-
Possible Causes
-
MySQL/MariaDB not running on RTR nodes
-
The RTRs act as API clients to the NDB cluster via MySQL. If MySQL is stopped or missing,
tp_ssi
cannot fetch subscriber data.
-
-
Misconfigured MySQL error log path
-
/etc/my.cnf
may reference a non-existent log directory (e.g.,/var/log/mariadb/mariadb.log
), causing MySQL startup to fail immediately.
-
-
Connectivity issues to the SPF nodes
-
RTRs may fail to reach the SPF cluster management node (
ndb_mgmd
) due to blocked network connectivity or port 1186 issues.
-
-
SPF NDB cluster instability or offline state
-
When SPF cluster services (e.g.,
ndbmtd
, MySQL, or the management service) are restarted or unstable, all RTRs can briefly lose contact and placetp_ssi
into “starting” or flapping state.
-
-
In all of these cases, the common cause is that the tp_ssi
process cannot establish stable communication with the SPF(s), which causes communication failures.
Resolution Steps:
-
Check NDB Connection:
ndb_mgm -e show
This command should be run from the SPF or the MGR. It will return a list of nodes. Ensure that the RTR nodes all show as "Connected." Also ensure that the SPF nodes show as connected.
- If any RTR node does not show as "Connected", then check whether or not mysql is running on that RTR by running:
systemctl status mysql
asdf- If it is running, then next you need to check for connectivity between that node and the SPF cluster on the ports/IPs used for that connectivity.
- If it is not running, try starting it by running:
systemctl start mysql
then check the status again. If it is not running, or failed to start, then you need to investigate following standard MySQL/Linux troubleshooting.
- If any SPF node does not show as "Connected", then check whether or not mysql or ndbmtd are running by running:
systemctl status ndbmtd systemctl status mysql
- If they are running, then next you need to check for connectivity between the node and the MGR.
- If not running, try starting them by running:
systemctl stop mysql systemctl stop ndbmtd systemctl start ndbmtd systemctl start mysql
If still not running, proceed with standard NDB cluster/linux troubleshooting
- Once all nodes are running again, run
ndb_mgm -e show
from the MGR or the SPF nodes again to ensure that all nodes are active and communicating as they should be.
2. Check SPF Connectivity:
There may be situations in which the RTR or MGR nodes cannot connect to the SPF.
- To verify the SPF that the node is configured to connect to, check near the end of the common_config.txt file. You should see some lines like these:
which define which SPF the node is configured to connect to, via which IP, and using which port.
- Test connectivity from each RTR node to each SPF node by using a simple ping test. This should succeed if the required connectivity is present.
3. Check SPF Status
- To check the status of the actual SPFs, from each SPF node, run:
tp_status spf_core
. If it is not running, you will need to investigate why the SPF is down.
Once the ndb and mysql processes are running as they should be on all nodes, all nodes have the required connectivity, and the SPFs are running, if tp_ssi is still not running, try running the following to restart tp_ssi on the affected nodes:
tp_stop --tp_ssi tp_start --tp_ssi
Frequently Asked Questions
- How do I know if this error applies to my situation?
- You'll see the error message "Cannot start transaction, tname:'profile'" when the tp_ssi process is stuck in the starting state, indicating transaction failures.
- What should I do if MySQL fails to start?
- Check the output of
journalctl -xe -u mysql.service
for more details on why MySQL is not starting successfully. Ensure the log directory exists and has the correct ownership. - How can I verify if the RTR is connected to the NDB cluster?
- Use the command
ndb_mgm -e show
from the SPF node to verify if the RTR shows "Connected" in the cluster configuration.
Priyanka Bhotika
Comments