Just a quick post with some additions to situation covered in
Just a short recap:
- customer experienced long timeout(slow connection) doing tnsping to 11.2 database running on AIX when DNS server was unreachable;
” tnsping ‘ (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = dns-hostname)(PORT = 1521))\ (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = XXX)))’
TNS Ping Utility for IBM/AIX RISC System/6000: Version 188.8.131.52.0 – Production
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = dns-hostname)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = XXX)))
OK (150010 msec)
- issue was because of IPv6 support was included in Oracle 11g and even if connection hostname was resolved by /etc/hosts to IPv4 client tried to resolve it to IPv6 address too.
- issue was fixed by replacing local, bind by local4, bind4 in the /etc/netsvc.conf.
So why I’m doing this recap ?
It’s because of mentioned problem was only a part of issue faced during IBM PowerHA(HACMP) cluster with oracle resources testing.
- two nodes cluster IBM PowerHA(HACMP)
- oracle database software was installed on both nodes
- single-instance(no RAC) database was created
- IBM PowerHA(HACMP) is used as a fail-over solution
- oracle listener and instance were configured as cluster resources
- everything was working as expected, including services fail-over to another node in a case of node failure
- during one of the test DNS server became inaccessible
- IBM PowerHA(HACMP) tried to check oracle listener, but wasn’t able to do it because of tnsping responded only after 150 seconds, but cluster resource timeout was lower than 150 seconds
- IBM PowerHA(HACMP) stopped all cluster resource including oracle database, listener, shared disk resource, etc and failed over to second node
- at the second node situation was the same, so IBM PowerHA(HACMP) wasn’t able to check cluster resources after startup and decided to stop them all too.
- no fail-over to the first node
- cluster was completely down because of DNS issue when dns host name wasn’t resolved to IPv6 locally
That’s complete small story about the whole IBM PowerHA(HACMP) cluster unavailability because of IPv6 support included in Oracle 11g and DNS issue.
PS: I remember at least one more issue similar to this – when cluster manager tried to check cluster resource and wasn’t able to do it because of some issue. It decided that resource was dead and was just killing production oracle database.