Intermittent Slow Oracle Net from Windows 7/Vista/2008 machines

Hi.

I have faces some interesting issue that worth a blog post.

After rebooting Windows 2003 Server with installed Oracle Database 9.2 some clients machines(with installed Windows 7) started to work very slowly with Oracle Database. Other clients, with installed Windows XP, worked as usual – fine. There no network hardware issues for Windows 7 machines – both machines were tested for the same network cable: Windows XP works fine, but Windows 7 is very slow…

Oracle sessions were tracing with 10046(level 12) event and it revealed that there are no issues with database – just network…

Listener was rebooted – didn’t helped…

System administrators noticed that Windows 2003 machine was claiming that it was not able to update root certificates, so they decided to provide Internet connection and update root certificates. Updates required server reboot and… vuala… after server rebooted everything was OK from every workstation.

Nobody was able to explain what was THE REASON, but EVERYBODY  WERE HAPPY.

Later, in two days, they required to reboot mentioned server once again, and after reboot Windows 7 workstation started working slowly…

I was there so it was time to see what is happening.

We noticed that only some statements performed slowly, other worked fast. Only SELECT queries that returned some result were performing bad, other statements, like UPDATE, DELETE, ALTER … worked just fine.

After enabling Oracle Net Tracing we noticed that most time spend at:

we are doing simple SELECT NULL FROM DUAL
[20-OCT-2011 14:23:01:858] nspsend: 00 00 00 00 28 53 45 4C  |....(SEL|
[20-OCT-2011 14:23:01:858] nspsend: 45 43 54 20 4E 55 4C 4C  |ECT.NULL|
[20-OCT-2011 14:23:01:858] nspsend: 20 46 52 4F 4D 20 44 55  |.FROM.DU|
[20-OCT-2011 14:23:01:858] nspsend: 41 4C 20 00 00 00 00 00  |AL......|
...
[20-OCT-2011 14:23:01:859] nsdo: switching to application buffer
[20-OCT-2011 14:23:01:859] nsrdr: entry
[20-OCT-2011 14:23:01:859] nsrdr: recving a packet
[20-OCT-2011 14:23:01:859] nsprecv: entry
[20-OCT-2011 14:23:01:859] nsprecv: reading from transport...
[20-OCT-2011 14:23:01:859] nttrd: entry
[20-OCT-2011 14:23:07:059] nttrd: socket 224 had bytes read=620

Every time we spent about 5 seconds waiting for data from network and it was clear that it was not an Oracle Net(SQL*Net) issue – most likely it’s an issue of underling TCP/IP…

We disabled firewalls on workstation and server, but it didn’t help…

After some Googling we have found the reason for mentioned behavior – it was one of the feature of Next Generation TCP/IP Stack called Receive Window Auto-Tuning.

“TCP AutoTuning enables TCP window scaling by default and automatically tunes the TCP receive window size for each individual connection based on the bandwidth delay product (BDP) and the rate at which the application reads data from the connection. Theoretically, with TCP auto-tuning, network connection throughput in Server 2008 should be improved for best performance and efficiency”

So, it was designed to help but did it in some strange way in this particular case…

Some Internet-sources claims that issue may be not in Windows 7, but at the network layer – some old switch that doesn’t support TCP/IP Window Scaling feature, but it wasn’t our case because nothing has changed at network level between Windows 2003 reboots, but it’s still unknown why rebooting Windows 2003 server with Oracle Database installed became the reason of this issue for the first time ? and why rebooting mentioned server helped to temporary resolve an issue ?

We have fixed this issue at client side with next windows command:

netsh interface tcp set global autotuninglevel=disabled

Conclusion:

  • it’s not an Oracle Client performance issue
  • it’s not an Oracle Server performance issue(from tracing with event 10046 level 12)
  • it’s not an Oracle Net(SQL*Net) performance issues(from Oracle Net Tracing)
  • it may be seen when some old network hardware is used, but not necessary
  • it may be seen from Windows Vista, Windows  7 and Windows 2008 machines
  • it is an example of feature that functions not like it was designed

References:

14 thoughts on “Intermittent Slow Oracle Net from Windows 7/Vista/2008 machines

  1. AMAZING…This solution worked for me. My problem was extreme slowness when one of our critical applications running oracle 11g client on window 7. Thanks!!!

  2. This solution worked for me too on windows 2008 server and windows 7
    Thanks very much! …you are the best in world, I hope you will increase the salary

  3. Thanks, this solved our problem too. We had very low performance to some Oracle IAS servers (webforms) and normal performance to servers on a different location. Only the (new) win 7 clients had the problem. (win XP was OK)
    This solution can also be found on Oracle support (“metalink”) [ID 1449731.1]. Well done !

  4. Switches usually work at layer 2, and as such they have no notion of TCP (layer 4), and even if we introduce a layer 3 switch (AKA router) one would have to actively do something to make it interfere with TCP, so “some old switch that doesn’t support TCP/IP Window Scaling feature” is pure nonsense from my perspective. TCP is a transport layer protocol between hosts.

    That being said, various Windows implementations of TCP/IP have presented us with many interesting challenges over the years, and this provides useful insight into one such.

    • Erik,
      Thanks for pointing into switches internals!
      I think this post was written because of feature implementation issue,
      and not actually related to some kind network hardware ‘issue’

  5. Please, any advice for a similar situation, but working fine on a Win7 workstation, but very slow on a WinXP one? “Server” is WinXP, Oracle 10g. Thank you very much!

Leave a comment