Troubleshooting ‘Log File Sync’ Waits

I have been contacted by one of our customers to provide reference information on troubleshooting Oracle Log File Sync waits.

I think that this information worth short blog post.

Reasons:

Log File Sync waits occur when sessions wait for redo data to be written to disk
typically this is caused by slow writes(IO subsystem saturation,…)
spikes in Log File Parallel Write, as shown by James Morle
or the application is committing too frequently
improper Operating System configuration(check 169706.1)
CPU overburning(very high demand => LGWR on run queue, check Kevin Closson post)
high Log Parallelism, which saturates filesystem/OS, as investigated by Nikolay Savvinov
BUGs in Oracle(especially with RAC option) and 3rd Party software(like ODM/DISM)

Recommendations:

tune LGWR process to get good throughput, especially when ‘log file parallel write‘ high too:
- do not put redo logs on RAID 5 without good write cache
- do not put redo logs on Solid State Disk (SSD)
It looks like last recommendatin was based on old experience working with SSD disk, which is obsolete now and even Oracle recommends using SSD disks for REDO logs(1566935.1 Implementing Oracle E-Business Suite 12.1 Databases on Oracle Database Appliance):

“Move REDO log files to +REDO diskgroup on Solid State Disks (SSDs).”

if CPUs are overburned(check runqueue with vmstat):
- check for non-oracle system activity, like GZIP or BZIP2 running in business hours…
- lower instance’s CPU usage(for example, tune SQL for LIOs)
- increase LGWR priority(renice or _high_priority_processes),
decrease COMMITs count for applications with many short transactions
use COMMIT [BATCH] NOWAIT(10g+) when possible
do some processing with NOLOGGING(or may be even with _disable_logging=TRUE if just testing performance benchmark/impact), but think about database recoverability
lower system’s CPU usage or increase LGWR priority
if you see spikes in Log File Sync, try to disable Adaptive Log File Sync(_use_adaptive_log_file_sync=FALSE)
check if there is some 3rd party software, or utilities like RMAN, activity on the same disks as redo logs placed, like trace/systemstate dump files, e.t.c
if you are on multy CPU/Core system, try restrict Log Parallelism (_log_parallelism_max=1)
trace LGWR as the last option for troubleshooting OS/3rd party issues 😉

References:

34592.1 WAITEVENT: “log file sync” Reference Note
34583.1 WAITEVENT: “log file parallel write” Reference Note
1376916.1 Troubleshooting: log file sync’ Waits
223117.1 Troubleshooting I/O-related waits
857576.1 How to Minimise Waits for ‘Log File Sync’
1548261.1 High Waits for ‘Log File Sync’: Known Issue Checklist for 11.2
1064487.1 Script to Collect Log File Sync Diagnostic Information (lfsdiag.sql)
1318709.1 AIX: Things To Check When Seeing Long Log File Sync Time in 11.2.
1918513.1 Higher ‘log file sync’ Waits After Upgrading to Linux Redhat EL 6
1205673.1 ‘Log File Sync’ problem on a Sun Server: A Typical Source for LOGFILE SYNC Performance Problems
1523164.1 SPARC: Reducing High Waits on ‘log file sync’ on Oracle Solaris SPARC by Increasing Priority of Log Writer
13551402.8 High “log file parallel write” and “log file sync” after upgrading 11.2 with Veritas/Symantec ODM
1278149.1 Intermittent Long ‘log file sync’ Waits, LGWR Posting Long Write Times, I/O Portion of Wait Minimal
1229104.1 LOG FILE SYNC WAITS SPIKES DURING RMAN ARCHIVELOG BACKUPS
1462942.1 Adaptive Switching Between Log Write Methods can Cause ‘log file sync’ Waits
Kevin Closson: “Manly Men Only Use Solid State Disk For Redo Logging. LGWR I/O is Simple, But Not LGWR Processing”
Jeremy Schneider: “Adaptive Log File Sync: Oracle, Please Don’t Do That Again”
Riyaj Shamsudee: “Tuning ‘log file sync’ wait events”
Gwen Shapira: “De-Confusing SSD (for Oracle Databases)”
Guy Harrison: “Using Solid State Disk to optimize Oracle databases”
Nikolay Savvinov: “High log file sync waits? Check log parallelism!”
SSD Performance Blog

9 thoughts on “Troubleshooting ‘Log File Sync’ Waits”

Interesting post, Oleksandr, thanks

But why “do not put redo logs on Solid State Disk (SSD)”?
Oracle itself very actively uses SSD in Exadata (for redo), and we have some positive experience with SSD for redo usage
What can be wrong with SSD?

odenysenko

04.12.2012 at 19:30

Igor,
It’s because SSD even IS NOT USED for redo-logs storage in Exadata as THE ONLY STORAGE option.
It’s used as a COMPLEMENT to traditional HDD(to overcome another issue –
not having enough disk spindles, and OIPS as a result):
Db nodes gets their WRITE CONFIRMATION in a case redo data has been successful
written to HDD or SSD(whatever was the first), so SSD and HDD COMPLEMENT EACH OTHER,
because no one is perfect…

It’s because of SSD Write Penalty, because of Garbage Collection.

SSD is especially GOOD for READ(and even WRITE) SMALL IOs,
which we don’t see in case of REDO writting profile.

Additionally, you may be interested in reading:
Gwen Shapira: “De-Confusing SSD (for Oracle Databases)”

Reply
- Igor Usoltsev
  
  05.12.2012 at 14:52
  I’ve read pointed recommendations and some others from different MOS notes too, and I’m sure this is a bit outdated theory
  
  Look at practical results from AWR for 1-hour period –
  
  1) ASM-iSCSI-SAS configuration:
```
                                                               Avg                
                                          %Time Total Wait    wait    Waits   % DB/bg
  Event                             Waits -outs   Time (s)    (ms)     /txn   time
  -------------------------- ------------ ----- ---------- ------- -------- ------
...
  log file sync                    43,605     0      2,963      68      0.1   10.1
...
  log file parallel write         193,830     0      2,067      11      0.3   31.2
```
  2) ASM-iSCSI-SSD configuration:
```
  log file sync                    55,095     0        574      10      0.1    2.7
...
  log file parallel write         302,177     0        622       2      0.4   21.5
```
  And for another heavy-loaded OLTP system with direct-attached SSD and ASM:
```
Event                       Waits %Time -outs Total Wait Time (s) Avg wait (ms) Waits /txn % DB time
...
log file sync	        4,741,351           0              12,006             3       0.31     12.88
...
log file parallel write	3,367,407           0               1,047             0       0.22     10.74
```
  The same recommendations may be found in Steve Shaw Improve Database Performance: Redo and Transaction Logs on Solid State Disks (SSDs)
  
  May be the “do not put redo logs on Solid State Disk (SSD)” is a bit tricky tip from Oracle HW Company, with some marketing influence? 🙂

Igor,
I support your thoughts about “this is a bit outdated theory” in a way that
SSD manufacturers constantly improve their Garbage Collection techniques making them asynchronous e.t.c
and I suppose that enterprise level SSDs have to be good for redo too,
but my personal opinion is that SSDs is the only tool and have to be used in proper place
where we may get most of it.
BTW:
“ASM-iSCSI-SSD” and “another heavy-loaded OLTP system with direct-attached SSD”
may use RAM-based SSD 😉

Igor Usoltsev

05.12.2012 at 18:34

Oleksandr,
There were used the flash-based SSD’s in examples above

Reply
- odenysenko
  
  13.12.2012 at 11:47
  
  Ugor,
  it was just a some kind of joke to show
  that I really don’t know your environments
  and there are some other attributes that
  need to be taken in attention to make correct judgment,
  like:
  – direct attached SSD is not the same as network attached SSD
  – what “network” mean in particular case
  – what SSD mean in particular case
  – …
  so there are more questions to ask than answers to conclude

Nice summary of a complex and sometimes confusing topic, Oleksander. And thank you for linking to my blog.

Igor, getting 2ms “log file parallel write” time on SSD, vs. 11ms on iSCSI mostly means you solved a problem unrelated to spinning disks using SSD… writing to a disk without seeks should not take 11ms, which means either something went wrong with the storage config (buffers? saturation?) or you are sharing the disks and therefore performing seeks. On the other hand 2ms is SLOW for SSD, and you may want to check why you are not getting the write speed you clearly payed for…

Very nice information. Here are few more steps to solve log file sync.

http://www.dbas-oracle.com/2013/07/Solve-log-file-sync-wait-event-and-log-file-parallel-wait-event.html

	Danilo Neto on Oracle 12.2 ASM PWFile Recreat…
	James Oldham on Oracle FMW GoldenGate Veridata…
	odenysenko on Using SQLPatch to inject HINTs…
	Sergii Shcheteniuk on Using SQLPatch to inject HINTs…
	Sergii Shcheteniuk on Using SQLPatch to inject HINTs…
	odenysenko on RAC: Correct DISPATCHERS confi…
	oracle12cdatabase on RAC: Correct DISPATCHERS confi…
	odenysenko on Quick solution for ORA-01665:…
	rbikblog on Quick solution for ORA-01665:…

Oleksandr Denysenko's Blog

be prepared for The Future…

Troubleshooting ‘Log File Sync’ Waits

Reasons:

Recommendations:

References:

9 thoughts on “Troubleshooting ‘Log File Sync’ Waits”

Leave a comment Cancel reply

Reasons:

Recommendations:

References:

Rate this:

Share this:

Related

9 thoughts on “Troubleshooting ‘Log File Sync’ Waits”

Leave a comment Cancel reply