Oracle Crash Recovery
In this post we will discuss how oracle does crash recovery or instance recovery. Crash recovery is when oracle database is restarted after crashing while running due to any reason(power gone,resource shortage etc.). Oracle database has to return to a consistent state after the restart. Before the crash many transactions may be inflight which were not committed yet and some of the transactions might have been committed but because the crash happens before the check pointing the data files on disk and the data blocks related to those transactions may not have been written to by DBWR process. So after the restart oracle db should have only the committed transactions before the crash and any uncommitted transactions should be discarded. The committed transaction details must have made it to the disk but there may be uncommitted transaction changes also present on disk. the uncommitted block being present on disk is not very obvious and we will try to demystify this phenomenon in this post. For understanding this we would have to delve into the oracle db architecture. Three background processes are at play here lgwr,dbwr and ckpt.
1. LGWR ? Whenever any transaction is committed writing to oracle data files synchronously can be very costly as oracle would have to find the actual blocks which have to be written to on the disk and these blocks may be scattered and random read/writes on disk is very expensive. So oracle writes all the changes done to it in a file which is only in append mode and is written sequentially. This file is called redo log and any changes done to the db can be replayed from it. LGWR is the process responsible for writing to the red logs from redo log buffers. Redo log buffers are circular in nature and they are reused. As soon as any transaction is committed LGWR persists the data in redo logs and signals the db after which user receives the message of commit successful. Sometimes during period of high db activity even before commit the redo log buffers can be written to log files in case buffer is filled more then 1/3 rd. This way uncommitted data also makes it way to the disk.
2.CKPT- Check pointing is the activity which modifies controlfile and datafile headers indicating the SCN upto which redo logfile data have been synchronized between db buffer cache and disk and signals DBWR process to flush the dirty blocks to disk.
3.DBWR ? Whenever any block is requested from oracle db it is fetched into the db buffer cache. All the dml activities happen in db buffer cache(in memory) and it is asynchronously synchronized with disk by DBWR process on being signaled by CKPT process. when any block in buffer cache is updated and has not been synchronized yet it is called dirty block or buffer. Before DBWR can write to disk all the corresponding redo entries must have been written to redo log files. sometimes when long running transactions are at play the number of buffers getting dirty increases. when clean buffers are required and not available the dirty buffers on cold end of buffer list i.e least recently used blocks are flushed to disk even if the commit instructions have not been received. When DBWR flush happens corresponding redo log buffers are also force flushed to redo log files by lgwr if it had not happened yet by then.
So from above explanation we see two possibilities of uncommitted transactions being made to disk
?When redo log buffers get filled at a high speed lgwr is forced to write uncommitted changes also to redo log files
?When dirty buffers increase a lot in buffer cache due to long running transactions accompanied with high db activity and clean buffers are required for any processing forcing dbwr to write uncommitted changes to the disk.
If we want to have quick crash recovery then frequent checkpoints should happen. As checkpoints mark the point in redo log file from where the roll forward process should start as prior to that the redo log changes and datafile are in sync. In the roll forward stage oracle replays all the redo files required which make the oracle database up to date till the time crash happened with all the committed and uncommitted transactions. After that roll backward process is started which removes all the uncommitted transactions form the db using the undo segments and makes the database consistent as of point in time before crash.