INSTANCE RECOVERY – Oracle database have inherit feature of recovering from instance failures automatically. SMON is the background process which plays a key role in making this possible. Though this is an automatic process that runs after the instance faces a failure, it is very important for every DBA to understand how is it made possible. In this blog, we are going to understand the same with the help of a flow diagram and a demonstration.
Instance recovery is the way of bringing back the transactions which are not yet written to the disk by DBWR at the moment of instance failure. Instance failures are possible in multiple ways: hardware issues, power glitches, network outages, abnormal instance shutdowns (SHUTDOWN ABORT). Database startup after any of these failures will automatically trigger the instance recovery. Let us now get a clear understanding of this process with the help of the flow diagram and with hands-on activity as well. This exercise would similar to the one we did in “Why should we configure limits.conf for Oracle database?” blog.
Recovery concepts:
Consider the image above, let us get into details of this flow.
- At 9:00 AM consider your database is working normally.
- At 9:10 AM think of database running 30 transactions in which 20 are committed and 10 are uncommitted, they are yet to be decided by user process to commit or rollback.
- While this transaction is still running, let us consider that instance is terminated at 9:11 AM.
- Now as we bring up the instance, a perfect recovery is the one that makes sure 20 transactions are not lost and 10 uncommitted transactions are lost. Oracle does it perfectly. Hoping so, startup up the instance normally.
- Once database mounts, then the initial phases of recovery begin.
- With the help of CKPT background process, instance recognizes any loss of transactions with the compatibility checks of SCN numbers between control files and data files.
- All the transactions which are lost without updated to the datafiles are re-ran with the help of REDO Which means that all 30 transactions are ran as on 9:10 AM now. This action is called as ROLL FORWARD.
- Database then checks the accessibility of UNDO segments and opens the database.
- On opening the database, now the 10 transactions which are not committed are rolled back making the database consistent with 20 committed transactions with the help of UNDO This action is called as ROLL BACK.
- Now that instance recovery is completed, Database is not recovered with 20 committed transactions. But yes, blocks would be updated into datafiles asynchronously by DBWR as usual.
We will now see the demonstration of the same to relate it with the reality.
- Database is working normally now.
SQL> select instance_name from v$instance;
INSTANCE_NAME
—————-
ORADB
Transactions are always running the database in the SYSTEM and SYSAUX tablesapces. We do not need to run any as of now.
- Let us terminate the instance now.
[oracle@OEL-12c trace]$ ps -ef | grep smon
oracle 4029 1 0 Oct21 ? 00:00:03 ora_smon_ORADB
oracle 26131 1172 0 16:32 pts/0 00:00:00 grep –color=auto smon
[oracle@OEL-12c trace]$ kill -9 4029
- Let us now startup the database and monitor the alert log file. Below is the important output of the alert log file.
Started redo scan
Sun Oct 22 16:34:11 2017
Completed redo scan
read 2450 KB redo, 553 data blocks need recovery
Sun Oct 22 16:34:11 2017
Started redo application at
Thread 1: logseq 11, block 15046
Sun Oct 22 16:34:11 2017
Recovery of Online Redo Log: Thread 1 Group 2 Seq 11 Reading mem 0
Mem# 0: /oradata/ORADB/onlinelog/o1_mf_2_drl8n920_.log
Mem# 1: /oradata/fra/ORADB/onlinelog/group2_member2.log
Sun Oct 22 16:34:11 2017
Completed redo application of 1.92MB
Sun Oct 22 16:34:11 2017
Completed crash recovery at
Thread 1: logseq 11, block 19947, scn 3934834
Alert log clearly says that REDO has been applied with the transactions which are not committed when instance terminated.
- As soon as the instance is opened, roll back operation is performed and below is the output of the same.
SQL> SELECT r.NAME “RB Segment Name”, dba_seg.size_mb,
DECODE(TRUNC(SYSDATE – LOGON_TIME), 0, NULL, TRUNC(SYSDATE – LOGON_TIME) || ‘ Days’ || ‘ + ‘) ||
TO_CHAR(TO_DATE(TRUNC(MOD(SYSDATE-LOGON_TIME,1) * 86400), ‘SSSSS’), ‘HH24:MI:SS’) LOGON,
v$session.SID, v$session.SERIAL#, p.SOSID,
v$session.USERNAME, v$session.STATUS
FROM v$lock l, v$process p, v$rollname r, v$session,
(SELECT segment_name, ROUND(bytes/(1024*1024),2) size_mb FROM dba_segments
WHERE segment_type = ‘TYPE2 UNDO’ ORDER BY bytes DESC) dba_seg
WHERE l.SID = p.pid(+) AND
v$session.SID = l.SID AND
TRUNC (l.id1(+)/65536)=r.usn AND
l.TYPE(+) = ‘TX’ AND
l.lmode(+) = 6
AND r.NAME = dba_seg.segment_name
ORDER BY size_mb DESC;
RB Segment Name SIZE_MB
—————————— ———-
LOGON SID SERIAL#
——————————————————– ———- ———-
SOSID USERNAME STATUS
———————— —————————— ——–
_SYSSMU3_3285411314$ 2.13
00:00:09 19 41123
26475 ACTIVE
I love to see your comments!
Feel free to give additional information in the comments, that I can validate and include in blog posts or update this as well.
Sriram bandarupalli
Very useful information sir.i would like to learn how should I contact you sir.
Pawan Kumar Yaddanapudi
Thank you !!! You can get the coordinates from ContactUs page.
HENRY ASHU
hello sir what is the cost one course like performance tuning
prathap reddy
That’s awesome explanation with flow diagram ,Thank you so much for uploading