This is in response to Aaron Nelson’s call for T SQL Tuesday – on the subject of Logging. There are many things that come to a DBA’s mind while talking of logging. For this I chose an incident that happened some years ago. I had joined a new job, and the dba who had taken my spot had been there for several years before she left. She had set up most of the maintenence jobs on existing servers, most of which were on SQL Server 2000 then.
On the second day of my job I was asked to restore a backup as of a certain date/time to a development server to recover some data for an application manager. I was directed to the place where backups were stored. I found the full backup, and hourly transaction logs following that. As I set about restoring, I got a certain log that gave me a missing LSN error. The log job was set to run hourly, and the timestamps showed all logs were there and there were no errors on the job. I looked further to see if anything else was going on at that time when the log was backed up – I found a reindexing job in which the recovery model was being set to simple and back to full again. In 2005 and above if this change is done – the log chain is broken and one gets an error that a full backup is needed before restarting it. But in 2000 there was no such thing. The log chain was still broken but the job was not throwing any errors to show that. The backups were just not usable.
The process had been in place for more than 4 years, and luckily there had been no requirements to use the transaction log backups during this time interval. They even had a testing mechanism for testing backups, but they did not test backups during this particular sequence of operations. After that episode I also changed the testing process to work through job cycle – through all jobs and maintenence windows, in case there was any issue. It was a great lesson for me on backup testing and how many issues can remain dormant for years until one gets them accidentally. Luckily it was not a crisis situation when it was discovered but it might very well have been. Test your backups, and through jobs and processes as necssary, not just during small windows.