Windows Services and Oracle Processes
I use an XE version of Oracle 11.2 on a Windows laptop as a sandbox database. It’s convenient for things like exploring new features, trying out development ideas in SQL and PL/SQL, and isolating schemas and data objects for research. Plus, it’s a free download and pain-free installation. But all of my admin experience with Oracle lies within a Unix/Linux context. Since the XE database pretty much runs itself for months and years, having to do some emergency DBA config work can come as a surprise and require research.Admins and developers accustomed to Linux expect to use startup and shutdown commands for database control, as well as the lsnrctl utility for managing TNS which allows users to connect to any database. All of these routine DBA basics unnervingly become point-and-click operations under Windows. Further, the well-known Oracle background processes which negotiate the many database activities and memory operations in Oracle, and which are independently inspectible, all but disappear under Windows, concealed within the umbrella ‘Oracle application’. XE installations on Windows create several buttons within the Windows Start Menu: one each for starting and stopping the database, another for opening up a SQL*Plus command line window, and perhaps one for SQLDeveloper. The session display (above) shows what happens when you click the Start Database icon. At this point the DB is fully usable, but if you check the Windows Task Manager you wont find anything related to Oracle under the Processes tab. Under the Services tab (right) you will see a few items related to Oracle, but nothing which correspond to the usual 20-plus Oracle background processes. The two relevant ones are OracleServiceXE and OracleXETNSListener corresponding to the database proper and the TNS listener respectively. In the screenshot (above) you can see that these two services are ‘running’ as a result of the successful Start Database operation. All of this so far is default Windows XE installation behavior. Now, suppose you are done with Oracle work for awhile and want to shut it down. The Stop Database button will accomplish this for you, but a quick check of the Task Manager reveals that the TNS listener is still running afterwards. Absent a lsnrctl utility, Task Manager lets you manipulate individual Windows services by highlighting one and right-clicking. But look (left) what happens when you try to stop it. Even though we’re running as the Windows Administrator, we do not have sufficient access privileges for the TNS service. Instead, Windows forces you to click on the ‘Services…’ button at the bottom right corner of the panel which brings you to another (extended) view of the Windows Services while apparently conferring the necessary privileges. Here, you can finally stop the desired service, OracleXETNSListener. Take note of something important: the column for ‘Startup Type’ setting. In the screenshot, the values for the Oracle DB and TNS services are set to “Automatic”, not “Manual”. Again, this is all default installation stuff. This means that whenever the OS bounces due to some blue screen problem or perhaps because of something ancillary like a Chrome browser crash, Oracle will startup silently behind the scenes. while you believe it is shut down. I recommend setting these to “Manual” for sandbox databases on non-dedicated laptops so that you definitely know when the database is active (and possibly generating trace) and when not. You simply need to remember to restart the TNS service manually when you want to use Oracle. It’s not just of theoretical interest whether Oracle services are running or stoppable when the database is shutdown on Windows. Or whether some Oracle services are manual or automatically enabled whenever Windows reboots. It is not unusual at all that Windows conducts frequent bounces of the system due to freezes caused by memory or file management issues. Windows is a much less stable OS in this regard than say, Linux. You can very easily have some expensive (disk-eating) Oracle service running in the background after an OS bounce, while believing you’ve managed everything properly, according to a default installation. This is what happened in the Case of the Missing 100GB. A quick internet search revealed that this scenario is not uncommon. Hence, the reason I decided to write this article.
Runaway Trace Files and CountermeasuresOracle XE can potentially generate a large number of trace files, occupying a lot of disk space, fast. It does so while creating a large number of new folders in Windows. For my sandbox, I tallied up the disk utilization for the raw database, software, and tablespace objects I had created and compared this with the amount of space occupied by trace files and other diagnostics at a given point in time.
DB + Software + Data: 4GB
Trace Files: 95GB
The quantity of files, including folders, which had been generated were more than 2.6 million! All while thinking the database was quietly off.
The entire tracing architecture changed with Oracle 11g. It’s been incorporated into a monitoring and issue-reporting utility called ADR (Automatic Diagnostic Repository) which generates many kinds of trace information for both system and user actions, including TNS diagnostics. The adrci tool is a command-line interface for inspecting these numerous trace incidents and also for packaging them up for sending to Oracle Support. Locations and database parameters related to tracing have also been revised. The long familiar BACKGROUND_DUMP_DEST, USER_DUMP_DEST and CORE_DUMP_DEST have been collapsed into the single 11g parameter DIAGNOSTIC_DEST. The default location for this parameter is the /diag directory, which is placed under $ORACLE_HOME. It is still possible (not clear for how much longer though) to disable ADR-style extended tracing and revert back to the old style via the DIAG_ADR_ENABLED parameter. Note: this is not a standard database parameter, but rather a SQL*Net setting. By default in 11g, ADR is running, so to unset it and revert to older tracing, you must explicitly place the following entry in your $ORACLE_HOME/network/admin/sqlnet.ora config file:
DIAG_ADR_ENABLED = OFF
To see which kinds of tracing are currently being routed where, you can consult the view V$DIAG_INFO. Here’s an Oracle documentation overview about the new diagnostics subsystem.
Automating Trace Cleanup
If you’re thinking cleaning up this mess is simple and quick, it’s either because you speak Windows-Batch, or like me, you’ve been lulled into complacency by a couple of decades of exposure to Unix shell tools. Writing and testing some Windows batch commands to eliminate unwanted aged files and folders proved time-consuming. I know that people exist who are fluent in this stuff, but geez, (BEGINNING OF RELIGIOUS RANT) what an ugly mishmash of flags, hyphens, quotes, and idiosyncratic arbitrariness Windows batch syntax is (END OF RELIGIOUS RANT)! Once you do arrive at something which works, you are in for a lengthy wait while it executes. (This is likely made worse because only sparse and fragmented free disk space may remain.)
After it finishes, it will be necessary to run a CHKDSK and defrag the disk. Finally, you’ll want to create a scheduled batch executable which repeats this cleanup process at regular intervals, say every 2 or 4 weeks, so you can happily forget about it. So – here for posterity are the steps for problem resolution and the Windows batch command for cleanup, which uses the forfiles command:
1) Ensure the Windows Oracle XE database is shutdown
2) Use Task Manager to halt any leftover Oracle services
3) Force all Startup settings to ‘Manual’ for Oracle services
(in case Windows bounces)
4) Locate /diag folder and check extent of space utilization
5) Execute BATCH code to delete unwanted .TRM, .TRC and folders
(use desired aging variable; example is 15 days)
6) CHKDSK and defrag entire disk
7) Optional: Turn off ADR using SQLNET.ORA
8) Optional: Schedule regular BATCH job cleanup in Windows (Step #5)