ClearDS_RTLogRTStatus - Script & DataStage job to Clear DataStage RTLOG and RTSTATUS

Thursday, May 21, 2015

ClearDS_RTLogRTStatus - Script & DataStage job to Clear DataStage RTLOG and RTSTATUS

1. Introduction

DataStage job logs and status file can grow huge at times due to system issues or poor design which cause performance issues with jobs. It will be cumbersome activity to clear these logs manually from DataStage director\Administrator when we have few many jobs with huge logs in same folder.


  • Example of system issue: When DataStage jobs are defined as ISD(Information Services Director) services and it start looping and keep on creating the logs for each instance.
  • Example of poor design: Logging few thousands of results to job logs using peek stage.
We will go over a process to execute CLEAR.FILE RT_LOG and CLEAR.FILE RT_STATUS automatically.


Please find the source code for .sh and .dsx attached at the end of this post.
This post is for IBM DataStage Developers with DataStage and basic shell scripting knowledge.
This script and job has been tested to work with DataStage version 8.7. There can be some changes to script and DataStage job if IBM changes the log file format.

2. ClearDS_RTLogRTStatus.dsx - DataStage Job to Identify Job ID

Every DataStage job in a project will have a unique Job ID and this Job ID is required to clear the RT_LOG and RT_STATUS. You could view the Job ID manually from the job log (Figure 3.a).


To pull the Job ID automatically, created a DataStage server job(ClearDS_RTLogRTStatus) using the predefined routine UtilityHashLookup("DS_JOBS",input.JobName, 5). This job will accept a sequential file with job name as input (Figure 3.b) and will create an output file (Figure 3.c) with job name and job ID. The input file with list of jobs whose logs needs to be cleared is a prerequisite for this job.

Figure 3.a – Job ID in Job log


Figure 3.b – Input

Figure 3.c – Output

3. ClearDS_RTLogRTStatus.sh - Script to clear the RT_LOG and RT_STATUS
Created script(ClearDS_RTLogRTStatus.sh) to call “CLEAR.FILE” for RT_LOG and RT_STATUS. This loops with the output of DatsStage job to will clear the RT_LOG and RT_STATUS of all required jobs.
Requirement of this script are
  • There should be a input folder in same directory level as script is and it should hold the output of DataStage job ClearDS_RTLogRTStatus(Figure 4.a).

  • Script should be called with two parameters DATASTAGE_PROJECT_NAME and INPUT_FILE_NAME
Example:- 
ClearDS_RTLogRTStatus.sh ClearDS_RTLogRTStatus.sh ISD_DEV ISD_Jobs_Listing_OUT.txt

DATASTAGE_PROJECT_NAME - Project in which your jobs reside
INPUT_FILE_NAME - Output of DataStage job ClearDS_RTLogRTStatus

  • The script execution logs are captured in logs folder (logs folder and temp folder gets auto generated during the first run).

Figure 4.a – Script Location and input directory

4. Downloadable Modules

1 comment

  1. I have encountered some bug in finding the job Id during Data stage job log and with the help of your this blog I am able to remove that bug.

    ReplyDelete

Error 404

The page you were looking for, could not be found. You may have typed the address incorrectly or you may have used an outdated link.

Go to Homepage