Model 204 Avoiding File Reorganizations
By James Damon
When File Reorganization Is Required
If you have converted all fields from KEY and NUMERIC RANGE attributes to the ORDERED attribute, and current file designs have been optimized, there are few reasons to reorganize files. Although this article focuses on how to avoid file reorganizations, the following table lists file parameters and field attributes that necessitate a file reorganization if they are to be changed:
File Parameters
Field Attributes
ATRPG, FVFPG, MVFPG
OCCURS
ASTRPPG
LENGTH
BRECPPG
CODED/NON-CODED
BSIZE (hashed files)
STRING/BINARY
CSIZE
VISIBLE/INVISIBLE
FILEMODL
AT-MOST-ONE/REPEATABLE
HASHKEY, SORTKEY
PDSIZE, PDSTRPPG
RECSCTY
In certain cases, file reorganization is also required to improve file performance and efficiency, such as when you need to:
A logical record delete is when a record is deleted from the existence bitmap of records in the file. No one can access the record, but physically the record is still on the disk. Furthermore, the Table B and Table D information about the now-inaccessible record remains in the file. When you reorganize the file, the inaccessible record is physically deleted and Table B and Table D no longer store information about it.
When File Reorganization Is Not Required
Maybe none of the previous considerations apply, but you need to expand Table B or Table D. You might think you need to reorganize a file to accommodate:
In actuality, you do not need to reorganize your file.
You can dynamically expand Table B and Table D from the Online environment with minimal file downtime by using one or a combination of the following commands:
Increasing Table B and Table D
Assuming there are sufficient pages in FREESIZE, the fastest and easiest way to add pages to the table in question is to use the INCREASE TABLEx command. The following command logically moves five 6184-byte blocks on disk from FREESIZE to the target table, Table B.
INCREASE TABLEB 5
The INCREASE command requires exclusive access to the file, implying that APSY subsystems, which have defined the file as mandatory, must be stopped to avoid the following error:
M204.0029: FILE IN USE BY SUBSYSTEM TEST1, COMMAND REJECTED
In addition, non-APSY update requests against the file may not be active or the following error occurs:
M204.0602: FILE IS IN USE M204.1076: DO YOU REALLY WANT TO TRY AGAIN?
Examining an INCREASE TABLEB command
The following table represents the Model 204 file named PARTS, DSN=CCA.PARTS.M204. Each row represents a track on a 3390 DASD. The track capacity of a 3390 DASD is eight Model 204 file pages, each 6184 bytes. Issuing a VIEW TABLES command would show:
ASIZE=3 BSIZE=15 CSIZE=1 DSIZE=10 FREESIZE=11
//PARTS DD DSN=CCA.PARTS.M204,DISP=SHR
FCT
A
B
C
D
F
Issue an INCREASE TABLEB 11 command, then another VIEW TABLES command for the following display:
ASIZE=3 BSIZE=26 CSIZE=1 DSIZE=10 FREESIZE=0
Examining an INCREASE DATASETS command
At this point there are no free pages available for another INCREASE TABLEx command. However, an additional dataset of five tracks could be added to the file as follows:
OPEN PARTS Update password INCREASE DATASETS WITH P2 (Note: the previous command requires that the DDNAME=P2 be allocated to the Model 204 job either dynamically or with JCL or FILEDEF)
//P2 DD DSN=CCA.PARTS.SECOND.M204,DISP=SHR
Now the file consists of two datasets. All the pages of the second dataset belong to FREESIZE. From this point forward the file cannot be opened unless both datasets have been allocated to whatever Model 204 jobóONLINE, BATCH204, IFAM1, or IFAM4órequires access.
The next two commands would reallocate the free pages in this second dataset, as the following table and the output of a VIEW TABLES command illustrate:
INCREASE TABLED 10 INCREASE TABLEB 20
And a VIEW TABLES command would show the following table sizes:
ASIZE=3 BSIZE=46 CSIZE=1 DSIZE=20 FREESIZE=10
In Summary
The number of datasets you can add to a Model 204 file to provide additional pages to FREESIZE is virtually unlimited. Furthermore, when the new datasets are allocated on different volumes, this scattering of the file across multiple volumes may reduce contention and may result in improved I/O service times.
More importantly, you can perform much file maintenance without resorting to file reorganization. Expanding Table B to accommodate additional records, or expanding Table D to accommodate increases in the Ordered Index, or increasing the number of stored procedures are frequent types of file maintenance. The ability to perform this type of maintenance dynamically, quickly and with minimal file downtime is of paramount importance.
System 1032 Automating Dataset Rebuilds: Part 2
By Tym Stegner
In Automating Dataset Rebuilds,Part 1, we discussed how to define datasets for optimum performance and we proposed several dataset-rebuilding considerations. In Part 2, we describe how two System 1032 sites are developing automated dataset rebuilding procedures.
Managing Your Dataset Rebuilds
The System 1032 Customer Support tracking system is relatively small, so that we have never needed to automate the process of rebuilding our datasets. The closest we have is a batch job that nightly examines several production datasets. If changes are detected, the batch job makes new, read-only subsets of these datasets that we plan to use as part of a Web-based System 1032 Customer Support information system.
However, several System 1032 customers have multitudes of datasets, which are updated with great frequency. Some of these sites use automated rebuilding mechanisms. Recently two sites have shared with me their plans and some source code to automatically rebuild datasets. In turn, I will share their ideas with you.
Automating the Rebuild Process
Southwest Texas State University has 544 unique System 1032 datasets with 11,479 unique attributes, as of this writing. Ardie Schneider, Database Administrator for Southwest Texas State University, has created an automated rebuild process for manually selected datasets.
Manually selecting dataset
Manually selecting a dataset sounds somewhat like browsing a bookshelf for an interesting title. Actually, Ardie routinely provides current statistical information to the support teams for the various functional groups at the university. The support teams decide which datasets to rebuild and when to rebuild them.
Ardie generates daily reports indicating dataset growth, both by disk blocks and by the number of records. He also provides weekly reports indicating percentages and the ranking of datasets according to the number of deleted records. The support teams then determine the appropriate time for maintenance based on:
Need for structural changes, which are based on user-submitted service requests.
Uptime requirementsóIs the dataset used by a process that requires 24-by-7 access, and if so, schedule a maintenance window? Or, can it be maintained at night or over a weekend?
Launching the rebuild process
The support teams submit jobs to rebuild the datasets they deem necessary, using the automated processes Ardie designed and implemented. The rebuild process is a set of three System 1032 files:
Introducing the DCL command file
The DSX.COM DCL command file locates the dataset to rebuild and handles a couple of basic questions about how to manage the rebuild for this dataset. The DCL command file accepts the following seven parameters:
Parameter
Specifies
P1
The logical name for the dataset location.
P2
The logical name for the backup location.
P3
The DMD or DMS filename only.
P4
The allocate factor, which adds space for new records to the set of existing records, is determined as follows, where x is the number of existing records, y is the allocate factor, and z is the total record space:
x + (x * y) = z
x + y = z
P5
The SORT attribute list. The default is $ID order.
P6
Whether to set DCL or System 1032 command verification?
P7
Whether to ignore dataset structure change errors?
Note: The seventh parameter (P7) bears additional explanation. These procedures can accommodate a rebuild where the DMD file has changed from the current dataset definition. During the dataset-to-dataset dump portion of the command, any changed primary attribute names might result in warnings about new or missing attributes for DUMP processing. This parameter is set to notice or ignore such changes.
Describing a rebuild process
Rather than posting the complete DCL and DMC source, the following sections describe the functionality of the process. If you want to see or make use of the source, you can download the complete procedures via anonymous FTP from the System 1032 FTP site at:
FOX.CCA-INT.COM
Look in the CCAPRINT subdirectory for the file REBUILDER.ZIP.
Note: The original files supplied by the site were modified to shorten some variable names and enhance the readability of the code.
Introducing the DSX.D1C DMC file
DSX.D1C uses PL1032 commands to call the GET_SYMBOL tool procedure to retrieve DSX.COM parameter values of dataset-location (P1) and dataset-name (P3) into text variables.
The commands open a dataset using dataset-location and dataset-name variables via "@=variable" to display the users of the dataset. Then exit System 1032.
The output from the DSX.D1C DMC is put in the log file for the DSX.COM execution.
Introducing the DSX.D2C DMC file
DSX.D2C declares command variables for the ALLOCATE and SORT operations and calls the GET_SYMBOL tool procedure to retrieve the parameter values from DSX.COM.
If the sort parameter (P5) is non-null, a SORT command variable is created to define a SORT command using those attributes. Then, the backed-up copy of the dataset is opened as an alias.
The ALLOCATE command variable is set to define an ALLOCATE command for the new dataset based on the following:
If the allocate factor value is over 50, the value represents the number of new records. The total number of allocated records is the existing records, plus the allocate factor value.
A new dataset is created by specifying the definition file, name, and location using "@=variable". The DMC makes use of the @=variable method and command variables to avoid the overhead of using an EXECUTE command.
A PL1032 procedure is defined that completes the following tasks:
Call the procedure, then exit System1032.
Similar to DSX.D1C, the output of DSX.D2C DMC is put in the DSX.COM log file.
Introducing the DSX.COM file
The commands in the DSX.COM file begin by checking the command verification parameter (P6) and issuing a SET VERIFY command, if this is indicated. Then, the value of P1 is tested and parsed for dataset location, and aborted if invalid. The value of P2 is tested and parsed next for the backup destination, and aborted if invalid
The backup device is extracted from the backup destination logical into a symbol. Then the existence of a dataset file (P3) is checked, and aborted if it is not found. All copies of the dataset are deleted from the backup location.
DSX.COM then determines the size of the dataset file and compares it to the free space on the backup device. The procedure is aborted if there is insufficient free space. Otherwise, the dataset is backed up to the backup device using the following steps:
Execute DSX.D2C to perform the dataset rebuild. If there is an error from the System 1032 session, or non-blank translation of status value symbol, take the following steps:
Otherwise, delete the backup copy. Abort DSX.D2C processing if the backup copy is not deleted.
Now you are finished.
Automating the dataset selection for rebuilding
At STATS Inc. in Illinois, Mike Hammer is designing a system to automate the evaluation by which datasets are chosen for rebuilding. Mike has set up a dataset that contains the information needed to schedule rebuilds. The tracking dataset contains:
Scheduling items such as log file enable, log file name and directory, submission queue, day of week to rebuild, rebuild job submission time.
While the departments at STATS, Inc. are being polled for their rebuild requirements, Mike has set up a program to read a dataset to determine whether a dataset qualifies for reloading based on the last rebuild date and the reload frequency.
Once selected a dataset RID is passed to a batch job that queries the rebuild dataset for the necessary parameters. Mike can detect unsuccessful builds based on the difference between the build start-end date-times. Mike will also handle exceptions. For example, in the case of several particularly large datasets, a customized job is called that is built specifically for those exceptional datasets.
Rebuilding Cues
While Southwest Texas State University rebuilds on an as-required or anticipatory schedule, STATS Inc. plans to use a straight time-based rebuilding cue. You could also use a different approach to determine the frequency-of-rebuild. For example, you might choose to rebuild based on the number of deleted, added, or updated records. Or, you might scan heavily used datasets for damage indications, and automate the recovery of these datasets.
System 1032 datasets function best when they have regular maintenance. Whether this maintenance is entirely manual, assisted by DMC or COM files, or is completely automated based on your site's criteria, your processing environment is enhanced.
We would like to thank Ardie Schneider of Southwest Texas State University and Mike Hammer of STATS, Inc. for sharing their expertise and ideas with us. Without them this article would not have been written. ñThe Editor
Copyright © 2008 Computer Corporation of America. All right reserved. Published in the United States of America.
Contact CCA Webmaster Copyright 2008