System 1032
USE OF AND
ACCESS TO PRODUCTS AND FEATURES ARE IN ACCORDANCE WITH THE TERMS AND
CONDITIONS OF THE USERS SOFTWARE LICENSE. THE PRESENTATION OF
MATERIAL HEREIN DOES NOT, IN ANY MANNER, MODIFY SUCH TERMS AND CONDITIONS.
Routine Catalog
Maintenance
By Tym
Stegner
In
this article I describe, in detail, optimizations that can be
applied to catalogs and necessitate recreating your catalogs.
The first few topics apply to all catalogs, followed by catalog-specific
sections.
Trigger
Procedures
Trigger procedures,
invoked by OPEN or other command-based processing, are an optional feature
of datasets and databases. While you develop your trigger procedures you
may issue many INSERT, REMOVE and/or REPLACE commands, until the trigger
operates the way you planned. When you are satisfied with your development
testing, CCA recommends that you recreate the catalog to clear the extraneous
referential overhead.
Security
Overhead
Using the ADMIT and
PREVENT commands you can create and maintain access controls for catalogs
themselves and the elements they contain. Accessing and evaluating security
against a dataset, attributes, or records involves considerable overhead,
usually during OPEN processing. For example, consider the case where security
rules for a dataset were applied for new users, but not removed for those
who no longer exist on the system. Gradually the access time for the dataset
degrades. You can make OPEN processing much more efficient by eliminating
out-dated security data.
Remember: A dataset's
original creator can access any catalog, regardless of applied security,
unless the default owner of the catalog is overridden. The only exception
to this is if the value of the system variable $SITE_DBA has been set
by the license activation process. That user name will always have ownership-level
access to any catalog.
Especially
for Datasets
The following list
identifies several maintenance projects that require little explanation:
| hot wild girls |
|
Clear
damage history. |
| |
|
Reorganize
data records to reduce SORT processing overhead by reviewing major
reports used against the dataset and pre-sorting the data by those
attributes before loading. |
| |
|
Release
unused pre-allocated disk space (by issuing an ALLOCATE command). |
Reintegrate
DMK and DMR Files
After you shift the
external DMK and DMR files back from separate disk spindles, reintegrate
them into their DMS file. Todays disks are bigger and faster than
formerly. An original dataset design that placed the keys and records
on different disks for asynchronous access might now have access to faster
consolidated disks. CCA recommends that you reintegrate the dataset, as
most former benefits are eliminated.
Only two situations
remain that benefit from separate DMK and DMR files: rekeying or LOAD
SUPERCEDE processing. In either case the DMK or DMR files are flushed
without having to dump and reload the DMS file. Note: this separation
is most useful only for a dataset that periodically replaces all its records.
Dataset
Keys
The most efficient
keys are those freshly built, before any updates have occurred. Every
update to a dataset's keys requires the rewrite of at least one key data
block, often several, and may include one or more key structure blocks.
By default, a key block reserves about 15% of its original allocation
for expansion space. Over time, updates and additions to records can cause
one or more key blocks to be split into separate blocks, because the original
key block is completely filled. In turn, more and more I/Os are required
as updates continue to retrieve necessary data. In addition, key data
for deleted records still exists in the key block. For key table compaction
alone it is worth doing dataset maintenance.
Records
System 1032 does not
ordinarily store assembled records. (The exception is grouped attributes.)
Attribute data is stored in chains of data blocks, which requires multiple
I/Os to retrieve the components of a record. Furthermore, Text Varying
or Binary Varying data are stored in completely separate sets of data
block chains, using a different blocking factor from the other attribute
data types.
Similar to key tables,
records are most efficiently stored by the LOAD command, as the data chains
are more sequential than they will be after record additions. Deleted
records in a dataset do not physically go away. The result set of your
FIND command must be processed against the master deleted record selection
set.
Maintaining deleted
records requires additional overhead for keying and record security, not
to mention the overhead associated with every FIND command. Dataset maintenance
frees up the space used by deleted records by completely removing deleted
records or preserving them in another place, if desired.
Allocation:
The ALLOCATE command is used to reserve space within a dataset for records
and can be used for key table space. See our first-ever CCAPRINT
article, May 1996 for details.
Loading: Use
the system variable $CLUSTER_LIMIT while loading data records. Binary
data loads faster than text data, albeit binary data files are less easy
to read outside of System 1032. Dataset to dataset dump is a preferred
methodology, as System 1032 transfers data in its internal formats, requiring
less data conversion during data transfer.
Pre-sorting the data by its primary key before loading can also help in
key table generation. Specify the PRESORTED {attr-name} option in your
LOAD command to help reduce sort overhead required to create key tables.
Reading:
Reduce null space in key tables by using attribute ranges to shrink data
storage requirements.
Updating:
Enlarge null space in key tables by using individual temporary datasets
for per-person data entry, then doing bulk updates to merge temporary
datasets into the master dataset.
Maintaining
Your System 1032 Libraries
The more frequently
a library is subject to INSERT, REMOVE, and REPLACE command processing,
the more quickly the library may need to be recreated. The elements in
a libraryvariables, forms, procedureslike those in a dataset,
can be thought of as being referenced as offsets from the top of the library.
As elements are added, removed, or replaced in a library, special re-direction
structures are created within the library to preserve the proper offsets
of those elements within the library.
Over time multitudinous
updates can increase the redirection structures until they become fragmented
or scrambled, causing errors when the elements are referenced or a procedure
call is made. Also, there is additional overhead incurred using these
elements, as the redirections must be traversed each time they are used.
Short of re-creating the library, there is no way to streamline these
data structures.
Maintaining
Your System 1032 Databases
Of the three types
of System 1032 catalogs, databases require the least maintenance. Short
of a definitional database, where datasets live within the DMB file itself,
such as the ODBC dictionary or a database storing programming elements,
a database file usually does not get many updates, so the considerations
applicable to datasets or libraries do not apply.
That said, database maintenance might be undertaken to:
| |
|
Utilize
logical names. For example, so a procedure can reference a database,
no matter where the database is located. |
| |
|
Handle an overflow
of changes to stored joins that eventually lead to fragmentation
similar to the by-product of using INSERT, REMOVE, and REPLACE command
processing in a library.
|
This article described the internal structures of System 1032 dataset,
database, and library catalogs, with an eye toward increasing the database
designers knowledge of these structures to create more efficient
catalogs. This knowledge is also a background for next months article,
which deals with elements of routine maintenance of catalogs.
Summary
This article examined
elements of catalog maintenance from the perspective of: Why do
it? In previously published articles we examined how to do it.
Understanding why a maintenance operation is beneficial may contribute
to finding the time or resources necessary to perform the maintenance,
thus contributing to the efficiency of your applications.