Computer Corporation of America
|
Feedback
Search CCA:
   
USA CCA
CCA Products
CCA Customer Support
CCA Resources
CCA - Company
CCAPRINT: A Newsletter for Model 204® and System 1032® Users
February, 10, 2003

System 1032
System 1032 MAIL Tools

By Tym StegnerTym

E-mail Messages in System 1032

Recently a System 1032 site posed an intriguing question. The site has a VMS and System 1032-based mailing service. They recently contracted to ship catalogs, and they receive catalog requests via e-mail messages into their VMS MAIL system. Their question was: how might they automatically poll a mailbox to handle these e-mail requests?

The e-mail message consists of an address to which the catalog is to be sent. The address must be parsed and stored into a System 1032 dataset. After discussing the situation with the customer and suggesting solutions, I would like to share how I might do this myself. Although a systems programmer could interact with the VMS MAIL symbiot, the system-wide mail handler, and react directly to incoming MAIL events, I am considering only DCL and System 1032 solutions.

Handling E-mail Messages

Handling e-mail messages is a two-part process:

Polling can be simply defined as waiting while periodically checking for an event. In the VMS world, at the DCL/System 1032 level, there are two ways to accomplish the waiting game:

  1. Successively submit a batch job
  2. Employ WAIT functionality

Incorporating WAIT functionality

Both DCL and System 1032 have access to WAIT functionality.

Although it is easier to program the use of the WAIT functionality in a loop, I prefer the batch submission approach. The benefits, as I see them, are:

Positioning Batch Submission

One decision to be made in the batch approach is the position of the SUBMIT command. Placing it at the close of processing tends to reduce parallel processing, so no other job is pending until the process is completed. However, to preserve the periodic polling process, resubmitting at the top of the job ensures there will be a poll at the appointed times.

To sum up so far, I made the following decisions for my e-mail program. I will use a DCL program to initiate my mail checking process, with a self-submit command at the top.

Checking for New E-mail Messages

My next decision is to decide on how to actually check for new e-mail messages. While it is possible to check for new mail from DCL, this historical approach is a bit involved, because it takes the following steps:

  1. Start the MAIL program
  2. Perform a READ/NEW
  3. Extract any found messages to a file
  4. Exit the MAIL program
  5. Parse the file to determine if there are any messages
  6. If messages are found, parse out one or more new messages.

I could modify these steps somewhat by extracting the new mail count from SYS$SYSTEM:VMSMAIL_PROFILE.DATA, but access to this file requires elevated privileges, plus knowledge of the file layout, and how to extract the new mail count.

This would be much easier if there was a callable interface to MAIL. Fortunately such an API has been available since VMS V5.4. The MAIL API is implemented for System 1032 users in two ways, as:

  1. A set of command variables, the MAIL_CMD library
  2. Tools procedures

Using the command variables, you can check for new mail as you would in DCL.

Figure A. Using command variable to check for mail

$ S1032
1032> open lib mail_cmd in s1032_demo readonly
1032> mail init
Implicit open done for S1032_STR library in 
SDSK:[S1032.V9811.TOOLS]S1032_STR.DML
Implicit open done for S1032_MAIL library in 
SDSK:[S1032.V9811.TOOLS]S1032_MAIL.DML
1032>
1032> mail read
1 messages in folder "NEWMAIL"
                         MAIL_MESSAGE_WHOLE

-----------------------------------------------------------------------
From: APE::USER1
To: USER2
CC:
Subj: catalog request

Mr. Firstname Lastname
Number Address Street
TownOrVillage, State ZipCode

1032>
1032> print $status
$STATUS
--------
00000001
1032>
1032> mail read
%DMAI-W-NOTEXIST, folder does not exist
%DMAI-W-NOTEXIST, folder does not exist
1032>
1032> print $status
$STATUS
--------
0C0B84C8
1032>
1032> mail end

Note: We can check if any mail is found by a test on $STATUS.

Messages that are read are placed into global variables in the MAIL_CMD library to allow programmatic access to the messages. This approach is somewhat easier than the DCL method, but still does not give me the control over the message components that I really want. I need to use the direct API procedure call.

System 1032 MAIL API

The System 1032 MAIL tools procedures are located in the S1032_MAIL library, in the S1032_TOOLS directory area.

Figure B. Using System 1032 MAIL tools procedures

!
! open my account's mail file and check for any new messages
!
Open library S1032_MAIL in s1032_tools readonly
variable msgCount,msgNum integer initially 0
Call MAIL_INIT
Call MAIL_SET_FOLDER( msgCount, "NEWMAIL" )
if msgCount gt 0 then
  for msgNum from 1 to msgCount do
    Call MAIL_SET_MESSAGE( msgNum )
    ...message processing...
  end_for
end_if

Extracting the E-mail Message Text

At this point, I have options for how I will extract the text of the message. I can:

In Figure A, I'm processing a catalog request. I want to break down the message into components. Therefore, the exploded approach, shown in Figure C, is what I do.

Figure C. Processing a catalog request

variable msgFrom, msgTo, msgCC, msgSub, msgExt, msgText text varying
variable msgDate date_time

Call MAIL_MESSAGE_INFO(msgFrom, msgDate, msgTo, msgCC, msgSub, msgExt)

Call MAIL_MESSAGE_TEXT(msgText)


1032> print msgText fmt sv50(a)
                     MSGTEXT
--------------------------------------------------
Mr. Firstname Lastname
Number Address Street
TownOrVillage, State ZipCode

1032> print msgFrom msgDate msgTo msgCC msgSub msgExt
      MSGFROM               MSGDATE              MSGTO
-------------------- ------------------  --------------------
        MSGCC                MSGSUB               MSGEXT
  -------------------- --------------------   --------------------
APE::USER1             02/10/2003 10:01AM   USER1
  ?                     catalog request         ?
1032> 

Analyzing Figure C

In Figure C, no CC set was set and the message was under the thousand-byte limit, so no external file was necessary. The contents of the e-mail message buffer must still be processed, but the complexity is reduced as the buffer now contains only the text of the message sent.

Once processing is completed, I terminate the mail interface, thus closing the mail file.

S1032> Call MAIL_END

The previous System 1032 commands are but a brief introduction to the commands available within the MAIL_CMD and S1032_MAIL libraries. Most any common mail actions can be performed with these libraries.

In Summary

Entire mail processors have been created using these procedures, automating the message retrieval and redirections, as well as creating custom address lookups and interfaces.

If your copy of the System 1032 Tools Guide does not have the chapter on Mail Tools Procedures, copies of the S1032 MAIL API will be available soon in the Anonymous FTP download area at:

FOX.CCA-INT.COM

Model 204

PRESENT or NOT PRESENT Using the Ordered Index
Part II: Ordered Numeric fields

By James Damon

Ordered Character versus Ordered Numeric fields

In the January 2003 issue of CCAprint I discussed using the Ordered Index, specifically Ordered Character fields, to simulate the IS [NOT] PRESENT FIND condition and the concomitant performance benefits of using this approach. Simulating IS [NOT] PRESENT processing for Ordered Numeric fields requires a slightly different technique to retain high performance and achieve the desired results.

When you work with Ordered Numeric fields, in addition to determining the presence or absence of the field in a record, you may also need to know which occurrences contain non-numeric data. That data may be even more erroneous than records containing no occurrence at all. You must also keep in mind:

So our effort is two-fold:

  1. Keep up the speed and efficiency of the search
  2. Locate records with non-numeric data in fields defined as Ordered Numeric.

Establishing the Cost of Searching Table B

In Figure 1, we will execute a Table B direct record search, which we know in advance is the slowest way to search, but we want to establish some statistics for comparison. This request is run against the nearly one-million record file DSNLIST3. The field SEQNO is defined as OCCURS 1 LENGTH 6 ORDERED NUMERIC.

Figure 1. Using IS LIKE * syntax against an Ordered Numeric field

03.041 FEB 03 17.37.44             PAGE 127
BEGIN
PRES1: IN DSNLIST3 FIND RECORDS SEQNO IS LIKE *
       END FIND
C1: COUNT RECORDS IN PRES1
PRINT COUNT IN C1 WITH `: TOTAL RECORDS IN WHICH SEQNO OCCURS '
END
T REQUEST 
999831: TOTAL RECORDS IN WHICH SEQNO OCCURS
CPU=45.058 CNCT=2893 DKRD=54162 DKWR=99 SQRD=50 SQWR=180
NTBL=2 QTBL=18 STBL=41 TTBL=3 VTBL=5 PDL=708 CNCT=80
CPU=14580 DKRD=40531 DKWR=46 OUT=3 FINDS=1 PCPU=180 RQTM=80149
DIRRCD=999847 DKPR=2040231            

Figure 1 confirms that the IS LIKE * syntax does a Table B search against an Ordered Numeric field. The RQTM statistic illustrates that the elapsed time was roughly 80 seconds (RQTM=80149). After subtracting CPU=14580, we can see that 65,569 milliseconds of elapsed time were required to perform the 40,531 physical I/Os (the DKRD statistic) to Table B required for a direct search of 999,847 records (the DIRRCD statistic).

Figure 1 found 999,831 records in which the SEQNO field occurs. However, we don't yet know which of those records harbor non-numeric data.

Increasing the Speed by Using the Ordered Index

Figure 2 shows a vastly superior way to achieve similar, though not exactly equivalent results, using the Ordered Index and Ordered Numeric fields. This FIND statement avoids a direct record search of Table B; instead it uses the Ordered Index to resolve the FIND processing.

However, it does not exactly simulate IS PRESENT or IS LIKE * processing. The resultant found set not only excludes records that do not contain the field, it also excludes records that contain the field with a non-numeric value.

Figure 2. Using IS GT OR IS LE syntax against an ORDERED NUMERIC field

03.041 FEB 03 17.47.55           PAGE 133
BEGIN
PRES2: IN DSNLIST3 FIND RECORDS SEQNO IS GT 0 OR SEQNO IS LE 0
       END FIND
C2:    COUNT RECORDS IN PRES2
PRINT COUNT IN C2 WITH `: TOTAL RECS WITH NUMERIC VALUES OF SEQNO'
END
T REQUEST
999814: TOTAL RECS WITH NUMERIC VALUES OF SEQNO 
CPU=24.378 CNCT=444 DKRD=10893 DKWR=44 SQRD=28 SQWR=94
NTBL=2 QTBL=17 STBL=46 TTBL=4 VTBL=12 PDL=856 CNCT=14 
CPU=6057 DKRD=2715 OUT=3 FINDS=1 PCPU=426 RQTM=14179
BXNEXT=2019625 BXFIND=4 BXRFND=19996 DKPR=1025240

Enjoying the Speed

In Table 1, the RQTM statistic shows that the Figure 2 recorded elapsed time as just 14.179 seconds compared to 80.149 seconds in Figure 1.

Table 1. Comparing an IS LIKE * search to an IS GT OR IS Le search

Search strategy

CPU (ms)

DKRD

RQTM (ms)

DIRRCD

IS LIKE * (Figure 1)

14,580

40,531

80,149

999,847

IS GT OR IS LE (Figure 2)

6,057

2,715

14,179

0

Hunting for Records with Non-Numeric Data

However, notice that the count of records in Figure 2 is 17 less than the count in Figure 1. Those 17 records contain a non-numeric occurrence of the field SEQNO.

In Figure 3, the record set on list NOTNUM.NOTPRES is generally what is desired when reviewing Ordered Numeric fields for accuracy, because this record set identifies records for which:

Either condition could represent a data inconsistency, which may require further investigation.

The request in Figure 3 illustrates how to isolate and separate records with no occurrence of SEQNO from those with non-numeric occurrences of SEQNO.

Figure 3. Isolating records where SEQNO is not present or is not numeric

BEGIN
ALL0: IN DSNLIST3 FIND ALL RECORDS           
      END FIND
C0: COUNT RECORDS IN ALL0
PRINT COUNT IN C0 WITH ': TOTAL RECORDS IN FILE'
PLACE RECORDS IN ALL0 ON LIST NOTNUM.NOTPRES           
*
ALL1: IN DSNLIST3 FIND RECORDS SEQNO IS GT 0 OR SEQNO IS LE 0
      END FIND
C1:   COUNT RECORDS IN ALL1
PRINT COUNT IN C1 WITH ': TOTAL RECS WITH NUMERIC VALUES OF SEQNO'
REMOVE RECORDS IN ALL1 FROM LIST NOTNUM.NOTPRES
*
C2: COUNT RECORDS ON LIST NOTNUM.NOTPRES           
PRINT COUNT IN C2 WITH ':      RECS WHERE SEQNO NOT NUMERIC OR NOT PRES'
NOTPRES: FIND RECORDS ON LIST NOTNUM.NOTPRES SEQNO IS NOT PRESENT
         END FIND
C3:      COUNT RECORDS IN NOTPRES
PRINT COUNT IN C3 WITH ':      RECS WHERE SEQNO NOT PRESENT'
*
PLACE RECORDS ON LIST NOTNUM.NOTPRES ON LIST NOTNUM
REMOVE RECORDS IN NOTPRES FROM LIST NOTNUM
C4: COUNT RECORDS ON LIST NOTNUM           
PRINT COUNT IN C4 WITH ':      RECS WHERE SEQNO NON NUMERIC'
END
T REQUEST 
999847: TOTAL RECORDS IN FILE
999814: TOTAL RECS WITH NUMERIC VALUES OF SEQNO
33:     RECS WHERE SEQNO NOT NUMERIC OR NOT PRES
16:     RECS WHERE SEQNO NOT PRESENT           
17:     RECS WHERE SEQNO NON NUMERIC            
CPU=56.953 CNCT=8003 DKRD=59624 DKWR=117 SQRD=79 SQWR=254
NTBL=10 QTBL=75 STBL=183 TTBL=4 VTBL=14 PDL=856 CNCT=11 CPU=5901 
DKRD=2717 OUT=7 FINDS=3 PCPU=520 RQTM=11323 DIRRCD=33 
BXNEXT=2019625 BXFIND=4 BXRFND=19996 DKPR=1025388

IS [NOT] PRESENT Processing Reappears

Figure 3 also illustrates that you can use the IS NOT PRESENT condition without a significant performance penalty to identify records containing no occurrence of the field. The lack of significant performance penalty is due to the small set of records to be directly searched. Alternatively, since the record set is small, you could examine each record directly in a FOR loop to differentiate between records containing no occurrence of SEQNO and records containing non-numeric occurrences.

The Cost of Nearly All Unique Values

Thus far we have determined the most efficient way of creating a found set using an Ordered Numeric field. We have isolated the records without the field or with the field, but harboring invalid data. Next we will consider the effect of the frequency of unique values on processing.

The examples in Figures 1, 2, and 3 use the field SEQNO, which occurs in nearly every record and for which all values, with a few exceptions, are unique. These nearly one million unique values must be stored in leaf pages in the Ordered Index, as illustrated in Figure 4.

Figure 4. ANALYZE command output illustrating the leaf pages required for so many unique values

ANALYZE SEQNO
ROOT NODE VERSION NUMBER = 1449263           
*** M204.0005: ANALYZE FIELDNAME = SEQNO
                     AVG.       OFFSET    COMP.     KEY       PAGE        AVG.
           PAGES     ENTRY      AREA      SIZE      AREA      USAGE%      UNUSED
ROOT       1         7          34                  60        1           6050
I-NODE     6         465        951                 4082      81          1110
LEAF       2689      371        763       3         3774      73          1602
MRIB:       IMMEDIATE           LIST            BITMAP           TOTAL
  ENTRIES                       2                                2
  RECORDS                       20                               20
    PAGES                       2                                2SRIB:       999811

In Figure 4 the SRIB (single record information block) statistic shows that there are, in fact, 999,811 unique values of SEQNO. Also, there are two MRIB (multiple record information blocks) entries, meaning that there are two values of SEQNO, 999999 and 3FEB03, which are not unique and which occur in 20 records. Seventeen records contain SEQNO=3FEB03 and three records contain SEQNO=999999.

This large collection of unique values for SEQNO results in worst-case performance in Ordered Index searches, as shown in these examples. When an Ordered field has only a few unique values, the performance of applications running these types of searches is considerably improved, resulting in fewer DKRDs, less CPU consumption, and less time holding the INDEX resource in SHARE mode.

In Summary

With Ordered Numeric fields, as with Ordered Character fields, the Ordered Index provides powerful and efficient facilities for determining the presence or absence of fields in a record. With Ordered Numeric fields the additional requirement of identifying records containing non-numeric values is also efficiently supported via the Ordered Index. Furthermore, the IS [NOT] PRESENT FIND condition remains a valuable and easy to use FIND condition and, when restricted to small sets of records, avoids the significant performance penalty typically associated with a direct-record search of an entire file.

Copyright © 2008 Computer Corporation of America.
All right reserved. Published in the United States of America.


Contact CCA Webmaster
Copyright 2008