System 1032 System 1032 MAIL Tools
By Tym Stegner
E-mail Messages in System 1032
Recently a System 1032 site posed an intriguing question. The site has a VMS and System 1032-based mailing service. They recently contracted to ship catalogs, and they receive catalog requests via e-mail messages into their VMS MAIL system. Their question was: how might they automatically poll a mailbox to handle these e-mail requests?
The e-mail message consists of an address to which the catalog is to be sent. The address must be parsed and stored into a System 1032 dataset. After discussing the situation with the customer and suggesting solutions, I would like to share how I might do this myself. Although a systems programmer could interact with the VMS MAIL symbiot, the system-wide mail handler, and react directly to incoming MAIL events, I am considering only DCL and System 1032 solutions.
Handling E-mail Messages
Handling e-mail messages is a two-part process:
Polling can be simply defined as waiting while periodically checking for an event. In the VMS world, at the DCL/System 1032 level, there are two ways to accomplish the waiting game:
Incorporating WAIT functionality
Both DCL and System 1032 have access to WAIT functionality.
Although it is easier to program the use of the WAIT functionality in a loop, I prefer the batch submission approach. The benefits, as I see them, are:
Positioning Batch Submission
One decision to be made in the batch approach is the position of the SUBMIT command. Placing it at the close of processing tends to reduce parallel processing, so no other job is pending until the process is completed. However, to preserve the periodic polling process, resubmitting at the top of the job ensures there will be a poll at the appointed times.
To sum up so far, I made the following decisions for my e-mail program. I will use a DCL program to initiate my mail checking process, with a self-submit command at the top.
Checking for New E-mail Messages
My next decision is to decide on how to actually check for new e-mail messages. While it is possible to check for new mail from DCL, this historical approach is a bit involved, because it takes the following steps:
I could modify these steps somewhat by extracting the new mail count from SYS$SYSTEM:VMSMAIL_PROFILE.DATA, but access to this file requires elevated privileges, plus knowledge of the file layout, and how to extract the new mail count.
This would be much easier if there was a callable interface to MAIL. Fortunately such an API has been available since VMS V5.4. The MAIL API is implemented for System 1032 users in two ways, as:
Using the command variables, you can check for new mail as you would in DCL.
Figure A. Using command variable to check for mail
$ S1032 1032> open lib mail_cmd in s1032_demo readonly 1032> mail init Implicit open done for S1032_STR library in SDSK:[S1032.V9811.TOOLS]S1032_STR.DML Implicit open done for S1032_MAIL library in SDSK:[S1032.V9811.TOOLS]S1032_MAIL.DML 1032> 1032> mail read 1 messages in folder "NEWMAIL" MAIL_MESSAGE_WHOLE ----------------------------------------------------------------------- From: APE::USER1 To: USER2 CC: Subj: catalog request Mr. Firstname Lastname Number Address Street TownOrVillage, State ZipCode 1032> 1032> print $status $STATUS -------- 00000001 1032> 1032> mail read %DMAI-W-NOTEXIST, folder does not exist %DMAI-W-NOTEXIST, folder does not exist 1032> 1032> print $status $STATUS -------- 0C0B84C8 1032> 1032> mail end
Note: We can check if any mail is found by a test on $STATUS.
Messages that are read are placed into global variables in the MAIL_CMD library to allow programmatic access to the messages. This approach is somewhat easier than the DCL method, but still does not give me the control over the message components that I really want. I need to use the direct API procedure call.
System 1032 MAIL API
The System 1032 MAIL tools procedures are located in the S1032_MAIL library, in the S1032_TOOLS directory area.
Figure B. Using System 1032 MAIL tools procedures
! ! open my account's mail file and check for any new messages ! Open library S1032_MAIL in s1032_tools readonly variable msgCount,msgNum integer initially 0 Call MAIL_INIT Call MAIL_SET_FOLDER( msgCount, "NEWMAIL" ) if msgCount gt 0 then for msgNum from 1 to msgCount do Call MAIL_SET_MESSAGE( msgNum ) ...message processing... end_for end_if
Extracting the E-mail Message Text
At this point, I have options for how I will extract the text of the message. I can:
In Figure A, I'm processing a catalog request. I want to break down the message into components. Therefore, the exploded approach, shown in Figure C, is what I do.
Figure C. Processing a catalog request
variable msgFrom, msgTo, msgCC, msgSub, msgExt, msgText text varying variable msgDate date_time Call MAIL_MESSAGE_INFO(msgFrom, msgDate, msgTo, msgCC, msgSub, msgExt) Call MAIL_MESSAGE_TEXT(msgText) 1032> print msgText fmt sv50(a) MSGTEXT -------------------------------------------------- Mr. Firstname Lastname Number Address Street TownOrVillage, State ZipCode 1032> print msgFrom msgDate msgTo msgCC msgSub msgExt MSGFROM MSGDATE MSGTO -------------------- ------------------ -------------------- MSGCC MSGSUB MSGEXT -------------------- -------------------- -------------------- APE::USER1 02/10/2003 10:01AM USER1 ? catalog request ? 1032>
Analyzing Figure C
In Figure C, no CC set was set and the message was under the thousand-byte limit, so no external file was necessary. The contents of the e-mail message buffer must still be processed, but the complexity is reduced as the buffer now contains only the text of the message sent.
Once processing is completed, I terminate the mail interface, thus closing the mail file.
S1032> Call MAIL_END
The previous System 1032 commands are but a brief introduction to the commands available within the MAIL_CMD and S1032_MAIL libraries. Most any common mail actions can be performed with these libraries.
In Summary
Entire mail processors have been created using these procedures, automating the message retrieval and redirections, as well as creating custom address lookups and interfaces.
If your copy of the System 1032 Tools Guide does not have the chapter on Mail Tools Procedures, copies of the S1032 MAIL API will be available soon in the Anonymous FTP download area at:
FOX.CCA-INT.COM
Model 204
PRESENT or NOT PRESENT Using the Ordered Index Part II: Ordered Numeric fields
By James Damon
Ordered Character versus Ordered Numeric fields
In the January 2003 issue of CCAprint I discussed using the Ordered Index, specifically Ordered Character fields, to simulate the IS [NOT] PRESENT FIND condition and the concomitant performance benefits of using this approach. Simulating IS [NOT] PRESENT processing for Ordered Numeric fields requires a slightly different technique to retain high performance and achieve the desired results.
When you work with Ordered Numeric fields, in addition to determining the presence or absence of the field in a record, you may also need to know which occurrences contain non-numeric data. That data may be even more erroneous than records containing no occurrence at all. You must also keep in mind:
So our effort is two-fold:
Establishing the Cost of Searching Table B
In Figure 1, we will execute a Table B direct record search, which we know in advance is the slowest way to search, but we want to establish some statistics for comparison. This request is run against the nearly one-million record file DSNLIST3. The field SEQNO is defined as OCCURS 1 LENGTH 6 ORDERED NUMERIC.
Figure 1. Using IS LIKE * syntax against an Ordered Numeric field
03.041 FEB 03 17.37.44 PAGE 127 BEGIN PRES1: IN DSNLIST3 FIND RECORDS SEQNO IS LIKE * END FIND C1: COUNT RECORDS IN PRES1 PRINT COUNT IN C1 WITH `: TOTAL RECORDS IN WHICH SEQNO OCCURS ' END T REQUEST
999831: TOTAL RECORDS IN WHICH SEQNO OCCURS
CPU=45.058 CNCT=2893 DKRD=54162 DKWR=99 SQRD=50 SQWR=180 NTBL=2 QTBL=18 STBL=41 TTBL=3 VTBL=5 PDL=708 CNCT=80 CPU=14580 DKRD=40531 DKWR=46 OUT=3 FINDS=1 PCPU=180 RQTM=80149 DIRRCD=999847 DKPR=2040231
Figure 1 confirms that the IS LIKE * syntax does a Table B search against an Ordered Numeric field. The RQTM statistic illustrates that the elapsed time was roughly 80 seconds (RQTM=80149). After subtracting CPU=14580, we can see that 65,569 milliseconds of elapsed time were required to perform the 40,531 physical I/Os (the DKRD statistic) to Table B required for a direct search of 999,847 records (the DIRRCD statistic).
Figure 1 found 999,831 records in which the SEQNO field occurs. However, we don't yet know which of those records harbor non-numeric data.
Increasing the Speed by Using the Ordered Index
Figure 2 shows a vastly superior way to achieve similar, though not exactly equivalent results, using the Ordered Index and Ordered Numeric fields. This FIND statement avoids a direct record search of Table B; instead it uses the Ordered Index to resolve the FIND processing.
However, it does not exactly simulate IS PRESENT or IS LIKE * processing. The resultant found set not only excludes records that do not contain the field, it also excludes records that contain the field with a non-numeric value.
Figure 2. Using IS GT OR IS LE syntax against an ORDERED NUMERIC field
03.041 FEB 03 17.47.55 PAGE 133 BEGIN PRES2: IN DSNLIST3 FIND RECORDS SEQNO IS GT 0 OR SEQNO IS LE 0 END FIND C2: COUNT RECORDS IN PRES2 PRINT COUNT IN C2 WITH `: TOTAL RECS WITH NUMERIC VALUES OF SEQNO' END T REQUEST
999814: TOTAL RECS WITH NUMERIC VALUES OF SEQNO
CPU=24.378 CNCT=444 DKRD=10893 DKWR=44 SQRD=28 SQWR=94 NTBL=2 QTBL=17 STBL=46 TTBL=4 VTBL=12 PDL=856 CNCT=14 CPU=6057 DKRD=2715 OUT=3 FINDS=1 PCPU=426 RQTM=14179 BXNEXT=2019625 BXFIND=4 BXRFND=19996 DKPR=1025240
Enjoying the Speed
In Table 1, the RQTM statistic shows that the Figure 2 recorded elapsed time as just 14.179 seconds compared to 80.149 seconds in Figure 1.
Table 1. Comparing an IS LIKE * search to an IS GT OR IS Le search
Search strategy
CPU (ms)
DKRD
RQTM (ms)
DIRRCD
IS LIKE * (Figure 1)
14,580
40,531
80,149
999,847
IS GT OR IS LE (Figure 2)
6,057
2,715
14,179
0
Hunting for Records with Non-Numeric Data
However, notice that the count of records in Figure 2 is 17 less than the count in Figure 1. Those 17 records contain a non-numeric occurrence of the field SEQNO.
In Figure 3, the record set on list NOTNUM.NOTPRES is generally what is desired when reviewing Ordered Numeric fields for accuracy, because this record set identifies records for which:
Either condition could represent a data inconsistency, which may require further investigation.
The request in Figure 3 illustrates how to isolate and separate records with no occurrence of SEQNO from those with non-numeric occurrences of SEQNO.
Figure 3. Isolating records where SEQNO is not present or is not numeric
BEGIN ALL0: IN DSNLIST3 FIND ALL RECORDS END FIND C0: COUNT RECORDS IN ALL0 PRINT COUNT IN C0 WITH ': TOTAL RECORDS IN FILE' PLACE RECORDS IN ALL0 ON LIST NOTNUM.NOTPRES * ALL1: IN DSNLIST3 FIND RECORDS SEQNO IS GT 0 OR SEQNO IS LE 0 END FIND C1: COUNT RECORDS IN ALL1 PRINT COUNT IN C1 WITH ': TOTAL RECS WITH NUMERIC VALUES OF SEQNO' REMOVE RECORDS IN ALL1 FROM LIST NOTNUM.NOTPRES * C2: COUNT RECORDS ON LIST NOTNUM.NOTPRES PRINT COUNT IN C2 WITH ': RECS WHERE SEQNO NOT NUMERIC OR NOT PRES' NOTPRES: FIND RECORDS ON LIST NOTNUM.NOTPRES SEQNO IS NOT PRESENT END FIND C3: COUNT RECORDS IN NOTPRES PRINT COUNT IN C3 WITH ': RECS WHERE SEQNO NOT PRESENT' * PLACE RECORDS ON LIST NOTNUM.NOTPRES ON LIST NOTNUM REMOVE RECORDS IN NOTPRES FROM LIST NOTNUM C4: COUNT RECORDS ON LIST NOTNUM PRINT COUNT IN C4 WITH ': RECS WHERE SEQNO NON NUMERIC' END T REQUEST
999847: TOTAL RECORDS IN FILE 999814: TOTAL RECS WITH NUMERIC VALUES OF SEQNO 33: RECS WHERE SEQNO NOT NUMERIC OR NOT PRES 16: RECS WHERE SEQNO NOT PRESENT 17: RECS WHERE SEQNO NON NUMERIC
CPU=56.953 CNCT=8003 DKRD=59624 DKWR=117 SQRD=79 SQWR=254 NTBL=10 QTBL=75 STBL=183 TTBL=4 VTBL=14 PDL=856 CNCT=11 CPU=5901 DKRD=2717 OUT=7 FINDS=3 PCPU=520 RQTM=11323 DIRRCD=33 BXNEXT=2019625 BXFIND=4 BXRFND=19996 DKPR=1025388
IS [NOT] PRESENT Processing Reappears
Figure 3 also illustrates that you can use the IS NOT PRESENT condition without a significant performance penalty to identify records containing no occurrence of the field. The lack of significant performance penalty is due to the small set of records to be directly searched. Alternatively, since the record set is small, you could examine each record directly in a FOR loop to differentiate between records containing no occurrence of SEQNO and records containing non-numeric occurrences.
The Cost of Nearly All Unique Values
Thus far we have determined the most efficient way of creating a found set using an Ordered Numeric field. We have isolated the records without the field or with the field, but harboring invalid data. Next we will consider the effect of the frequency of unique values on processing.
The examples in Figures 1, 2, and 3 use the field SEQNO, which occurs in nearly every record and for which all values, with a few exceptions, are unique. These nearly one million unique values must be stored in leaf pages in the Ordered Index, as illustrated in Figure 4.
Figure 4. ANALYZE command output illustrating the leaf pages required for so many unique values
ANALYZE SEQNO ROOT NODE VERSION NUMBER = 1449263 *** M204.0005: ANALYZE FIELDNAME = SEQNO AVG. OFFSET COMP. KEY PAGE AVG. PAGES ENTRY AREA SIZE AREA USAGE% UNUSED ROOT 1 7 34 60 1 6050 I-NODE 6 465 951 4082 81 1110 LEAF 2689 371 763 3 3774 73 1602
MRIB: IMMEDIATE LIST BITMAP TOTAL ENTRIES 2 2 RECORDS 20 20 PAGES 2 2SRIB: 999811
In Figure 4 the SRIB (single record information block) statistic shows that there are, in fact, 999,811 unique values of SEQNO. Also, there are two MRIB (multiple record information blocks) entries, meaning that there are two values of SEQNO, 999999 and 3FEB03, which are not unique and which occur in 20 records. Seventeen records contain SEQNO=3FEB03 and three records contain SEQNO=999999.
This large collection of unique values for SEQNO results in worst-case performance in Ordered Index searches, as shown in these examples. When an Ordered field has only a few unique values, the performance of applications running these types of searches is considerably improved, resulting in fewer DKRDs, less CPU consumption, and less time holding the INDEX resource in SHARE mode.
With Ordered Numeric fields, as with Ordered Character fields, the Ordered Index provides powerful and efficient facilities for determining the presence or absence of fields in a record. With Ordered Numeric fields the additional requirement of identifying records containing non-numeric values is also efficiently supported via the Ordered Index. Furthermore, the IS [NOT] PRESENT FIND condition remains a valuable and easy to use FIND condition and, when restricted to small sets of records, avoids the significant performance penalty typically associated with a direct-record search of an entire file.
Copyright © 2008 Computer Corporation of America. All right reserved. Published in the United States of America.
Contact CCA Webmaster Copyright 2008