Computer Corporation of America
|
Feedback
Search CCA:
   
USA CCA
Rocket
Customer Support
CCA Company
CCAPRINT: A Newsletter for Model 204® and System 1032® Users
September 25, 2008
     
Model 204: I/O Bound or CPU Bound - Which Are You? Printer-friendly version
System 1032: Extracting Metadata from Datasets, Part II Printer-friendly version

Model 204
USE OF AND ACCESS TO PRODUCTS AND FEATURES ARE IN ACCORDANCE WITH THE TERMS AND CONDITIONS OF THE USER’S SOFTWARE LICENSE. THE PRESENTATION OF MATERIAL HEREIN DOES NOT, IN ANY MANNER, MODIFY SUCH TERMS AND CONDITIONS.

I/O Bound or CPU Bound - Which Are You?
By James Damon



We hear these terms often—I/O bound or CPU bound--but what do they really mean? What are the ramifications if my application is one or the other? What can I do about it? What should I do about it? How does a system manager identify one type of user versus the other? What, if anything, should a system manager do? I’ll try to answer these and other questions and explain what this all means in this article.

Identifying an I/O Bound Application
An I/O bound application is one whose speed is limited (bound) by the speed of I/O devices--disks, printers, terminals and other input/output devices. Figure 1 illustrates an application that is I/O bound. This application requires a direct search of every record in the file because of the IS PRESENT clause. It uses very little CPU. Its elapsed time is determined more by the speed of the input device--a disk where the Model 204 file is allocated—rather than by the speed or availability of the CPU. The I/O required, DKRD=4104, accounts for almost 90% ((RQTM-CPU) / RQTM) of the elapsed time.

Figure 1. An I/O bound application

08.248  SEP 04  13.00.35         PAGE 1

BEGIN
FD RECTYPE IS PRESENT
END
T REQUEST

CPU=0.646  CNCT=51  DKRD=4104  DKWR=1  SQRD=13  SQWR=71
NTBL=1  QTBL=12  TTBL=3  VTBL=10  PDL=796  CNCT=5  CPU=635  
DKRD=4101  OUT=1  FINDS=1  PCPU=122  RQTM=5166  DIRRCD=101000

Applications in a multi-user, full screen, APSY environment are almost always I/O bound.

Identifying a CPU Bound Application
Contrast the I/O bound application in Figure 1 with the CPU bound application in Figure 2, which performs no I/O whatsoever. Although it does nothing useful, it consumes a significant amount of CPU.

Figure 2. A CPU bound application


08.248  SEP 04  15.09.10         PAGE 5

BEGIN
REPEAT 10000000 TIMES
END REPEAT
END
T REQUEST

CPU=3.962  CNCT=44  SQRD=9  SQWR=32
QTBL=4  VTBL=2  PDL=252  CNCT=9  CPU=3955  OUT=1  SLIC=357  
PCPU=416  RQTM=9502  DKPRF=1

The only resource consumed by the Figure 2 request was CPU. Its speed was bound by CPU availability. It consumed 3955 ms of CPU and required an elapsed time, RQTM, of 9502 ms to complete. That resulted in a PCPU (percentage of CPU available) of 416 (3955 / 9502). In other words, the CPU was 41.6% available to this request while it ran. This informs us that the operating system was dispatching other, higher priority tasks and also running itself 58.4% of the time. Model 204 could also have been running higher priority users during this time. From Model 204’s perspective, this application was CPU bound and was treated accordingly by being time sliced 357 times (SLIC=357).

Distinguishing I/O Bound vs CPU Bound Users
How did Model 204 know the CPU bound request was, in fact, CPU bound? The IOSLICE parameter, set by the system manager in CCAIN, sets the threshold for declaring a user as CPU bound. The IOSLICE default is 30 milliseconds. Once the CPU bound request shown in Figure 2, consumed 30 milliseconds of CPU, and still had not performed a disk I/O, terminal I/O, or gone into a wait for some reason, it was declared by the Model 204 scheduler to be CPU bound. Up until that point it was considered I/O bound because it had used less than IOSLICE milliseconds of CPU. The CPUSLICE parameter, also set in CCAIN, is used when a user becomes CPU bound; CPUSLICE becomes the new time slice value for that user, replacing the IOSLICE value.

Balancing IOSLICE and CPUSLICE
Figure 3 shows the time slice parameter settings and purpose. The Model 204 scheduler uses these settings as limits to manage time slicing and to determine whether a user is I/O or CPU bound.

Figure 3. Viewing the time slice parameter values


VIEW IOSLICE, CPUSLICE

IOSLICE   30          CPU SLICE – IO

CPUSLICE  10          CPU SLICE – CPU

The defaults for IOSLICE and CPUSLICE parameters--30 milliseconds and 10 milliseconds, respectively--can be set to different values in CCAIN. Or, the system manager can reset them dynamically. They are system wide parameters and therefore, apply to all users. The defaults, however, may not be suitable in all environments. Thirty milliseconds on a 100 MIP machine lets an application run three million machine instructions before being declared CPU bound. That represents a significant amount of work. You may think that a smaller number is more appropriate for declaring a user to be CPU bound. On the other hand, reducing the value of IOSLICE increases Model 204 scheduler overhead, a situation generally to be avoided. The default values for IOSLICE and CPUSLICE should be optimal in most environments.

Using the MONITOR Command
The MONITOR command can quickly identify CPU bound users, as shown in Figure 4. When a user becomes CPU bound, the user’s priority (the CUR column in Figure 4) is decremented by two points and the user’s time slice (the SLICE column) is changed to the CPUSLICE value. The CPUSLICE value is the number of milliseconds of CPU the user is given during each time slice. Until the user again becomes I/O bound, the SLICE column remains equal to the CPUSLICE value and the priority continues to decrement until it reaches the bottom of the user’s priority range.

Figure 4. Using the MONITOR command to identify CPU bound users


VIEW IOSLICE, CPUSLICE
IOSLICE   30          CPU SLICE – IO
CPUSLICE  10          CPU SLICE – CPU

MONITOR
08.248  SEP 04  19.15.09        PAGE 12
USER SVR    BUF  FLS  PCPU SMPLS   RUNG    REDY    BLKI    WTSV    BLKO   SWPG   
   3   3   9552    4 0.114   60   0.000   0.000   3.016   0.000   0.000   0.000

USER SVR USERID     P CUR   SLICE AGE FUNC  CNCT        CPU  SEQIO QUE  WT FLGS  
  0    1 SUPERKLUGE H 128   0.030              0      0.033    122 BLKI  4  60  
  4    5 USER4      S  32   0.010     EVAL   219      3.930     75 REDY  
  5    4 USER5      H 104   0.030            158      0.007    230 RUNG  
  6    5 USER6      H  96   0.030     EVAL  1575      9.007    144 BLKI  1  20  
  8    2 USER8      H  80   0.010     EVAL   336      7.626    100 REDY
                    

When a CPU bound request goes into a wait for any reason, such as wait for disk I/O, terminal input or output, record or resource locking conflict, it immediately returns to an I/O bound state and the user’s SLICE value is reset to the IOSLICE value. The user’s priority is also reset to 16 points above the minimum priority for that priority class.

Dynamic Dispatching by the Model 204 Scheduler
The Model 204 scheduler (SCHD) in the nucleus is responsible for monitoring an application’s use of CPU and adjusting its priority and time slice values accordingly. After all priorities and time slice values are adjusted, the scheduler runs the highest priority, ready to run (REDY) user. That user then becomes the running (RUNG) user. In Figure 4 the MONITOR command output shows USER4 and USER8 as CPU bound. Their current dispatching priority, the CUR column, shows that they have reached the bottom of the range for their priority class and the SLICE column has been changed to the CPUSLICE value. CPU bound users will always be in the REDY or RUNG queue until they voluntarily give up the CPU (request a database page, issue an output statement like READ SCREEN or PRINT, or enter a wait state for some other reason). They’re always either ready to run and waiting for CPU or they are running.

The other users in Figure 4 are I/O bound, because the value in their SLICE column is equal to IOSLICE. USER6 is running the I/O bound request shown earlier. His WT column shows a 1 (waiting for disk I/O). This is expected for a direct search of Table B, which must read all active pages in Table B.

Waits for disk I/O do not cause a user’s priority to increment except when the wait ends a user’s CPU bound state. However, a user who is doing both disk I/O and terminal I/O will have her priority incremented. Each time a terminal output occurs, priority will be incremented by one point. Each time a terminal input occurs, priority will be incremented by two points. In a full screen, APSY environment, this means that a user’s priority will increment by one when a READ SCREEN statement is issued (terminal output) and then by two when the user hits enter or a PF key from the screen display (terminal input). The CPUSLICE parameter, also set in CCAIN, is used when a user becomes CPU bound; CPUSLICE becomes the new time slice value for that user, replacing the IOSLICE value.

Using the MONITOR STAT command
The SLIC statistic, available in the MONITOR STAT output and also in EVAL stats, will give you some idea of how often users were declared to be CPU bound. High values of the SLIC statistic indicate that users were often declared to be CPU bound and those users could be causing periodic spikes in response time.

  • Increasing the value of IOSLICE will result in a decrease in the SLIC statistic and reduce scheduler overhead. It will, however, allow CPU bound requests to monopolize the CPU for longer periods of time.

  • However, beware of setting IOSLICE too low, because that can lead to an unwanted increase in Model 204 scheduler overhead.

In Summary
With the availability of large buffer pools and the ability to keep large amounts of your database in memory, users may now become CPU bound more often. That’s because the data they need may already be in the buffer pool requiring fewer waits for disk I/O, thus increasing the possibility of being declared CPU bound. If you are concerned about CPU bound users monopolizing your Model 204 Online and degrading performance for the large majority of I/O bound, APSY users, then adjusting the values of IOSLICE and CPUSLICE downward may provide a solution.

 

System 1032
USE OF AND ACCESS TO PRODUCTS AND FEATURES ARE IN ACCORDANCE WITH THE TERMS AND CONDITIONS OF THE USER’S SOFTWARE LICENSE. THE PRESENTATION OF MATERIAL HEREIN DOES NOT, IN ANY MANNER, MODIFY SUCH TERMS AND CONDITIONS.

Extracting Metadata from Datasets, Part II
By Tym Stegner

Tym This is the second of a two-part article exploring means of obtaining attribute and dataset definition information. Part II deals with metadata extraction not available to the SHOW command API and suggests a method for storing user-defined dataset or attribute context data for retrieval.What SHOW Does Not Show

What SHOW Does Not Show
Some useful information is not available via the SHOW command API, and thus cannot be obtained using the SHOW_datatype tools procedures, as described in Part 1 of this article. The most common example of this is the SHOW DATASET FILE command.


1032> Set Ds FILMS
1032> Show Ds File
Dataset FILMS
  File: SYS$SYSDEVICE:[S1032.V9811.DEMO]FILMS.DMS;4
  Typed String: S1032_DEMO:
  Expanded String: DDSK:[DEMO_DVL.SYS]FILMS.DMS

SHOW DS FILE displays information stored when a dataset (or other) catalog is created.

  • "File:" describes the current file specification of the opened catalog
  • "Typed String:" reveals the value used for the OUTPUT clause of the CREATE command
  • "Expanded String:" is the full file specification of the value used for the OUTPUT clause, including translations of any logical names used in “Typed String”

For example, you might want to determine the current directory of a dataset in use or to extract the logical name that was used when the dataset was created. However, the SHOW DS FILE output is not available to the DM_SHOW nor SHOW_TEXT procedures, so we will use another method to obtain it.

Memory Buffer Operations
Earlier I mentioned a method of obtaining SHOW output by sending the data to an external file (see Part I), then parsing the contents of the file for required data. While this is a possible methodology, the overhead associated with creation of lots of little output files can be burdensome.

In System 1032, instead of using an external file you can use a memory buffer. This memory buffer is available via the INIT_OUTPUT and GET_OUTPUT tools procedures.

In Figure 1 instead of using the INITIALIZE command to define a channel for directing output to a file, the INIT_OUTPUT tools procedure assigns the output channel to a memory buffer. Up to eight channels can be assigned at one time to either flat files or memory buffers, or a combination of the same.

Figure 1


1032> Open Ds FILMS In S1032_DEMO Readonly
Current dataset is now FILMS
1032> Open Library S1032_HLI In S1032_TOOLS Readonly
1032>
1032> Call Init_Output(8)
Initializing Channel 8
1032>

Our output channel assigned, we use the SHOW ON [CHANNEL] # syntax to redirect the output of our SHOW DS FILE command. (The CHANNEL keyword is optional.)


1032> Show On 8 Ds File
1032>


Another SHOW command reveals the input and output parameters of the GET_OUTPUT tools procedure.


1032> Show Get_Output Param
Procedure GET_OUTPUT
  OUTBUF Output Text
  OUTPUT_LINES Output Integer
  CHANNEL Integer
  NUM_LINES Integer Optional
1032>


In the following example, I defined variables for the two required output parameters. However, I called GET_OUTPUT without specifing the “Num_Lines” parameter, which causes reading out the entire buffer:


1032> Var Bx Text Varying
1032> Var Nx Integer
1032> Call Get_Output(Bx,Nx,8)
1032>


Issuing a PRINT command reveals the contents of the output parameters. We see that four lines of output were retrieved. An SV format preserves the formatting of the lines.


1032> Print Nx Bx Fmt Sv60(A)
    NX                                    BX
-----------  ----------------------------------------------------
          4  Dataset FILMS
               File: SYS$SYSDEVICE:[S1032.V9811.DEMO]FILMS.DMS;4
               Typed String: S1032_DEMO:
               Expanded String: DDSK:[DEMO_DVL.SYS]FILMS.DMS
1032>


GET_OUTPUT Details
We are interested in only the second line of output for this example, so we will use the procedure again in a different manner to obtain only the second line. However, the buffer has already been read once by the above command, so the buffer pointer must be reset to the start of the buffer.


1032>
1032> Call Get_Output(Bx,Nx,8,-1)
1032>


Once the buffer pointer has been reset using a “-1” as the “Num_Lines” parameter, GET_OUTPUT is called twice, each time returning a single line, as denoted using “1” for the “Num_Lines” parameter. You could also call GET_OUTPUT with a value of “2” for the “Num_Lines” parameter, returning two lines from the buffer, but the former method reduces the necessary parsing of the returned value.


1032>
1032> Call Get_Output(Bx,Nx,8,1)
1032> Call Get_Output(Bx,Nx,8,1)
1032> Print Nx Bx Fmt Sv60(A)
    NX                                    BX
-----------  ----------------------------------------------------
          1  File: SYS$SYSDEVICE:[S1032.V9811.DEMO]FILMS.DMS;4
1032>


It is a small matter to strip off the leading characters to obtain the file specification of the current dataset.

Capturing User Defined Metadata
Occasionally you need metadata about a dataset or an attribute that is not part of the System 1032 definition environment, such as an application-specific code or setting. While it is possible to define variables into the dataset environment that contain the metadata you want, this approach is limited because you cannot modify the initial value of a variable after the dataset has been created, as you might have need to do after the dataset is populated. Also, there is the possibility that the variables might be left out during dataset refreshes.

It is a common practice at some sites for System 1032-based financial applications to store a fiscal year of data in similarly named datasets. This is done because the financial datasets all have the same structure and use the same applications. Reviewing a particular year’s data is only a matter of opening the appropriate dataset, and the applications do not have to be coded to accommodate different dataset names.

Thus, rather than having a series of datasets in the form:

DS-Name

VMS File Specification

GENLEDGE2007

dsk:[findata]GENERAL_LEDGER_2007.DMS

GENLEDGE2006

dsk:[findata]GENERAL_LEDGER_2006.DMS

GENLEDGE2005

dsk:[findata]GENERAL_LEDGER_2005.DMS

This method allows the following:

DS-Name

VMS File Specification

GENLEDGE

dsk:[findata]GENERAL_LEDGER_2007.DMS

GENLEDGE

dsk:[findata]GENERAL_LEDGER_2006.DMS

GENLEDGE

dsk:[findata]GENERAL_LEDGER_2005.DMS

Putting Comments to Work
Sometimes the application is written for a specific year being processed, but other applications may be written to work with all fiscal years. The program must use a fiscal year related to already opened datasets. One method might be to use the actions described above for extracting the dataset’s file name, and parsing out the fiscal year.

However, people have been known to rename datasets, or, the fiscal year may not track so nicely with the calendar year. Thus, it would be helpful to be able to encode the fiscal year into the dataset somehow. Our suggestion is to make use of the comment field for the dataset as the repository for needed metadata.

The data can be emplaced at the time of dataset creation, or post creation by using the MODIFY DATASET COMMENT command. Once the data is encoded, it can be retrieved directly using a SHOW_TEXT procedure call to return the encoded data to the application.

In Summary
This two-part article has explored some options for retrieving dataset or attribute contextual data using tools procedures that enable programmatic retrieval of information available via the SHOW command. Access to the DM_SHOW API function is available using the tools procedures SHOW_datatype, or via use of a System 1032 internal memory buffer, accessible via the INIT_OUTPUT and GET_OUTPUT tools procedures.

These techniques are useful for getting contextual or environmental data from a dataset, for use in application programming.


Copyright © 2008 Computer Corporation of America.
All right reserved. Published in the United States of America.