The Challenges of Implementing DFSMS
By Ernie Ishman

In April 1991, Geisinger System Services made a commitment to implement DFSMS. The months that followed presented many
new challenges to the internal support staff. These included everything from dealing with DFSMS related problems to
understanding some of the new functions DFSMS introduced to the system. As project leader, I kept a log of both expected
and unexpected issues during the project. This article details some of our experiences. 
     The article assumes a basic understanding of DFSMS and is intended as a comparison view of one shopsţ successes and
pitfalls.
Upgrade Maintenance 
     The initial task of the project was to upgrade maintenance for MVS/ESA, which had been installed in October 1990
via CBIPO. That task turned out to be ongoing. An initial CBPDO brought us up to 9101 with two succeeding PDOs through
9104 and 9106. In each case, problems experienced firsthand were corrected. Since the software behind DFSMS is still
maturing, this is an especially important point to keep in mind. For the most part, stability of DFSMS throughout the
project was acceptable with most problems being more of a nuisance. An example of this came when an eight-character
storage group was tried for the first time. Much to our surprise, it did not work. VSAM defines going to storage group
SGBASE90 were not functioning. APAR OY40764 corrected the problem, but a desire to stay current on maintenance was born.
     Before discussing some of the technical details, let me mention some of the objectives that drove our desire for
DFSMS. These included but were not limited to:
     strategic positioning; 
     define and enforce DASD standards; 
     simplify data allocation; 
     automate control of storage resources; 
     aromote device independence; 
     exploit features such as data set level caching and PDS/E; and 
     promote efficient utilization of DASD.
     For the project to be considered a success, these objectives had to be satisfied.
The Team
     The project team included a DBA,  an operations analyst, an applications analyst, a security analyst and two systems
programmers. The bulk of actual work was performed by the systems programmers with other members of the team serving
more in an advisory capacity. To keep the project moving along, the Custom Migration Support (CMS) offering from IBM
was brought in to assist in all phases of the project. This primarily provided us with an unbiased look at the
environment with detailed insight on how prepared we were for the conversion. In addition, the CMS team offered proven
recommendations that helped streamline DFSMS configuration design, problem diagnosis and end-user education.
     After initial maintenance was applied to the system, the next step was to bring up a minimal configuration. This
proved to be a transparent but milestone event. With a minimal config in place, the stage was set to phase in additional
changes as each prior one proved successful. One concern we dealt with was the need to understand how some of the
third-party products were going to interact with DFSMS. Since DMS was our DASD management software used in place of
DFHSM, it was very important to take each step with caution. To insulate end users from initial problems, the internal
support staff data was the first to get managed. This provided us with an opportunity to understand what DFSMS meant
to the environment as well as shake out problems. At this point, successes and pitfalls seemed to come on a daily basis.
     A problem that appeared early was related to the small size of the early storage group. Initially, only two volumes
were put under DFSMS control supporting selected permanent data sets and limited temporary work space requests. A couple
of newly managed non-VSAM allocations existed with a UNIT=(SYSWK,3) coded. This related to three work volumes in the
unmanaged pool. Although the necessary space was available in the managed pool to service the allocation, it failed
because the request for three volumes was honored. That seemed a bit harsh to me and the others, as a fixing APAR OY32669
was issued. That changed the process to issue a warning message indicating the number of volumes requested may not be
satisfied if actually required.
     Also somewhat of a surprise was the effect DFSMS had on space allocation requests in a mixed environment. We have
both 3380 and 3390 devices and decided to use the 3390 for the base configuration track size default. Most SPACE keywords
in our JCL had been coded to do either track or cylinder allocations. Most JCL also used the UNIT=esoteric allocation
technique. Since DFSMS couldnţt make an assumption on what device types made up an esoteric pool, it used the base track
size to determine space requests. In other words, DFSMS calculated a request as though it would end up on a 3390. Thus,
a request of 10 cylinders would end up as 12 on a 3380. The way around this was to change space requests to more generic
values such as kilobytes. Since thatţs not a trivial task, weţve accepted the oddity as long as there is mixed geometry.
Some Relief From Overallocation
     Since overallocation was common on the 3380s, significant relief was offered when new space release attributes were
introduced in the management class called ţimmediateţ and ţimmediate-conditional.ţ These were available with PTF UY65041.
It provided us with an opportunity to automate the release of space when JCL did not have the RLSE keyword coded. Since
there is a potential need for space beyond the initial request, such as would be used in a DISP=MOD request, we decided
to implement immediate-conditional. This meant space would only be released when a secondary allocation could be made
available. This approach has turned out to be considerably more efficient than waiting until nightly space management
cycles run to release space after the fact. 
     The only problem we saw was associated with an ISPF exclusive enqueue. For data sets defined with immediate space
release, ISPF was locking out other users after the data set had be edited. This was especially noticeable with PDS data
sets when a single member was changed. No other members could be accessed by other users as long as the first person
remained at the ISPF primary edit screen. Since itţs common to make a PDS member change and only back out to the edit
screen, this proved to be quite a headache. Prior to the fix for this via APAR OY49132, a special management class to
avoid space release was assigned to problem data sets.
     A productivity gain offered by DFSMS is the elimination of EDT gens. However, this should not be confused with an
attempt to eliminate esoteric units completely. JCL containing an esoteric request will receive a JCL error if the
esoteric request is removed from the system. To avoid this, no esoterics were removed from our MVSCP source. Instead,
as all volumes for a particular esoteric were converted, the esoteric would be set to point to a low-use paging volume.
This volume was established early on as the catchall pack for problems arising from various scenarios. Now and then a
data set would show up on this volume indicating an allocation got through the system for a data set that would have
otherwise caused a JCL error. One of our primary goals was to avoid unnecessary abends and this went a long way in
accomplishing that. Eventually, all old esoterics pointed back to the paging pack just in case.
     Since DFSMS requires managed data sets be cataloged, I was surprised to see sporadic occurrences of uncataloged data
sets. These generally turned out to be temporary in nature. Such things as a job abend or system crash allowed the
situation to occur. Because of this, we periodically run a job that reports uncataloged data sets. They are then cleaned
up. Unlike unmanaged data sets, option 3.4 of ISPF cannot be used to purge these uncataloged data sets. We found the
ISMF DELETE line command will delete these data sets as it apparently issues DELETE NVR, which would be the other way
of doing it. 
     This also brought out another item of interest. DFSMS does not enter the management class ACS routine for temporary
data so the system will not assign a management class to DSTYPE=TEMP data. Thus, a space management job based on
management class was not picking them up. This should not be an issue in a DFHSM environment because this daily space
management, by default, cleans them up. 
VSAM Alternate Indexes
     While discussing data sets that do not go through an ACS routine, itţs important to note VSAM alternate indexes.
Although itţs clear in the manual, we found out the hard way that jobs trying to assign a management or storage class
to an AIX will fail. For good reason, these data sets are automatically managed in the same manner as their related
cluster. This keeps the entire sphere under control within the same storage group.
     A welcome feature of DFSMS for the internal support staff was being able to eliminate involvement with large
production data set placement requests. We basically opened up the entire DASD pool to all allocations with the
expectation that DFSMS and SRM could do a better job than we could. One system-related problem appeared related to this.
DFSMS assigns volumes for new allocations when a job is submitted. In some cases, this included very large requests for
space. If the step actually needing space did not execute for a period of time, there would be a chance space was no
longer available on the selected volume. When DFSMS switched the allocation to a different volume, an abend occurred.
This was fixed via APAR OY47967.
     We chose to exert control over who could allocate large test data sets. Two approaches were taken: One came in the
form of a check in the storage class ACS routine for data sets requesting over 300MB primary. If the user was not
privileged, the request was failed with a note to contact internal support for assistance. This too was a welcome feature
as there was no enforcement of this unwritten standard in the past. The second approach dealt with allocations that could
potentially grow above 300MB via secondary extents. An ACS storage class exit was written to detect these and issue a
warning back to the joblog and syslog. The exit was necessary because it is not possible to write to the log via an ACS
routine. We wanted to review these periodically to detect potential space abuse. The exit was based on some code found
in member IGDACSSC of IPO1.SAMPLIB. The code as shipped was implemented during early ACS code testing for debugging
purposes. I recommend taking a look at it.
     We expected installation of DFSMS would provide opportunities to use the new PDS/E data structure. It turned out
that some ugly messages were issued when an attempt was made to define a PDS/E. That was corrected via installation of
PTF UY50879. This was yet another indicator of the importance of current maintenance. I also wanted to start using the
new DCOLLECT option in IDCAMS, but found it was not available until UY90555 was applied.
     Periodically, programmers coded up data set allocations using the EXPDT keyword. Since we did not use the value in
any space management processing, I was glad to see a way to override it via DFSMS. By establishing a retention limit
of zero in the management class, EXPDT values were ignored. A message indicating the value was ignored went back to the
joblog. This has eliminated the need to code PURGE on DELETE requests that sometimes occurred at very inconvenient times.

Generic Entries in APF
     A new feature for system programmers was the ability to make generic entries in the APF list for managed data sets
by leaving off the normal volser value. That informed the system to use normal catalog searching to find and authorize
the data set at IPL time. This was an especially useful enhancement in relation to our disaster recovery plan. We never
knew where an authorized data set would end up during the recovery scenario. Now, if it is managed, thereţs no need to
know. The feature is available via APARs OY26695, OY27602, OY28919 and OY41408.
     Since our environment had mixed DASD geometry in the storage group pools, there were a few isolated problems when
files were converted to DFSMS. CICS complained for good reason when some journals got moved from unmanaged 3380s to
managed 3390s. This was due to the files being preformatted using certain characteristics of the geometry of the device
they were on. The larger track size left allocated space that was not usable. Unfortunately, the problem didnţt show
itself until CICS actually tried to write to this area. The regions had to be brought down to correct the situation.
The reason for this type of problem is quite apparent, but itţs an example of how little things can slip through if
youţre not careful. 
     The problem is typical of others you may have where DSORG PS or DA files are preformatted by specialized utilities.
There was one instance where the release of a software product did not support 3390s. It was corrected by an upgrade
to the product. Checking with the vendor is a must before moving data associated with a product utilizing specialized
utilities.
     After trying different approaches with management and storage class constructs, the list was stabilized as the
project neared completion. Since management class was now controlling backup and archival of managed data, daily scrutiny
was a must. Since DMS was the management tool, my experiences would not apply to everyone. Suffice it to say that backup
and archive management was not always what we expected. It took time to understand which management class parameters
related to various DMS terminology. As our knowledge base expanded, it was pleasant to see such items as data set level
caching via STORCLAS was everything it advertised. To illustrate our implementation of these two constructs, a chart
summarizing the basic differences internal to each is shown in Figure 1. Attributes that are the same for respective
classes are listed separately.
     I will not illustrate our data classes because little time was spent on them. We saw few significant benefits without
a more strict naming convention. The storage groups were simply divided by 3380 and 3390 geometry with one thrown in
for the VIO pool. Although there is considerable use of DB2 and CICS, weţve been able to avoid implementing guaranteed
space. Initially, there was apprehension that poor data set placement could cause response time problems; that never
materialized. It seems as long as there are sufficient volumes, DFSMS does an adequate job on performance. The only data
not being managed at this point are DB2 BSDS, DB2 logs, selected CICS journals, paging data sets, master catalog volume,
IPL volume and a third-party IPL critical data sets volume. These data sets primarily are kept unmanaged for recovery
purposes. Integration into DFSMS is expected as time and disaster recovery techniques permit.
     Although the efficiency of VIO has been around for some time, the ability to manage the resource was limited. DFSMS
takes a big step forward in getting a handle on VIO usage. I expected and received significant results when VIO was
implemented for selected temporary data sets. Jobs using small, high reuse data sets saw as much as  a 42 percent
decrease in execution time if the job was not set up to use VIO previously. Good examples were jobs doing a DB2
precompile, CICS translate, COBOL compile, linkedit and DB2 bind all in one TSO batch step. 
     A couple of problems surfaced with products unable to handle data sets moved from DASD to VIO. Most problems were
handled by fixing a vendor PTF or by excluding the product from VIO via ACS. One curious problem came about because of
the useful function STOP-X37 was providing to avoid out of space conditions. Since STOP-X37 did not get involved in VIO
x37 abend processing, any jobs previously relying on help for a data set on DASD started to abend if DFSMS moved it to
VIO. Another item applicable to STOP-X37, which should apply to other comparable products, relates to the recatalog
feature. This feature allows files to be recataloged to the new volume when a duplicate data set gets created. For good
reason, this is not applicable under DFSMS. If there are jobs relying on this to go to normal completion, a JCL error
will occur. 
Most Difficult Problem
     By far, the most difficult problem we had to debug was not even recognized as a DFSMS problem. To facilitate testing,
we run a separate MVS system that communicates with production MVS via a virtual CTC adapter under VM. One day, after
making various changes unrelated to DFSMS, the test system was taken down for an IPL. 
     Much to our surprise, the VTAM link via the CTC was not able to reconnect after the system came back up. A thorough
review of the changes revealed nothing to cause such a problem. Since DFSMS was not involved in the immediate changes,
it was not a suspect. To make a long story short, a minor change had been made to the storage class ACS routine about
a week prior. The change unknowingly allowed DFSMS to attempt management of all devices including the CTC. It just so
happened when VTAM tried to activate the CTC, it was receiving a sense code indicating already in use. This was due to
the inadvertent involvement of DFSMS. The moral of the story is to make sure the storage class ACS routine has a filter
list that is very selective on devices DFSMS gets its hands on.
     An unexpected benefit of DFSMS came from a change in the way IDCAMS BLDINDEX processing works. For BLDINDEX to
function in a DFSMS environment, PTF UY65152 was applied. This eliminated the need for IDCUT work files. These files
had unique requirements that did not fit the DFSMS environment. In place of its sort logic, IDCAMS now called the
standard sort product on the system. After applying this PTF, a job performing a BLDINDEX of 1.5 million records
decreased in time from 64 minutes to 10 minutes. EXCPs dropped from 189,000 to 9,000. Comparable decreases were noticed
throughout the system for BLDINDEX jobs. Obviously, this caught a lot of attention.
     Two unacceptable situations associated with VSAM file processing required JCL changes. An infrequent technique used
to define VSAM files involved use of the FILE(xx) keyword. The xx DD in the JCL referred to in the DEFINE would contain
only UNIT=xxxx, DISP=OLD and VOL=SER=yyyyyy. Under DFSMS, this caused problems since the referenced volume went away
and there was no managed DSN= to allow use of a DUMMY storage group. Since there was no need for the FILE(xx), they were
removed when found. 
     The other problem dealt with the combination of a DELETE, DEFINE and REPRO all within the same IDCAMS step. The REPRO
used an OUTFILE(xx) to refer back to the JCL for the output data set that had just been deleted and redefined. Under
DFSMS, the DELETE of a data set followed by a DEFINE usually moved it to another volume. In turn, when the REPRO tried
to process OUTFILE(xx), which had moved, an abend occurred. All occurrences of this technique had to be change to use
an OUTDATASET in place of OUTFILE. That forced a search of the catalog for the current location of the data set when
REPRO was invoked, instead of relying on where it was when the JCL was interpreted.
     Something I had hoped to make greater use of was the ability to define VSAM files via JCL. Unfortunately, limitations
allow its use only for very basic or temporary VSAM allocations, which is probably all it was intended for. Clusters
requiring attributes such as REUSE, SPEED or ERASE must still be defined via IDCAMS DEFINE. If youţve ever looked at
the run-time difference between a large file loaded with SPEED as opposed to RECOVERY, youţll agree this is not trivial.
Alternate index definitions are also not possible. 
     Other options such as FREESPACE, CISIZE, IMBED and SHR not directly available in JCL can be accessed via DATACLAS,
but that required a commitment to DATACLAS. We decided it was best to continue recommending IDCAMS DEFINE for VSAM file
allocations to avoid any confusion. We did inform everyone that a DISP=(OLD,DELETE) for a VSAM file in JCL would delete
the file. That was ignored in earlier DFP releases.
     The enthusiasm to begin using system determined blocksize, which was a new feature of DFP, uncovered a problem with
model DSCBs. Some programmers were changing allocations to remove the BLKSIZE keyword so SDB would be invoked.
Unfortunately for the GDGs, that was causing DFP to revert back to the blocksize of the model DSCB. This was not always
equal to the original BLKSIZE in the JCL. There were only a few model DSCBs on the system and they were there only to
satisfy the GDG requirement, not to provide actual data set attributes. Since models are no longer required under DFSMS,
the solution was to remove the model at the same time as the BLKSIZE. It took a few abends before everyone got the
message. 
VOL=REF  
     An interesting scenario involving VOL=REF was encountered. Anytime a file was defined using the VOL=REF technique,
normal DFSMS processing was bypassed with the file taking on the class constructs of the referred data set. This was
causing some data sets to get managed even before they were included for selection in the ACS routines. The opposite
was true when an otherwise managed data set referred to an unmanaged data set. Fortunately, the use of VOL=REF was very
limited and was not an issue after everything was converted.
     As with any project of this magnitude, there were considerably more challenges than I could possibly detail. Many
were shop-specific, related to such items as homegrown software and overriding management concerns. You can be sure
unique experiences are awaiting anyone involved in such an undertaking. I have tried to highlight the items of value
to a general audience. In conclusion, I can say that DFSMS has had a very positive effect on our environment and has
been worth every bit of time weţve taken to explore, understand and implement it.
/* 
Was this article of value to you? If so, please let us know by circling Reader Service No. 78. 
     
  Figure 1: Basic Differences of Implementing Two Constructs
 
MGMTCLAS Name         Primary days       Expire after x days    Partial release 
MCBASE                       30                  no limit      cond-immed 
MCBASEP                     100                  no limit      cond-immed 
MCPURGET                    n/a                      1         cond-immed 
MCPURGEP                    n/a                      2         cond-immed 
MCNOMIG                   never                  no limit      cond-immed 
MCNORLSE                    100                  no limit            no

Expire non-usage       -  no limit (archived data sets are kept until?) 
Retention limit        -  0 (ignore EXPDT) 
Level 1 days           -  same as Primary (Level 1 not implemented) 
CMD/auto migrate       -  both 
# primary GDGţs        -  2 
Rolled off GDG         -  expire 
Backup frequency       -  0 
# backups (exists)     -  7 
# backups (deleted)    -  1 
Retain days (only)     -  50 
Retain days (extra)    -  50 
Auto backup            -  yes

STORCLAS Name            Dir resp              Dir bias                Seq resp   Seq bias  Guar. space   Comments 
SCBASE         900       -          900        -          no           default 
SCFASTR        10        R           10        R          no           force fast read 
SCFASTW        10        W           10        W          no           force fast write 
SCGUARAN       900       -          900        -          yes          not used 
SCNOSMS        999       -          999        -          no           bypass SMS via ACS 
SCNOVIO        900       -          900        -          no           bypass VIO 
SC80           900       -          900        -          no           force 3380 alloc 
SC90           900       -          900        -          no           force 3390 alloc 
Availability             -          standard 
Sync write               -          no 

