Professional Documents
Culture Documents
A 15 Year Perspective on
Automatic Programming
ROBERT BALZER
(Invited Paper)
Abstract-Automatic programming consists not only of an automatic two fundamental flaws which exacerbate the maintenance
compiler, but also some means of acquiring the high-level specification problem.
to be compiled, some means of determining that it is the intended spec-
ification, and some (interactive) means of translating this high-level First, there is no technology for managing the knowl-
specification into a lower-level one which can be automatically compiled. edge-intensive activities which constitute the software de-
We have been working on this extended automatic programming velopment processes. These processes, which convert a
problem for nearly 15 years, and this paper presents our perspective specification into an efficient implementation, are infor-
and approach to this problem and justifies it in terms of our successes mal, human intensive, and largely undocumented. It is just
and failures. Much of our recent work centers on an operational testbed
incorporating usable aspects of this technology. This testbed is being this information, and the rationale behind each step, that
used as a prototyping vehicle for our own research and will soon be is crucial, but unavailable, for maintenance. (This failure
released to the research community as a framework for development also causes problems for the other lifecycle phases, but is
and evolution of Common Lisp systems. particularly acute for maintenance because of the time lag
Index Terms-Automatic programming, evolution, explanation, and personnel changeover normally involved, which pre-
knowledge base, maintenance, prototyping, specification, transforma- cludes reliance on the informal mechanisms such as
tion. "walking down the hall," typically used among the other
INTRODUCTION phases.)
T HE notion of automatic programming has fascinated Second, maintenance is performed on source code (i.e.,
our field since at least 1954 when the term was used the implementation). All of the programmer's skill and
to describe early Fortran compilers. The allure of this goal knowledge has already been applied in optimizing this
is based on the recognition that programming was, and form (the source code). These optimizations spread infor-
still is, the bottleneck in the use of computers. Over this mation; that is, they take advantage of what is known else-
period we have witnessed phenomenal improvements in where and substitute complex but efficient realizations for
hardware capability with corresponding decreases in cost. (simple) abstractions.
These price/performance gains reached the point in the Both of these effects exacerbate the maintenance prob-
early 1980's that individual dedicated processes became lem by making the system harder to understand, by de-
cost effective, giving birth to the personal computer in- creasing the dependences among the parts (especially since
dustry. This hardware revolution apparently will continue these dependences are implicit), and by delocalizing in-
for at least the rest of the decade. formation.
Software productivity has progressed much more slowly. With these two fundamental flaws, plus the fact that we
Despite the development of high-level, and then higher- assign our most junior people to this onerous task, it is no
level, languages, structured methods for developing wonder that maintenance is such a major problem with the
software, and tools for configuration management and existing software paradigm.
tracking requirements, bugs, and changes, the net gain is Against this background, it is clear why automatic pro-
nowhere close to that realized in hardware. gramming commands so much interest and appeal. It
But the basic nature of programming remains un- would directly obviate both these fundamental flaws
changed. It is largely an informal, person-centered activ- through full automation of the compilation process.
ity which results in a highly detailed formal object. This But programming is more than just compilation. It also
manual conversion from informal requirements to pro- involves some means of acquiring the specification to be
grams is error prone and labor intensive. Software is pro- compiled and some means of determining that it is the
duced via some form of the "waterfall" model in which a intended specification. Furthermore, if one believes, as
linear progression of phases is expected to yield correct we do, that optimization cannot be fully automated, it also
results. Rework and maintenance are afterthoughts. involves some interactive means of translating this high-
Unfortunately, this existing software paradigm contains level specification into a lower-level one which can be au-
Manuscript received June 3, 1985; revised August 2, 1985. This work tomatically compiled.
was supported in part by the Defens'e Advanced Research Projects Agency This is the extended automatic programming problem,
under Contract MD903 81 G 0335, in part by the National Science Foun- and the main issue is how to solve it in a way that still
dation under Contract F30602 81 K 0056, and in part by RADC under obviates the two fundamental flaws in the current para-
Contract MCS-8304190.
The author is with the Information Sciences Institute, University of digm.
Southern California, Marina del Rey, CA 90292. ISI's Software Sciences Division has been pursuing this
0098/5589/85/1100-1257$01.00 © 1985 IEEE
1258 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-I1, NO. 11, NOVEMBER 1985
goal for nearly 15 years. This paper describes the per- namely, converting informal specifications into formal
spective and approach of this group to the automatic pro- ones. Our SAFE project [8] in the early 1970's took a
gramming problem and justifies it in terms of the suc- (parsed) natural language specification as input and pro-
cesses and failures upon which it was built. duced a formal specification as output. This system was
able to correctly handle several small (up to a dozen sen-
THE EXTENDED AUTOMATIC PROGRAMMING PROBLEM tences) but highly informal specifications (see Fig. 2(a)
Automatic programming has traditionally been viewed and (b).
as a compilation problem in which a formal specification The basis for its capability was its use of semantic sup-
is complied into an implementation. At any point in time, port to resolve ambiguity. Its input was assumed to be an
the term has usually been reserved for optimizations which informal description of a well-formed process. Hence, it
are beyond the then current state of the compiler art. To- searched the space of possible disambiguations for a well-
day, automatic register allocation would certainly not be formed interpretation. This search space was incremen-
classified as automatic programming, but automatic data tally generated because each informal construct was dis-
structure or algorithm selection would be. ambiguated as it was needed in the dynamic execution
Thus, automatic programming systems can be charac- context of the hypothesized specification. This dynamic
terized by the types of optimizations they can handle and execution context was generated via symbolic evaluation.
the range of situations in which those optimizations can This use of the dynamic execution context as the basis
be employed. This gives rise to two complementary ap- of semantic disambiguation was strong enough that, at
proaches to automatic programming. In the first (bottom- least for the simple specifications on which the system was
up) approach, the field advances by the addition of an op- tested, most ambiguities were locally resolved, and rather
timization that can be automatically compiled and the cre- little (chronological) backtracking was needed.
ation of a specification language which allows the corre- Many key aspects of our approach to automatic pro-
sponding implementation issue to be suppressed from gramming (as explained later in the paper) found their first
specifications. In the second (top-down) approach, which expression in this system: executable specifications and the
gives up full automation, a desired specification language use of symbolic evaluation to analyze each execution, an
is adopted, and the gap between it and the level that can explicit domain model (constructed by the system), and
be automatically compiled is bridged interactively. It is the database view as a specification abstraction of soft-
clear that this second approach builds upon the first and ware systems (as embodied in AP3 [18], the knowledge
extends its applicability by including optimizations and/or representation system for SAFE).
situations which cannot be handled automatically. Our modest success with SAFE led us to consider the
Hence, there are really two components of automatic problems of scaling the system up to handle realistic sized
programming: a fully automatic compiler and an interac- specifications. We recognized that we were solving two
tive front-end which bridges the gap between a high-level hard problems simultaneously: the conversion from infor-
specification and the capabilities of the automatic com- mal to formal, and the translation from high-level formal
piler. In addition, there is the issue of how the initial spec- to low-level formal. The first was the task we wanted to
ification was derived. It has grown increasingly clear that work on, but the latter was required because a suitable
writing such formal specifications is difficult and error- high-level formal specification language did not exist.
prone. Even though these specifications are much simpler
than their implementations, they are nonetheless suffi- Formal Specification
ciently complex that several iterations are required to get We decided to embark on a "short" detour (from which
them to match the user's intent. Supporting this acquisi- we have not yet emerged) to develop an appropriate high-
tion of the specification constitutes the third and final level specification language. The result of this effort was
component of the extended automatic programming prob- the Gist language [1], [20].
lem. Gist differs from other formal specification languages
This paper is organized into sections corresponding to in its attempt to minimize the translation from the way we
each of these three components of the extended automatic think about processes to the way we write about them by
programming problem, captured as a software develop- formalizing the constructs used in natural language. We
ment paradigm, as shown in Fig. 1. Throughout this paper designed Gist from an explicit set of requirements [4]
we will elaborate this diagram to obtain our goal software which embodied this goal and the goal of using such spec-
lifecycle paradigm. In fact, our group has focused its ef- ifications in an automation-based software lifecycle
forts much more on the specification acquisition and in- (evolved from Fig. 1). It is indeed surprising and unfor-
teractive translation components than it has on the tradi- tunate that so few languages have been designed this way.
tional automatic compiler task. This requirements-based design resulted in a language
with a novel combination of features.
SPECIFICATION ACQUISITION Global Database: We conceive of the world in terms
Informal to Formal of objects, their relationships to one another, and the set
Our entry into this field was preceded by a survey [3] of operations that can be performed on them. At the spec-
which categorized then current work. We identified spec- ification level, all these objects must be uniformly acces-
ification acquisition as a major problem and decided to sible.
focus our initial efforts on the earliest aspects of this task- Operational: Gist specifications are operational (i.e.,
BALZER: PERSPECTIVE ON AUTOMATIC PROGRAMMING 1259
decisions &
rationale
T' specification
* ((MESSAGE ((RECEIVED) FRO" (THE "AUTOOIN-SC*))) (ARE PROCESSED) FOR (AUTOMATIC DISTRIBUTION
ASSIGNMENT))
* ((HE RULES FOR ((EDITING) (MESSAGES))) (ARE) (z ((REPLACE) (ALL LINE-FEEOS) MITH (SPACES))
((SAVE) (ONLY (ALPHANUMERIC CHARACTERS) AND (SPACES))) ((ELIMINATE) (ALL REDUNDRNT SPACES))))
* (WHEN ((A KEY) (IS LOCATED) IN (A MESSAGE)) ((PERFORM) (THE ACTION ((RSSOCIRTED) UITH (THAT
TYPE OF (KEY))))))
* ((THE ACTION FOR (TYPE-S KEYS)) (IS) (: (IF ((NO OFFICE) (HAS BEEN ASSIGNEO) TO (THE MESSAGE)
FOR (OACTION)) ((THE 'ACTION' OFFICE FRO" (THE KEY)) (IS ASSIGNED) TO (THE MESSAGE) FOR
("ACTION"))) (IF ((THERE IS) ALREADY (AN "ACTION* OFFICE FOR (THE MESSAGE))) ((THE 'ACTION'
OFFICE FROM (THE KEY)) (IS TREATEO) AS (AN INFORMATIONO OFFICE))) (((LABEL OFFSI (ALL
"INFORMATION* OFFICES FROM (THE KEY))) (ARE ASSIGNED) TO (THE MESSAGE)) IF ((REF OFFSI THEY)
(HAVE (NOT) (ALREADY) BEEN RSSIGNED) FOR (W'ACTION') OR ('INFORflATION'))))))
* W(N(E ACTION FOR (TYPE-1 KEYS)) (IS) (s (IF (t HE KEY) (IS) (FIRST TYPE-1 KEY ((FOUNO) IN
(THE MESSAGE)))) THEN ((THE KEY) (IS USEO) TO ((DETERMINE) (THE 'ACTION" OFFICE)))) (OTHERUISE
(TNE KEY) (IS USED) TO ((DETERMINE) (ONLY 'INORMATION' OFFICES)))))
Fig. 2. (a) Actual input for message processing example.
us detect some of these mismatches by making the gen- by the degree to which it is only partially specified (in the
erator (i.e., the specification) more understandable, many limit, when the case is completely specified, symbolic
mismatches can only be detected by examining the behav- evaluation reduces to normal interpretation).
iors themselves. We have built a prototype evaluator for Gist. It produces
There are three classes of tools relevant to this task. The a complex case-based exploration of the symbolic test case.
first is a theorem prover. It could be used to prove that all It retains only the "interesting" (as defined by a set of
behaviors have some desired set of properties. We have heuristics) consequences of the behavior generated for the
not investigated this approach because it is hard to com- test case. To make this behavior understandable, we have
pletely characterize the intended behavior via such prop- also built a natural language behavior explainer (see [27]
erties, and because this approach only checks the ex- for details of this work).
pected. A major problem is that unexpected behaviors Our intent is to provide both static (via the specification
occur. The remaining two classes directly examine behav- paraphraser) and dynamic (via the symbolic evaluator)
iors, and so are able to detect these unexpected behaviors. means for validating a specification. We imagine an in-
The second class of tools is an interpreter for the spec- tense iterative cycle during the creation of a specification
ification language. Since we have required that our spec- in which the validation cycles progress from the static
ification language be operational, all specifications are feedback provided by the specification paraphraser to the
directly executable (although potentially infinitely ineffi- dynamic feedback provided by the symbolic evaluator.
cient). The major advantage of this type of tool is that it During this period the specification is being used as a fully
provides a means for augmenting and testing the under- functional (although highly inefficient) prototype of the
standing of the specification through examples. Its major target system. These augmentations to the software life-
problem is the narrow, case by case, feedback that it pro- cycle are shown in Fig. 4.
vides.
This has caused us to focus our efforts on the remaining Maintenance
class of tools: symbolic evaluation. Symbolic evaluation We also realized that the revisions made to the specifi-
differs from normal evaluation by allowing the test case cation during this validation cycle are just like those that
to be partially specified [11]. Those aspects not specified arise during maintenance. This realization suggested a
are treated symbolically with all cases (relevant to the radical change in how maintenance is accomplished: mod-
specification) automatically explored. This provides a ify the specification and reimplement. This change in the
means of exploring entire classes of test cases simulta- software lifecycle (as shown in Fig. 5) resolves the fun-
neously. These cases are only subdivided to the extent that damental flaw in the current software lifecycle. By per-
the specification makes such distinctions. These subcases forming maintenance directly on the specification, where
are automatically generated and explored. The size of the information is localized and loosely coupled, the task is
test case explored by symbolic evaluation is determined greatly simplified because the optimization process, which
BALZER: PERSPECTIVE ON AUTOMATIC PROGRAMMING 1261
* THE NUMBER OF COPIES OF A nESSAG DISTRIBUTED TO AN OFFICE IS A FUNCTION OF UHETHER THE OFFICE
* THE-ft FO EDITIN ESSAGS ARE (1) REPLACE ALL LINE FEEDS/NUITH SPACES (2) SAVE ONLY
ALPHANUMERIC CHARACTERS AND SPACES AND SPACES AND THEN (3) ELININATE ALL REDUNDANT SPACES. /
* UHEN A KEY IS LOCATED IN A MESSAGE, PERFORlt ACTION ASSOCIATEO WITH THAT TYPE OF KEY.
* THE ACTION FOR TYPE- E ISt IF NO ACTION OFFICE HAS BEEN ASSIGNED TO THE 11ESSAGE, THE
ACTION OFFICE ROtt THE KEY IS ASSIGNED TO THE MESSAGE FOR ACTION. XF THERE IS ALREADY AN
ACTION OFFICE FOR THE NESSAGE, THE ACTION OFFICE FROIM THE KEY6jj £ ASAN INFORHATION
OFFiCE,V ALL INFORMIATION OFFICES FRO" THE KEY ARE ASSIGNED TOf THE I¶ESdUGE/IF THEY HAVE NOT
* THE ACTION FOR TYPE-I IS: IF THE KEY IS THE FIRST TYPE-1 KEY FOUND IN THE 'MESSAGE THEN
INFORMATION OFFICES /
decisions &
rationale
decisions &
rationale
informal tl high-level
specificato s Acquisition spePt
cificati on. Translation _
L
. ii i (prototype) II ji s leow-level
~~~~~~~~specification
Specification 1
Validation
Maintenance
Fig. 6. Extended automatic programming paradigm.
In fact, it is also the basis for our approach to mainte- dition to the low-level specification which is fed to the au-
nance. As we described previously, we believe that main- tomatic compiler. This formal development is also an
tenance should be performed by modifying the specifica- additional input to the next cycle of interactive translation,
tion and rederiving the implementation. This. strategy is in which it is modified and replayed to generate the next
ideal when the implementation is fully automatic because version of the low-level specification.
the modified specification then only needs to be "recom- This replay mechanism, and the formal development on
piled." which it is based, is the means by which this paradigm
still eliminates the two fundamental flaws in the current
Replay software paradigm without requiring the unachievable goal
However, in an interactive translation approach, such as of full automation. The flaw of informal and undocu-
our own, while much of the effort has been automated, mented translations is eliminated by the formal develop-
there is still considerable human involvement. This makes ment which formally documents and records the optimi-
it unfeasible to rederive the entire implementation each time zation decisions made by the human developer, thus
the specification is modified. However, most of the imple- making them accessible to other developers for later main-
mentation decisions remain unchanged from one cycle to tenance or enhancement of the system. The flaw of trying
the next, so rather than rederiving the implementation to modify optimized source code is eliminated by replay,
from the specification, we modify the Paddle program which provides enough automation of the optimization
which generated the previous implementation to alter only process that a new implementation can feasibly be rede-
appropriate implementation decisions and then "replay" rived directly from the specification.
(i.e., rerun) it to generate the new implementation. An earlier version of this extended automatic program-
Reimplementing a specification by modifying the for- ming paradigm, which did not differentiate between the
mal development (the Paddle program) and replaying it interactive translation and automatic compilation phases,
represents the final elaboration to our extended automatic was adopted for the Knowledge-Based Software Assistant
programming paradigm (see Fig. 6). The formal devel- [21] and included as a long-range objective of the Stars
opment becomes an output of interactive translation in ad- program [6].
1264 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-11, NO. 11, NOVEMBER 1985
and the fact that the structure of these systems is regular ceptable. Annotations allow the user to select among sev-
and declarative, has allowed us to match the specification eral predefined implementations, or to define new, spe-
language with the automatic compiler and (attempt to) cial-purpose ones. The AP5 compiler is concerned with
eliminate the need for interactive translation. The basis translating specifications which use the abstract knowl-
for this effort is Swartout's thesis [26], [29], in which he edge base interface to access and manipulate logical data
observed that good explanations of the behavior of a sys- into efficient programs [12]. The term "virtual database"
tem required not only an ability to paraphrase the code of has been coined to describe such data and knowledge bases
the system, but also an ability to describe the relationship which logically hold and manage data, which are actually
of the implementation to the (possibly implicit) specifica- represented heterogeneously and directly accessed and
tion from which it was derived and the rationale behind manipulated by compiled database applications. We ex-
the design decisions employed in the translation. pect large improvements in efficiency in such applications
Rather than attempting to extract this much information over the homogeneous and actual (as opposed to virtual)
from a human designer, he built a special-purpose auto- interface currently employed in AP3 programs.
matic programmer which dropped mental breadcrumbs
along the way (recorded its design decisions in a devel- FORMALIZED SYSTEM DEVELOPMENT
opment history). We are extending this automatic com- Nearly three years ago, we confronted the realization
piler to build rules from domain principles'applied to spe- that although we had broadly explored the extended au-
cific domain facts and to integrate rules together into tomatic programming problem and conceived a new par-
methods. Each such capability that the automatic compiler adigm for automated software development, all of our in-
can handle can be removed from the input specification. sight and experience was based on pedagogical examples.
Since so little effort has yet occurred in specifying rather None of our tools or technology had progressed to a usable
than programming expert systems, considerable progress state.
can be made before we reach the limits of our ability to We believe very strongly in the evolution notion,
extend the automatic compiler. embedded in our software paradigm, that large, complex
systems cannot be predesigned, but rather, must grow and
Annotations evolve on the basis of feedback from use of earlier simpler
Our annotation effort is exploring a quite different ap- versions. We know that our insights and perceptions are
proach to automatic compilation. Rather than extending necessarily limited and preliminary, and that only by ac-
the compiler's ability to make optimization decisions au- tually experiencing such an environment would they sig-
tomatically, we are attempting to extend its ability to carry nificantly deepen. We also felt that enough pieces of tech-
out the decisions of a human designer. These decisions are nology had been developed that we could begin to benefit
represented as declarative annotations of the specifica- from their use.
tion. The human designer is interested only in the even- We therefore committed ourselves to building an oper-
tual achievement of these decisions, not in interactively ational testbed capable of supporting our own software de-
examining the resulting state and using it as the basis for partment. This testbed will be used to experiment with
further decisions. tools and technologies for supporting our software para-
Thus, this annotation approach is really the logical limit digm and for studying alterations to it. The testbed be-
of our attempts to get more declarative specifications of came operational in January 1984 and, since then, is the
the formal development. Our reason for considering it here only system we have used for all our computing needs. 'By
is that it basically trivializes the interactive translation the end of this year it will be available to the research
phase to make syntactic annotations while enhancing the community as a framework for development and evolution
role of the automatic compiler, which follows the guidance of Common Lisp systems.
provided by those annotations. To build this testbed, called formalized system devel-
We believe that the annotation approach has great prom- opment (FSD) [2]), we had to overcome two major obsta-
ise because it enables the human designer to make deci- cles to using our new software paradigm. The first was
sions in a nonoperational way, which facilitates enhanced establishing a connection between our high and low levels
replay, decreases the need to interactively examine inter- of specification. Because we attempted to minimize the
mediate states of the implementation, and allows the au- difference between the way we think about processes and
tomatic compiler to consider these decisions in any order, the way we write about them, we created a large gap be-
or even jointly. tween these two levels of specification. We still have a long
Our first target of opportunity for annotation is repre- way to go in creating an appropriate interactive translation
sentation selection. Towards this end, we are constructing framework, stocking it with a suitable and comprehensive
a new knowledge representation system, AP5 [13], to re- set of translations, and providing an adequate level of au-
place AP3. Besides cleaning up the semantics of AP3, its tomation to obtain a usable interactive translation facility,
main purpose is to allow heterogeneous representations of even for quite sophisticated programmers.
knowledge. AP5 defines relations, its basic unit of repre- This lack of connection has bedeviled' our efforts right
sentation, as an abstract data type. Any implementation from the start (with SAFE [8]) and has been the major
which satisfies the formal interface thus defined is ac- impediment to our use of this new paradigm. It has iso-
1266 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-I1, NO. 11, NOVEMBER 1985
lated our high-level specification language from our tech- ifies, through a knowledge-base pattern, some relation-
nology and relegated its use to pedagogical examples. ships that are always supposed to be satisfied. However,
Overcoming this problem is simple, but philosophically when this pattern is violated, the consistency rule at-
painful. By suitably lowering the level of our "high-level" tempts to "patch" the state, via a set of repair rules that
specification (hereafter called the prototype specifica- have access to both the old and the new state. If it is suc-
tion), we can bring it close enough to the lower level of cessful in reestablishing the relationship in the new state,
specification to bridge via interactive translation. The cost then the computation continues (with the extra modifica-
of such a lowering of the input prototyping specification tions made by the repair rule merged with those of the
is that more optimization issues must be handled manually original transition as a single atomic step, so that the in-
outside the system, thus complicating maintenance. termediate state, which violated the consistency rule, is
We decided to initially limit interactive translation to not visible). This allows the system to automatically main-
providing annotations for the automatic compiler (as dis- tain consistency relationships expressed via these rules in
cussed in the annotations section). We believe we cur- an electronic-like manner. Like Prolog [22] rather than
rently understand how to annotate data representations, spreadsheets, these rules can involve logical as well as nu-
inference rules, and demon invocations, and how to com- meric relationships, and can be used multidirectionally.
pile those annotations. This then defines the initial differ-
ence between the prototyping and low-level specifications. Prototyping the Prototype
The rest of the high-level specification is therefore de- The current status of the FSD testbed is that it is fully
termined by the capabilities of our automatic compiler. operational and in daily use within our group at ISI. It
This brings us to our second major obstacle to using our supports the semantics of the prototyping specification
new software paradigm: automatic compilation. The issue language described above. However, the syntax of this
is both what can be compiled and what constitutes an ad- language is not yet in place. Instead, we currently write
equate level of optimization. in a macro and subroutine extension to AP3.
From the very start of our efforts, we were writing our We are in the process of converting FSD to AP5. This
tools in terms of a knowledge representation system, FSD will put the second phase of the automatic compiler in
[18], which directly incorporated many aspects of our place. We will then bring up the first phase, which trans-
specification semantics. We found that AP3's direct im- lates the Gist-like syntax of the prototyping specification
plementation of these semantic constructs, while far from language into AP5. Both of these phases should be oper-
what could be produced by hand, still provided adequate ational within a few months.
response. Major portions of the FSD testbed are con- We originally intended FSD to be just a testbed for soft-
structed this way, with the same acceptable performance ware development. However, when we considered the se-
result. We therefore decided that, in general, this is an mantic base we were installing for it, we found nothing in
appropriate target level for our automatic compiler. In it that was particular to software development. Rather, this
places where further optimization is required, we will use semantic base defined the way we wanted to interact with
annotations to direct the automatic compiler. This com- computer systems in general. We therefore decided to ex-
piler will actually be a two-phase compiler. The first phase tend FSD from a software development environment to a
will translate the annotated low-level specification lan- general computing environment in which we could also
guage into calls on the AP5 knowledge-base system. The read and compose mail, prepare papers, and maintain per-
AP5 compiler will then use the annotations to perform any sonal databases, because such an extension merely in-
further indicated optimizations. A major reason for our volved writing these additional services. Furthermore, the
shift from AP3 to AP5 is the latter's capability to compile creation and evolution of these extra services would give
annotations into heterogeneous representations. We be- us an additional insight into the strengths and weaknesses
lieve that this additional, directed level of optimization will of our software development capabilities.
enable us to write the entire FSD system in the prototyp- The FSD environment presents the user with a persis-
ing specification language. tent universe of typed objects (instead of a file-based
We have not yet addressed the simplifications required world), which can be accessed via description (i.e., as-
to convert Gist into a compihAable prototyping language. sociatively) and modified by generic operations. All the
These simplifications are defined by the capabilities of the metadata defining the types, attributes, inheritance, view-
first phase of the automatic compiler. Because so much of ing, and printing mechanisms are also represented as
the semantics of Gist is directly captured by AP5, most of manipulable objects in this persistent universe. Consis-
the compilation is straightforward. The remaining diffi- tency rules invisibly link objects to one another so that the
cult aspects are all related to specific language constructs. consistency conditions are automatically maintained as the
The most problematical is Gist's constraints. These objects are modified, either directly by the user or by some
constraints limit nondeterministic behavior so that the service. Automation rules (demons) provide a way of
constraints are never violated. This requires indefinite transferring clerical activities to the system so that it grad-
look-ahead, and no effective general implementation is ually becomes a specialized personal assistant.
known. Instead, this construct has been replaced by two More and more of our tools and technology are being
lower-level ones: a simple constraint which merely signals integrated into this method. This effort is providing us
an error when its pattern is violated, and a consistency with a wealth of prototyping experience for use on later
rule. Like the simple constraint, the consistency rule spec- versions. Our long-term goal is to understand the rules
BALZER: PERSPECTIVE ON AUTOMATIC PROGRAMMING 1267
which govern (or should govern) software development and group and exploring the use of our paradigm for develop-
evolution, and to embed them within our system and par- ing expert systems, D. Cohen for his work on symbolic
adigm. evaluation and the development of AP5, and M. Feather
SUMMARY for developing Gist transformations and formalizing its se-
mantics.
We have incrementally elaborated the simple notion of of his group
Finally, he would like to thank all the members
for their help in making FSD an operational
an automatic compiler into a paradigm for automated soft-
testbed in which they can explore our software paradigm
ware development. These elaborations recognize that
and the utility of their tools and technology.
1) the specifications have to be acquired and validated; REFERENCES
2) this validation requires an operational semantics; [1] R. Balzer, D. Cohen, M. Feather, N. Goldman, W. Swartout, and D.
3) an interactive translation facility is needed to obtain Wile, "Operational specification as the basis for specification valida-
the lower-level specification that can be automatically tion," in Theory and Practice of Software Technology, Ferrari, Bol-
compiled; ognani, and Goguen, Eds. Amsterdam, The Netherlands: North-
Holland, 1983.
4) the decisions employed in that interactive translation [2] R. Balzer, D. Dyer, M. Morgenstern, and R. Neches, "Specification-
must be documented and recorded in a formal structure; based computing environments," in Proc. Nat. Conf Artif Intell.
5) this formal development is the basis for a replay fa- AAAI-83, Inform. Sci. Inst., Univ. Southern Calif., Marina del Rey,
1983.
cility which enables implementations to be rederived from [3] R. Balzer, "A global view of automatic programming," in Proc. Third
the revised specification on each maintenance cycle. Int. Joint Conf. Artif. Intell., Aug. 1983, pp. 494-499.
[4] R. Balzer and N. Goldman, "Principles of good software specification
We showed how this paradigm, even without full auto- and their implications for specification languages," in Proc. Specifi-
mation, eliminated the two fundamental flaws in the cur- cations Reliable Software Conf., Boston, MA, Apr. 1979, pp. 58-67;
see also, -, in Proc. Nat. Comput. Conf., 1981.
rent paradigm. [5] R. Balzer, "Transformational implementation: An example," IEEE
We also described the wide variety of research we have Trans. Software Eng., vol. SE-7, pp. 3-14, Jan. 1981; see also, -,
Inform. Sci. Inst., Univ. Southern Calif., Marina del Rey, Res. Rep.
undertaken in support of this paradigm and characterized RR-79-79, May 1981.
our successes and failures. Our chief successes are [6] R. Balzer, C. Green, and T. Cheatham, "Software technology in the
1990's using a new paradigm," Computer, pp. 39-45, Nov. 1983.
1) Gist, its validation (via paraphrase, symbolic evalu- [7] R. Balzer, "Automated enhancement of knowledge representations,"
ation, and behavior explanation); in Proc. Ninth Int. Joint Conf. Artif. Intell., Los Angeles, CA, Aug.
2) Paddle (formal development structure); 18-23, 1985.
[8] R. Balzer, N. Goldman, and D. Wile, "Informality in program spec-
3) explication of the paradigm itself; ifications," IEEE Trans. Software Eng., vol. SE-4, pp. 94-103, Feb.
4) development of FSD (operational testbed). 1978.
[9] Bauer, F. L., "Programming as an evolutionary process," in Proc.
Our chief failures were SecondInt. Conf. Software Eng., Oct. 1976, pp. 223-234.
[10] T. E. Bell and D. C. Bixler, "A flow-oriented requirements statement
1) the unreadability of Gist (partially resolved by the language," in Proc. Symp. Comput. Software Eng., MRI Symp. Se-
ries. Brooklyn, NY: Polytechnic Press, 1976; see also, -, TRW
Gist paraphraser); Software Series, TRW-SS-76-02.
2) the inability to translate Gist (even interactively) into [11] D. Cohen, "Symbolic execution of the Gist specification language,"
an automatically compilable form; in Proc. Eighth Int. Joint Conf. Artif. Intell., IJCAI-83, 1983, pp.
17-20.
3) the fact that all our tools were laboratory prototypes, [12] D. Cohen and N. Goldman, "Efficient compilation of virtual database
which only worked on pedagogical examples and did not specifications," Jan. 1985.
interface with one another (partially resolved by FSD). [13] D. Cohen, AP5 Manual, Inform. Sci. Inst., Univ. Southern Calif.,
Marina del Rey, draft, 1985.
The chief remaining open research issues are [14] M. Feather, "Language support for the specification and development
of composite systems, draft, Jan. 1985.
1) incremental specification; [15] -, "Formal specification of a source-data maintenance system,"
Working Paper, Nov. 6, 1980.
2) replay; [16] -, "Formal specification of a real system," Working Paper, Nov.
3) automatic compilation leverage provided by a nar- 21, 1980.
rower focus (explainable expert systems) and by annota- [17] -, "Package router description and four specifications," Working
Paper, Dec. 1980.
tions; [18] N. Goldman, AP3 Reference Manual, Inform. Sci. Inst., Univ. South-
4) explicating the rules that govern software develop- ern Calif., Marina del Rey, 1983.
ment and evolution, and embedding them within an op- [19] N. M. Goldman, "Three dimensions of design development," Inform.
Sci. Inst., Univ. Southern Calif., Marina del Rey, Tech. Rep. RS-83-
erational testbed (FSD). 2, July 1983.
[20] N. Goldman and D. Wile, "Gist language description," draft, 1980.
ACKNOWLEDGMENT [21] C. Green, D. Luckham, R. Balzer, T. Cheatham, and C. Rich, "Re-
The work reported here is the result, over nearly 15 port on a knowledge-based software assistant," Rome Air Develop.
Cent. Tech. Rep. RADC-TR-83-195, Aug. 1983.
years, of many colleagues who have participated with the [22] R. Kowalski, Logic for Problem Solving. Amsterdam, The Nether-
author in this joint endeavor to understand, formalize, and lands: Elsevier North-Holland, 1979.
automate the software process. He is grateful to all of [23] P. London and M. S. Feather, "Implementing Specification Free-
doms," Inform. Sci. Inst., Univ. Southern Calif., Marina del Rey,
these colleagues for making this effort so exciting, reward- Tech. Rep. RR-81-100; see also,-, Sci. Comput. Programming,
ing, and productive. He is particularly indebted to N. 1982.
Goldman for his contributions to SAFE and the formali- [24] J. Mostow, "Why are design derivations hard to replay?" in Machine
Learning: A Guide to Current Research, T. Mitchell, J. Carbonell,
zation of Gist, D. Wile for his development of Popart and and R. Michalski, Eds. Hingham, MA: Kluwer, 1986, to be pub-
Paddle, B. Swartout for bringing explanation into our lished.
1268 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-1II NO. 11, NOVEMBER 1985
Abstract-This paper reports on efforts to extend the transforma- a version of P is produced that meets implementation con-
tional implementation (TI) model of software development [1]. In par- ditions (e.g., it is compilable, it is efficient). Fig. 1 pre-
tkular, we describe a system that uses Al techniques to automate major
portions of a transformational implementation. The work has focused sents a diagram of the model.
on the formalization of the goals, strategies, selection rationale, and In the TI model, the specification language P is Gist
finally the transformations used by expert human developers. A system [12], the agent S is an expert human developer, and T is
has been constructed that includes representations for each of these taken from a catalog of correctness-preserving trans-
problem-solving components, as well as machinery for handling hu- formations2 (Section II discusses other bindings of P, S,
man-system interaction and problem-solving control. We will present
the system and illustrate automation issues through two annotated ex-
and T). Hence, the human is responsible for deciding what
amples. should be transformed next, and what transformation to
Index Terms-Knowledge-based software development, program use; the system checks the transformation's preconditions
transformation systems. and applies it to produce a new state. As Balzer noted in
[1], the TI model provides at least two advantages.
1) Focus is shifted away from consistency problems and
I. INTRODUCTION towards design tradeoffs.
TN previous
a issue of this TRANSACTIONS, Balzer pre- 2) The process of developing a program is formalized
Isented the transformational implementation (TI) model as a set of transformations. Thus, the process itself can be
of software development [1]. Since that article appeared, viewed as an objecct of study.
several research efforts have been undertaken to make TI Since Balzer's article appeared, we have attempted to
a more useful tool. This paper reports on one such effort. use the TI model on several realistic problems. This work
A general model of software transformation can be has confirmed one of the article's conjectures:
summarized as follows:' 1) we start with a formal speci- ". . . it is evident that the developer has not been freed
fication P (how we arrive at such a specification is a sep- to consider design tradeoffs. Instead of a concern for
arate research topic), 2) an agent S applies a transforma- maintaining consistency, the equally consuming task
tion T to P to produce a new P, 3) step 2 is repeated until of directing the low level development has been im-
posed. While the correctness of the program is no
Manuscript received January 2, 1985; revised June 19, 1985. This work longer an issue, keeping track of where one is in a
was supported by the national Science Foundation under Grants MCS- development, and how to accomplish each step in all
7918792 and DCR-8312578.
The author is with the Department of Computer Science, University of 2Correctness rests both on Gist's formal semantics and on preconditions
Oregon, Eugene, OR 97403. attached to transformations. Proofs of correctness are carried out by in-
'A survey and more detailed discussion of transformation systems can spection; there has been no attempt to date to apply a more formal proof
be found in [15]. method.