SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
1.       Computer Language Advances †

         Daniel E. Cooke1,2, Michael Gelfond1, and Joseph E. Urban3
            1
           Computer Science Department, Texas Tech University
                      2
                        NASA Ames Research Center
 3
   Computer Science and Engineering Department, Arizona State Univeristy




Introduction.
The beginning of modern computer science can be marked by Alan Turing’s
paper, which among other advances, showed that there are unsolvable
problems. [1] Within the infinite set of solvable problems there is a set of
technically feasible and problems that are not technically feasible to solve
(e.g., problems that it would take a supercomputer of today 10,000 years to
solve). Thus, from the beginning, computer science has been viewed as the
science of problem solving using computers.
    One category of computer science research is devoted to outwardly
encroaching upon the set of infeasible problems. Advances in computer
hardware, for example, lead to more powerful computers that execute faster
and have larger stores of memory. Advances in hardware are meant to result
in improvements in the raw computing power of the devices that can be
brought to bear in order to solve more complex problems. Additionally,
research efforts focusing on complexity and algorithms are also concerned
with outwardly encroaching upon the set of intractable problems.
    In general, this first category of computer science research is devoted to
making complicated problems more technically feasible to solve. Problems
that are technically feasible are those for which existing algorithms can
obtain an answer in some reasonable time using currently available computer

† Research sponsored, in part, by NASA NAG 5-9505.
technology. The second major category of computer science research has to
do with making it humanly feasible to solve more complicated problems.
Human feasibility has to do with the level of difficulty one faces in finding a
solution to a problem.
     Some of the efforts in the second category of research focus on the
development of approaches that delegate some of the complexity of certain
activities to easy-to-use tools, such as database management systems,
operating systems, networking tools. Other approaches to the second
category focus on problem representation. These areas of research include
artificial intelligence and software engineering. Central to all efforts to
improve upon the human’s ability to solve more complicated problems is
computer language research.
     This chapter focuses on language design from a scientific point of view.
To do so, a scientific backdrop will be developed. Then the notions of
language design from a theoretical and experimental point of view are
outlined. From there, the history of language design efforts are reviewed,
first from the evolutionary and then from the revolutionary vantages. The
chapter ends with a brief introduction of recent results obtained by one of the
authors.


Scientific Endeavors.
In most sciences there are two organized research fronts: theoretical and
experimental. Experiment provides a wealth of scientific observations that
eventually must be distilled into a compact representation. The distilled,
compact representation is a theory – a formulation of the body of knowledge
gained by experiment that concisely represents the larger body of
knowledge. Once formulated, the theory is employed in the design of future
empirical studies to see if the theory predicts the outcomes of the
experiments. Future experiments may confirm the theory – leading to
greater confidence in it – or they may falsify the theory – indicating that the
theory may be incorrect. Results existing between confirmatory and
falsification may indicate that the theory needs some modification towards
improvement.
     Based upon the interaction between the theoretical and experimental
communities of the physical sciences, Kuhn, in his now classic text The
Structure of Scientific Revolutions, identified two types of theoretical
science. [2] The first category of theoretical research (called ordinary
science) attempts to modify, extend, and/or refine theories so that they do a
better job of explaining empirical results that are not explained by the
unrefined theories. The second category (called extraordinary science)
takes place when a radically new view (or theory) is proposed and it is seen
that the theory explains empirical evidence, subsuming the old theory and all
of its subsequent refinements. Extraordinary science results in paradigm
shifts and in scientific revolutions.



Theoretical Language Research.
A programming language serves as the central part of a theory about how
one best approaches problem solving with the aid of computers. The
language is central to the theory because it is the basis for finding and
communicating problem solutions to machines or other people. A language
is meant to facilitate one’s ability to state and structure problem solutions.
Possibly the most important aspect of the language is the fact that it alone
manifests the abstraction – or the approach to problem solving – provided by
the theory. Wirth noted that “One of the most crucial steps in the design of
a language is the choice of abstraction upon which programs are to base.” [3]
Abstraction is key to an understanding of language and language design.


Abstraction.
An abstraction is commonly viewed as the act of taking away – the
formation of an idea apart from the concrete existence or specification
thereof. A computer abstraction can be viewed as the result of dropping out
nonessential, complicating details. An abstraction defines that which is
salient to the problem solution, and in doing so, identifies the extraneous
details that can be ignored in previous abstractions.
     The original model of computing required the physical rewiring of the
CPU to alter the CPU's actions. A revolution in abstraction came with the
advent of the stored program concept, also known as the Von Neumann
architecture. The Von Neumann approach dropped out the original
distinction between the storage of data and the storage of a program.
Assembler languages quickly followed. These languages dropped out many
nonessential details including the need to keep track of physical addresses.
Macro definitions for oft-repeated functions such as input-output improved
the assembler abstraction by removing many nonessential details from the
programmer’s concern. In the case of input-output on the IBM 360/370, the
details hidden are those involved in executing channel programs. The GET
and PUT macros essentially hide a surfeit of nonessential details: without
these macros the programmer has to write programs that run on the I/O
channel processors together with the mainframe program code required to
initiate the I/O transfers.
     FORTRAN was a major step in abstraction improvement because it
provided a view of problem solving where one ignored many machine level
details such as register usage and memory management. FORTRAN's
abstraction moves the programmer away from machine operations - pointing
the programmer towards algebraic operations. Languages like Algol and
Pascal improved the FORTRAN abstraction in a significant way through the
addition of prominent control structures and by placing data structure design
on a level equal to algorithm design. In other words, with the newer
abstractions the design of the data structure comes to be viewed as a critical
element in problem solving. To illustrate the significance of the data
structure, imagine trying to convert an arbitrary postfix expression into an
equivalent prefix expression without the use of an advanced data structure.
The Pascal abstraction eventually led to the view that the programmer's
product is a data product rather than a program that produces that data
product.
     The object-oriented approach to programming is less of an abstraction
change in that this approach adds, rather than deletes technical detail. From
an historical viewpoint one might view the relationship between the
procedural and object-oriented approaches as analogous to the relationship
between the assembler and the macro-assembler languages. In other words,
the following analogy may indeed be an appropriate one to consider: macro-
assembly : assembly :: object-oriented : procedural.
     The object-oriented approach was motivated by the fact that procedural
languages lacked a "middle ground" between local and global variables. The
lack of this middle ground led to technical difficulties in the implementation
and reusability of data structures. The difficulty was addressed by Parnas's
definitions of information hiding (a form of hiding variables beyond the
hiding afforded by local variables). [4] These problems were solved with the
addition of features to encapsulate data structures. Once one encapsulates
data structures, it is a logical step to encapsulate program structures. Reuse
of the so-called objects is facilitated through an inheritance mechanism.
Object-oriented programming is a natural, evolutionary step that adds
features to the procedural language abstraction in order to solve technical
problems that arise in the use and reuse of data and program structures.
    Other changes to the procedural abstraction have been made in order to
accommodate concurrent programming. Languages like Ada, Modula, and
Linda have added some notion of multitasking in a manner that interacts
with existent program structures for procedures and functions.
    Java is considered by many to be the best designed object-oriented
language. Java possesses an environment that provides one with the ability
to develop graphical user interfaces, applets for web-based programming,
concurrent processes, etc. As such, Java embodies a superset of the
significant extensions and modifications that characterize the evolution of
the procedural languages.
    Apart from the evolution of procedural programming, completely new
ways to view problem solving have arisen from work in functional [5], logic
[6], and collection-oriented languages [7,8]. These improvements have
sought to change the view of programming altogether. They can be viewed
more as revolutionary changes as opposed to the more characteristic
evolutionary changes described above.

     A programming language provides the fundamental level of abstraction
for problem solving with a computer. It demarcates the point where the
human leaves off and the architecture takes over, in the process of translating
the problem solution into a set of actions to be performed by the computer.
Language compilers or interpreters perform all further translation for us.
    The fundamental level of abstraction is very important. It defines the
basis for the way we organize our activities to solve problems, i.e., it is the
basis for the software process model.


What constitutes a computer language theory?
As a theory for problem solving, a computer language should satisfy certain
objective requirements. The requirements are important to know, because
they form an important measure of the significance of a language as a
theory. Before a community is willing to design and perform experiments
related to a theory, the theory should satisfy requirements, which indicate
that the experiments are warranted and worthwhile. In this section
requirements that a computer language should satisfy are presented.
One requirement of a computer language is that it be unambiguous. To
satisfy this requirement there must be a mathematical definition of the syntax
and semantics of the language. The syntax of a language provides a precise
definition of the rules employed in order to construct grammatically correct
(i.e., valid) sentences in the language. Backus-Naur Form is commonly used
for simple syntax definitions. More complex syntactic features are
frequently captured by attribute grammar, another form used in syntax
definition.      Attribute grammars permit the description of semantic
constraints such as data typing constraints.
     The semantics of a computer language should provide a precise
definition of the meaning of a sentence stated in the language. Semantics
can be stated using the denotational, axiomatic, or operational approaches, or
through the use of other formal languages.
     The Backus-Naur Form allows for the construction of a finite set of rules
that define how any element of the infinite set is constructed. For example,
the following finite set of recursive rules define an infinite set of elements:
        T ::= zero | succ(T)

   These rules define a set of syntactic objects, which belong to the lan-
guage:

        T = {zero, succ(zero), succ(succ(zero)),…}

    Now consider the set of natural numbers:
        N = {0,1,2,3, …}

    Formal semantics are given in a function definition. The function maps
from the syntactic set T, to the semantic set, N, i.e., m:T → N. Equations
define the function m. For example, consider the equations, which perform
the simple mapping from elements of T to elements of N:

        m(zero)                      =        0                   (1)
        m(succ(T))                   =        +(m(T),1)           (2)


    Using equations (1) and (2), a precise (i.e., unambiguous) mapping can
be developed from a syntactic to a semantic object:
(2)
        m(succ(succ(zero)))                      =
                                                   (2)
        +(m(succ(zero)),1)                       =
                                                   (1)
        +(+(m(zero),1),1)                        =
        +(+(0,1),1)                              =       2

    Notice that the lefthand side (LHS) of rules (1) and (2) refer to syntactic
structures, while the righthand side (RHS) of the rules indicate the semantic
of the syntactic structure. Once the rules have been developed they can be
analyzed to determine if ambiguity exists in the language definition. For
example, if there are two rules with the same LHS, but differing RHS’s, then
there are two meanings for one syntactic structure. Whenever multiple
interpretations exist for the same syntactic structure, ambiguity exists in the
language.

    With the following modification to the Backus-Naur Form rule:
        T ::= zero | succ(T)
        T1::= T + T

together with additional semantic equations to define mappings from T1:
        m(zero + zero)              =            0                     (3)
        m(succ(T) + zero)           =            m(succ(T))            (4)
        m(zero + succ(T))           =            m(succ(T))            (5)
        m(succ(T1) + succ(T2))      =            +(m(T1 + T2),2)       (6)

one can provide for more sophisticated, yet equally precise mappings that
specify the addition of natural numbers. Consider the expression
succ(succ(zero)) + succ(succ(zero)):
                                                             (6)
        m(succ(succ(zero)) + succ(succ(zero)))           =
                                                           (6)
        +(m(succ(zero) + succ(zero)),2)                  =
                                                           (3)
        +(+(m(zero + zero),2),2)                         =
        +(+(0,2),2)                                      =         4

Consider the expression succ(zero) + succ(succ(zero)):
                                                             (6)
        m(succ(zero) + succ(succ(zero)))                 =
                                                           (5)
        +(m(zero + succ(zero)),2)                        =
                                                           (2)
        +(m( succ(zero)),2)                              =
                                                           (1)
        +(+(m(zero),1),2)                                =
+(+(0,1),2)                                              =   3

     The formal (i.e., unambiguous) syntax and semantic definitions of a
computer language are objective requirements that a new language should
satisfy.
     Another desirable feature of a computer language has to do with size. A
computer language should possess a small number of features that are easy
to combine in any number of ways in order to produce complex problem
solutions. Computer languages that possess a small number of combinable
features are called orthogonal. Orthogonal languages allow a software
developer to design solutions of complex problem through the interaction of
a small set of language features. The syntax of a computer language
provides important insight into the extent to which a software developer can
claim that the language is orthogonal. Consider the grammar definitions
below:
        FOR             ::= for id := E to E do ?;
        WHILE           ::= while C do ?;
        IF              ::= if C then ? else ?;
        ASSIGN          ::= id := E;
        READ            ::= readln(id);
        WRITE           ::= writeln(E);
        STATE           ::= FOR | WHILE | IF | ASSIGN | READ |
                                     WRITE | STATE; STATE

    If one replaces all the ?'s above with the nonterminal STATE, the
language has a good degree of orthogonality. The goodness is due to the fact
that one is able to nest any control structure inside any other control
structure.
    The writability of a language has to do with how difficult it is to encode
a problem solution. In general, an orthogonal language is arguably more
writable than a nonorthogonal language. For the same reason, the
readability of the orthogonal language - the ability to decipher the problem
solution - is an improvement over nonorthogonal languages.
    The conciseness of the problem solutions that can be developed in a
language is often viewed to be an important feature of a good language.
While it is certainly true that conciseness is an important criterion,
conciseness must be considered within the context of a language’s
readability and writability. For example, one can imagine a language that
has a unique one-character symbol for each problem solution known to
humankind – obviously requiring an infinite supply of one-character
symbols. Suppose, in this language, that the symbol # represents a heapsort
program. The solution to the sort and any other problem in this language is
definitely as concise as can be envisioned. The problem with the language
has to do with its readability and writability. To be an effective problem
solver, the programmer who writes or reads the program must recall that the
# represents the heapsort program – and he/she must remember this fact as
distinguished from the potentially infinite number of other pairings between
a singular symbol and a problem solution.
     The table-lookup quality to this hypothetical language would seem to go
against the traditional efforts of computer scientists. Beginning with Alan
Turing, computer scientists have focused on the identification of a small set
of primitive elements that comprise problem solving. The design and choice
of the primitives takes into consideration the degree to which the primitives
chosen can be combined to “build up” complicated problem solutions. Thus,
the conciseness of the language itself (rather than the size of the problem
solutions written in the language), together with the degree to which the
language constructs can be combined (i.e., the orthogonality of the language)
would seem to be higher priority features of a language. If these features are
met, then the conciseness of solutions becomes a reasonable criterion for a
language.
     Another desirable feature of a programming language is that it be
capable of representing a Universal Turing Machine. Languages satisfying
this requirement, are known as Turing Computable. A Turing Computable
Language is one that can solve any problem that can be stated as an
algorithm, i.e., any effectively compatible function. Simply put, an algorithm
is a set of easy-to-follow instructions, that given an input, will produce a
result. [9] Implied by the previous definition is the requirement that the
algorithm will "halt." Given an arbitrary input, an algorithm must produce
an answer (or result) in a finite amount of time. A language that can state
any possible algorithm, is Turing Computable. A Turing Computable
Language is capable of solving any algorithmically solvable problem (i.e.,
one for which there is an algorithmic solution). Therefore, a language
should be Turing Computable.
     Coherence is another desirable feature of a language. [10] Coherence
requires that language features be drawn from the same abstraction level.
Recall the quote seen earlier:
         One of the most crucial steps in the design of a language is
         the choice of abstraction upon which programs are to base.
Two sentences later, in the same text, one finds:
        [The language designer] should restrict his selection [of lan-
        guage features] from the same level which are in some sense
        compatible with each other. [3]

     Coherence of a language is an important feature insofar as one does not
want the programmer to have to understand a given program on two
differing levels of abstraction. Suppose there was a version of a procedural
language like Pascal, which allowed the programmer to dip into the
assembler level of representation. If a change is made to a program in this
language, the programmer must understand the change at the Pascal and the
assembler level. In other words, the programmer must understand the
interaction of the machine code generated by the compiler and the assembler
code that is embedded in the source program.
     Regularity is yet another desirable feature in a language. Regularity
requires that language features operate in a consistent and uniform manner,
independent of context – regularity requires similar actions to have similar
consequences. One way to view regularity is from the standpoint of a
human interface. A computer system that has a "mouse" ought to have the
mouse button(s) perform in a uniform manner at all times. Suppose a mouse
has two buttons. If button X opens any available pop-up window at the
operating system level, it should do the same for any application program
one might enter from the operating system level. If button X performs
function A in one application program and function B in another, the overall
interface design is not consistent, not regular, and not particularly good.
Here we begin to migrate away from objective requirements into the realm
of nonobjective requirements.
     A qualitative measure of a language is that it be intuitive. An intuitive
language is one that allows the problem solver to follow instinct and
intuition in solving the problem. Clearly, what one person views as intuitive
is likely to be different from what the next person views to be intuitive.
Since intuitiveness is a qualitative measure, a problem solver can only
determine whether one language is more intuitive than another language
through experimentation. Experimental computer science is costly and
rarely done when it comes to language comparisons. However, if languages
can be shown to meet some reasonable number of the qualities introduced in
this section, it would be ideal if they could be compared experimentally for
the more qualitative characteristics.
Experimental Science.
There are at least two approaches to empirical language studies. The first
approach involves the careful understanding and evaluation of a language
and results in a value judgement on the part of the researcher. This first ap-
proach is not unlike the approach Wirth took in the Oberon project [see 11].
    The formalisms developed to describe languages are analogous to
differential equations, which are employed in describing and modeling
physical phenomena. Just as the study of the models of physical phenomena
have led to breakthroughs in Physics, the mathematical models describing
language approaches should be carefully studied in order to advance the
understanding of problem solving.
    The second approach to empirical language studies involves the
development of controlled experiments. The types of measurements that are
required and the variables for which controls must be established need to be
well understood. To some extent, computer scientists are like physicists
attempting to study thermodynamics without a thermometer. Nonetheless,
controlled experiments need to be undertaken.
    If software advances are to keep pace with hardware advances,
languages and other software tools must improve the productivity of
software engineers (i.e., make problems more humanly feasible to solve).
For a language to better facilitate productivity, it must provide a more
intuitive approach to problem solving. The degree of intuitiveness of a
language is a nonobjective consideration that may require controlled
experimentation. To be effective, two major forms of experimentation are
required. These forms are:
•=   to determine which languages are best suited for a given type of problem solver and
•=   to determine which languages are best suited for a given type or domain of problems.

    Ideally computer language research should be based upon a framework
similar to that found in the physical sciences. In the physical sciences there
is a healthy interaction between theory and experiment, where each drives
the other. To a lesser extent the same is true in Computer Science.
However, the organization of the theory and experimental communities is
not as well developed in Computer Science.
    Suppose one viewed a new language as a theory about how to best
approach a certain type of problem solving. The language would be
expected to meet certain criteria, like those listed in the previous section.
Once the theory is accepted, the language would be subjected to
experimental research where competing language approaches would be
compared empirically. The experimental activity might require sociological
and psychological testing to determine which language seems to provide the
most intuitive approach.



Evolutionary versus revolutionary
Recall the earlier discussion concerning Kuhn’s ordinary and extraordinary
science. There have been examples of extraordinary science in language
research – beginning with the Von Neumann architecture, then FORTRAN,
LISP, Algol/Pascal (to a lesser degree), SQL, and Prolog. The revolutionary
approaches having had the greatest impact are the Von Neumann
architecture and FORTRAN. Many will argue that the object-oriented
approach was and is revolutionary. However, one can argue the other side as
well – that the object-oriented approach is a refinement (and improvement)
to the procedural approach and manifests an evolutionary, rather than
revolutionary, impact.


Evolution of Computer Languages
The evolution of computer languages has been based upon the revolution
brought about by the introduction of the Von Neumann computer
architecture. This architecture was the basis for the computer language
constructs. The stored program and the program counter permit the
modification of the flow of control.       These hardware capabilities have
culminated in the sequence, selection, and iterative control constructs. The
general-purpose registers – specialized memory locations between which
arithmetic and logic circuitry exist – allow for the arithmetic and
comparative capabilities. The arithmetic construct follows from these
hardware features, as do the constructs – that when paired with the ability to
alter the program counter – allow for the ability to perform conditional
execution required for selection and iteration. Input/output ports on
computers support the input-output constructs found in computer languages.
     Because these architectural features are common to all computers and
because languages allow us to tell computers what to do, the constructs are
general constructs to all procedural, programming languages. The constructs
form categories in which every procedural programming language has
statements. Therefore, the constructs form a solid basis from which to learn
a new programming language. The evolution of languages is based upon the
executable constructs – the constructs just mentioned that permit us to tell
the computer what to do – and the nonexecutable commands to the language
translator. The nonexecutable constructs provide for program and data
structure definition.
     FORTRAN, a true revolutionary approach to programming, provided the
definitive version of the assignment statement and the arithmetic expression.
No language since FORTRAN has altered the basic form of the arithmetic
expression or the assignment statement. In other words, if one knows the
operational semantics and basic syntactic forms of FORTRAN’s assignment
statement, then the problem solver will not need to relearn this type of
statement or expression in any new language. The only differences will be
syntactic nuances.
     Likewise, languages like Algol and Pascal defined – for the most part –
the remaining executable constructs. The input-output and the sequential
control structures were established in a definitive way by the forms they took
in Algol and Pascal. Pascal’s stream I/O – as opposed to a strictly formatted
I/O – has not been subsequently improved upon in any revolutionary way.
Similarly, the block structured sequences, selective (e.g., if-then-else and
case or switch statements), and the iterative (e.g., the while and the for
statements) constructs have not been modified in a major way by subsequent
languages. The only major changes in control constructs is in the addition of
parallel control structures found in many languages, including Concurrent
Pascal, Modula, Ada, and Java; and in the way in which encapsulated
program structures are declared, instantiated, and invoked.
     Encapsulation of program structures – though – is arguably less a change
in control structures and more of a change in the commands a programmer
gives to a language interpreter or compiler in order to establish data and
program structures. The view taken here, is that in terms of the procedural
paradigm (including the paradigm’s Object-Oriented extensions), the
declaration of program and data structures – the set of nonexecutable
constructs – is the one area of a language that continues to undergo
substantive evolutionary change.
Revolution in Computer Languages
Already mentioned are the revolutionary advances that accompanied
FORTRAN’s introduction. The dropping away of many machine details
(e.g., memory and register usage) truly changed the ground upon which the
programmer stands when solving problems. Algol and Pascal improved
upon this ground in a semi-revolutionary way by having the programmer
regard the data structure as being on par with the algorithm in problem
solving.
     LISP was a significant advance as well. [12] Problem solving was
reduced to the interaction between two primitive features – recursion and the
list. The simple elegance that ensued with LISP provided the programmer
with a view of programming that is characterized by a simple and natural
approach. The data structures that one composes in LISP are robust in terms
of their growth and the fact that they can be heterogeneous. The revolution
brought about by LISP allowed the programmer a view of programming
based upon lambda calculus.        Other significant advances in functional
programming include the languages FP [5] and Haskell [13].
     Prolog – another revolutionary language – allowed the programmer to
view programming based upon First Order Predicate Calculus. [6] The
language features allow for simple, concise, and elegant solutions to many
problems. For example, one can implement a stack data structure with the
simple clause:
   push_pop(Top,Stack,[Top|Stack])

Consider the following trace using this definition:

   Yes
   ?- push_pop(9,[4,5,6,7],Stack).
   Stack = [9, 4, 5, 6, 7]

   Yes
   ?- push_pop(Top,Stack,[9,4,5,6,7]).
   Top = 9
   Stack = [4, 5, 6, 7]

   Yes
   ?-
Concurrency and Parallelisms
Hoare’s Communicating Sequential Processes provides a significant formal
approach to concurrent and parallel programming. [14] Hoare’s approach
provides formalisms for loop unfolding and concurrent processing of other
iterative structures as well as the ability to synchronize concurrent tasks by
extending Djikstra’s Guarded Commands and his Semaphores. [15]
Communicating Sequential Processes led to languages like OCCAM and
provided true insight into concurrent processing – influencing many
subsequent languages.
     Conventional approaches to concurrent programming (including Linda
[16]) add features to the existing procedural and Object-Oriented approaches
to programming. Adding features complicate these paradigms. Newer
approaches to concurrent programming attempt to provide more intuitive
approaches to parallelisms resulting in programs that are easier to write and
easier to understand. Many of these approaches are based upon collection-
oriented languages that provide for aggregation operators like those
pioneered in [17, 18].
     These newer approaches to parallel programming include [19, 8, 20, 21,
22, 23]. These approaches provide a way to indicate that functions being
invoked are to execute concurrently. For example, suppose one wishes to
compute the square of a list of numbers. In NESL one can specify this
computation as follows [8 -p.90]:
   {a*a : a in X }

where X is defined to be a list of numbers such as [3, -4, -9, 5]. NESL's
computational model can be viewed architecturally as a Vector Random
Access Machine (VRAM). [8] The VRAM is a virtual model for
computation consisting of an unlimited number of processors π that have
simultaneous access to a shared memory. The assignment of a computation
χ(x) to a particular processor πi can be denoted as πi=(χ(x)). In the NESL
program statement above, the {}'s specify concurrent execution. The
program statement above specifies the parallel computation of the square of
each element of X:
        {a*a : a in [3, -4, -9, 5] }  =
                  [πi (3*3), πi+1(-4*-4), πi+2(-9*-9), πi+3(5*5)] =
                  [9, 16, 81, 25]
In the VRAM model, each processor obtains its data from and produces its
result in the shared memory. Each processor πi is viewed as a conventional
processor. NESL does provide an excellent language for the specification of
data parallelism wherein a divide and conquer approach to problem solving
is promoted. NESL provides for the specification of a data parallel approach
to programming. SequenceL, presented below provides a more implicit
approach to parallelisms. The problem above – in SequenceL can be
specified as follows:
        *([[3, -4, -9, 5], [3, -4, -9, 5]])

resulting in the following concurrent computations:
        [ *([3,3]) || *([-4,-4]) || *([-9,-9]) || *([5,5]) ] =
        [9,16,81,25]

     Like NESL, the SequenceL language is based upon a shared memory
architecture. In general, the history of language development has deeply
affected the definition of SequenceL. Recall that one of the first major
advances in Computer Language Design was the elimination of the
distinction between program and memory. This advance eventually led to
the paradigms where data is stored in named locations and algorithms can
define the locations through the use of assignment and input-output
statements. Advances also led to the development of the control flow
constructs that are used to define and process data structures. These
constructs are the sequence, selection, iteration, and parallel constructs.
     The SequenceL paradigm provides no distinction between data and
functions/operators. Furthermore, all data is considered to be nonscalar, i.e.,
sequences. SequenceL has an underlying control flow used in the evaluation
of SequenceL operators. In the underlying Consume-Simplify-Produce-
cycle (i.e., CSP-cycle), operators enabled for execution are Consumed (with
their arguments) from the global memory, Simplified according to
declarative constructs that operate on data, and the resulting simplified terms
are then Produced in the global memory. This CSP-cycle, together with the
declarative SeuqenceL constructs (applied during the simplification step)
imply control flow structures that the programmer does not have to design or
specify.
     The elimination of the distinction between data and operators is achieved
through the use of a global memory, called a tableau T, where SequenceL
terms are placed:
[ f a1 1, f a2 2,…, f an n ] where n≥0

when ai= 0 fi is a sequence and when aI>0 fi is a function or an operator.
The arity ai of a function or operator fi indicates the number of arguments
(i.e., sequences) fi requires for execution. There is no notion of assignment
in SequenceL. Furthermore, there is no notion of input distinguished from
the application of an operator to its operands. In this context, the word
operator refers both to built-in and user-defined operators.
     The underlying CSP control flow in SequenceL eliminates the need for
the programmer to specify parallel and most of the specification of iterative
and sequence-based operations. The control flow approach identifies
operators that are enabled for execution. The enabling of operators is based
upon the arity of the operator (which is derived from the operator’s
signature) matching the number of sequences following the operator. The
global memory and the producer/consumer abstraction in SequenceL
implements the basic ability to enable the Hoare/Djikstra concept of
synchronized communication among processes. However, the SequenceL
approach provides this ability without an additional language construct
beyond its approach of enabling function execution through the data flow
execution-based CSP executution strategy underlying SequenceL

    However, CSP alone will not achieve the desired effect. In order to
derive finer-grained parallelisms and many iterative processes in an
automated way, CSP must interact with advanced constructs for processing
data. The desired interaction occurs in the Simplification step of the CSP-
cycle. These advanced SequenceL constructs are the regular, generative,
and irregular constructs.      Among other capabilities, these constructs
implement Hoare’s loop unfolding constructs.
    The regular construct applies an operator to corresponding elements of
the normalized operand sequences. For example, +([4,4]) distributes the
plus sign among the corresponding elements of the two singleton sequences
[4] and [4], resulting in [8]. Now consider the more complicated
application:

   T = [*([20,30,40],[2])]
In this case normalization results in the elements of the second operand
sequence being repeated in the order they occur until it is the same length
(and/or dimension in terms of nesting) as the larger:

     T = [*([20,30,40],[2,2,2])]

    Now the multiplication operator can be distributed among the
corresponding elements of the operand sequences.             The
normalization/distribution is a single simplification step:

     T’ = [*([20,2]), *([30,2]), *([40,2])]

     As a result, in the next CSP cycle there are three parallel operations that
can take place (i.e. loop unfolding is implied by the regular construct). Thus,
the interaction of the CSP execution cycle and the regular construct results in
the identification of microparallelisms. The final result is:

     T’’ = [40, 60, 80]

    The generative construct allows for the generation of sequences, e.g.,
[1,…,5] = [1,2,3,4,5]. The irregular construct allows for conditional
execution:

     T=[Double-odds(Consume(s_1(n)), Produce(next)) where next =
     [ *([s_1(i) 2]) when <>( mod([s_1(i),2]),0) else []]
      Taking i from [1,…,n],[2,3,4,5]]

See Figure 1 for a trace of the function.

      [ *([[2,3,4,5](i), 2]) when <>( mod([[2,3,4,5](i),2]),0) else []]Taking i from [1,…,4]


     [ *([[2,3,4,5](i), 2]) when <>( mod([[2,3,4,5](i),2]),0) else []]Taking i from [1,2,3,4]



[ *([2 ,2]) when      [ *([3, 2]) when                  [ *([4,2]) when        [ *([5, 2]) when
<>( mod([2,2]),0)     <>( mod([3,2]),0)                 <>( mod([4,2]),0)      <>( mod([5,2]),0)
else []]              else []]                          else []]               else []]

 [ *([2 ,2]) when     [ *([3, 2]) when                  [ *([4,2]) when        [ *([5, 2]) when
<>( 0,0)              <>( 1,0)                          <>( 0,0)               <>( 1,0)
else []]              else []]                          else []]               else []]
[ *([2 ,2]) when      [ *([3, 2]) when                    [ *([4,2]) when         [ *([5, 2]) when
False                  True                                False                   True
else []]               else []]                            else []]                else []]

[       []                 *([3, 2])                                   []                     *([5, 2])   ]



[       []                      6                                      []                     10          ]


                                                       [6,10]


                                        Figure 1. Double-Odds.

    With the irregular construct one can, when necessary, produce recursive
applications of a function:

             T=[list(Consume(s_1), Produce(next)) where next =
                      [ [list, -([s_1, 1])] when >(s_1,0) else [s_1]] [4]

Recursion involves a function placing itself back into T. See figure 2 for a
trace of the function.




                          [list(Consume(s_1), Produce(next)) where next =
                                  [ [list, -([s_1, 1])] when >(s_1,0) else [s_1]] [2]


                                       [ [list, -([2, 1])] when >(2,0) else [2]]


                          [list(Consume(s_1), Produce(next)) where next =
                                  [ [list, -([s_1, 1])] when >(s_1,0) else [s_1]] [1]


                                       [ [list, -([1, 1])] when >(1,0) else [1]]


                          [list(Consume(s_1), Produce(next)) where next =
                                  [ [list, -([s_1, 1])] when >(s_1,0) else [s_1]] [0]


                                       [ [list, -([0, 1])] when >(0,0) else [0]]
[0]


                                 Figure 2. List function.


    SequenceL, represents an attempt to develop a language abstraction in
which control structures – sequential, iterative, and parallel – are implied.
    In summary, the SequenceL abstraction presents a paradigm with the
following aspects:

•=   a Global Memory that does not distinguish between operators and data, and whose state
     fully reflects the state of computation;
•=   All operands are sequences where atoms are represented by singleton sequences;
•=   Underlying Consume-Simplify-Produce execution strategy; and
•=   High level constructs to process nonscalars, applied in the simplification step.

For more detail about SequenceL please see [7, 24] Consider now a matrix
multiply function as written in SequenceL:

         Function matmul(Consume(s_1(n,*),s_2(*,m),Produce(next)) where next = {com-
                    pose([+([*(succ_1(i,*),succ_2(*,j))])])}
                    Taking [i,j]From cartesian_product([gen([1,...,n]),gen([1,...,m])])
         ([[2,4,6],[3,5,7],[1,1,1]], [[2,4,6],[3,5,7],[1,1,1]])


The term, [[2,4,6],[3,5,7],[1,1,1]] is the sequence representation of a matrix.
We will now trace the steps the SequenceL interpreter takes in order to
evaluate the function above. Bear in mind the only information provided by
the user is the function above, together with its arguments. The user does
not specify any control structures – not even the parallel paths that can be
followed in solving the problem. These parallel paths are implied in the
solution and derived by the SequenceL semantic. The semantic is an
interactions between the CSP-cycle and the SequenceL constructs applied in
the Simplify step.


Trace of the Evaluation of the SequenceL Matrix Multiply.
For simplicity, let T contain only the matrix multiply function and its input
sequences as seen above. The first step is to instantiate the variables s_1 and
s_2. At the same time, variables n and m obtain the cardinal values in the
designated sequence dimensions:
        {compose([+([*([[2,4,6],[3,5,7],[1,1,1]](i,*), [[2,4,6],[3,5,7],[1,1,1]](*,j))])])}
                Taking[i,j]From cartesian_product([gen([1,...,3]),gen([1,...,3])])

Now SequenceL’s generative construct produces the values needed in the
Taking clause. The simple form of the generative command is seen in this
example. It simply fills in the integer values between the upper and lower
bounds, i.e., from 1 to 3:

        {compose([+([*([[2,4,6],[3,5,7],[1,1,1]](i,*), [[2,4,6],[3,5,7],[1,1,1]](*,j))])])}
                          Taking [i,j]From cartesian_product([[1,2,3],[1,2,3]])

Next the SequenceL Cartesian Product generates the values for subscripts i
and j.
        {compose([+([*([[2,4,6],[3,5,7],[1,1,1]](i,*), [[2,4,6],[3,5,7],[1,1,1]](*,j))])])}
                          Taking [i,j]From [ [[1], [1]], [[1], [2]], [[1], [3]],
                                               [[2], [1]], [[2], [2]], [[2], [3]],
                                               [[3], [1]], [[3], [2]], [[3], [3]]        ]

Now that the simplification of the function is complete, the function’s result
is produced in T:


        T=[
        [+([*([[2,4,6],[3,5,7],[1,1,1]](1,*),[[2,4,6],[3,5,7],[1,1,1]](*,1))])           ||
        +([*([[2,4,6],[3,5,7],[1,1,1]](1,*),[[2,4,6],[3,5,7],[1,1,1]](*,2))])            ||
        +([*([[2,4,6],[3,5,7],[1,1,1]](1,*),[[2,4,6],[3,5,7],[1,1,1]](*,3))]) ]          ||
        [+([*([[2,4,6],[3,5,7],[1,1,1]](2,*),[[2,4,6],[3,5,7],[1,1,1]](*,1))])           ||
        +([*([[2,4,6],[3,5,7],[1,1,1]](2,*),[[2,4,6],[3,5,7],[1,1,1]](*,2))])            ||
        +([*([[2,4,6],[3,5,7],[1,1,1]](2,*),[[2,4,6],[3,5,7],[1,1,1]](*,3))]) ]          ||
        [+([*([[2,4,6],[3,5,7],[1,1,1]](3,*),[[2,4,6],[3,5,7],[1,1,1]](*,1))])           ||
        +([*([[2,4,6],[3,5,7],[1,1,1]](3,*),[[2,4,6],[3,5,7],[1,1,1]](*,2))])            ||
        +([*([[2,4,6],[3,5,7],[1,1,1]](3,*),[[2,4,6],[3,5,7],[1,1,1]](*,3))]) ]
             ]

The next Consume-Simplify-Produce (CSP) step replaces the tableau above
with the one below. This simplification step results in the parallel selection
of the vectors to be multiplied. Concurrent evaluation is denoted by the ||
symbol.
T=      [
                        [         +([*([[2,4,6],[2,3,1]])])                 ||
                                  +([*([[2,4,6],[4,5,1]])])                 ||
                                  +([*([[2,4,6],[6,7,1]])])       ]         ||
                        [         +([*([[3,5,7],[2,3,1]])])                 ||
                                  +([*([[3,5,7],[4,5,1]])])                 ||
                                  +([*([[3,5,7],[6,7,1]])])       ]         ||
                        [         +([*([[1,1,1],[2,3,1]])])                 ||
                                  +([*([[1,1,1],[4,5,1]])])                 ||
                                  +([*([[1,1,1],[6,7,1]])])      ]
                ]



In the next CSP step, the products are formed concurrently using
SequenceL’s regular construct. The regular process distributes a built-in
operator, e.g., the * operation, among corresponding elements of the
operands, resulting in 27 parallel multiplies:

T=      [
                [       +([*([[2], [2]]) || *([[4], [3]]) || *([[6], [1]])])     ||
                        +([*([[2], [4]]) || *([[4], [5]]) || *([[6], [1]])])     ||
                        +([*([[2], [6]]) || *([[4], [7]]) || *([[6], [1]])]) ]   ||
                [       +([*([[3], [2]]) || *([[5], [3]]) || *([[7], [1]])])     ||
                        +([*([[3], [4]]) || *([[5], [5]]) || *([[7], [1]])])     ||
                        +([*([[3], [6]]) || *([[5], [7]]) || *([[7], [1]])]) ]   ||
                [       +([*([[1], [2]]) || *([[1], [3]]) || *([[1], [1]])])     ||
                        +([*([[1], [4]]) || *([[1], [5]]) || *([[1], [1]])])     ||
                        +([*([[1], [6]]) || *([[1], [7]]) || *([[1], [1]])]) ]
                ]

Comparing the tableau above and the resulting tableau below, one can see
that SequenceL handles nested parallelisms automatically. The final step of
simplification involves the application of the regular construct to form the
sums of the products.

        T=      [       [         +([4,12,6])          ||
                                  +([8,20,6])          ||
                                  +([12,28,6]) ]       ||
                        [         +([6,15,7])          ||
                                  +([12,25,7])         ||
                                  +([18,35,7]) ]       ||
                        [         +([2,3,1])           ||
+([4,5,1])     ||
                                       +([6,7,1]) ]        ]

This regular process adds together corresponding elements of the operand
sequences. E.G., +([4,12,6]) = [4+12+6] = [22]. The final result is:

        [[22, 34, 46], [28, 44, 60], [6, 10, 14]]

Since no further simplification is possible, evaluation ends. There are three
SequenceL constructs: the regular, irregular, and generative. The matrix
multiply employed the regular and the generative. For more detailed
explanation of these constructs see [7, 24]
     One important aspect of this language is that through the introduction of
new language constructs, one can imply most control structures – even
concurrent or parallel structures. Therefore, rather than producing an
algorithm that implies a data product, one can come closer to specifying a
data product that implies the algorithm that produces or processes it. The
difficult part of traditional forms of programming seems to be centered
around the fact that programmers have to somehow envision the elusive data
product implied by their programs.



Some additional language experiments involving SequenceL.
SequenceL is a Turing complete language. An interpreter exists that finds
the parallel structures inherent in SequenceL problem solutions. The Matrix
Multiply example represents a fairly straightforward problem solution
insofar as the parallel paths behave independently of one another. In other
words, once the parallelisms are known, the paths can be spawned and
joined together with each path contributing its part of the final solution
without knowledge of what the other paths have computed. Examples of
problems where there are computed, intermediate results that need
broadcasting have also been explored using the SequenceL interpreter.
    For example, the Forward Processing in the Gaussian Elimination
Solution (the three SequenceL functions below) of systems of linear
equations has been executed with good results in terms of finding inherent
parallelisms. With three or more equations, intermediate results must be
known to all paths in order to produce the final result.
Finally, in terms of scheduling, both the matrix multiply and the
Gaussian Codes are examples of problems for which static a-priori schedules
can be generated. The paths of execution can be determined based upon the
dimensions of the matrix, in the matrix multiply, and the number of
equations, in the Gaussian code. The Quicksort problem was run as an
example of a problem where dynamic scheduling is necessary. [25]



Summary
In this chapter an overview of computer language design was presented. A
scientific backdrop was provided for the presentation from which it is easier
to understand the exchange between theoretical and experimental language
research. The backdrop also makes the distinction between ordinary and
extraordinary language research easier to make. The impact of the scientific
approach is then presented in the summary on the development of
SequenceL.




REFERENCES
[1]   Alan Turing, “On Computable Numbers, with an Application to the
    Entscheidungsproblem,” Proc. London Math Soc, 2: 42, 230-265.
[2] Thomas Kuhn, The Structure of Scientific Revolutions, Chicago University
    Press, 1962.
[3] N. Wirth, "On the Design of Programming Languages", Proc. IFIP Congress 74,
    North-Holland Publishing Co., (pp. 386-393).
[4] David Parnas, "On Criteria To Be Used in Decomposing Systems Into
    Modules," CACM, vol. 14 no. 1, April, 1972, pp. 221-227.
[5] J. Backus, “Can Programming Be Liberated from the von Neumann Style? A
    Functional Style and its Algebra of Programs,” Communications of the ACM,
    vol. 21, no. 8, August 1978, 613-641.
[6] R. Kowalski, “Algorithm = Logic + Control,” Communications of the ACM,
    vol. 22, no. 7, July 1979, 424-436.
[7] D. Cooke, “An Introduction to SEQUENCEL: A Language to Experiment with
    Nonscalar Constructs,” Software Practice and Experience, Vol. 26, No. 11,
    November 1996, 1205-1246.
[8] G. Blelloch, "Programming Parallel Algorithms," Communications of the ACM,
    Vol. 39, No. 3, March 1996, pp. 85-97.
[9] Daniel I.A. Cohen, Introduction to Computing Theory, John Wiley and Sons,
    Inc., New York, 1986.
[10] P. Zave, "An Insider's Evaluation of PAISLey," IEEE Transactions on
    Software Engineering, Vol. 17, No.3, March, 1991 (pp.212-225).
[11] N. Wirth and J. Gutknecht, Project OBERON The Design of an Operating
    System and Compiler, Addison-Wesley Publishing Company, Reading, MA,
    1992.
[12] McCarthy, J., "Recursive Functions of Symbolic Expressions and Their
    Computation by Machine," Communications of the ACM, vol. 3, no. 4 (April,
    1960), pp. 184-195.
[13] Paul Hudak, The Haskell School of Expression: Learning Functional
    Programming through Multimedia, Cambridge University Press, New York,
    2000.
[14] C.A.R. Hoare, “Communicating Sequential Processes,” Communications of the
    ACM, Vol. 21 No. 8, August, 1978 (pp.666-677).
[15] E.W. Dijkstra, "Guarded Commands," Communications of the ACM, 18:8, 453-
    457.
[16] D.Gelernter, "Generative Communications in Linda," ACM Transactions on
    Programming Languages and Systems, Vol. 7, No. 1, 1985, pp. 80-112.
[17] K. Iverson, . A Programming Language, Wiley, New York (1962).
[18] K. Iverson, J Introduction and Dictionary, Iverson Software Inc. Toronto,
    1994.
[19] J.P. Banatre and D. Le Matayer, "Programming by Multiset Transformation,"
    Communications of the ACM, Vol. 36 No. 1, January 1993, pp.98-111.
[20] V. Breazu-Tannen, P. Buneman, and S. Naqvi , "Structural Recursion as a
    Query Language," In Proceedings of 3rd International Conference on Database
    Programming Languages, Nafplion, Greece, Morgan Kaufmann, pp. 9-19.
[21] C. Hankin, D. Le Metayer, and D. Sands, A Calculus of Gamma Programs,
    Publication Interne no 674, Juillet, 1992, IRISA, France.
[22] J. Sipelstein and G. Blelloch, Collection-Oriented Languages, Report, School
    of Computer Science, Carnegie Mellon University, CMU-CS-90-127.
[23] D. Suciu, Parallel Programming Languages for Collections, Ph.D.
    Dissertation in Computer and Information Science, University of Pennsylvania,
    1995.
[24] D. Cooke, “SequenceL Provides a Different way to View Programming,”
    Computer Languages 24 (1998) 1-32.
[25] Daniel E. Cooke and Per Andersen, “Automatic Parallel Control Structures in
    SequenceL,” Software Practice and Experience, Volume 30, Issue 14,
    (November 2000), 1541-1570.

Más contenido relacionado

La actualidad más candente

Algoithems and data structures
Algoithems and data structuresAlgoithems and data structures
Algoithems and data structures
adamlongs1983
 
Soft computing from net
Soft computing from netSoft computing from net
Soft computing from net
EasyMedico.com
 
Comparison of relational and attribute-IEEE-1999-published ...
Comparison of relational and attribute-IEEE-1999-published ...Comparison of relational and attribute-IEEE-1999-published ...
Comparison of relational and attribute-IEEE-1999-published ...
butest
 

La actualidad más candente (13)

Deep learning and applications in non-cognitive domains III
Deep learning and applications in non-cognitive domains IIIDeep learning and applications in non-cognitive domains III
Deep learning and applications in non-cognitive domains III
 
Analysis Report
 Analysis Report  Analysis Report
Analysis Report
 
Algoithems and data structures
Algoithems and data structuresAlgoithems and data structures
Algoithems and data structures
 
BookyScholia: A Methodology for the Investigation of Expert Systems
BookyScholia: A Methodology for the  Investigation of Expert SystemsBookyScholia: A Methodology for the  Investigation of Expert Systems
BookyScholia: A Methodology for the Investigation of Expert Systems
 
Soft computing from net
Soft computing from netSoft computing from net
Soft computing from net
 
Artificial Intelligence of the Web through Domain Ontologies
Artificial Intelligence of the Web through Domain OntologiesArtificial Intelligence of the Web through Domain Ontologies
Artificial Intelligence of the Web through Domain Ontologies
 
Cloud Technologies for Microsoft Computational Biology Tools
Cloud Technologies for Microsoft Computational Biology ToolsCloud Technologies for Microsoft Computational Biology Tools
Cloud Technologies for Microsoft Computational Biology Tools
 
IRJET- Survey on Text Error Detection using Deep Learning
IRJET-  	  Survey on Text Error Detection using Deep LearningIRJET-  	  Survey on Text Error Detection using Deep Learning
IRJET- Survey on Text Error Detection using Deep Learning
 
The Advancement and Challenges in Computational Physics - Phdassistance
The Advancement and Challenges in Computational Physics - PhdassistanceThe Advancement and Challenges in Computational Physics - Phdassistance
The Advancement and Challenges in Computational Physics - Phdassistance
 
Comparison of relational and attribute-IEEE-1999-published ...
Comparison of relational and attribute-IEEE-1999-published ...Comparison of relational and attribute-IEEE-1999-published ...
Comparison of relational and attribute-IEEE-1999-published ...
 
IRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
IRJET- Survey on Generating Suggestions for Erroneous Part in a SentenceIRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
IRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
 
A Dynamic Topic Model of Learning Analytics Research
A Dynamic Topic Model of Learning Analytics ResearchA Dynamic Topic Model of Learning Analytics Research
A Dynamic Topic Model of Learning Analytics Research
 
Understanding complex systems
Understanding complex systemsUnderstanding complex systems
Understanding complex systems
 

Destacado (6)

Asundi
AsundiAsundi
Asundi
 
14 Kotel
14 Kotel14 Kotel
14 Kotel
 
Nato1968
Nato1968Nato1968
Nato1968
 
Ch1
Ch1Ch1
Ch1
 
Software Engineering Risk Management Software Application
Software Engineering Risk Management   Software ApplicationSoftware Engineering Risk Management   Software Application
Software Engineering Risk Management Software Application
 
Peephole optimization techniques in compiler design
Peephole optimization techniques in compiler designPeephole optimization techniques in compiler design
Peephole optimization techniques in compiler design
 

Similar a 03

Fundamentals of data structures ellis horowitz & sartaj sahni
Fundamentals of data structures   ellis horowitz & sartaj sahniFundamentals of data structures   ellis horowitz & sartaj sahni
Fundamentals of data structures ellis horowitz & sartaj sahni
Hitesh Wagle
 
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYUSING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
cseij
 
Data structures and algorithms alfred v. aho, john e. hopcroft and jeffrey ...
Data structures and algorithms   alfred v. aho, john e. hopcroft and jeffrey ...Data structures and algorithms   alfred v. aho, john e. hopcroft and jeffrey ...
Data structures and algorithms alfred v. aho, john e. hopcroft and jeffrey ...
Chethan Nt
 
Adaptive information extraction
Adaptive information extractionAdaptive information extraction
Adaptive information extraction
unyil96
 
Systematic software development using vdm by jones 2nd edition
Systematic software development using vdm by jones 2nd editionSystematic software development using vdm by jones 2nd edition
Systematic software development using vdm by jones 2nd edition
Yasir Raza Khan
 
Systematic software development using vdm by jones 2nd edition
Systematic software development using vdm by jones 2nd editionSystematic software development using vdm by jones 2nd edition
Systematic software development using vdm by jones 2nd edition
Yasir Raza Khan
 
Software Engineering
Software EngineeringSoftware Engineering
Software Engineering
Sanjay Saluth
 
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL DevelopmentProposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
Jorge Barreto
 

Similar a 03 (20)

Compoutational Physics
Compoutational PhysicsCompoutational Physics
Compoutational Physics
 
Concurrency Issues in Object-Oriented Modeling
Concurrency Issues in Object-Oriented ModelingConcurrency Issues in Object-Oriented Modeling
Concurrency Issues in Object-Oriented Modeling
 
Fundamentals of data structures ellis horowitz & sartaj sahni
Fundamentals of data structures   ellis horowitz & sartaj sahniFundamentals of data structures   ellis horowitz & sartaj sahni
Fundamentals of data structures ellis horowitz & sartaj sahni
 
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYUSING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
 
Chap10.ppt
Chap10.pptChap10.ppt
Chap10.ppt
 
Chap10.ppt Chemistry applications in computer science
Chap10.ppt Chemistry applications in computer scienceChap10.ppt Chemistry applications in computer science
Chap10.ppt Chemistry applications in computer science
 
Unit ii oo design 9
Unit ii oo design 9Unit ii oo design 9
Unit ii oo design 9
 
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
 
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
 
Data structures and algorithms alfred v. aho, john e. hopcroft and jeffrey ...
Data structures and algorithms   alfred v. aho, john e. hopcroft and jeffrey ...Data structures and algorithms   alfred v. aho, john e. hopcroft and jeffrey ...
Data structures and algorithms alfred v. aho, john e. hopcroft and jeffrey ...
 
Adaptive information extraction
Adaptive information extractionAdaptive information extraction
Adaptive information extraction
 
A Case Study Of A Reusable Component Collection
A Case Study Of A Reusable Component CollectionA Case Study Of A Reusable Component Collection
A Case Study Of A Reusable Component Collection
 
From requirements to ready to run
From requirements to ready to runFrom requirements to ready to run
From requirements to ready to run
 
Systematic software development using vdm by jones 2nd edition
Systematic software development using vdm by jones 2nd editionSystematic software development using vdm by jones 2nd edition
Systematic software development using vdm by jones 2nd edition
 
Systematic software development using vdm by jones 2nd edition
Systematic software development using vdm by jones 2nd editionSystematic software development using vdm by jones 2nd edition
Systematic software development using vdm by jones 2nd edition
 
Automated identification of sensitive information
Automated identification of sensitive informationAutomated identification of sensitive information
Automated identification of sensitive information
 
OOP ppt.pdf
OOP ppt.pdfOOP ppt.pdf
OOP ppt.pdf
 
Software Engineering
Software EngineeringSoftware Engineering
Software Engineering
 
Genetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary MethodologyGenetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary Methodology
 
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL DevelopmentProposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
 

03

  • 1. 1. Computer Language Advances † Daniel E. Cooke1,2, Michael Gelfond1, and Joseph E. Urban3 1 Computer Science Department, Texas Tech University 2 NASA Ames Research Center 3 Computer Science and Engineering Department, Arizona State Univeristy Introduction. The beginning of modern computer science can be marked by Alan Turing’s paper, which among other advances, showed that there are unsolvable problems. [1] Within the infinite set of solvable problems there is a set of technically feasible and problems that are not technically feasible to solve (e.g., problems that it would take a supercomputer of today 10,000 years to solve). Thus, from the beginning, computer science has been viewed as the science of problem solving using computers. One category of computer science research is devoted to outwardly encroaching upon the set of infeasible problems. Advances in computer hardware, for example, lead to more powerful computers that execute faster and have larger stores of memory. Advances in hardware are meant to result in improvements in the raw computing power of the devices that can be brought to bear in order to solve more complex problems. Additionally, research efforts focusing on complexity and algorithms are also concerned with outwardly encroaching upon the set of intractable problems. In general, this first category of computer science research is devoted to making complicated problems more technically feasible to solve. Problems that are technically feasible are those for which existing algorithms can obtain an answer in some reasonable time using currently available computer † Research sponsored, in part, by NASA NAG 5-9505.
  • 2. technology. The second major category of computer science research has to do with making it humanly feasible to solve more complicated problems. Human feasibility has to do with the level of difficulty one faces in finding a solution to a problem. Some of the efforts in the second category of research focus on the development of approaches that delegate some of the complexity of certain activities to easy-to-use tools, such as database management systems, operating systems, networking tools. Other approaches to the second category focus on problem representation. These areas of research include artificial intelligence and software engineering. Central to all efforts to improve upon the human’s ability to solve more complicated problems is computer language research. This chapter focuses on language design from a scientific point of view. To do so, a scientific backdrop will be developed. Then the notions of language design from a theoretical and experimental point of view are outlined. From there, the history of language design efforts are reviewed, first from the evolutionary and then from the revolutionary vantages. The chapter ends with a brief introduction of recent results obtained by one of the authors. Scientific Endeavors. In most sciences there are two organized research fronts: theoretical and experimental. Experiment provides a wealth of scientific observations that eventually must be distilled into a compact representation. The distilled, compact representation is a theory – a formulation of the body of knowledge gained by experiment that concisely represents the larger body of knowledge. Once formulated, the theory is employed in the design of future empirical studies to see if the theory predicts the outcomes of the experiments. Future experiments may confirm the theory – leading to greater confidence in it – or they may falsify the theory – indicating that the theory may be incorrect. Results existing between confirmatory and falsification may indicate that the theory needs some modification towards improvement. Based upon the interaction between the theoretical and experimental communities of the physical sciences, Kuhn, in his now classic text The Structure of Scientific Revolutions, identified two types of theoretical science. [2] The first category of theoretical research (called ordinary science) attempts to modify, extend, and/or refine theories so that they do a
  • 3. better job of explaining empirical results that are not explained by the unrefined theories. The second category (called extraordinary science) takes place when a radically new view (or theory) is proposed and it is seen that the theory explains empirical evidence, subsuming the old theory and all of its subsequent refinements. Extraordinary science results in paradigm shifts and in scientific revolutions. Theoretical Language Research. A programming language serves as the central part of a theory about how one best approaches problem solving with the aid of computers. The language is central to the theory because it is the basis for finding and communicating problem solutions to machines or other people. A language is meant to facilitate one’s ability to state and structure problem solutions. Possibly the most important aspect of the language is the fact that it alone manifests the abstraction – or the approach to problem solving – provided by the theory. Wirth noted that “One of the most crucial steps in the design of a language is the choice of abstraction upon which programs are to base.” [3] Abstraction is key to an understanding of language and language design. Abstraction. An abstraction is commonly viewed as the act of taking away – the formation of an idea apart from the concrete existence or specification thereof. A computer abstraction can be viewed as the result of dropping out nonessential, complicating details. An abstraction defines that which is salient to the problem solution, and in doing so, identifies the extraneous details that can be ignored in previous abstractions. The original model of computing required the physical rewiring of the CPU to alter the CPU's actions. A revolution in abstraction came with the advent of the stored program concept, also known as the Von Neumann architecture. The Von Neumann approach dropped out the original distinction between the storage of data and the storage of a program. Assembler languages quickly followed. These languages dropped out many nonessential details including the need to keep track of physical addresses. Macro definitions for oft-repeated functions such as input-output improved the assembler abstraction by removing many nonessential details from the
  • 4. programmer’s concern. In the case of input-output on the IBM 360/370, the details hidden are those involved in executing channel programs. The GET and PUT macros essentially hide a surfeit of nonessential details: without these macros the programmer has to write programs that run on the I/O channel processors together with the mainframe program code required to initiate the I/O transfers. FORTRAN was a major step in abstraction improvement because it provided a view of problem solving where one ignored many machine level details such as register usage and memory management. FORTRAN's abstraction moves the programmer away from machine operations - pointing the programmer towards algebraic operations. Languages like Algol and Pascal improved the FORTRAN abstraction in a significant way through the addition of prominent control structures and by placing data structure design on a level equal to algorithm design. In other words, with the newer abstractions the design of the data structure comes to be viewed as a critical element in problem solving. To illustrate the significance of the data structure, imagine trying to convert an arbitrary postfix expression into an equivalent prefix expression without the use of an advanced data structure. The Pascal abstraction eventually led to the view that the programmer's product is a data product rather than a program that produces that data product. The object-oriented approach to programming is less of an abstraction change in that this approach adds, rather than deletes technical detail. From an historical viewpoint one might view the relationship between the procedural and object-oriented approaches as analogous to the relationship between the assembler and the macro-assembler languages. In other words, the following analogy may indeed be an appropriate one to consider: macro- assembly : assembly :: object-oriented : procedural. The object-oriented approach was motivated by the fact that procedural languages lacked a "middle ground" between local and global variables. The lack of this middle ground led to technical difficulties in the implementation and reusability of data structures. The difficulty was addressed by Parnas's definitions of information hiding (a form of hiding variables beyond the hiding afforded by local variables). [4] These problems were solved with the addition of features to encapsulate data structures. Once one encapsulates data structures, it is a logical step to encapsulate program structures. Reuse of the so-called objects is facilitated through an inheritance mechanism. Object-oriented programming is a natural, evolutionary step that adds
  • 5. features to the procedural language abstraction in order to solve technical problems that arise in the use and reuse of data and program structures. Other changes to the procedural abstraction have been made in order to accommodate concurrent programming. Languages like Ada, Modula, and Linda have added some notion of multitasking in a manner that interacts with existent program structures for procedures and functions. Java is considered by many to be the best designed object-oriented language. Java possesses an environment that provides one with the ability to develop graphical user interfaces, applets for web-based programming, concurrent processes, etc. As such, Java embodies a superset of the significant extensions and modifications that characterize the evolution of the procedural languages. Apart from the evolution of procedural programming, completely new ways to view problem solving have arisen from work in functional [5], logic [6], and collection-oriented languages [7,8]. These improvements have sought to change the view of programming altogether. They can be viewed more as revolutionary changes as opposed to the more characteristic evolutionary changes described above. A programming language provides the fundamental level of abstraction for problem solving with a computer. It demarcates the point where the human leaves off and the architecture takes over, in the process of translating the problem solution into a set of actions to be performed by the computer. Language compilers or interpreters perform all further translation for us. The fundamental level of abstraction is very important. It defines the basis for the way we organize our activities to solve problems, i.e., it is the basis for the software process model. What constitutes a computer language theory? As a theory for problem solving, a computer language should satisfy certain objective requirements. The requirements are important to know, because they form an important measure of the significance of a language as a theory. Before a community is willing to design and perform experiments related to a theory, the theory should satisfy requirements, which indicate that the experiments are warranted and worthwhile. In this section requirements that a computer language should satisfy are presented.
  • 6. One requirement of a computer language is that it be unambiguous. To satisfy this requirement there must be a mathematical definition of the syntax and semantics of the language. The syntax of a language provides a precise definition of the rules employed in order to construct grammatically correct (i.e., valid) sentences in the language. Backus-Naur Form is commonly used for simple syntax definitions. More complex syntactic features are frequently captured by attribute grammar, another form used in syntax definition. Attribute grammars permit the description of semantic constraints such as data typing constraints. The semantics of a computer language should provide a precise definition of the meaning of a sentence stated in the language. Semantics can be stated using the denotational, axiomatic, or operational approaches, or through the use of other formal languages. The Backus-Naur Form allows for the construction of a finite set of rules that define how any element of the infinite set is constructed. For example, the following finite set of recursive rules define an infinite set of elements: T ::= zero | succ(T) These rules define a set of syntactic objects, which belong to the lan- guage: T = {zero, succ(zero), succ(succ(zero)),…} Now consider the set of natural numbers: N = {0,1,2,3, …} Formal semantics are given in a function definition. The function maps from the syntactic set T, to the semantic set, N, i.e., m:T → N. Equations define the function m. For example, consider the equations, which perform the simple mapping from elements of T to elements of N: m(zero) = 0 (1) m(succ(T)) = +(m(T),1) (2) Using equations (1) and (2), a precise (i.e., unambiguous) mapping can be developed from a syntactic to a semantic object:
  • 7. (2) m(succ(succ(zero))) = (2) +(m(succ(zero)),1) = (1) +(+(m(zero),1),1) = +(+(0,1),1) = 2 Notice that the lefthand side (LHS) of rules (1) and (2) refer to syntactic structures, while the righthand side (RHS) of the rules indicate the semantic of the syntactic structure. Once the rules have been developed they can be analyzed to determine if ambiguity exists in the language definition. For example, if there are two rules with the same LHS, but differing RHS’s, then there are two meanings for one syntactic structure. Whenever multiple interpretations exist for the same syntactic structure, ambiguity exists in the language. With the following modification to the Backus-Naur Form rule: T ::= zero | succ(T) T1::= T + T together with additional semantic equations to define mappings from T1: m(zero + zero) = 0 (3) m(succ(T) + zero) = m(succ(T)) (4) m(zero + succ(T)) = m(succ(T)) (5) m(succ(T1) + succ(T2)) = +(m(T1 + T2),2) (6) one can provide for more sophisticated, yet equally precise mappings that specify the addition of natural numbers. Consider the expression succ(succ(zero)) + succ(succ(zero)): (6) m(succ(succ(zero)) + succ(succ(zero))) = (6) +(m(succ(zero) + succ(zero)),2) = (3) +(+(m(zero + zero),2),2) = +(+(0,2),2) = 4 Consider the expression succ(zero) + succ(succ(zero)): (6) m(succ(zero) + succ(succ(zero))) = (5) +(m(zero + succ(zero)),2) = (2) +(m( succ(zero)),2) = (1) +(+(m(zero),1),2) =
  • 8. +(+(0,1),2) = 3 The formal (i.e., unambiguous) syntax and semantic definitions of a computer language are objective requirements that a new language should satisfy. Another desirable feature of a computer language has to do with size. A computer language should possess a small number of features that are easy to combine in any number of ways in order to produce complex problem solutions. Computer languages that possess a small number of combinable features are called orthogonal. Orthogonal languages allow a software developer to design solutions of complex problem through the interaction of a small set of language features. The syntax of a computer language provides important insight into the extent to which a software developer can claim that the language is orthogonal. Consider the grammar definitions below: FOR ::= for id := E to E do ?; WHILE ::= while C do ?; IF ::= if C then ? else ?; ASSIGN ::= id := E; READ ::= readln(id); WRITE ::= writeln(E); STATE ::= FOR | WHILE | IF | ASSIGN | READ | WRITE | STATE; STATE If one replaces all the ?'s above with the nonterminal STATE, the language has a good degree of orthogonality. The goodness is due to the fact that one is able to nest any control structure inside any other control structure. The writability of a language has to do with how difficult it is to encode a problem solution. In general, an orthogonal language is arguably more writable than a nonorthogonal language. For the same reason, the readability of the orthogonal language - the ability to decipher the problem solution - is an improvement over nonorthogonal languages. The conciseness of the problem solutions that can be developed in a language is often viewed to be an important feature of a good language. While it is certainly true that conciseness is an important criterion, conciseness must be considered within the context of a language’s readability and writability. For example, one can imagine a language that has a unique one-character symbol for each problem solution known to humankind – obviously requiring an infinite supply of one-character symbols. Suppose, in this language, that the symbol # represents a heapsort
  • 9. program. The solution to the sort and any other problem in this language is definitely as concise as can be envisioned. The problem with the language has to do with its readability and writability. To be an effective problem solver, the programmer who writes or reads the program must recall that the # represents the heapsort program – and he/she must remember this fact as distinguished from the potentially infinite number of other pairings between a singular symbol and a problem solution. The table-lookup quality to this hypothetical language would seem to go against the traditional efforts of computer scientists. Beginning with Alan Turing, computer scientists have focused on the identification of a small set of primitive elements that comprise problem solving. The design and choice of the primitives takes into consideration the degree to which the primitives chosen can be combined to “build up” complicated problem solutions. Thus, the conciseness of the language itself (rather than the size of the problem solutions written in the language), together with the degree to which the language constructs can be combined (i.e., the orthogonality of the language) would seem to be higher priority features of a language. If these features are met, then the conciseness of solutions becomes a reasonable criterion for a language. Another desirable feature of a programming language is that it be capable of representing a Universal Turing Machine. Languages satisfying this requirement, are known as Turing Computable. A Turing Computable Language is one that can solve any problem that can be stated as an algorithm, i.e., any effectively compatible function. Simply put, an algorithm is a set of easy-to-follow instructions, that given an input, will produce a result. [9] Implied by the previous definition is the requirement that the algorithm will "halt." Given an arbitrary input, an algorithm must produce an answer (or result) in a finite amount of time. A language that can state any possible algorithm, is Turing Computable. A Turing Computable Language is capable of solving any algorithmically solvable problem (i.e., one for which there is an algorithmic solution). Therefore, a language should be Turing Computable. Coherence is another desirable feature of a language. [10] Coherence requires that language features be drawn from the same abstraction level. Recall the quote seen earlier: One of the most crucial steps in the design of a language is the choice of abstraction upon which programs are to base.
  • 10. Two sentences later, in the same text, one finds: [The language designer] should restrict his selection [of lan- guage features] from the same level which are in some sense compatible with each other. [3] Coherence of a language is an important feature insofar as one does not want the programmer to have to understand a given program on two differing levels of abstraction. Suppose there was a version of a procedural language like Pascal, which allowed the programmer to dip into the assembler level of representation. If a change is made to a program in this language, the programmer must understand the change at the Pascal and the assembler level. In other words, the programmer must understand the interaction of the machine code generated by the compiler and the assembler code that is embedded in the source program. Regularity is yet another desirable feature in a language. Regularity requires that language features operate in a consistent and uniform manner, independent of context – regularity requires similar actions to have similar consequences. One way to view regularity is from the standpoint of a human interface. A computer system that has a "mouse" ought to have the mouse button(s) perform in a uniform manner at all times. Suppose a mouse has two buttons. If button X opens any available pop-up window at the operating system level, it should do the same for any application program one might enter from the operating system level. If button X performs function A in one application program and function B in another, the overall interface design is not consistent, not regular, and not particularly good. Here we begin to migrate away from objective requirements into the realm of nonobjective requirements. A qualitative measure of a language is that it be intuitive. An intuitive language is one that allows the problem solver to follow instinct and intuition in solving the problem. Clearly, what one person views as intuitive is likely to be different from what the next person views to be intuitive. Since intuitiveness is a qualitative measure, a problem solver can only determine whether one language is more intuitive than another language through experimentation. Experimental computer science is costly and rarely done when it comes to language comparisons. However, if languages can be shown to meet some reasonable number of the qualities introduced in this section, it would be ideal if they could be compared experimentally for the more qualitative characteristics.
  • 11. Experimental Science. There are at least two approaches to empirical language studies. The first approach involves the careful understanding and evaluation of a language and results in a value judgement on the part of the researcher. This first ap- proach is not unlike the approach Wirth took in the Oberon project [see 11]. The formalisms developed to describe languages are analogous to differential equations, which are employed in describing and modeling physical phenomena. Just as the study of the models of physical phenomena have led to breakthroughs in Physics, the mathematical models describing language approaches should be carefully studied in order to advance the understanding of problem solving. The second approach to empirical language studies involves the development of controlled experiments. The types of measurements that are required and the variables for which controls must be established need to be well understood. To some extent, computer scientists are like physicists attempting to study thermodynamics without a thermometer. Nonetheless, controlled experiments need to be undertaken. If software advances are to keep pace with hardware advances, languages and other software tools must improve the productivity of software engineers (i.e., make problems more humanly feasible to solve). For a language to better facilitate productivity, it must provide a more intuitive approach to problem solving. The degree of intuitiveness of a language is a nonobjective consideration that may require controlled experimentation. To be effective, two major forms of experimentation are required. These forms are: •= to determine which languages are best suited for a given type of problem solver and •= to determine which languages are best suited for a given type or domain of problems. Ideally computer language research should be based upon a framework similar to that found in the physical sciences. In the physical sciences there is a healthy interaction between theory and experiment, where each drives the other. To a lesser extent the same is true in Computer Science. However, the organization of the theory and experimental communities is not as well developed in Computer Science. Suppose one viewed a new language as a theory about how to best approach a certain type of problem solving. The language would be expected to meet certain criteria, like those listed in the previous section.
  • 12. Once the theory is accepted, the language would be subjected to experimental research where competing language approaches would be compared empirically. The experimental activity might require sociological and psychological testing to determine which language seems to provide the most intuitive approach. Evolutionary versus revolutionary Recall the earlier discussion concerning Kuhn’s ordinary and extraordinary science. There have been examples of extraordinary science in language research – beginning with the Von Neumann architecture, then FORTRAN, LISP, Algol/Pascal (to a lesser degree), SQL, and Prolog. The revolutionary approaches having had the greatest impact are the Von Neumann architecture and FORTRAN. Many will argue that the object-oriented approach was and is revolutionary. However, one can argue the other side as well – that the object-oriented approach is a refinement (and improvement) to the procedural approach and manifests an evolutionary, rather than revolutionary, impact. Evolution of Computer Languages The evolution of computer languages has been based upon the revolution brought about by the introduction of the Von Neumann computer architecture. This architecture was the basis for the computer language constructs. The stored program and the program counter permit the modification of the flow of control. These hardware capabilities have culminated in the sequence, selection, and iterative control constructs. The general-purpose registers – specialized memory locations between which arithmetic and logic circuitry exist – allow for the arithmetic and comparative capabilities. The arithmetic construct follows from these hardware features, as do the constructs – that when paired with the ability to alter the program counter – allow for the ability to perform conditional execution required for selection and iteration. Input/output ports on computers support the input-output constructs found in computer languages. Because these architectural features are common to all computers and because languages allow us to tell computers what to do, the constructs are general constructs to all procedural, programming languages. The constructs
  • 13. form categories in which every procedural programming language has statements. Therefore, the constructs form a solid basis from which to learn a new programming language. The evolution of languages is based upon the executable constructs – the constructs just mentioned that permit us to tell the computer what to do – and the nonexecutable commands to the language translator. The nonexecutable constructs provide for program and data structure definition. FORTRAN, a true revolutionary approach to programming, provided the definitive version of the assignment statement and the arithmetic expression. No language since FORTRAN has altered the basic form of the arithmetic expression or the assignment statement. In other words, if one knows the operational semantics and basic syntactic forms of FORTRAN’s assignment statement, then the problem solver will not need to relearn this type of statement or expression in any new language. The only differences will be syntactic nuances. Likewise, languages like Algol and Pascal defined – for the most part – the remaining executable constructs. The input-output and the sequential control structures were established in a definitive way by the forms they took in Algol and Pascal. Pascal’s stream I/O – as opposed to a strictly formatted I/O – has not been subsequently improved upon in any revolutionary way. Similarly, the block structured sequences, selective (e.g., if-then-else and case or switch statements), and the iterative (e.g., the while and the for statements) constructs have not been modified in a major way by subsequent languages. The only major changes in control constructs is in the addition of parallel control structures found in many languages, including Concurrent Pascal, Modula, Ada, and Java; and in the way in which encapsulated program structures are declared, instantiated, and invoked. Encapsulation of program structures – though – is arguably less a change in control structures and more of a change in the commands a programmer gives to a language interpreter or compiler in order to establish data and program structures. The view taken here, is that in terms of the procedural paradigm (including the paradigm’s Object-Oriented extensions), the declaration of program and data structures – the set of nonexecutable constructs – is the one area of a language that continues to undergo substantive evolutionary change.
  • 14. Revolution in Computer Languages Already mentioned are the revolutionary advances that accompanied FORTRAN’s introduction. The dropping away of many machine details (e.g., memory and register usage) truly changed the ground upon which the programmer stands when solving problems. Algol and Pascal improved upon this ground in a semi-revolutionary way by having the programmer regard the data structure as being on par with the algorithm in problem solving. LISP was a significant advance as well. [12] Problem solving was reduced to the interaction between two primitive features – recursion and the list. The simple elegance that ensued with LISP provided the programmer with a view of programming that is characterized by a simple and natural approach. The data structures that one composes in LISP are robust in terms of their growth and the fact that they can be heterogeneous. The revolution brought about by LISP allowed the programmer a view of programming based upon lambda calculus. Other significant advances in functional programming include the languages FP [5] and Haskell [13]. Prolog – another revolutionary language – allowed the programmer to view programming based upon First Order Predicate Calculus. [6] The language features allow for simple, concise, and elegant solutions to many problems. For example, one can implement a stack data structure with the simple clause: push_pop(Top,Stack,[Top|Stack]) Consider the following trace using this definition: Yes ?- push_pop(9,[4,5,6,7],Stack). Stack = [9, 4, 5, 6, 7] Yes ?- push_pop(Top,Stack,[9,4,5,6,7]). Top = 9 Stack = [4, 5, 6, 7] Yes ?-
  • 15. Concurrency and Parallelisms Hoare’s Communicating Sequential Processes provides a significant formal approach to concurrent and parallel programming. [14] Hoare’s approach provides formalisms for loop unfolding and concurrent processing of other iterative structures as well as the ability to synchronize concurrent tasks by extending Djikstra’s Guarded Commands and his Semaphores. [15] Communicating Sequential Processes led to languages like OCCAM and provided true insight into concurrent processing – influencing many subsequent languages. Conventional approaches to concurrent programming (including Linda [16]) add features to the existing procedural and Object-Oriented approaches to programming. Adding features complicate these paradigms. Newer approaches to concurrent programming attempt to provide more intuitive approaches to parallelisms resulting in programs that are easier to write and easier to understand. Many of these approaches are based upon collection- oriented languages that provide for aggregation operators like those pioneered in [17, 18]. These newer approaches to parallel programming include [19, 8, 20, 21, 22, 23]. These approaches provide a way to indicate that functions being invoked are to execute concurrently. For example, suppose one wishes to compute the square of a list of numbers. In NESL one can specify this computation as follows [8 -p.90]: {a*a : a in X } where X is defined to be a list of numbers such as [3, -4, -9, 5]. NESL's computational model can be viewed architecturally as a Vector Random Access Machine (VRAM). [8] The VRAM is a virtual model for computation consisting of an unlimited number of processors π that have simultaneous access to a shared memory. The assignment of a computation χ(x) to a particular processor πi can be denoted as πi=(χ(x)). In the NESL program statement above, the {}'s specify concurrent execution. The program statement above specifies the parallel computation of the square of each element of X: {a*a : a in [3, -4, -9, 5] } = [πi (3*3), πi+1(-4*-4), πi+2(-9*-9), πi+3(5*5)] = [9, 16, 81, 25]
  • 16. In the VRAM model, each processor obtains its data from and produces its result in the shared memory. Each processor πi is viewed as a conventional processor. NESL does provide an excellent language for the specification of data parallelism wherein a divide and conquer approach to problem solving is promoted. NESL provides for the specification of a data parallel approach to programming. SequenceL, presented below provides a more implicit approach to parallelisms. The problem above – in SequenceL can be specified as follows: *([[3, -4, -9, 5], [3, -4, -9, 5]]) resulting in the following concurrent computations: [ *([3,3]) || *([-4,-4]) || *([-9,-9]) || *([5,5]) ] = [9,16,81,25] Like NESL, the SequenceL language is based upon a shared memory architecture. In general, the history of language development has deeply affected the definition of SequenceL. Recall that one of the first major advances in Computer Language Design was the elimination of the distinction between program and memory. This advance eventually led to the paradigms where data is stored in named locations and algorithms can define the locations through the use of assignment and input-output statements. Advances also led to the development of the control flow constructs that are used to define and process data structures. These constructs are the sequence, selection, iteration, and parallel constructs. The SequenceL paradigm provides no distinction between data and functions/operators. Furthermore, all data is considered to be nonscalar, i.e., sequences. SequenceL has an underlying control flow used in the evaluation of SequenceL operators. In the underlying Consume-Simplify-Produce- cycle (i.e., CSP-cycle), operators enabled for execution are Consumed (with their arguments) from the global memory, Simplified according to declarative constructs that operate on data, and the resulting simplified terms are then Produced in the global memory. This CSP-cycle, together with the declarative SeuqenceL constructs (applied during the simplification step) imply control flow structures that the programmer does not have to design or specify. The elimination of the distinction between data and operators is achieved through the use of a global memory, called a tableau T, where SequenceL terms are placed:
  • 17. [ f a1 1, f a2 2,…, f an n ] where n≥0 when ai= 0 fi is a sequence and when aI>0 fi is a function or an operator. The arity ai of a function or operator fi indicates the number of arguments (i.e., sequences) fi requires for execution. There is no notion of assignment in SequenceL. Furthermore, there is no notion of input distinguished from the application of an operator to its operands. In this context, the word operator refers both to built-in and user-defined operators. The underlying CSP control flow in SequenceL eliminates the need for the programmer to specify parallel and most of the specification of iterative and sequence-based operations. The control flow approach identifies operators that are enabled for execution. The enabling of operators is based upon the arity of the operator (which is derived from the operator’s signature) matching the number of sequences following the operator. The global memory and the producer/consumer abstraction in SequenceL implements the basic ability to enable the Hoare/Djikstra concept of synchronized communication among processes. However, the SequenceL approach provides this ability without an additional language construct beyond its approach of enabling function execution through the data flow execution-based CSP executution strategy underlying SequenceL However, CSP alone will not achieve the desired effect. In order to derive finer-grained parallelisms and many iterative processes in an automated way, CSP must interact with advanced constructs for processing data. The desired interaction occurs in the Simplification step of the CSP- cycle. These advanced SequenceL constructs are the regular, generative, and irregular constructs. Among other capabilities, these constructs implement Hoare’s loop unfolding constructs. The regular construct applies an operator to corresponding elements of the normalized operand sequences. For example, +([4,4]) distributes the plus sign among the corresponding elements of the two singleton sequences [4] and [4], resulting in [8]. Now consider the more complicated application: T = [*([20,30,40],[2])]
  • 18. In this case normalization results in the elements of the second operand sequence being repeated in the order they occur until it is the same length (and/or dimension in terms of nesting) as the larger: T = [*([20,30,40],[2,2,2])] Now the multiplication operator can be distributed among the corresponding elements of the operand sequences. The normalization/distribution is a single simplification step: T’ = [*([20,2]), *([30,2]), *([40,2])] As a result, in the next CSP cycle there are three parallel operations that can take place (i.e. loop unfolding is implied by the regular construct). Thus, the interaction of the CSP execution cycle and the regular construct results in the identification of microparallelisms. The final result is: T’’ = [40, 60, 80] The generative construct allows for the generation of sequences, e.g., [1,…,5] = [1,2,3,4,5]. The irregular construct allows for conditional execution: T=[Double-odds(Consume(s_1(n)), Produce(next)) where next = [ *([s_1(i) 2]) when <>( mod([s_1(i),2]),0) else []] Taking i from [1,…,n],[2,3,4,5]] See Figure 1 for a trace of the function. [ *([[2,3,4,5](i), 2]) when <>( mod([[2,3,4,5](i),2]),0) else []]Taking i from [1,…,4] [ *([[2,3,4,5](i), 2]) when <>( mod([[2,3,4,5](i),2]),0) else []]Taking i from [1,2,3,4] [ *([2 ,2]) when [ *([3, 2]) when [ *([4,2]) when [ *([5, 2]) when <>( mod([2,2]),0) <>( mod([3,2]),0) <>( mod([4,2]),0) <>( mod([5,2]),0) else []] else []] else []] else []] [ *([2 ,2]) when [ *([3, 2]) when [ *([4,2]) when [ *([5, 2]) when <>( 0,0) <>( 1,0) <>( 0,0) <>( 1,0) else []] else []] else []] else []]
  • 19. [ *([2 ,2]) when [ *([3, 2]) when [ *([4,2]) when [ *([5, 2]) when False True False True else []] else []] else []] else []] [ [] *([3, 2]) [] *([5, 2]) ] [ [] 6 [] 10 ] [6,10] Figure 1. Double-Odds. With the irregular construct one can, when necessary, produce recursive applications of a function: T=[list(Consume(s_1), Produce(next)) where next = [ [list, -([s_1, 1])] when >(s_1,0) else [s_1]] [4] Recursion involves a function placing itself back into T. See figure 2 for a trace of the function. [list(Consume(s_1), Produce(next)) where next = [ [list, -([s_1, 1])] when >(s_1,0) else [s_1]] [2] [ [list, -([2, 1])] when >(2,0) else [2]] [list(Consume(s_1), Produce(next)) where next = [ [list, -([s_1, 1])] when >(s_1,0) else [s_1]] [1] [ [list, -([1, 1])] when >(1,0) else [1]] [list(Consume(s_1), Produce(next)) where next = [ [list, -([s_1, 1])] when >(s_1,0) else [s_1]] [0] [ [list, -([0, 1])] when >(0,0) else [0]]
  • 20. [0] Figure 2. List function. SequenceL, represents an attempt to develop a language abstraction in which control structures – sequential, iterative, and parallel – are implied. In summary, the SequenceL abstraction presents a paradigm with the following aspects: •= a Global Memory that does not distinguish between operators and data, and whose state fully reflects the state of computation; •= All operands are sequences where atoms are represented by singleton sequences; •= Underlying Consume-Simplify-Produce execution strategy; and •= High level constructs to process nonscalars, applied in the simplification step. For more detail about SequenceL please see [7, 24] Consider now a matrix multiply function as written in SequenceL: Function matmul(Consume(s_1(n,*),s_2(*,m),Produce(next)) where next = {com- pose([+([*(succ_1(i,*),succ_2(*,j))])])} Taking [i,j]From cartesian_product([gen([1,...,n]),gen([1,...,m])]) ([[2,4,6],[3,5,7],[1,1,1]], [[2,4,6],[3,5,7],[1,1,1]]) The term, [[2,4,6],[3,5,7],[1,1,1]] is the sequence representation of a matrix. We will now trace the steps the SequenceL interpreter takes in order to evaluate the function above. Bear in mind the only information provided by the user is the function above, together with its arguments. The user does not specify any control structures – not even the parallel paths that can be followed in solving the problem. These parallel paths are implied in the solution and derived by the SequenceL semantic. The semantic is an interactions between the CSP-cycle and the SequenceL constructs applied in the Simplify step. Trace of the Evaluation of the SequenceL Matrix Multiply. For simplicity, let T contain only the matrix multiply function and its input sequences as seen above. The first step is to instantiate the variables s_1 and
  • 21. s_2. At the same time, variables n and m obtain the cardinal values in the designated sequence dimensions: {compose([+([*([[2,4,6],[3,5,7],[1,1,1]](i,*), [[2,4,6],[3,5,7],[1,1,1]](*,j))])])} Taking[i,j]From cartesian_product([gen([1,...,3]),gen([1,...,3])]) Now SequenceL’s generative construct produces the values needed in the Taking clause. The simple form of the generative command is seen in this example. It simply fills in the integer values between the upper and lower bounds, i.e., from 1 to 3: {compose([+([*([[2,4,6],[3,5,7],[1,1,1]](i,*), [[2,4,6],[3,5,7],[1,1,1]](*,j))])])} Taking [i,j]From cartesian_product([[1,2,3],[1,2,3]]) Next the SequenceL Cartesian Product generates the values for subscripts i and j. {compose([+([*([[2,4,6],[3,5,7],[1,1,1]](i,*), [[2,4,6],[3,5,7],[1,1,1]](*,j))])])} Taking [i,j]From [ [[1], [1]], [[1], [2]], [[1], [3]], [[2], [1]], [[2], [2]], [[2], [3]], [[3], [1]], [[3], [2]], [[3], [3]] ] Now that the simplification of the function is complete, the function’s result is produced in T: T=[ [+([*([[2,4,6],[3,5,7],[1,1,1]](1,*),[[2,4,6],[3,5,7],[1,1,1]](*,1))]) || +([*([[2,4,6],[3,5,7],[1,1,1]](1,*),[[2,4,6],[3,5,7],[1,1,1]](*,2))]) || +([*([[2,4,6],[3,5,7],[1,1,1]](1,*),[[2,4,6],[3,5,7],[1,1,1]](*,3))]) ] || [+([*([[2,4,6],[3,5,7],[1,1,1]](2,*),[[2,4,6],[3,5,7],[1,1,1]](*,1))]) || +([*([[2,4,6],[3,5,7],[1,1,1]](2,*),[[2,4,6],[3,5,7],[1,1,1]](*,2))]) || +([*([[2,4,6],[3,5,7],[1,1,1]](2,*),[[2,4,6],[3,5,7],[1,1,1]](*,3))]) ] || [+([*([[2,4,6],[3,5,7],[1,1,1]](3,*),[[2,4,6],[3,5,7],[1,1,1]](*,1))]) || +([*([[2,4,6],[3,5,7],[1,1,1]](3,*),[[2,4,6],[3,5,7],[1,1,1]](*,2))]) || +([*([[2,4,6],[3,5,7],[1,1,1]](3,*),[[2,4,6],[3,5,7],[1,1,1]](*,3))]) ] ] The next Consume-Simplify-Produce (CSP) step replaces the tableau above with the one below. This simplification step results in the parallel selection of the vectors to be multiplied. Concurrent evaluation is denoted by the || symbol.
  • 22. T= [ [ +([*([[2,4,6],[2,3,1]])]) || +([*([[2,4,6],[4,5,1]])]) || +([*([[2,4,6],[6,7,1]])]) ] || [ +([*([[3,5,7],[2,3,1]])]) || +([*([[3,5,7],[4,5,1]])]) || +([*([[3,5,7],[6,7,1]])]) ] || [ +([*([[1,1,1],[2,3,1]])]) || +([*([[1,1,1],[4,5,1]])]) || +([*([[1,1,1],[6,7,1]])]) ] ] In the next CSP step, the products are formed concurrently using SequenceL’s regular construct. The regular process distributes a built-in operator, e.g., the * operation, among corresponding elements of the operands, resulting in 27 parallel multiplies: T= [ [ +([*([[2], [2]]) || *([[4], [3]]) || *([[6], [1]])]) || +([*([[2], [4]]) || *([[4], [5]]) || *([[6], [1]])]) || +([*([[2], [6]]) || *([[4], [7]]) || *([[6], [1]])]) ] || [ +([*([[3], [2]]) || *([[5], [3]]) || *([[7], [1]])]) || +([*([[3], [4]]) || *([[5], [5]]) || *([[7], [1]])]) || +([*([[3], [6]]) || *([[5], [7]]) || *([[7], [1]])]) ] || [ +([*([[1], [2]]) || *([[1], [3]]) || *([[1], [1]])]) || +([*([[1], [4]]) || *([[1], [5]]) || *([[1], [1]])]) || +([*([[1], [6]]) || *([[1], [7]]) || *([[1], [1]])]) ] ] Comparing the tableau above and the resulting tableau below, one can see that SequenceL handles nested parallelisms automatically. The final step of simplification involves the application of the regular construct to form the sums of the products. T= [ [ +([4,12,6]) || +([8,20,6]) || +([12,28,6]) ] || [ +([6,15,7]) || +([12,25,7]) || +([18,35,7]) ] || [ +([2,3,1]) ||
  • 23. +([4,5,1]) || +([6,7,1]) ] ] This regular process adds together corresponding elements of the operand sequences. E.G., +([4,12,6]) = [4+12+6] = [22]. The final result is: [[22, 34, 46], [28, 44, 60], [6, 10, 14]] Since no further simplification is possible, evaluation ends. There are three SequenceL constructs: the regular, irregular, and generative. The matrix multiply employed the regular and the generative. For more detailed explanation of these constructs see [7, 24] One important aspect of this language is that through the introduction of new language constructs, one can imply most control structures – even concurrent or parallel structures. Therefore, rather than producing an algorithm that implies a data product, one can come closer to specifying a data product that implies the algorithm that produces or processes it. The difficult part of traditional forms of programming seems to be centered around the fact that programmers have to somehow envision the elusive data product implied by their programs. Some additional language experiments involving SequenceL. SequenceL is a Turing complete language. An interpreter exists that finds the parallel structures inherent in SequenceL problem solutions. The Matrix Multiply example represents a fairly straightforward problem solution insofar as the parallel paths behave independently of one another. In other words, once the parallelisms are known, the paths can be spawned and joined together with each path contributing its part of the final solution without knowledge of what the other paths have computed. Examples of problems where there are computed, intermediate results that need broadcasting have also been explored using the SequenceL interpreter. For example, the Forward Processing in the Gaussian Elimination Solution (the three SequenceL functions below) of systems of linear equations has been executed with good results in terms of finding inherent parallelisms. With three or more equations, intermediate results must be known to all paths in order to produce the final result.
  • 24. Finally, in terms of scheduling, both the matrix multiply and the Gaussian Codes are examples of problems for which static a-priori schedules can be generated. The paths of execution can be determined based upon the dimensions of the matrix, in the matrix multiply, and the number of equations, in the Gaussian code. The Quicksort problem was run as an example of a problem where dynamic scheduling is necessary. [25] Summary In this chapter an overview of computer language design was presented. A scientific backdrop was provided for the presentation from which it is easier to understand the exchange between theoretical and experimental language research. The backdrop also makes the distinction between ordinary and extraordinary language research easier to make. The impact of the scientific approach is then presented in the summary on the development of SequenceL. REFERENCES [1] Alan Turing, “On Computable Numbers, with an Application to the Entscheidungsproblem,” Proc. London Math Soc, 2: 42, 230-265. [2] Thomas Kuhn, The Structure of Scientific Revolutions, Chicago University Press, 1962. [3] N. Wirth, "On the Design of Programming Languages", Proc. IFIP Congress 74, North-Holland Publishing Co., (pp. 386-393). [4] David Parnas, "On Criteria To Be Used in Decomposing Systems Into Modules," CACM, vol. 14 no. 1, April, 1972, pp. 221-227. [5] J. Backus, “Can Programming Be Liberated from the von Neumann Style? A Functional Style and its Algebra of Programs,” Communications of the ACM, vol. 21, no. 8, August 1978, 613-641. [6] R. Kowalski, “Algorithm = Logic + Control,” Communications of the ACM, vol. 22, no. 7, July 1979, 424-436. [7] D. Cooke, “An Introduction to SEQUENCEL: A Language to Experiment with Nonscalar Constructs,” Software Practice and Experience, Vol. 26, No. 11, November 1996, 1205-1246. [8] G. Blelloch, "Programming Parallel Algorithms," Communications of the ACM, Vol. 39, No. 3, March 1996, pp. 85-97.
  • 25. [9] Daniel I.A. Cohen, Introduction to Computing Theory, John Wiley and Sons, Inc., New York, 1986. [10] P. Zave, "An Insider's Evaluation of PAISLey," IEEE Transactions on Software Engineering, Vol. 17, No.3, March, 1991 (pp.212-225). [11] N. Wirth and J. Gutknecht, Project OBERON The Design of an Operating System and Compiler, Addison-Wesley Publishing Company, Reading, MA, 1992. [12] McCarthy, J., "Recursive Functions of Symbolic Expressions and Their Computation by Machine," Communications of the ACM, vol. 3, no. 4 (April, 1960), pp. 184-195. [13] Paul Hudak, The Haskell School of Expression: Learning Functional Programming through Multimedia, Cambridge University Press, New York, 2000. [14] C.A.R. Hoare, “Communicating Sequential Processes,” Communications of the ACM, Vol. 21 No. 8, August, 1978 (pp.666-677). [15] E.W. Dijkstra, "Guarded Commands," Communications of the ACM, 18:8, 453- 457. [16] D.Gelernter, "Generative Communications in Linda," ACM Transactions on Programming Languages and Systems, Vol. 7, No. 1, 1985, pp. 80-112. [17] K. Iverson, . A Programming Language, Wiley, New York (1962). [18] K. Iverson, J Introduction and Dictionary, Iverson Software Inc. Toronto, 1994. [19] J.P. Banatre and D. Le Matayer, "Programming by Multiset Transformation," Communications of the ACM, Vol. 36 No. 1, January 1993, pp.98-111. [20] V. Breazu-Tannen, P. Buneman, and S. Naqvi , "Structural Recursion as a Query Language," In Proceedings of 3rd International Conference on Database Programming Languages, Nafplion, Greece, Morgan Kaufmann, pp. 9-19. [21] C. Hankin, D. Le Metayer, and D. Sands, A Calculus of Gamma Programs, Publication Interne no 674, Juillet, 1992, IRISA, France. [22] J. Sipelstein and G. Blelloch, Collection-Oriented Languages, Report, School of Computer Science, Carnegie Mellon University, CMU-CS-90-127. [23] D. Suciu, Parallel Programming Languages for Collections, Ph.D. Dissertation in Computer and Information Science, University of Pennsylvania, 1995. [24] D. Cooke, “SequenceL Provides a Different way to View Programming,” Computer Languages 24 (1998) 1-32. [25] Daniel E. Cooke and Per Andersen, “Automatic Parallel Control Structures in SequenceL,” Software Practice and Experience, Volume 30, Issue 14, (November 2000), 1541-1570.