2. UNIT II GRID SERVICES
Introduction to Open Grid Services Architecture
(OGSA) – Motivation – Functionality Requirements –
Practical & Detailed view of OGSA/OGSI – Data
intensive grid service models – OGSA services.
4. The Hourglass Model
Focus on architecture issues
Propose set of core services as basic
infrastructure
Used to construct high-level, domain-specific
solutions (diverse)
Design principles
Keep participation cost low
Enable local control
Support for adaptation
“IP hourglass” model
Diverse global services
Core
services
Local OS
A p p l i c a t i o n s
5. Layered Grid Architecture
(By Analogy to Internet Architecture)
Application
Fabric
“Controlling things locally”: Access to, & control
of, resources
Connectivity
“Talking to things”: communication (Internet
protocols) & security
Resource
“Sharing single resources”: negotiating access,
controlling use
Collective
“Coordinating multiple resources”: ubiquitous
infrastructure services, app-specific distributed
services
Internet
Transport
Application
Link
InternetProtocolArchitecture
6. We define Grid architecture in terms of a layered collection of protocols.
•Fabric layer includes the protocols and interfaces that provide access to the resources that are being
shared, including computers, storage systems, datasets, programs, and networks. This layer is a logical view
rather then a physical view. For example, the view of a cluster with a local resource manager is defined by
the local resource manger, and not the cluster hardware. Likewise, the fabric provided by a storage system is
defined by the file system that is available on that system, not the raw disk or tapes.
•The connectivity layer defines core protocols required for Grid-specific network transactions. This layer
includes the IP protocol stack (system level application protocols [e.g. DNS, RSVP, Routing], transport and
internet layers), as well as core Grid security protocols for authentication and authorization.
•Resource layer defines protocols to initiate and control sharing of (local) resources. Services defined at
this level are gatekeeper, GRIS, along with some user oriented application protocols from the Internet
protocol suite, such as file-transfer. (Grid Resource Information Service is the repository of local resource
information derived from information providers)
•Collective layer defines protocols that provide system oriented capabilities that are expected to be wide
scale in deployment and generic in function. This includes GIIS, bandwidth brokers, resource brokers,(Grid
Index Information Service: (GIIS): represents a centralized MDS server that provides information about all of
your resources) *Master Data Services (MDS) enables your organization to manage a trusted version of data
•Application layer defines protocols and services that are parochial in nature, targeted towards a specific
application domain or class of applications.
7. Example:
Data Grid Architecture
Discipline-Specific Data Grid Application
Coherency control, replica selection, task management, virtual data catalog,
virtual data code catalog, …
Replica catalog, replica management, co-allocation, certificate authorities,
metadata catalogs,
Access to data, access to computers, access to network performance data, …
Communication, service discovery (DNS), authentication, authorization,
delegation
Storage systems, clusters, networks, network caches, …
Collective
(App)
App
Collective
(Generic)
Resource
Connect
Fabric
9. Simulation tool
GridSim is a Java-based toolkit for modeling, and simulation of
distributed resource management and scheduling for conventional
Grid environment.
GridSim is based on SimJava, a general purpose discrete-event
simulation package implemented in Java.
All components in GridSim communicate with each other through
message passing operations defined by SimJava.
10. Salient features of the GridSim
It allows modeling of heterogeneous types of resources.
Resources can be modeled operating under space- or time-shared
mode.
Resource capability can be defined (in the form of MIPS (Million
Instructions Per Second) benchmark.
Resources can be located in any time zone.
Weekends and holidays can be mapped depending on resource’s local
time to model non-Grid (local) workload.
Resources can be booked for advance reservation.
Applications with different parallel application models can be simulated.
11. Salient features of the GridSim
Application tasks can be heterogeneous and they can be CPU or I/O
intensive.
There is no limit on the number of application jobs that can be submitted to
a resource.
Multiple user entities can submit tasks for execution simultaneously in the
same resource, which may be time-shared or space-shared. This feature
helps in building schedulers that can use different market-driven economic
models for selecting services competitively.
Network speed between resources can be specified.
It supports simulation of both static and dynamic schedulers.
Statistics of all or selected operations can be recorded and they can be
analyzed using GridSim statistics analysis methods.
12. A Modular Architecture for GridSim Platform and Components.
Appn Conf Res Conf User Req Grid Sc Output
Application, User, Grid Scenario’s input and Results
Grid Resource Brokers or Schedulers
…
Appn
modeling
Res entity Info serv Job mgmt Res alloc Statis
GridSim Toolkit
Single CPU SMPs Clusters Load Netw Reservation
Resource Modeling and Simulation
SimJava Distributed SimJava
Basic Discrete Event Simulation Infrastructure
PCs Workstation ClustersSMPs Distributed Resources
Virtual Machine
13. What is the OGSA Standard?
Acronym for Open Grid Service Architecture
OGSA define how different components in grid interact
Open Grid Services Architecture (OGSA) is a set of standards
defining the way in which information is shared among diverse
components of large, heterogeneous grid systems.
In this context, a grid system is a scalable wide area network (WAN) that
supports resource sharing and distribution.
14. major goals of OSGA
Identify the use cases that can drive the OGSA platform
components.
Identify and define the core OGSA platform components.
Define hosting and platform specific bindings.
Define resource models and resource profiles with interoperable
solutions.
15. Functional requirements of OGSA.
Interoperability and Support for Dynamic and Heterogeneous
Environments
Resource Sharing Across Organizations
Optimization
Quality of Service (QoS) Assurance
Job Execution
Data Services
Security
Administrative Cost Reduction
Scalability
Availability
Ease of Use and Extensibility
16. Architecture of OGSA
Comprised of 4 main layers
1. Physical and Logical Resources Layer
2. Web Service Layer
3. OGSA Architected Grid Services Layer
4. Grid Applications Layer
19. OGSA Architecture - Web Services Layer
Web service is software available online that could interact with other
software using XML
Consists of Open Grid Services Infrastructure (OGSI) sub-layer which
specifies grid services and provide consistent way to interact with grid
services
Also extends Web Service Capabilities
Consists of 5 interfaces:
1. Factory: provide way for creation of new grid services
2. Life Cycle: Manages grid service life cycles
3. State Management: Manage grid service states
4. Service Groups: collection of indexed grid services
5. Notification: Manages notification between services & resources
21. OGSA Architecture – OGSA Architected Services -
Layer
Classified into 3 service categories
1. Grid Core Services
2. Grid Program Execution Services
3. Grid Data Services
22. OGSA Architected Services – Grid Core Services
Composed of 4 main types of services:
1. Service Management: assist in installation, maintenance, &
troubleshooting tasks in grid system
2. Service Communication: include functions that allow grid
services to communicate
3. Policy Services: Provide framework for creation,
administration & management of policies for system operation
4. Security Services: provide authentication & authorization
mechanisms to ensure systems interoperate securely
23. OGSA Architected Services – Grid Program
Execution Services
Supports unique grid systems in high performance
computing, collaboration, parallelism
Support virtualization of resource processing
24. OGSA Architected Services – Grid Data Services
Support data virtualization
Provide mechanism for access to distributed
resources such as databases, files
26. OGSA Architecture – Grid Applications Layer
This layer comprise of applications that use the
grid architected services
27. Functional requirements of
OGSA
Interoperability and Support for Dynamic and Heterogeneous Environments
Resource Sharing Across Organizations
Optimization
Quality of Service (QoS) Assurance
Job Execution
Data Services
Security
Administrative Cost Reduction
Scalability
Availability
Ease of Use and Extensibility
28. Interoperability and Support for Dynamic and Heterogeneous
Environments
The need to support heterogeneous systems leads to requirements that include the
following:
• Resource virtualization. Essential to reduce the complexity of managing
heterogeneous systems and to handle diverse resources in a unified way.
• Common management capabilities. Simplifying administration of a heterogeneous
system requires mechanisms for uniform and consistent management of resources.
A minimum set of common manageability capabilities is required.
• Resource discovery and query. Mechanisms are required for discovering resources
with desired attributes and for retrieving their properties. Discovery and query
should handle a highly dynamic and heterogeneous system.
• Standard protocols and schemas. Important for interoperability. In addition, standard
protocols are also particularly important as their use can simplify the transition to
using Grids.
29. Resource Sharing Across Organizations
One major purpose of OGSA is to support resource sharing and utilization across
administrative domains, whether different work units within an enterprise or even
different institutions.
Resource sharing requirements include:
• Global name space. To ease data and resource access. OGSA entities should be
able to access other OGSA entities transparently, subject to security constraints,
without regard to location or replication.
• Metadata services. Important for finding, invoking, and tracking entities. It should
be possible to allow for access to and propagation, aggregation, and management
of entity metadata across administrative domains.
• Site autonomy. Mechanisms are required for accessing resources across sites
while respecting local control and policy
• Resource usage data. Mechanisms and standard schemas for collecting and
exchanging resource usage (i.e., consumption) data across organizations—for the
purpose of accounting, billing, etc.
30. Optimization
Optimization refers to techniques used to allocate resources effectively to meet
consumer and supplier requirements. Optimization applies to both suppliers
(supply-side) and consumers (consume-side) of resources and services
Quality of Service (QoS) Assurance
Services such as job execution and data services must provide the agreed-upon
QoS. Key QoS dimensions include, but are not limited to, availability, security, and
performance.
QoS assurance requirements include:
• Service level agreement. QoS should be represented by agreements which are established
through negotiation between service requester and provider prior to service execution.
Standard mechanisms should be provided to create and manage agreements.
• Service level attainment. If the agreement requires attainment of Service Level, the resources
used by the service should be adjusted so that the required QoS is maintained. Therefore,
mechanisms for monitoring service quality, estimating resource utilization, and planning for
and adjusting resource usage are required.
• Migration. It should be possible to migrate executing services or applications to adjust
workloads for performance or availability
31. Job Execution
Functions such as scheduling, provisioning, job control and exception handling of
jobs must be supported, even when the job is distributed over a great number of
heterogeneous resources.
Job execution requirements include:
• Support for various job types. Execution of various types of jobs must be supported
including simple jobs and complex jobs such as workflow and composite services.
• Job management. It is essential to be able to manage jobs during their entire
lifetimes, types of groupings of jobs (e.g., workflows, job arrays). Mechanisms are
also required for controlling the execution of individual job steps as well as
orchestration or choreography services.
• Scheduling. The ability to schedule and execute jobs based on such information as
specified priority and current allocation of resources is required. It is also required to
realize mechanisms for scheduling across administrative domains, using multiple
schedulers.
• Resource provisioning. To automate the complicated process of resource
allocation, deployment, and configuration. It must be possible to deploy the required
applications and data to resources and configure them automatically.
32. Data Services
require support for the sharing and integration of distributed
data, for example enabling access to information stored in
databases that are managed and administered independently,
with appropriate security assurances.
Data services requirements include:
• Policy specification & management. Examples include
specification of who can access data, where data will be required,
what transformations are permitted on the data, whether use is
exclusive, what performance or availability is required, how much
resources can be used, what consistency is mandated between
replicas, and similar constraints.
Data storage. Disk and tape systems, amongst others, store
33. • Data access. Clients require easy and efficient access to various
types of data (such as databases, files, streams and
integrated/federated data) through a uniform set of interfaces is
required, independent of its physical location or platform, by
abstracting underlying data resources.
• Data transfer. High-bandwidth transfer of data is required,
independent of the physical attributes of the data sources and
sinks, which can exploit relevant features of those sources and
sinks if required.
• Data location management. These services manage where data
is physically located, OGSA should support multiple methods for
making data available to a client at a given location, according to
34. Data update. Although some data resources are read only, many if
not most provide some users with update privileges. OGSA must
provide update facilities which ensure that the specified consistency
can be maintained when cached or replicated data is modified.
• Data persistency. Data should be preserved according to specified
policy and its association with its metadata should be maintained in
accordance with that policy. It should be possible to use one of
many possible persistency models.
• Data federation. OGSA should support data integration for
heterogeneous and distributed data. Heterogeneous data includes
data organized according to different schemas and data stored
using different technologies (e.g., relational vs. flat file).
35. Security
Safe administration requires controlling access to services
through robust security protocols and according to provided
security policy.
Security requirements include:
• Authentication and authorization.
• Multiple security infrastructures. Distributed operation implies a
need to integrate and interoperate with multiple security
infrastructures. OGSA needs to integrate and interoperate with
existing security architectures and models.
• Perimeter security solutions. Resources may have to be
accessed across organizational boundaries, without
compromising local security mechanisms, such as firewall policy
36. • Isolation. Various kinds of isolation must be ensured, such as
isolation of users, performance isolation, and isolation between
content offerings within the same Grid system.
• Delegation. Mechanisms that allow for delegation of access rights
from service requestors to service providers are required. The risk
of misuse of delegated rights must be minimized
• Security policy exchange. Service requestors and providers
should be able to exchange dynamically security policy information
to establish a negotiated security context between them.
• Intrusion detection, protection, and secure logging. Strong
monitoring is required for
intrusion detection and identification of misuses, malicious or
otherwise, including virus or worm attacks.
37. Administrative Cost Reduction
The complexity of administering large-scale distributed,
heterogeneous systems increases administration costs and the risk
of human errors
Policy-based management is required to automate Grid system control,
so that its operations conform to the goals of the organization that
operates and utilizes the Grid system.
Application contents management mechanisms can facilitate the
deployment, configuration, and maintenance of complex systems, by
allowing all application-related information to be specified and managed
as a single logical unit.
Problem determination mechanisms are needed, so that administrators
can recognize and cope quickly with emerging problems.
38. Scalability:A large-scale Grid system can create added value such as drastically
reducing job turn around (or elapsed) time, allowing for utilizing huge number of
resources, thereby enabling new services.
Availability:
mean-time-to-repair (MTTR) -- heterogeneity of the Grid
Disaster recovery mechanisms are needed so that the operation of a Grid system can
be recovered quickly and efficiently in case of natural or human-caused disaster,
avoiding long-term service disruption. Remote backup and simplifying or automating
recovery procedures is required.
Fault management mechanisms can be required so that running jobs are not lost
because of resource faults. Mechanisms are required for monitoring, fault detection,
and diagnosis of causes or impacts on running jobs. In addition, automation of fault-
handling, using techniques such as checkpoint recovery, is desirable.
Ease of Use and Extensibility: mechanism and policy must be realized via extensible
and replaceable components, to permit OGSA to evolve over time and allow users to
construct their own mechanisms and policies to meet specific needs.
39. Conclusion
Grid-Computing allows networked resources to be combined
and used
Grid-Computing offers great benefit to an organization
OGSA are comprehensive standards which governs grid-
computing
40. Open Grid Services Infrastructure (OGSI)
Gives a formal and technical specification of what a grid
service is.
Its a excruciatingly(exceedingly elaborate or intense) / incredibly /
detailed specification of how Grid Services work.
GT3 includes a complete implementation of OGSI.
It is a formal and technical specification of the concepts
described in OGSA.
The Globus Toolkit 3 is an implementation of OGSI.
Some other implementations are OGSI::Lite (Perl)1 and the
UNICORE OGSA demonstrator2 from the EU GRIP project.
OGSI specification defines grid services and builds upon web
services.
41.
42. The Open Grid Services Infrastructure (OGSI) was published by
the Global Grid Forum (GGF) as a proposed
recommendation in June 2003.[1] It was intended to provide an
infrastructure layer for the Open Grid Services Architecture
(OGSA)
43.
44.
45.
46. OGSI creates an extension model for WSDL called GWSDL (Grid
WSDL). The reason is:
Interface inheritance
Service Data (for expressing state information)
Components:
Lifecycle
State management
Service Groups
Factory
Notification
Handle Map
Open Grid Services Infrastructure (OGSI)
47. OSGi (Open Service Gateway
Initiative) is a Java framework
for developing and deploying
modular software programs
and libraries.
Each bundle is a tightly
coupled, dynamically loadable
collection of classes, jars, and
configuration files that
explicitly declare their external
dependencies (if any).
OSGi Service Gateway Architecture
48. The framework is conceptually divided into the following areas:
Bundles Bundles are normal jar components with extra manifest headers.
Services The services layer connects bundles in a dynamic way by offering
a publish-find-bind model for Plain Old Java Interfaces (POJI) or Plain Old
Java Objects (POJO).
Services Registry The application programming interface for management
services (ServiceRegistration, ServiceTracker and ServiceReference).
Life-Cycle The application programming interface for life cycle
management (install, start, stop, update, and uninstall) for bundles.
Modules The layer that defines encapsulation and declaration of
dependencies (how a bundle can import and export code).
Security The layer that handles the security aspects by limiting bundle
functionality to pre-defined capabilities.
49. Execution Environment Defines what methods and classes are
available in a specific platform. There is no fixed list of execution
environments, since it is subject to change as the Java Community
Process creates new versions and editions of Java.
However, the following set is currently supported by most OSGi
implementations:
CDC-1.0/Foundation-1.0
CDC-1.1/Foundation-1.1
OSGi/Minimum-1.0
OSGi/Minimum-1.1
50. Data intensive grid service models
Applications in the grid are normally grouped into two categories
Computation-intensive and Data intensive
Data intensive applications deals with massive amounts of data.
The grid system must specially designed to discover, transfer
and manipulate the massive data sets.
Transferring the massive data sets is a time consuming task.
Data access method is also known as caching, which is often
applied to enhance data efficiency in a grid environment.
By replicating the same data block and scattering them in
multiple regions in a grid, users can access the same data with
locality of references.
51. Replication strategies determine when and where to create a replica of the data.
The strategies of replications can be classified into dynamic and static
Static method
The locations and number of replicas are determined in advance and will not be modified.
Replication operation require little overhead
Static strategic cannot adapt to changes in demand, bandwidth and storage variability
Optimization is required to determine the location and number of data replicas.
Dynamic strategies
Dynamic strategies can adjust locations and number of data replicas according to change in
conditions
Frequent data moving operations can result in much more overhead the static strategies
Optimization may be determined based on whether the data replica is being created,
deleted or moved.
The most common replication include preserving locality, minimizing update costs and
maximizing profits .
Data intensive grid service models
52. Grid data Access models
In general there are four access models for organizing a
data grid as listed here
1. Monadic method
2. Hierarchical model
3. Federation model
4. Hybrid model
53. Monadic method
This is a centralized data
repository model. All data is saved
in central data repository.
When users want to access some
data they have no submit request
directly to the central repository.
No data is replicated for preserving
data locality.
For a larger grid this model is not
efficient in terms of performance
and reliability.
Data replication is permitted in this
model only when fault tolerance is
demanded.
54. Hierarchical model
It is suitable for building a large data
grid which has only one large data
access directory
Data may be transferred from the
source to a second level center. Then
some data in the regional center is
transferred to the third level centre.
After being forwarded several times
specific data objects are accessed
directly by users. Higher level data
center has a wider coverage area.
PKI security services are easier to
implement in this hierarchical data
access model
55. Federation model
It is suited for designing a data grid
with multiple source of data supplies.
It is also known as a mesh model
The data is shared the data and
items are owned and controlled by
their original owners.
Only authenticated users are
authorized to request data from any
data source.
This mesh model cost the most when
the number of grid institutions
becomes very large
56. Hybrid model
This model combines the best
features of the hierarchical and mesh
models.
Traditional data transfer technology
such as FTP applies for networks
with lower bandwidth.
High bandwidth are exploited by high
speed data transfer tools such as
GridFTP developed with Globus
library.
The cost of hybrid model can be
traded off between the two extreme
models of hierarchical and mesh-
connected grids.
57. Parallel versus Striped Data Transfers
Parallel data transfer opens multiple data streams for passing
subdivided segments of a file simultaneously. Although the
speed of each stream is same as in sequential streaming, the
total time to move data in all streams can be significantly
reduced compared to FTP transfer.
Striped data transfer a data objects is partitioned into a number
of sections and each section is placed in an individual site in a
data grid. When a user requests this piece of data, a data stream
is created for each site in a data gird. When user requests this
piece of data, data stream is created for each site, and all the
sections of data objects ate transected simultaneously.
58. Grid Services and OGSA
Facilitate use and management of resources across distributed,
heterogeneous environments
Deliver seamless QoS
Define open, published interfaces in order to provide interoperability
of diverse resources
Exploit industry-standard integration technologies
Develop standards that achieve interoperability
Integrate, virtualize, and manage services and resources in a
distributed, heterogeneous environment
Deliver functionality as loosely coupled, interacting services aligned
with industry- accepted web service standards
59. OGSA services fall
into seven broad
areas, defined in
terms of capabilities
frequently required
in a grid scenario.
Figure shows the
OGSA architecture.
These services are
summarized as
follows:
60. OGSA services - seven broad areas
1. Infrastructure Services Refer to a set of common functionalities, such as
naming, typically required by higher level services.
2. Execution Management Services Concerned with issues such as starting
and managing tasks, including placement, provisioning, and life-cycle
management. Tasks may range from simple j obs to complex workflows or
composite services.
3. Data Management Services Provide functionality to move data to where it is
needed, maintain replicated copies, run queries and updates, and transform
data into new formats. These services must handle issues such as data
consistency, persistency, and integrity. An OGSA data service is a web service
that implements one or more of the base data interfaces to enable access to,
and management of, data resources in a distributed environment. The three
base interfaces, Da ta Access, Da ta Fa ctory, and Da ta Ma na gement,
define basic operations for representing, accessing, creating, and managing
data.
61. 4. Resource Management Services Provide management capabilities
for grid resources: management of the resources themselves,
management of the resources as grid components, and management
of the OGSA infrastructure. For example, resources can be
monitored, reserved, deployed, and configured as needed to meet
application QoS requirements. I t also requires an information model
(semantics) and data model (representation) of the grid resources
and services.
5. Security Services Facilitate the enforcement of security-related
policies within a (virtual) organization, and supports safe resource
sharing. Authentication, authorization, and integrity assurance are
essential functionalities provided by these services.
OGSA services - seven broad areas
62. 6. Information Services Provide efficient production of, and access to,
information about the grid and its constituent resources. The term
“information” refers to dynamic data or events used for status
monitoring; relatively static data used for discovery; and any data that is
logged. Troubleshooting is j ust one of the possible uses for information
provided by these services.
7. Self-Management Services Support service-level attainment for a set
of services (or resources), with as much automation as possible, to
reduce the costs and complexity of managing the system. These
services are essential in addressing the increasing complexity of owning
and operating an I T infrastructure.
OGSA services - seven broad areas
63. References
1. Kai Hwang, Geoffery C. Fox and Jack J. Dongarra, “Distributed and Cloud Computing: Clusters,
Grids, Clouds and the Future of Internet”, First Edition, Morgan Kaufman Publisher, an Imprint of
Elsevier, 2012.
2. https://www.dcc.fc.up.pt/~ines/aulas/1213/CG/OGSA.ppt
3. http://www.computerworld.com/article/2552339/networking/open-grid-services-architecture.html
4. http://searchsoa.techtarget.com/definition/Open-Grid-Services-Architecture
5. www.cs.umsl.edu/~sanjiv/classes/cs6740/presentation/OGSA.ppt
6. www.nesc.ac.uk/news/.../OpenGridServicesArchitectureApril20021.ppt
7. www.cse.buffalo.edu/~bina/cse486/spring2011/progtutorial_0.4.3.pdf
Notas del editor
We define Grid architecture in terms of a layered collection of protocols.
Fabric layer includes the protocols and interfaces that provide access to the resources that are being shared, including computers, storage systems, datasets, programs, and networks. This layer is a logical view rather then a physical view. For example, the view of a cluster with a local resource manager is defined by the local resource manger, and not the cluster hardware. Likewise, the fabric provided by a storage system is defined by the file system that is available on that system, not the raw disk or tapes.
The connectivity layer defines core protocols required for Grid-specific network transactions. This layer includes the IP protocol stack (system level application protocols [e.g. DNS, RSVP, Routing], transport and internet layers), as well as core Grid security protocols for authentication and authorization.
Resource layer defines protocols to initiate and control sharing of (local) resources. Services defined at this level are gatekeeper, GRIS, along with some user oriented application protocols from the Internet protocol suite, such as file-transfer.
Collective layer defines protocols that provide system oriented capabilities that are expected to be wide scale in deployment and generic in function. This includes GIIS, bandwidth brokers, resource brokers,….
Application layer defines protocols and services that are parochial in nature, targeted towards a specific application domain or class of applications. These are are are … arrgh