Реферат: This document defines an Agreement between a Resource Infrastructure Provider and a Virtual Organization












EGI-InSPIRE


Resource infrastructure Provider Service Level Agreement



Document identifier:

VO-RP-SLA-v0.4.3.draft.doc

Date:

11/04/2011

Activity:

SA1

Lead Partner:

EGI.eu

Document Status:

DRAFT

Dissemination Level:

PUBLIC

Document Link:






Abstract

This document defines an Agreement between a Resource Infrastructure Provider and a Virtual Organization. The SLA documents Service Level Targets and specifies the responsibilities of the both parties.



^ Copyright notice

Copyright © Members of the EGI-InSPIRE Collaboration, 2010. See www.egi.eu for details of the EGI-InSPIRE project and the collaboration. EGI-InSPIRE (“European Grid Initiative: Integrated Sustainable Pan-European Infrastructure for Researchers in Europe”) is a project co-funded by the European Commission as an Integrated Infrastructure Initiative within the 7th Framework Programme. EGI-InSPIRE began in May 2010 and will run for 4 years. This work is licensed under the Creative Commons Attribution-Noncommercial 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, and USA. The work must be attributed by attaching the following reference to the copied elements: “Copyright © Members of the EGI-InSPIRE Collaboration, 2010. See www.egi.eu for details of the EGI-InSPIRE project and the collaboration”. Using this document in a way and/or for purposes not foreseen in the license, requires the prior written permission of the copyright holders. The information contained in this document represents the views of the copyright holders as of the date such views are published.

^ Delivery Slip




Name

Partner/Activity

Date

From

D.Zilaskos

AUTH/SA1




^ Document Log

Issue

Date

Comment

Author/Partner

1

11-04-2011

First draft TOC based on 3rd OLA task force minutes

D.Zilaskos/AUTH

2

16-08-2011

Document draft for internal review

P.Solagna/EGI.eu

M.Krakowian/NGI_PL

3










^ Application area

This document is a formal deliverable for the European Commission, applicable to all members of the EGI-InSPIRE project, beneficiaries and Joint Research Unit members, as well as its collaborating projects.

Document amendment procedure

Amendments, comments and suggestions should be sent to the authors. The procedures documented in the EGI-InSPIRE “Document Management Procedure” will be followed:
https://wiki.egi.eu/wiki/Procedures

Terminology

A complete project glossary is provided at the following page: http://www.egi.eu/about/glossary/.

^ PROJECT SUMMARY



To support science and innovation, a lasting operational model for e-Science is needed − both for coordinating the infrastructure and for delivering integrated services that cross national borders.


The EGI-InSPIRE project will support the transition from a project-based system to a sustainable pan-European e-Infrastructure, by supporting ‘grids’ of high-performance computing (HPC) and high-throughput computing (HTC) resources. EGI-InSPIRE will also be ideally placed to integrate new Distributed Computing Infrastructures (DCIs) such as clouds, supercomputing networks and desktop grids, to benefit user communities within the European Research Area.


EGI-InSPIRE will collect user requirements and provide support for the current and potential new user communities, for example within the ESFRI projects. Additional support will also be given to the current heavy users of the infrastructure, such as high energy physics, computational chemistry and life sciences, as they move their critical services and tools from a centralised support model to one driven by their own individual communities.


The objectives of the project are:



The continued operation and expansion of today’s production infrastructure by transitioning to a governance model and operational infrastructure that can be increasingly sustained outside of specific project funding.

The continued support of researchers within Europe and their international collaborators that are using the current production infrastructure.

The support for current heavy users of the infrastructure in earth science, astronomy and astrophysics, fusion, computational chemistry and materials science technology, life sciences and high energy physics as they move to sustainable support models for their own communities.

Interfaces that expand access to new user communities including new potential heavy users of the infrastructure from the ESFRI projects.

Mechanisms to integrate existing infrastructure providers in Europe and around the world into the production infrastructure, so as to provide transparent access to all authorised users.

Establish processes and procedures to allow the integration of new DCI technologies (e.g. clouds, volunteer desktop grids) and heterogeneous resources (e.g. HTC and HPC) into a seamless production infrastructure as they mature and demonstrate value to the EGI community.



The EGI community is a federation of independent national and community resource providers, whose resources support specific research communities and international collaborators both within Europe and worldwide. EGI.eu, coordinator of EGI-InSPIRE, brings together partner institutions established within the community to provide a set of essential human and technical services that enable secure integrated access to distributed resources on behalf of the community.


The production infrastructure supports Virtual Research Communities (VRCs) − structured international user communities − that are grouped into specific research domains. VRCs are formally represented within EGI at both a technical and strategic level.


^ TABLE OF CONTENTS

1 Introduction 7

1.1 Document Amendment Procedure 7

1.2 Terminology 7

1.2.1 Resource Centre (Site) 7

1.2.2 Resource Centre Operations Manager 7

1.2.3 Resource Infrastructure 7

1.2.4 Resource Infrastructure Provider 7

1.2.5 Resource Infrastructure Operations Manager 7

1.2.6 Operations Centre 7

1.2.7 National Grid Initiative 8

1.2.8 Virtual Organization 8

1.2.9 Certified Resource Centre 8

1.2.10 Unified Middleware Distribution 8

1.2.11 UMD-compliant Middleware 8

1.2.12 Capability 8

^ 1.3 Parties to the agreement 9

1.4 Duration of the Agreement 9

1.5 Scope of the Agreement 9

1.6 Responsibilities 9

1.6.1 Resource Infrastructure Provider 9

1.6.2 Resource centres 9

1.6.3 Virtual Organisation 10

2 Description of the services covered 11

^ 2.1 Infrastructure Services 11

2.1.1 Operations Dashboard 11

2.1.2 Helpdesk 11

2.1.3 Resource Provisioning 11

2.1.4 Service Availability Monitoring 11

2.2 Technical Services 12

2.2.1 Information Discovery System 12

2.2.2 Job Scheduling Service 12

2.2.3 File transfer management 12

2.2.4 Attribute authority management 12

2.2.5 Credential management 13

2.2.6 Central File Catalogue 13

^ 2.3 Support Services 13

2.3.1 First and second level support 13

2.3.2 Grid oversight 13

2.3.3 High Priority Tickets 14

2.4 Human services 14

2.4.1 Infrastructure management 14

3 table of metrics 15

4 References 16


1Introduction
This Service Level Agreement (SLA) is to obtain agreement between a Resource Infrastructure Provider and the Virtual Organisations utilizing the RP infrastructure.
1.1Document Amendment Procedure
The SLA may be amended at any time if there is mutual agreement by both parties. This

will usually take the form of a signed and dated SLA addendum.
1.2Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

More information about the entities defined in the sections below is available in [ARCH].
1.2.1Resource Centre (Site)
The Resource Centre – also known as Site – is the smallest resource administration domain in EGI. It can be either localised or geographically distributed. It provides local resources and the functional capabilities [UMD] necessary to make those resources securely accessible to end-users. Access is granted by exposing common interfaces to users.
1.2.2Resource Centre Operations Manager
The Resource Centre Operations Manager leads the Resource Centre operations, and is the official technical contact person in the connected organisation. He/she is locally supported by a team of Resource Centre administrators.
1.2.3Resource Infrastructure
A Resource Infrastructure is a federation of Resource Centres.
1.2.4Resource Infrastructure Provider
The Resource Infrastructure Provider is the legal organisation responsible for any matter that concerns the respective Resource Infrastructure. It provides, manages and operates (directly or indirectly) all the operational services required to an agreed level of quality as required by the Resource Centres and the user community. It holds the responsibility of integrating these operational services into EGI in order to enable uniform resource access and sharing for the benefit of their end-users. The Resource Infrastructure Provider liaises locally with the Resource Centre Operations Managers, and represents the Resource Centres at an international level. Examples of a Resource Infrastructure Provider are the European Intergovernmental Research Institutes (EIRO) and the National Grid Initiatives (NGIs) – see section 1.2.7.
1.2.5Resource Infrastructure Operations Manager
The Resource Infrastructure Operations Manager is the contact point for all operational matters and represents the Resource Infrastructure Provider within the OMB. He/she is appointed by the Resource Infrastructure Provider.
1.2.6Operations Centre
The Operations Centre offers operations services on behalf of the Resource Infrastructure Provider.

The operations services are delivered locally in collaboration with the relevant Resource Centres at a local level, and globally with EGI.eu.
1.2.7National Grid Initiative
The National Grid Initiative (NGI) an entity fulfils the following criteria [STA]:

have a mandate to represent its national Grid community in all matters falling within the scope of EGI.eu;

be the only organisation having the mandate described in (a) for its country and thus provide a single contact point at the national level.

be able to commit to EGI.eu financially i.e. to pay the agreed EGI.eu financial contribution.

be able to nominate a representative duly authorised to deliberate, negotiate and decide on all matters falling within the mandate of the EGI Council.

have a sustainable structure or be represented by a legal structure which has a sustainable structure in order to commit to EGI.eu in the long term.
1.2.8Virtual Organization
A Virtual Organisation (VO) is a grouping of users and (optionally) resources, often not bound to a single institution, which, by reason of their common membership and in sharing a common goal, are given authority to use a set of resources.
1.2.9Certified Resource Centre
An EGI Resource Centre is certified if it conforms to EGI production requirements. Conformance is tested through a certification procedure by the respective Operations Centre.
1.2.10Unified Middleware Distribution
The Unified Middleware Distribution (UMD) is the integrated set of software components that EGI makes available from technology providers within the EGI Community [D2.7]. These components are distributed to provide an integrated offering for deployment on EGI.
1.2.11UMD-compliant Middleware
UMD-compliant Middleware is the software that provides one of more UMD capabilities, and successfully interoperates with UMD by complying with the UMD supported interfaces specified in the UMD Roadmap [UMD]. It is mandatory that UMD-compliant software supports the Monitoring and Accounting Capabilities.
1.2.12Capability
A Capability is an activity needed by either the end-user (functional capability) or operations community (non-functional capability) that is defined and delivered by one or more Interfaces that may be supported by one or more technology providers [D5.1]. Capabilities can be functional and non-functional (security – including user management, authentication and authorization, and operations – including messaging, accounting, monitoring).


1.3Parties to the agreement
The parties to this agreement are the Resource infrastructure Provider (represented by the Resource Infrastructure Operations Manager), and the Virtual Organisation (represented by the Virtual Organisation Manager).

The resource centres, part of the Resource Provider infrastructure, which officially support the Virtual Organisation are:

SiteA

SiteB

...
1.4Duration of the Agreement
This agreement is set to continue until one or both parties exercise the termination of the agreement through explicit notification.
1.5Scope of the Agreement
The Resource Infrastructure Provider OSLA covers the commitments made by a Resource Infrastructure Provider to a Virtual Organisation that utilize the Resource Infrastructure.

This SLA is applicable to a Resource Infrastructure Provider that meets one of the following conditions:

the Resource Infrastructure Provider is a Participant or Associated Participant in The European Grid Initiative Foundation [STA];

the Resource Infrastructure Provider collaborates with EGI.eu in a framework defined by a Resource Infrastructure Provider MoU [MoU].

The SLA is applicable to VOs that are registered in the Operations Portal.
1.6Responsibilities
This section defines the responsibilities of each party. The overall task for all concerned is to operate, support, and manage a production quality Grid infrastructure for Virtual Organisations across the European Research Area.
1.6.1Resource Infrastructure Provider
The main responsibilities of the Resource Infrastructure Provider are:

to provide all the operational services agreed by the parties and mentioned in Chapter x

to provide all the core middleware services agreed by the parties and mention in Chapter y

to adhere to the policies agreed between the Resource Infrastructure Provider and the Virtual Organisation.
1.6.2Resource centres
The main responsibilities of the Resource Centres listed in section X are:

● to adhere to the policies agreed between the Resource infrastructure Provider and the Virtual Organisation.


1.6.3Virtual Organisation
The VO is responsible of endorsing the VO-specific policies agreed between the parties [put reference] and the relevant EGI procedures, in case of de-commissioning of a service to retrieve the VO resources (such as files) in due time. For VO support, the VO needs to provide the required information in tickets and has to reply to tickets in a timely way.

The VO responsibilities are the following:

● react to tickets assigned to VO Support in less than 8 hours.

● send broadcasts in case the VO configuration changes.


2Description of the services covered
The Resource infrastructure Provider MUST provide the following set of core middleware services and local services.

[A list of possible services follows as a template for the list agreed between NGI and VO]
2.1Infrastructure Services
All the infrastructure services listed in this section are local services provided by the Resource Provider for the Virtual Organisation specific purposes.
2.1.1Operations Dashboard
Operations Dashboard is used for day-by-day operations to monitor the functionality of Resource Centres officially supporting the Virtual Organisation.

Service targets

● Minimum Availability and Reliability: 70%, 75%

Service hours

● 24 hours/7 days
2.1.2Helpdesk
The VO needs to be contacted in case of problems and incidents. Resource Provider must provide a dedicated helpdesk system to manage both communications to Resource Centres and to VO.

Service targets

● Minimum Availability and Reliability: 70%/75%

Service hours

● 24 hours/7 days


2.1.3Resource Provisioning
The Resource Infrastructure is a collection of Resource Centres that offer resources and UMD-compliant capabilities. Through the Operations Centre the Resource Infrastructure Provider coordinates the operations of the Resource Centres which officially support the Virtual Organisation (see 1.3).

Service Targets

● Minimum Availability/Reliability of the Resource Infrastructure: 70%/75%

Service hours

● 24 hours/7 days
2.1.4Service Availability Monitoring
The Service Availability Monitoring (SAM) is the local monitoring framework to check the functionality of the capabilities provided by the Resource Centres. It is constituted by:

● a test submission engine;

● a portal for the visualization of the test results and Availability/Reliability statistics;

Service targets

● The Resource infrastructure Provider must provide at least one SAM system to run specific VO tests.

● Minimum Availability and Reliability of all the SAM components: 95%, 98%

Service hours

● 24 hours/7 days


2.2Technical Services
The following core services MUST be provided by the Resource Provider to the Virtual Organisation users.
2.2.1Information Discovery System
The top-level Information Discovery System is needed for service discovery, for the collection of static and/or dynamic information about the infrastructure and is needed by the SAM service.

Service Targets

● The Resource infrastructure Provider must provide at least one instance

● Minimum availability and reliability: 70%/75%

^ Service hours

● 24 hours/7 days


2.2.2Job Scheduling Service
Compute Job Scheduling capability refers to the service that can be delivered to a user in response to their request for a job to be run. This includes managing the selection of the most appropriate resource that meets the user’s requirements, the transfer of any files required as input or produced as output between their source or destination storage location and the selected computational resource, and the management of any data transfer or execution failures within the infrastructure.

^ Service Targets

● The Resource infrastructure Provider must provide at least one instance

● Minimum availability reliability: 70%/75%


2.2.3File transfer management
The bandwidth linking resource sites is a resource that needs to be managed in the same way compute resources at a site are accessed through a job scheduler. By being able to schedule wide area data transfers, requests can be prioritised and managed. This would include the capability to monitor and restart transfers as required.

Service Targets

● The Resource infrastructure Provider must provide at least one instance

● Minimum availability reliability: 70%/75%


2.2.4Attribute authority management
Resources within the production infrastructure are made available to controlled collaborations of users represented in the infrastructure through Virtual Organisations. Access to a VO is governed by a VO manager who is responsible for managing the addition and removal of users and the assignment of users to groups and roles within the VO.

^ Service Targets

● The Resource infrastructure Provider must provide at least one instance

● Minimum availability reliability: 70%/75%
2.2.5Credential management
Credential Management provides the capability for obtaining, delegating and renewing authentication credentials by a client using a remote service

Service Targets

● The Resource infrastructure Provider must provide at least one instance

● Minimum availability reliability: 70%/75%


2.2.6Central File Catalogue
The File Catalogue services maps logical file name to physical file location.

Service Targets

● The Resource infrastructure Provider must provide at least one instance

● Minimum availability reliability: 70%/75%
2.3Support Services 2.3.1First and second level support
Operational incidents and problems are reported by end-users and the Resource Centre administrators to the Operations Centre of the Resource Infrastructure Provider. The Resource Infrastructure Provider offers support by helping in the resolution of such incidents and problems, also escalating these to higher-level teams in case of need for specialized support.

Support is provided either centrally by the Operations Centre Support Unit in the EGI Helpdesk, or locally through the local helpdesk system if available (see Section …).

Support activities include support to network performance and connectivity issues as well as security support.

^ Service Target

● Maximum response time to a problem or incident: four hours after the time the ticket was assigned to the Operations Centre Support Unit.

Service hours

● The service MUST be available during the regular Operating Hours of the host organisation of the support provider.

● Response times to trouble-tickets are expressed in Operating Hours.
2.3.2Grid oversight
The Resource Infrastructure Provider oversees the smooth operation of the infrastructure, proactively checks the status of the Resource Centres, and monitors alarms raised by VO-specific tests. This service is delivered by the Regional Operator on Duty team (ROD) [ROD].

Service targets

● Tickets are opened for VO specific alarms in a timely manner.

^ Service hours

● The service MUST be available during the regular Operating Hours of the Operations Centre.
2.3.3High Priority Tickets
VO managers, or VO members with an appropriate role, can open high priority tickets. These tickets are opened in case of VO-related critical problems raised in a site considered important by the parties.

Service targets

● High priority tickets max response time: one hour.

Service hours

● The service MUST be available 24/7.
2.4Human services 2.4.1Infrastructure management
The Resource Infrastructure Provider should coordinate the registration of new Resource Centres supporting the VO, or the decommission of a Site that supports the VO, accordingly to the procedures requested by the VO and agreed by the parties.


Service targets

Maximum response time to a request of Resource Centre registration/certification/ decommissioning: ^ 4 hours

Service hours

The support services MUST be available during the regular Operating Hours of the host organisation of the support provider. Virtual Organisation and Resource infrastructure Provider can agree on a different schema for the operating hours of specific services.

Response times to trouble-tickets are expressed in Operating Hours.


3table of metrics
The following metrics are a summary of the values reported in the previous sections, to be used as a reference.

[^ A list of possible metrics follows as a template for the list agreed between Resource Provider and VO]




Value

Section

Nagios monitoring systems provided

one




Dashboard system provided

one




WMS provided

one




LB provided

one




VOMS provided

one




LFC provided

one




Minimum Resource Infrastructure Provider reliability

75%




Period of availability/reliability/outage calculations

per month




Maximum time to acknowledge GGUS tickets (Resource Infrastructure Provider)

four hours




Maximum time to acknowledge GGUS alarm tickets, if applicable

four hours




Maximum time to resolve GGUS incidents

five working days
























4References
[ARCH]

EGI Operations Architecture, EGI-InSPIRE Deliverable D4.1, 2011 (https://documents.egi.eu/document/218)

[D2.7]

EGI Sustainability Plan, EGI-InSPIRE Deliverable 2.7, Mar 2011 (https://documents.egi.eu/secure/ShowDocument?docid=313)

[D5.1]

UMD Roadmap, EGI-InSPIRE Deliverable D5.1, Oct 2010 (https://documents.egi.eu/document/100)

[GOC]

GOCDB Input System User Documentation (https://wiki.egi.eu/wiki/GOCDB/Input_System_User_Documentation)

[GSP]

Grid Security Policy (https://documents.egi.eu/document/86)

[MAN]

EGI Operations Manuals (https://wiki.egi.eu/wiki/Operations_Manuals)

[MAN02]

Service Intervention Management, Manual MAN02, (https://wiki.egi.eu/wiki/MAN02_Service_intervention_management)

[PERF]

Availability and reliability statistics (https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics)

[POL]

EGI Policies and Procedures (https://wiki.egi.eu/wiki/PDT:Policies_and_Procedures)

[QOS]

Sonvane, D.; Kalmady, R.; Chand, P. et alt.; Computation of Service Availability Metrics in Gridview, (https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf)

[REP]

NGI Annual Reports (https://wiki.egi.eu/wiki/EGI-inSPIRE_SA1#NGI_Assessment)

[RN]

Resource Centre OLA: Release Notes (https://wiki.egi.eu/wiki/Resource_Centre_OLA:_Release_Notes)

[SLA]

The EGEE-III Service Level Agreement between ROCs and Sites, EGEE-III Project, 2008 (https://edms.cern.ch/document/860386)

[SOP]

Grid Site Operations Policy (https://documents.egi.eu/public/ShowDocument?docid=75)

[STA]

EGI.eu Statutes, May 2010 (https://documents.egi.eu/document/18)

[TOR]

Operations Management Board Terms of Reference (https://documents.egi.eu/document/117)

[UMD]

UMD Roadmap, EGI-InSPIRE Deliverable D5.2, 2011 (https://documents.egi.eu/document/272)






EGI-InSPIRE INFSO-RI-261323

© Members of EGI-InSPIRE collaboration

PUBLIC

/


еще рефераты
Еще работы по разное