OPERATION AT THE ESRF: HOW WE WORK L. Hardy, JM Filhol ESRF, BP 220 avenue des Martyrs, 38043 Grenoble Cedex. Abstract The European Synchrotron Radiation Facility (ESRF) is an X-ray source of the third generation. The accelerator complex is composed of a Linear accelerator (e- 200 MeV), a synchrotron (300 metres - 6 GeV) and a Storage Ring (844 metres). We will describe the role of the Operation Group within the Machine Division and its relations with the other Divisions. We shall explain how the shift rosters are organised, how the information is passed from one crew to another. The safety aspects which come under the responsibility of the operation crew will also be discussed. THE OPERATION GROUP IN THE ESRF STRUCTURE The Operation Group (composed of two engineers, eight full time operators and one polyvalent technician) lies within the Machine Division, which is one of the five ESRF departments. There are about 20 engineers and 50 technicians in this Division. SCHEDULING WORK The Operation Group is in charge of scheduling all activities for one year of operation at a time. This work is done one year ahead and consists in sharing the year between the User Service Mode (USM), the Machine Dedicated Time (MDT) and the Machine shutdowns. Typical data for 1999 are: 5568 hours of USM (five runs of 7 to 8 weeks long), 1290 hours of MDT and 1902 hours of Machine shutdown (two long shutdowns in summer and in winter and three short intermediate shutdowns of 10 days). This basic schedule can be considered as the starting point from which all activities of the Operation group stem. Once this schedule has been accepted by the Directors, it constitutes the basis on which the Experiments Division can build a preliminary schedule of experiments. At this stage, a meeting combining the operation manager and the experimenters (about 30 beam line scientists) takes place. The goal is to determine in which mode (i.e. which electron filling pattern) the Machine will be run for each shift. There are two such meetings per year. The operation schedule also forms the starting point for the Operators to organise their rotations. WHO WORKS IN THE CONTROL ROOM? An operating crew is composed of two people: One full time operator and one other person working about 10 % of his (her) working time on shift. During the weekend, holidays or nights, a third volunteer may be enlisted to reinforce the crew, which may prove helpful in case of a problem. In the event of the need for an important and/or quick decision (to stop the machine for several hours for instance) or an emergency evacuation it is essential that one person be entrusted with the power of decision. For this reason, one of the shift crew will be given the status of ‘Shift Leader’. Since our eight operators have been operating the accelerators for several years, all of them now have this capacity. However, they only take on the role of shift leader for about 50 % of their time. When the second person of the crew is a technician, the Operator has the status of ‘Operator Shift Leader’. When the second person of the crew is an engineer, this engineer acts as the Shift Leader. This enables as many members of staff (engineers and technicians alike) as possible to participate in the life of the Machine and brings a variety of fields of competence in the Control Room, whilst always having a professional operator on hand. In this way the volunteers participate actively in the life of the Machine (and hence feel more involved in the piece of equipment they are responsible for). Another advantage is that the Operators can hold technical discussions with experts from various fields. We consider this as a means for efficient internal on- the-spot training ! OPERATOR’S ROTATION: HOW IS IT ORGANISED? The constraints are the following: 1 ) it is required that an Operator must be present in the Control Room 24 hours a day, 365 days per year. 2 ) French law must be respected (maximum of 48 hours of work per week). 3 ) the operators must have their quota of holidays like everyone else. In addition, to allow them to attend training courses, they work 75 % of their working time on shift and 25 % on ‘normal days’. This is also to allow safety margin in case of illness. Bearing in mind all these constraints, we came to the conclusion that eight Operators are necessary to ensure around-the-clock operation of the Machine. HOW IS THE SHIFT BOOKING ORGANISED FOR VOLUNTEERS? It is important to mention that the second or the third person on shift is a volunteer (engineer or technician). Volunteers generally spend between 5 to 10 % of their working time on shift in the Control Room. They are listed (about 60 people for the time being). Twice a year, the list of vacant USM shifts is published and the volunteers are requested to book a given number of shifts (which is now around one shift per month per person). TRANSMISSION OF THE INFORMATION BETWEEN THE OPERATORS. The transmission of information from one crew to the next is crucial to ensure continuous operation of the Machine. The simplest way to transmit the information is verbally at the hand-over between shifts. The Operator begins his shift one hour before the second person of the shift to ensure an overlap for transfer of information. All relevant information concerning events of a given shift is recorded in the logbook (complaints from Users, failures, timing of events, injection efficiency, etc.). All of this information is summarised on a flyleaf which contains the main data. Temporary, short-term information is written on a white board reserved for this purpose. Other, more long-term pieces of information, advice or new procedures are put onto a dedicated WEB page (on-site only) by the Operator. This WEB page is subdivided according to pieces of equipment and an index page lists all the newest information. At the beginning of his shift, the Operator checks this index page in order to see if something new has been added since the last time he was on shift. Once every six weeks, there is an ‘operator’s meeting’ in order to exchange other types of information. This meeting provides the opportunity for the Operation Managers to give explanations about new equipment or even give a small training course. It is also during these meetings that the operators will explain the problems they have encountered and the solutions which are envisaged. ARCHIVING AND EXPLOITATION OF THE SHIFT INFORMATION The flyleaf (one per shift) in the logbook contains major data such as beam availability over the period of the shift, the description of the failures (if any) with their duration, the conclusion of the shift, etc. This information is systematically transferred to an Excel database. The database dedicated to the failures will, for each individual failure, record the name of the concerned equipment and sub equipment, the date and time, the duration, the description of the failure and the action taken. After each run this database is made available on the computer network. On a regular basis an analysis is made of this data in order to detect any trends on given equipment. ASSISTANCE TO THE SHIFT CREW OUTSIDE NORMAL WORKING HOURS. It is evident that during the normal working hours, the relevant people on site will be called immediately to solve any problems which may arise Outside normal working hours, a procedure has been established. Three levels of problem have been identified : 1 ) A minor problem or a bug is noticed by the operator but does not need immediate action. For such cases an internal e-mail system has been developed with aliases grouping e-mail addresses according to the type of equipment. This system ensures that the relevant people will be warned of the problems seen by the operators during a given shift. 2 ) A problem is noticed on a ‘non-strategic’ piece of equipment. This does not fully prevent the machine from running but it does not ensure smooth operation of the Machine either. The equipment concerned is listed in the Control Room. In these cases, people who are in charge of such equipment are said to be ‘callable’. This means that they will be called at home but, according to their availability at the time, they may decide not to come immediately to solve the problem. Two examples of the type of equipment covered by this level are the diagnostics tools and the insertion devices. 3 ) - A piece of equipment has failed and it is impossible to store the beam or even to refill the Machine. - The beam is stored but a failure on other equipment will prevent to refill the Machine in case of a beam loss. For these strategic pieces of equipment, different people are on ‘stand-by’ and are equipped with a beeper so that they can be reached 24 hours a day. These people must immediately react and go to the Control Room if the problem cannot be solved over the telephone. More than ten pieces of equipment are covered by people on stand-by. Besides these groups, a group named ‘More Experienced Shift Leaders’ composed of staff members possessing a good level of knowledge of accelerator physics is also on stand-by. The role of this latter group is not only to work out problems which are not covered by the other groups, but also to provide help to the crews when needed, for instance to develop or confirm a diagnosis of a failure. COMMUNICATION TOWARDS THE OTHER GROUPS OR DIVISIONS Each week, the Operation Managers organise a meeting whose purpose is to assess the progress of the previous week. All of the engineers and technicians directly involved in the life of the Machine are welcome to attend (about 30 people actually attend the meeting). Any failures which occurred the week before are discussed in detail. The main goal is to ensure that the person responsible for the concerned equipment is aware of the failure and that some action has been undertaken to avoid it happening again, as far as possible. The details of the work performed during the Machine Dedicated Day is also reviewed. This meeting lasts about 1h30 and is also the opportunity for staff members to exchange information and ideas. Every three weeks, another meeting, this time involving in particular the scientists from the Experimental Division is organised. The purpose of this meeting is not only to summarise the previous three weeks of Operation but also to provide a place for the scientists to air requests, comments and any complaints they may have (for a new mode of filling or discussing beam position stability for instance). At the end of each run, the Operation Group issues an ‘Operation Report’. This report summarises all the events of the run, the statistics, etc. This report is accessible on the internet. ORGANISATION OF THE MACHINE SHUTDOWNS The Operation Group is responsible for the organisation of the Machine shutdowns. This starts with the gathering of information on activities of all groups needing to intervene on the Machine. This information is then compiled in a scheduling software. At this stage, it is important to detect any possible conflicts in activities. In most cases compromises will be found but if it is not possible to find a good solution, the Operation Group makes the final decision. All activities performed during a shutdown are requested on a special form and must be approved by one of the Operation Managers and by the Safety Group. Once the shutdown has started, the Operation Group overviews the activities and helps people to solve problems on the spot. When the shutdown is over, each group is requested to provide a list of each completed task. DAY TO DAY CO-ORDINATION WITH THE USERS About 40 beamlines can work simultaneously 24 hours a day. This represents a great amount of people since several of them can work on one beamline. For this reason, there is an Experimental Hall Operator (EHO) whose job is to co-ordinate actions and gather the information on behalf of the Experimental side. This EHO is, in principle, the only link between the Control Room and the Users. Actually, it is clear that the crew on shift in the Control Room will help or give any information that the Users may require. SAFETY MATTERS IN THE CONTROL ROOM. At the ESRF, the CTRM is the only place which is manned 24 hours a day, all through the year. For this reason, the operating crew is in charge of initiating all safety actions. Concretely, safety is ensured at the ESRF in the following way: All around the ESRF there are hundreds of red telephones directly connected to the CTRM. In the event of an emergency (fire, accident), witnesses will automatically warn the CTRM. All of the fire detectors are also connected to a central station in the CTRM. A simplified procedure exists for each type of incident. The shift crew will apply the appropriate procedure. In some cases, the operator himself will manage the whole intervention but, depending on the gravity of events, he could also request the help of a specialised intervention team. This “crash” team (composed of professional firemen and first-aid workers) are permanently on stand- by and are located on another site close to the ESRF site. PROCEDURES IN THE CONTROL ROOM At present, the Machine Division offers about eight different modes of electron filling patterns, four of which are regularly delivered to the Users. Some of them require a cleaning procedure to eliminate the parasitic electron bunches between the main bunches. These modes are assessed during the Machine Dedicated Time and each piece of equipment has a dedicated settings file associated with a given mode. When a mode is completely assessed, a refill procedure is written. During the night which precedes delivery to the Users, the operator fully simulates beam delivery to the Users (by simulation we mean that the full procedure is tested except for the opening of the front-end shutters). Thanks to a scraper, a fast decay is done to ensure that we are not faced with an intensity-dependant phenomenon (such as beam instabilities for instance). Should a problem occur, an experienced person on stand-by will be called. These procedures help to refill the Machine quickly and efficiently (from 5 to 15 minutes depending on the mode). The other procedures which are to be strictly applied are the emergency procedures as described above. These two types of procedure must be readily available and hence, are called “written procedures”. This is to differentiate them from any other instructions which we call “user’s guide”. The latter are generally written by the operators themselves. They are available on the intranet. Apart from these procedures, it is the philosophy of the Operation Group to leave it up to the crew to take initiatives when faced with problems. POLICY OF TRANSPARENCY OF THE OPERATION GROUP All information concerning the life of the Machine is available on the internet (www.esrf.fr). Information on the schedules, Operation Reports, the statistics, etc. is all available for consultation to the outside world. CONCLUSION The ESRF accelerators were commissioned in 1992 then operated in 1993 for beamline commisioning and the first external Users started their experiments in 1994. Since 1994 up to May 1998, around 23 000 hours of beam have been delivered to the Users. The beam availability has been increased year after year (90 % in 1993 and more than 95 % in 1998 at the time of writing this paper). Therefore, we do believe that good organisational strategies have been taken up to now and that a cruising speed has been reached. The main challenges now will be avoiding falling into a routine, having a good policy of maintenance and sustaining the high level of motivation of all those involved in the life of the Machine.