APPLICATION OF EYE-TRACKING TECHNOLOGY AS A DIAGNOSTIC TOOL FOR ASSESSING FLIGHT OPERATORS. PART 1: ANALISE OF FLIGHT OPERATORS’ ATTENTION DISTRIBUTION AND SWITCHING USING EYE-TRACKING

The paper presents the results of an experiment devoted to studying attention distribution and switching using eye-tracking technology. The experiment was conducted in two stages. At the first stage (year 2016), 48 students majoring in air traffic control were examined. At the second stage (year 2017), 43 students studying to become civil aviation pilots were involved. Correlation analysis methods and Pearson’s chi-squared test were used.


INTRODUCTION
The efficiency of an air traffic control system depends on its design, reliability, the reliability of equipment, operators' skills, and on how disciplined and organized the staff is. Efficiency depends on indicators that signal how well these components function and include accuracy, reliability, the amount of information displayed about airspace, the volume of tasks performed, and others [1].
The human factor is the key factor that has a negative effect on air traffic safety. Every year, a number of incidents stemming from air traffic control (ATC) issues occur all around the world. For instance, on February 2, 2018, two Boeing-737 aircraft operated by Pobeda, a Russian airline, and Pegasus, a Turkish airline, nearly hit each other midair at Ataturk Airport in Istanbul. It is noted that the aircraft passed only 250 meters apart [2]. On January 30, 2018, a Boeing 757-200 operated by FedEX Express and flying from Athens to Ben Gurion Airport in Tel Aviv and a Beech 200 King Air carrying the UN workers to Egypt avoided a collision being at a horizontal distance of 0.4 miles (740 m) and a vertical distance of a little over 300 feet (90 m) from each other [3]. On August 13, 2018, at Edinburgh Airport, a number of events resulted in an Airbus A320-214 with 180 passengers taking off Runway 06 at 09:48:13 and a Boeing 737-800 operated by Norwegian Air International (159 passengers on board) landing on the same runway at 09: 48:15. At the closest point of approach, the two aircraft were separated by approximately 875 m, with Airbus A320-214 being at 60 ft aal when Boeing 737-800 touched down [4]. Another case when two aircraft happened to be in hazardous proximity to each other happened on June 2, 2017, in the vicinity of Lae Nadzab Airport. Lack of attention both in the air and on the ground as well as mistakes made by the air traffic controller giving vertical and horizontal separation directions nearly ended in a head-on collision [5]. A total of 15 Chinese air traffic controllers were punished by authorities for their involvement in a serious runway incursion incident at Shanghai-Hongqiao Airport, China on October 10, 2016 [6]. In Russia alone, 22 incidents caused by violating separation minimums happened in 2018, 13 of which resulted from mistakes made by ATC staff (with 11 out of 13 incidents in 2017) [7]. Documents [7][8][9][10][11][12][13][14][15][16] and a number of others underline the importance of reducing the negative effect of the human factor on air traffic safety and flight safety in general. As it is stated in the European Plan for Aviation Safety (EPAS), "The human factors and human performance issue has been integrated within the other risk areas so that human factors aspects are considered in an integrated manner when risks are being mitigated. In order to be able to clearly identify those actions dealing with human factor aspects, an "HF" marker has been added to the "activity / sector" column of each of the actions" [16].
There are different ways to gradually reduce the negative effect of the human factor on air traffic safety [17]. They include working on improving processes aimed at identifying potential conflict situations and assessing their danger [18][19][20][21][22][23], improving the efficiency of interaction within both the crew and the ATC team [11,12,24,25], improving interaction between the air traffic controller and the aircraft crew [26], and a number of others. However, one of the main methods is working on improving the psychological screening of pilots and air traffic controllers [27,28]. The importance of conducting psychological screening procedures in a proper fashion for improving flight safety is obvious, as this is the first obstacle on the way to a job in aviation for people who are not suitable for this work for various reasons.
This is why the issue that this paper is devoted to, which is finding ways to improve psychological screening procedures aimed at pilots and air traffic controllers, in particular in the area of assessing such important qualities as attention distribution and switching, seems to be significant and topical.

PREVIOUS STUDIES AND REFERENCES REGARDING THE ISSUE OF ATTENTION CHARACTERISTICS
Attention is best described as the sustained focus of cognitive resources on information while filtering or ignoring extraneous information. Attention is a very basic function that often is a precursor to all other neurological / cognitive functions. "Sustained attention is a fundamental aspect of human cognition and has been widely studied in applied and clinical contexts. Despite a growing understanding of how attention varies throughout task performance, moment-to-moment fluctuations are often difficult to assess. … Despite the importance of characterizing individual differences in sustained attention, few clear psychological markers distinguishing high and low performers have been uncovered" [29]. Attention may be differentiated into "overt" versus "covert" orienting [30]. Overt orienting is the act of selectively attending to an item or location over others by moving the eyes to point in that direction [31]. Overt orienting can be directly observed in the form of eye movements. Although overt eye movements are quite common, there is a distinction that can be made between two types of eye movements: reflexive and controlled. Reflexive movements are commanded by the superior colliculus of the midbrain. These movements are fast and are activated by the sudden appearance of stimuli. In contrast, controlled eye movements are commanded by areas in the frontal lobe. These movements are slow and voluntary. Covert orienting is the act of mentally shifting one's focus without moving one's eyes [31][32][33]. Covert orienting has the potential to affect the output of perceptual processes by governing attention to particular items or locations but does not influence the information that is processed by the senses. Researchers often use "filtering" tasks to study the role of covert attention of selecting information. These tasks often require participants to observe a number of stimuli, but attend to only one. Such a task was used in our experiment described in this paper. We examined attention characteristics at different levels of perceptual load. The perceptual considers the subject's ability to perceive or ignore stimuli, both task related and non-task related. Studies show that if there are many stimuli present (especially if they are task-related), it is much easier to ignore the non-task-related stimuli, but if there are few stimuli, the mind will perceive the irrelevant stimuli as well as the relevant [34]. Objective research methods, such as eye tracking technology, are often used to study the operator's attention, as eye movements are the marker that makes it possible to understand what object the person's attention is directed at, which is especially true of overt orienting attention. Many studies using eye tracking technology have been conducted. In particular, such works as [35][36][37][38][39][40][41][42] should be noted as being related to our research to some extent.
Application of eye-tracking technology as … 169.

MATERIALS AND METHODS
As part of studying the possibility of using diagnostic methods based on objective rather than subjective principles, an experiment was conducted to study attention distribution and switching using a stationary eye-tracking device. The Tobii REX eye tracker [43] was used, which was installed on the IIYAMA ProLite T2250MTS 21.5" monitor. PC specifications: Intel Core i5-3450 3.10 GHz processor, ASUS P8Z77-V DELUXE motherboard, DIMM DDR3 2048MB PC10600 1333MHz Kingston RAM, Windows 7 Professional.
Setup and calibration were carried out according to the manual "Tobii Eye Tracker. User's Guide" [43] for each participant individually before the beginning of the test. The distance between the participant's eyes and the monitor, at the lower edge of which the eye tracker was attached, was approximately 65 cm [44], taking into account adjustment recommendations, as shown in Fig. 1. When the setup and calibration procedure is completed, its quality is illustrated by lines of different colors and lengths. The length of each line represents the shift from the center of the calibration point to the selected viewpoint [45]. Calibration results are displayed in Fig. 2.
After completing this procedure, calibration for each participant was checked using eight test points. Recalibration was performed if the participant looked at one of the control points and the colored points that reflect the direction of the participant's gaze were not located close enough relative to its center. Only after obtaining satisfactory results did the participant proceed to the task itself.

Fig. 2. Calibration results
To process the data, we used a program developed by A.P. Plyasovskikh, Doctor of Technical Sciences, at the All-Russian Scientific Research Institute of Radio Equipment and designed specifically to analyze various aspects of eye movement when doing an exercise given.
To analyze the results, we used the R programming language, which is widely used as statistical software for data analysis and has almost become the standard for statistical programs [46] (available under the GNU GPL license [47]). Correlation analysis methods [48] and Pearson's chi-squared test [49] were also used.
The experiment was conducted in two stages. The first stage, which was conducted in November 2016, involved 48 third-year students at Saint Petersburg University in Civil Aviation majoring in air traffic control. Their ages ranged from 20 to 23 years. There were 27 males and 21 females. The results of the first stage are discussed in detail in Arinicheva, O. V et al. [28] and other works. The second stage, which was conducted in November 2017, involved 43 fourth-year students studying to become civil aviation pilots. Their ages ranged from 20 to 25 years. All of them were males.
The experiment was conducted in accordance with the fundamental principles of bioethics [50] and on a voluntary basis.

EXPERIMENT ONE: RESULTS AND DISCUSSION
A total of 48 students majoring in air traffic control took part in the first experiment. First, they were given a task to fix their eyes on a green square (0, 176, 80; hereinafter, the colors are indicated in accordance with the RGB color model [51]). The visuals had the form of a sequence of slides (see Fig. 3). On the white background, there was a barely noticeable grid with a 28x28 mm cell, and the zone of the slide which was actively used was 280x224 mm, leaving some free space around the perimeter. The size of the square was the same as the size of the cell, i.e. 28x28 mm. On each of the 16 slides, the green square was located in one of four symmetrically located zones (top left and right and bottom left and right). The size of each zone (56x56 mm) was set in excess of the size of the square that was to be monitored in order to level out possible errors in the calibration of the eye tracker. The zones were numbered starting from the top left clockwise. The slides were changed at regular intervals (2.5 s). According to the given algorithm, the green square appeared on each next slide in a zone which was adjacent to its previous position (such an algorithm had been designed to prevent the eyes from moving through the center). In each zone, the green square appeared an equal number of times (four times). The task was formulated as follows: "Constantly, without getting distracted, keep your eyes on the green square".
The second task was identical to the first one, but, simultaneously with the green square, squares of other colors appeared on the screen: red (255, 0, 0), yellow (255, 255, 0), and blue (0, 112, 192). This was done to create distractions and make it more difficult for the participants to distribute and switch attention owing to an increase in the number of objects being perceived. Thus, with overt orienting attention, the perceptual load increased (see Fig. 4). The algorithm for moving the green square was the same. It was assumed that the characteristics of attention distribution and switching in the second task would be worse.
Application of eye-tracking technology as … 171.
, (2) where ТА1 is time (%) spent looking at zone 1 (first stage); ТА2 is time (%) spent looking at zone 2 (first stage); ТА3 is time (%) spent looking at zone 3 (first stage); ТА4 is time (%) spent looking at zone 4 (first stage); ТВ1 is time (%) spent looking at zone 1 (second stage); ТВ2 is time (%) spent looking at zone 2 (second stage); ТВ3 is time (%) spent looking at zone 3 (second stage); ТВ4 is time (%) spent looking at zone 4 (second stage); ; ; . This criterion was chosen based on the features of the program that we used, which was developed by A.P. Plyasovskikh. This is the reason why it was inconvenient and not entirely appropriate for us to use the frequently encountered methods of eye-tracking analysis [35,36], such as the mean distance between the target and the gaze point. According to the setup of the experiment, attention is considered to be switched in an ideal way if the portion of time spent looking at each zone of the screen is 25%. As the difference in the length of time spent looking at each zone from the so-called ideal value can be both positive and negative, as well as with the aim of "concentrating" the results (deviations as a whole were not large; they varied from 0 to 3 in absolute value without taking into account results which were obviously incorrect), the difference was squared. It seems that this method is suitable for solving our problem, as the operator, especially the pilot, needs to fix their eyes on particular devices in a particular sequence and it is important whether or not he or she is looking at the right device at the right time. If he or she is not looking at it, then the mean distance between the target and the gaze point is, in general, no longer significant. We proposed this criterion based on an analysis of the tasks solved by the operator, and these experiments were also aimed at its validation.
The heat map (for example, Fig. 5a) shows that the students were quite good at fixing their gaze within the given zones. However, the results of this experiment turned out to be contradictory as a significant number of its participants (22 out of 48) performed the more complex task better. Although, as is clearly seen from the trajectory of the gaze (Fig. 5b), in many cases the participants reacted to the distractions (the objects other than the green square). In addition, some of the results (obtained from eight students) were obviously incorrect. It can be explained by either the student turning their head (and getting distracted) or a poor calibration of the device for a particular student. Upon excluding the results which are obviously incorrect, the following picture was obtained for the remaining 40 students (see Table 1; some of the correlations presented there, for example, between such values as ΣA and ΣΣ = ΣA + ΣB are given solely to make the picture complete). The correlation between the sum of the sums of squares of deviations from the ideal periods of looking at each of the zones at each stage (ΣΣ) and the sum of the squares of deviations at each of the stages (ΣA and ΣB) is almost the same. However, this sum of the sums (ΣΣ) is almost independent of the deviations from the ideal periods of looking at the top zones (tA1, tA2, tB1, tB2) and depends highly significantly on similar deviations in the bottom zones (tA3, tA4, tB3, tB4). At the same time, there is almost no correlation between the absolute value of the difference between the sum of squared deviations from the ideal period of looking at each Application of eye-tracking technology as … 173.

a) b)
zone at each stage (│RZ│) and the sum of squared deviations at the first stage (ΣA), whereas the correlation with the sum of the squared deviations at the second stage (ΣВ) is strong and highly significant. Moreover, the bottom left zone (tB4) made the biggest contribution to it. It is not entirely clear what these results mean. It has been suggested that such correlations stem from the monitor's being positioned too low relative to the eyes of the students who took part in the experiment.
A thing to be noted (see Table 2) is the presence of individuals with positive (26 people) and negative (22 people) differences between the sums of the squares of the deviations from the ideal period of looking at each zone at each stage (RZ). As mentioned before, it was assumed that the characteristics of attention distribution and switching will be worse in the second stage compared with the first one. However, a significant number of subjects (22 students) did the more complex task better. At the same time, no significant differences were revealed using Pearson's chi-squared test [49] (χ 2 = 1.6783 < χ 2 cr.0.95 = 7.815 for ν = 3, where ν is the number of degrees of freedom) when analyzing the distribution of the sum of the squares of the deviations from the ideal period of looking at each zone at each stage (ΣΣ) among the samples of the participants with positive or negative differences between the sums of the squares of the deviations from the ideal period of looking at each zone at each stage (RZ) (see Table 2). Based on the distribution of positive and negative differences between the sums of squares of deviations from the ideal period of looking at each of the zones at each stage (RZ) among men and women (see Table 3), no significant differences were found using Pearson's chi-squared test (χ 2 = 2.025 < χ 2 cr.0.95 = 3.841 for ν = 1). Based on the distribution of the sum of the squares of deviations from the ideal period of looking at each zone at each stage (ΣΣ) among men and women (see Table 4), no significant differences were found using Pearson's chi-squared test (χ 2 = 0.9312 < χ 2 cr.0.95 = 7.815 for ν = 3).
As the results of the first experiment were questionable, it was decided to conduct another experiment. Table 2 Distribution of the sum of the squares of the deviations from the ideal period of looking at each of the zones at each stage (ΣΣ) of the first experiment based on the presence of positive or negative differences between the sums of the squares of the deviations from the ideal period of looking at each of the zones at each stage (RZ) (people)  Table 3 Distribution of positive and negative differences between the sums of squares of deviations from the ideal period of looking at each zone at each of the stages (RZ) of the first experiment based on sex (people) RZ < 0 RZ > 0 females 12 9 males 10 17

EXPERIMENT TWO: RESULTS AND DISCUSSION
A total of 43 students studying to become civil aviation pilots took part in the second stage of the experiment. The monitor was put 10 cm higher.
The task given to the participants was similar to the one used at the first stage in the first experiment. The parameters, such as the sizes of the zones as well as the sequence and speed of demonstrating the slides, remained the same. First, the participants needed to fix their eyes on the green square moving around the screen. Then, as in the first experiment, squares of other colors along with the green one appeared on the screen. After that, distractions became more difficult to overcome: in addition to squares of various colors, figures (circles and triangles) of sizes fitting the grid cells appeared on the screen. In addition, as can be seen in Fig. 6, some of these figures were shown in the same color (green) as the square that the participants needed to keep their eyes on. Other parameters, such as the algorithm for the appearance of the green square, the interval between the slides, and the RGB parameters, were left unchanged.
The results obtained appear to be more interesting. The results of the third stage were significantly worse than those of the first one, but, as it happened before, the bottom part of the screen accounted for the majority of bad results. This was clearly visible on the heat map for those participants whose test results were good, i.e. above the average, and especially for those who did poorly, i.e. whose test results were well below the average (see Fig. 7a). At the third stage, it is clearly visible in a number of cases how the participant's gaze moved in search of the green square which had moved to the next zone according to the given algorithm (see Fig. 7b).
We tried to find out whether there were features in the facial structure, or rather in the eye area, of the participants who did well or badly in the experiment. In order to do it, a number of pictures were taken from different angles. Features which can potentially influence experiment results are deep-set eyes or a particular eye shape, strongly pronounced or low brow ridges, and particularly thick eyelashes. However, as can be seen, there are no particular differences between the two group of participants except for slightly bigger bags under the eyes in the second group (see Fig. 8).
Later, we calculated the sums of squares of deviations from the ideal period of looking at each of the zones at each stage (ΣА, ΣВ and ΣС), the sums in the pairs of stages (ΣАВ, ΣВС, and ΣАС), and the total sum (ΣΣ) as well as the differences between these three sums (RAB, RBC, and RAC), (5) where ТА1 is time (%) spent looking at zone 1 (first stage); ТА2 is time (%) spent looking at zone 2 (first stage); ТА3 is time (%) spent looking at zone 3 (first stage); ТА4 is time (%) spent looking at zone 4 (first stage); ТВ1 is time (%) spent looking at zone 1 (second stage); ТВ2 is time (%) spent looking at zone 2 (second stage); ТВ3 is time (%) spent looking at zone 3 (second stage); ТВ4 is time (%) spent looking at zone 4 (second stage); Application of eye-tracking technology as … 175. The results are shown in Table 5. As can be clearly seen from Table 5, the results in the bottom zones are much worse than those in the top ones in 100% of cases. In 80% of cases, the results in the top left zone are better than those in the top right zone, and the results in the bottom right zone are better than those in the bottom left zone.
In the previous experiment, the more difficult task was done better (on average) than the easier one (see Table 2). In the second experiment, the results of comparing the second stage and the first stage are less controversial but also not quite expected. Only the results of the third stage of the experiment are consistent with the ones predicted.
As only male students took part in the second experiment and there were three rather than two stages, Pearson's chi-squared test was not applied to the results as it was done in the first experiment (see Tables 2, 3 and 4).
Although future pilots did better, on average, in the second experiment compared with students majoring in air traffic control in the first experiment (see Table 5), the results of both experiments share almost the same major trends and peculiarities.

CONCLUSIONS
The hypothesis put forward in the experiment that such characteristics of attention as distribution and switching deteriorate under distraction was only partially confirmed. It was fully confirmed at the third stage of the second experiment. In the first experiment, the more difficult task was done better (on average) than the easier one, although the result is rather the opposite in terms of the number of participants. In the second experiment, the result of comparing the second and first stages is less controversial but also not quite expected. Only the third stage of the second experiment gave the results Application of eye-tracking technology as … 177.
that were consistent with the hypothesis. It is difficult to say what caused such results. The reason may be that there might have been errors in the setup of the experiment itself, but it is also quite possible that the Yerkes-Dodson law manifested itself [52], which means that some of the participants in the experiment were not attentive enough during the first stages of both experiments and became more concentrated at the second stages. Moreover, "Studies show that if there are many stimuli present (especially if they are task-related), it is much easier to ignore the non-task related stimuli, but if there are few stimuli the mind will perceive the irrelevant stimuli as well as the relevant" [34], so it is possible that this factor influenced our results.
It is still difficult to unambiguously claim that such indicators as RZ and ΣΣ , which are proposed in the paper and characterize how well the tasks are performed, are valid, but from the point of view of common sense and based on the results obtained, they seem quite appropriate.
The most interesting result is that both experiments showed the same picture, in which the bottom part of the monitor accounts for the worst results. In 100% of cases, the results in the bottom zones are much worse than those in the top ones. In 80% of cases, the results in the top left zone are better than those in the top right zone, and the results in the bottom right zone are better than those in the bottom left zone. Unfortunately, the Tobii REX eye tracker used in the experiments has such specifications that it cannot be used with operator training simulators. The trend revealed in the experiment may be of importance for air traffic safety.