The authors have declared that no competing interests exist.
Conceived and designed the experiments: HZ CM LTM. Performed the experiments: HZ LAEH. Analyzed the data: HZ. Wrote the paper: HZ CM LAEH LTM.
We examined an eye-hand coordination task where optimal visual search and hand movement strategies were inter-related. Observers were asked to find and touch a target among five distractors on a touch screen. Their reward for touching the target was reduced by an amount proportional to how long they took to locate and reach to it. Coordinating the eye and the hand appropriately would markedly reduce the search-reach time. Using statistical decision theory we derived the sequence of interrelated eye and hand movements that would maximize expected gain and we predicted how hand movements should change as the eye gathered further information about target location. We recorded human observers' eye movements and hand movements and compared them with the optimal strategy that would have maximized expected gain. We found that most observers failed to adopt the optimal search-reach strategy. We analyze and describe the strategies they did adopt.
A variety of human daily activities, such as cooking, drawing, and driving, involve coordination of eye and hand. Typically your hand moves towards whatever you have just looked at. But is this coupling compulsory? To test whether human observers can adopt appropriate eye-hand coordination strategies to maximize rewards, we created an unusual task where good performance required that hand and eye move independently of each other in order to rapidly find and touch a target. Observers were rewarded for minimizing the overall time to find and touch a target among distractors and we made the visual search and hand movements very slow so that a simple “hand-follows-eye” strategy would reduce observers' winnings considerably. Most observers failed to choose the optimal visual search strategy but did intelligently coordinate hand movements to the visual search strategy they did pick. The “very slow search and reach” task we developed provides a novel approach to investigate coordination between perceptual and motor systems experimentally and computationally.
In visually guided manual tasks that involve a sequence of targets, the movements of eye and hand are usually tightly linked
This strategy of coordination makes sense, intuitively. Shortly after the start of sandwich making, you know where all the relevant items are and, if you did not use your gaze to aid your reach, it is unclear what you might do with your eyes instead. If, however, there were a rewarding alternative (e.g. watching your favorite television show), we might expect very different eye and hand movements in carrying out the same task.
When we talk about rewards and the probabilities of rewards, we are in the framework of statistical decision theory
In the past decade several groups of researchers have compared human choice of hand or eye movements to the performance of ideal decision makers who plan movements to maximize expected gain or a similar criterion
We investigated an eye-hand coordination task where we created an unusual reward structure intended to encourage a decoupling of the eye and the hand. Human observers were asked to find and touch a target among five distractors (
The red circle is the starting position for the eye and the hand. Each gray circle with blue shapes inside is an object. Two clusters of objects are located to the left and right of the midline, on a virtual arc centered at the starting position. One cluster contains four objects, the other two. On half the trials the two-cluster is on the left as shown, on the other half, on the right. Each object is equally likely to be the target. One and only one of them is the target to be touched. See the Stimuli section for a definition of the target and distractors.
In the task, we intentionally slowed down both visual search and hand movements to amplify their temporal costs. This sort of constraint occurs in everyday movements when, for example, we carry a very full cup of tea from one place to another. Visual objects (target or distractor) were made so complex that observers had to fixate an object for 1∼2 seconds to discriminate target from distractor. Observers were required to move their finger along the surface of the touch screen under a speed limit. It took about 9 seconds to cover the distance from the starting position to any of the objects. As mentioned above, the observer received a monetary reward for touching the target, a reward that decreased linearly with time since the beginning of the trial.
We designed that task so that a sequential strategy that consisted of first locating the target and only then initiating the finger movement away from the starting position to the target would result in negligible reward. Participants could do considerably better by starting their hand movement before they had located the target through visual search. The strategy maximizing expected gain required that they plan hand movements on incomplete information about the target and update their movement plan as further information about the target location became available through visual search. If, for example, visual search of all the targets on the left half of the display screen has failed to find the target, then the target must be on the right side and the trajectory of the hand can be adjusted to take advantage of this additional knowledge.
The optimal strategies for an individual depend on the individual's speed in searching for the target, her movement speed, and the spatial layout of the target and distractors. As
We developed a model of optimal eye-hand coordination described under the
We recorded observers' eye movements and hand movements. Before the search-reach task, observers were trained in visual search with key press responses and in moving on the touch screen, and during these training sessions we obtained their search slope and reach speed separately. We compared the performance of human observers to the performance predicted by our model of optimal eye-hand coordination (maximizing expected gain) described below.
We considered three questions. First, do people use the visual search strategy that maximizes expected gain? In particular, do observers search in the order that reduces the spatial uncertainty of the target most quickly? We will conclude that they do not.
Second, the uncertainty of the target changes as the visual search proceeds. At the beginning of a trial, any of the six objects could be the target. When certain objects have been identified as distractors, though the target is still unknown, the target can only be one of the remaining objects. Can we find evidence that observers adapt their hand movements to the partial information acquired through visual search before the target location is identified? In particular, do they move their hands at all before the target location is known?
Finally, if people failed to maximize expected gain in visual search and/or hand movement, could this failure be attributed to a hard constraint of the motor system? Is it possible, for example, that the hand has no choice but to follow the eye, whether appropriate or not?
The experiment had been approved by the University Committee on Activities Involving Human Subjects (UCAIHS) of New York University and informed consent was given by the observer prior to the experiment.
Stimuli were presented in a dimly lit room on a 32-in. (69.8×39.2 cm) Elo touch screen, which was vertically mounted in a Unistrut frame and was run at a frame rate of 60 Hz with 1366×768 resolution in pixels. An Eyelink II eye tracker was used to record the gaze positions of the observer. The display and recording were controlled by a Dell Pentium D Optiplex 745 computer using the Psychophysics Toolbox
An example of the stimuli in the search-reach task is shown in
The starting position for the finger was a red circle of 0.8 cm radius, which was on the midline of the display and close to the bottom. Two clusters of objects were located to left and right of the midline, one cluster containing four objects, the other two. All the objects were along a virtual arc centered at the starting position. The center-to-center distance from the starting position to any object was 27 cm. Within a cluster the objects were equally spaced and the space between any two adjacent objects was 3 cm. That is, with the starting position as the origin, centers of two adjacent objects spanned an arc of 15 deg. The centers of the two clusters were 45 deg left and 45 deg right of the midline. On each trial the objects as a whole might have a clockwise or counter-clockwise jitter of no more than 3.8 deg.
The experiment consisted of two 1.5-hour sessions on two different days. All observers went through the following three experimental phases corresponding to three different tasks: training of visual search, training of reach, and testing of search-reach. In each trial of a particular task, observers receive bonus points for successfully performing the task. During the trial, the number of points that could be won started at 100 and decayed linearly with the time used (at a rate of 8, 7, and 5 point/sec, respectively, for the three tasks) until the task was successfully completed or the count reached 0. The observer received the point count remaining at the end of each trial. Every 1000 points were redeemed as US$1 at the end of the experiment.
The task in this phase was ordinary visual search – to search for a target among distractors. A trial began with the display of a red circular starting position on the touch screen. When observers put their finger on the starting position and fixated it for 0.5 second, six objects appeared, in clusters of two and four. In half of the trials, the small cluster was on the left and the large cluster on the right; in the other half, the reverse. Observers knew that the target was equally likely to be any of the objects. They kept their finger on the starting position during the visual search. When they found the target, they responded by lifting their finger while fixating the target. Feedback followed. If they had correctly indicated the target, they were informed how many reward points they had won in the trial. Otherwise they were informed that they had erred and would receive no reward.
Each observer completed three variants of the task. In the
The layout of the objects could be small cluster on the left and large cluster on the right or the reverse. The observer performed 12 trials in the practice and 6 (target location)×2 (layout)×8 = 96 trials in the formal experiment for each of these three tasks. These tasks provided observers with experience at the visual search task. At the same time, we could estimate the search slope (searches/second) as well as the preference in search order for each observer in the free search task. During visual search training no hand movements were involved except to initiate the trial by touching the red dot and terminate it by releasing the red dot.
Observers fixated and pressed the starting position to start the trial, just as they did in the visual search tasks above. At the locations of the to-be-searched objects, one white circle and five blue circles appeared instead. The task was to move one's finger along the surface of the screen into the white target and then lifted the finger. Observers were required to move at a speed of no more than 4 cm/s. The number of reward points for successfully reaching the target linearly decreased with the movement time. The feedback was similar to that of the search tasks.
There were 12 practice trials and 6 (target location)×2 (layout)×8 = 96 experimental trials. In practice trials, a white circle (radius 1 cm) followed the movement of the subject's finger on the screen with the subject in effect “dragging” the circle from point to point on the screen. The speed of the circle was limited to 4 cm/s and the positions of the finger and the circle were recorded every 16.7 milliseconds. If the finger moved too rapidly and opened a gap between the center of the circle and the finger of more than 1 cm, the trial would be terminated with a warning message displayed. Thus, over the course of training, the subject learned to move smoothly on the screen without “losing” the circle. We applied the same speed limit algorithm during the main part of the experiment but the circle was not visible. A trial was cancelled and repeated later if the requirements of the movement task were violated.
Parallel to the training of visual search, the goal of the training of reach was two-fold. On one hand, it helped observers to move comfortably on the screen under the speed constraint. On the other hand, it enabled us to measure the actual speed for each observer and detect possible motor biases.
After the separate tasks of visual search and reach, in this phase, observers were tested with the search-reach task as described in the
Each observer performed 12 practice trials and 6 (target location)×2 (layout)×8 = 96 experimental trials for the search-reach task. We did not repeat unsuccessful trials in the test of search-reach. If observers failed on a trial, they lost the bonus for the trial.
Observers completed the training of visual search in the first session and the training of reach and test of search-reach in the second session. For all the phases, the gaze positions and the screen coordinates of the finger were recorded every 0.017 second.
Eight observers (four female) participated. None were aware of the purpose of the experiment or the hypotheses under test. All were right-handed and used their right index finger to move on the touch screen. They received US$12 per hour plus a performance-related bonus calculated as described above.
What concerned us was the strategy of eye-hand coordination people would use in the search-reach task. Based on statistical decision theory
We number the six objects as 1∼6 starting from the outmost object of the small cluster (
Both the visual search strategy and the hand movement strategy of interest apply up to the time point when the target is found, since after that the only admissible strategy is to move one's finger straightly towards the target at full speed. The search-reach time is the sum of the time to find the specific target plus this additional reach time. The former is determined by the search strategy alone. The latter is determined by the location of the finger when the target is found, which in turn is determined by the search and hand movement strategies.
To make the optimality problem tractable, we made some reasonable assumptions about the process of search and reach. First, we assumed that the observer fixates and examines one object at a time and does not switch to the next target until the current object is correctly classified as target or non-target. Second, we assumed that it takes a constant time to saccade to a target and then identify it to be the target or a distractor. The actual time will differ with length of saccade, of course, but the differences are negligible at the time scale of the experiment. Third, we assumed that the observer changes the aim of her hand movement only when a new object is identified.
With these assumptions, the visual search strategy is reduced to specifying the order for the eye to visit the objects, such as 123456 or 342516, while the hand movement strategy could be specified by changes in direction of the finger each time a new object is identified. Let
Substituting
Considering that the last two objects are actually interchangeable in their order, we only need to specify the first four objects to be searched. We simulated all of the possible permutations (
For simplicity, we assume that the observer does not switch to the other cluster before finishing one cluster and goes from one end to the other end within each cluster. In
The differences between 1234 and the other three orders that start from the small cluster (1265, 2134, and 2165) are negligible, while these four are obviously better than the other two orders that start from the large cluster (6543 and 3456). To have an idea of the magnitude of the difference, consider a typical human observer (O7) with
We illustrate the simulation of hand strategies for the typical observer (on
The computation outlined so far may be too complex for humans to execute. So we considered a heuristic: always move toward the centroid of the objects that have not been identified yet. The observer might update her aim of movement after the identification of each new object (bottom panel), or only after one cluster has been identified, or never unless the target is found. We found that the aim-for-centroid strategy is a good approximation to the optimal strategy. If the movement aim is updated after each object, the expected search-reach time is almost the same as that of the optimal strategy. Even the aim-for-centroid with cluster updating is close to optimal, corresponding to 97%–99% of the maximum expected gain. Even if the aim is never updated, i.e. the hand keeps moving towards its initial direction until the target is found, the expected gain is 91%–96% of the maximum.
Unless otherwise stated, the significance level used for all tests was .05 with a Bonferroni correction for 8 observers (
We used Kumar's
In the search-reach task, observers failed to touch the target in a considerable percentage of trials. We noticed that the failures of quite a few trials were due to a violation of the speed limit just after the eye fixated the target. We defined these trials as “almost-successful” trials. Most of the speed violations occurred immediately after the target was found and before that there was no significant difference between the almost-successful trials and the real successful trials in movement speed. Across observers the percentage of almost-successful trials ranged from 18% to 30% (median: 21%). The percentage of successful plus almost-successful trials was above 92% for all observers except one observer (O5, 75%). We completed the almost-successful trials by assuming that thereafter the observer would move towards the target at her mean movement speed. The rewards of almost-successful trials were re-calculated based on the re-calculated search-reach time. In later analyses, these completed almost-successful trials were treated as successful trials. Since the violation of speed limit occurred after the target had been identified, there was no reason to assume that observers had used a different search or reach strategy in the almost-successful trials from that which they had used in the real successful trials, although they received no rewards for the former.
Through the free search task in the training of visual search, we could estimate how rapidly each observer could identify an object as target or non-target. For each observer, we fitted the search time of a trial as a linear function of the number of objects fixated in the trial. Only successful trials were included. The variance explained ranged from 77% to 96% across observers.
We measured the observer's hand movement speed in the reach task of the training phase, in which the observer moved straightly from the starting position to the designated target position.
Hand trajectories in the search-reach task are plotted for two typical observers, O1 (left sequence) and O8 (middle and right sequences). Each sequence is for one specific search order (labeled at top). Each panel is for one specific target position (labeled in the panel). Each trajectory is for one trial. Colors along the trajectories code stages of visual search. Red denotes that no objects have been examined. Cyan denotes that 1 object has been examined, and so on. Green denotes that the target has been found. Note where the hand trajectories change going directions and how the trajectories vary with search order.
Based on the search slope and movement speed of an observer, we could predict her maximal expected gain in the search-reach task with the optimal model described earlier. Efficiency was defined as the average gain of successful trials divided by the maximal expected gain. To avoid overestimating maximal expected gain and thus underestimating efficiency, we added the observer's extra search time, if positive, when computing her minimum expected search-reach time.
We plotted the efficiency of the search-reach task for each observer in
We could reject the hypothesis of optimality for all observers. Most observers were far from optimal. A median observer achieved only 78% of the expected gain predicted by the optimal strategy of eye-hand coordination. We will look at their actual visual search and hand movement strategies next.
As a second index of task performance, we computed the mean movement time after the target was found (post-found time) for each observer and compared it to the expected post-found time for optimal strategies (
For the search-reach task, we examined the search order of the trials in which at least one object had been searched. Only 14 trials were missing (O3: 1; O5: 12; O7: 1). As a median observer, the search order of 85% of the trials was one of the six that we had simulated earlier: 1234, 1265, 2134, 2165, 6543, 3456 (see
As we illustrated in
To save time, observers should search the small cluster first rather than the large cluster first. The percentages of usage of the two visual search strategies were contrasted with each other. Star denotes a significant difference. Only two observers correctly used the small-cluster first strategy more often than the large-cluster first strategy. The rest of the observers showed no significant preference between the two.
The two observers who correctly searched the small cluster first in the search-reach task did not do so by accident. In the free search task of the training phase, where the order of search did not influence the cost of time, O1 searched in both orders equally often, and O2 exhibited the reverse preference (large-cluster first). In contrast, observers who happened to prefer the small cluster first in the training phase (O4 and O5) unfortunately gave up this preference in the search-reach task (see
When the position of the target is known, there is no doubt the observer should move towards the target at her full speed, as in the reach training task. The interesting question though is how the subject moves her hand before the target is found. We addressed human observers' hand movement strategies from the following four aspects.
Yes, but six of the eight observers moved markedly slower than they had did in the training of reach phase (
In the search-reach task, for a typical observer (
Is it possible that some observers might have benefited from moving slowly or even not moving before the target was found? We considered one possibility. Suppose that, due to visual and motor variability, the actual movement of the hand may deviate from the planned direction, particularly if the eye is engaged elsewhere. If the hand is moving in a wrong direction, moving at full speed would amplify the effect of any error. Suppose the angular error of hand movement was a Gaussian distribution of a standard deviation of 5 deg, a pessimistic estimate given previous estimates of human pointing performance without visual feedback
In the simulation, all the observers' maximum expected gain increased monotonically with speed (
Each time an object was identified as a distractor, the object was excluded from the possible targets. As we discussed earlier under the model of optimal eye-hand coordination, given a specific order of search, the optimal hand movement strategy could be well approximated by the strategy of aiming at the centroid of the possible targets. Therefore, if human observers accommodate their hand movement to their search, we expected them to update their movement direction towards the centroid of the as yet unsearched objects. That is, in a trial, if the observer searched the small cluster first, she should shift towards the large cluster when the objects in the small cluster have been identified as distractors, and vice versa.
For trials which had no fewer than three objects fixated before the target is found, we evaluated the change of movement direction by computing the difference between the position of the finger at the end of fixating the third object and that of the first target. We used the angle
To summarize, five of the eight observers appropriately adjusted their hand movement based on current information from the visual search strategy they were using.
Although most of the observers correctly updated their movement direction with the progressing visual search, only a few of them moved in the direction that agreed with the optimal model of hand movement.
For trials with no less than one object fixated before the target, we defined the direction of initial movement as the direction from the starting position to the position of the finger at the end of fixating the first object. We characterized it in the angle in a polar coordinate system that centered at the starting position and ran counter-clockwise from the direction to the right. The mean initial movement direction across trials is shown in
In
We tested the optimality of human strategies of eye-hand coordination in a task that involved finding and then touching a target among distractors as rapidly as possible. To minimize the overall time of visual search and hand movement and thereby to maximize expected gain, the observer needed to search the possible locations of the target in a specific order and alter her hand movement repeatedly in response to new visual information.
In such a task, the optimal strategies of visual search and hand movement were inter-related and jointly determined by the time required to identify an object and move the hand to touch it. For objects divided into two uneven clusters, the optimal visual search strategy was to search the small cluster first and then the large cluster. The optimal hand movement strategy was to move towards the centroid of the objects that had not been searched yet.
We examined human observers' hand movement and visual search strategies separately. We found that observers did (correctly) move before the target was found and most observers updated their movement direction correctly contingent on the progress of their search.
This outcome is consistent with previous studies that show motor compensation for increased visual and motor uncertainty
Sensitivity to probabilistic structures does not guarantee the optimality of movement under uncertainty. In our task, where the optimal strategy of hand movement could be clearly defined, we found that human observers significantly departed from optimal: Before the target was found, they did not move in their full speed and, at the beginning of their movements, only two out of eight correctly move in the optimal direction.
As to the visual search strategy, most of the observers failed to prefer the optimal visual search strategy. They started their search from the small cluster and the large cluster equally often. This failure is probably due to the indirect link between eye movements and the ultimate rewards. In the search-reach task, the better search orders do not help to shorten the search time itself but instead serves to shorten the movement time of the hand to the target. Although we investigators could model these indirect benefits or costs of eye movements
The sub-optimality of visual search or hand movement strategy might have been a result of inability to plan eye movements and hand movements independently. For instance, the endpoints of the eye and the hand are correlated in rapid reaching
Performing visual search and hand movement at the same time might lead to reduced performance in one or even both tasks. The observer might need a longer time to examine an object or the observer might have a larger variance in hand movement speed and thus have to slow down in order not to violate the time limit. However, as shown in
To conclude, people intelligently coordinate their hand with their eye in an uncertain environment. However, most of them did not use the eye and/or hand strategies that would have maximized the expected gain of the overall activity. Our study opens up the question: To what extent can the cost of one effector (e.g. hand) be taken into account in the movement planning of another effector (e.g. eye) of the same organism?
There are evidently costs of control in eye movements alone and in hand movements alone but these costs are consistent with near-optimal performance in the many tasks reported in the literature and reviewed in the introduction. In our task subjects must plan movements of two effectors (eye and hand) and it is very plausible that the sub-optimality we observe is due to a cost associated with planning two inter-related tasks (a “cost of coordination”). Testing this conjecture is a worthwhile direction for future research.
(PDF)
(PDF)
(PDF)
(PDF)
(PDF)
(AVI)