You are on page 1of 17

Visual-Haptic Interfaces, Modification of Motor and Cognitive

Performance
Author: Morris Steffin, MD, Chief Science Officer, Virtual
Reality Neurotech Lab
Contributor Information and Disclosures
Updated: Oct 13, 2008

Print This
Email This

References

Introduction
The development of virtual reality (VR) technology has spawned new concepts of patient interaction and
behavioral modification. The extension of techniques developed for virtual surgery training and pilot training
provides the basis for retraining patients with neurological deficits resulting from multiple sclerosis, spinal cord
injury, and stroke. Moreover, the application of VR can be of substantial benefit in compensating for sensory
deficits, particularly in vision and hearing.

VR approaches can be directed toward assisting the performance of motor and sensory tasks; VR also can be
used to develop novel modalities of physical therapy to improve unassisted performance. New modalities of
diagnosis and treatment of sensorimotor processing deficits and cognitive dysfunction are emerging from the
confluence of clinical neurology, basic science advances, and computer science. In this article, the design
considerations of these assistive, diagnostic, and therapeutic systems are reviewed.

Visual-Haptic Interface
Central to the ability to modify motor performance in patients with neurologic disorders is the means to apply
corrective or cueing forces to the body parts involved in the activity. In patients with cerebellar tremor, for
example, as occurs in multiple sclerosis, a movement such as reaching toward and grasping an object

becomes extremely difficult, as demonstrated in Image 1 (panels A-F are stages of the movement in time). The
entire epoch, which lasts approximately 3 seconds, is shown fully graphed in Image 2.

As the patient attempts to reach for the target object (ie, the glass), his hand oscillates rather than following a
smooth and accurate trajectory. Interestingly, the terminal regions (thumb and fingers) are relatively stable,
allowing for reasonably accurate grasping, but the wrist oscillations result in overturning rather than grasping
the target object.

The successful trajectory for the patient's hand can be mapped out in advance once the target is selected. As
long as the patient's wrist and hand remain within limits established by the position of the target, he or she will
be able to reach it with stability. The spatial domain of these limits may be termed the force corridor. A device
can be envisioned that applies force to counter the patient's wrist movement should the wrist deviate outside
the corridor.

Thus, the 2 salient functions of the visual-haptic interface are as follows:

Establishing the force corridor on the basis of the position of the patient's body part and the target

Providing the counterforce (ie, haptic interaction) to constrain the body part to the force corridor

Establishing the spatial domain of the force corridor

The spatial domain (ie, the region of body part positioning needed to achieve the movement) is computed from
the initial position of the patient's body part (in this case, the wrist) and the position of the target. Position data
are available from the videospace of the patient and the target.

A rough corridor is delineated in Image 1. The 3 spatial regions of interest (ROIs), which are overlaid in blue,
are the lateral boundaries of the corridor. Encroachment by the wrist and fingers into the ROIs represents
deviation from the desired trajectory of the wrist and hand. Degrees of encroachment for each of the 3 ROIs

are plotted in graphs below each panel. The corresponding fast Fourier transforms of the encroachment
functions are plotted to the left of the panel, and the lowest fast Fourier transforms graph is the coherence of
the upper 3 (for quantitative methods, see Steffin 199735 and Steffin, 199934 ). These encroachment levels can
be used to control a haptic device that provides counterforce for correction of aberrant wrist movements. For
simplicity, only 3 ROIs are shown as limit points on the force corridor; in practice, at least 20 ROIs would be
necessary for accuracy.

Haptic generator

The counterforce presented to a body part (in this case, the wrist) at any instant can be represented by a vector
whose characteristics must be determined by the constraints of the spatial domain and the conditions for
movement stability. The computational system provides a value for each ROI in the force corridor region
proportional to the level of encroachment by a body part (eg, wrist, fingers) into the corridor limit zone
delineated by that ROI.

The generated values for each ROI can be incorporated into a transfer matrix to determine the counterforce
vector components. The encroachment matrix values must be processed to generate the specific force
components. To continue the example of the reaching arm, application of force by a transducer at a single point
on the upper extremity, such as the wrist, is assumed for simplicity.

Consider a haptic device with 3 of freedom output; that is, the force takes the form of a vector, F =
F[x(D,t),y(D,t),z(D,t)], in which x, y, and z are functions of the spatial domain matrix, D, and time, t. By
formulating the force transfer characteristic in this way, the haptic generator can produce a stabilizing, rather
than destabilizing, corrective output to the patient. Bioengineering concepts and principles involved in the
construction of such a force vector from spatial data have been described. Implementation of the computational
subroutines is proceeding in the author's laboratory.35,34

The application of appropriate counterforce can appreciably decrease tremor and inaccuracy of movement in a
patient with cerebellar deficit, as indicated in Images 3-4, the latter showing the complete epoch. In this case, a
stabilizing force was applied as a preliminary test of the idea. Note the markedly decreased perturbation in
trajectory demonstrated by the much flatter curves in the encroachment graphs of Images 3-4 than in those of
Images 1-2.

Application of such a counterforce can be achieved by tethering a haptic device of 3 of freedom directly to the
wrist.35,34 This general approach also appears to be effective in improving movement accuracy in certain cases
of spasticity.

This visual-to-haptic transfer approach has several advantages. Because the functional spatial domain is
constructed from the patient's videospace, the acquisition technology for the spatial domain data is primarily a
function of software engineering. This reduces the overall complexity of the hardware for integrating
electromagnetic or multiple infrared detectors into the patient's environment to achieve this result. Likewise, the
transduction to force output, at least for the paradigmatic case outlined here, involves relatively simple
interaction between the computer and the force generator. The goal of such an approach is construction of a
practical instrument that would be available in a typical patient environment. By extension, finer movements
(eg, of the fingers) ultimately may be incorporated into the approach using this and other stimulation modalities.

Facial expression control input - An auxiliary spatial domain

For severely motor-impaired patients (eg, quadriplegics), the extremity videospace monitor approach will fail
because the patient is incapable of the extremity volitional movement necessary to create a haptic input signal.
As an alternative, video processing of the patient's facial expression can be used to perform this task. This
method is potentially simpler and more reliable to implement than other current approaches, such as EEG
driving input, especially because no electrodes need be applied to the patient's head, and voice recognition
may require excessive processing time. The only requirement for facial control is a video camera mounted to

view the patient's face and a self-contained video digital signal processor (single-board freestanding) operating
under algorithms under development in this laboratory.

Such techniques have been applied to detection of behavioral states, particularly drowsiness 38 and loss of
consciousness (in addition to seizure detection35,34 ). For example, such a paradigm can detect sudden loss of
consciousness, as in pilots undergoing high acceleration. 39 By using these techniques, scalar processing of
converted video facial input can be used to develop robotic assistance regimens. Work is proceeding in the
author's laboratory to develop algorithms for realization of this goal.

The basic approach to facial monitoring is demonstrated in Image 5. The eye region is analyzed in real time,
including the supraorbital region and the palpebral fissure. The graphs represent scalar values corresponding
to the positions of the structures in the corresponding videospace. Spatial and time resolution are good, as is
evident in Image 5.

The same approach is demonstrated in Image 6 for the mouth region. Oral and chin movements are displayed
in separate channels. With mouth opening and closing, spatial and time resolution of the movements are
similar to those for the eye region. In this case, the mouth movements occurred on command and are therefore
more rapid (square wave) than would occur with physiologic yawning; differentiation between volitional and
subcortical processes such as yawning is clear with this method, as is shown in Image 7.38 With the physiologic
yawn, the graphs show much more gradual configurational changes of the mouth, almost sinusoidal rather than
rectangular. Preservation of high-frequency response is thus necessary for rapid system discrimination of and
response to volitional facial driving responses.

Increased spatial resolution can be achieved by multiple channel sampling of overlapping regions. This is
demonstrated in Image 8. Here, periods of active oral movement contrast with a period of cessation of mouth
movements. Reliability of the data is increased by interchannel correlation, as can be seen in these traces
during the cessation phase by inspection. Again, the waveforms demonstrate the feasibility of scalar analysis.

To resolve behavioral changes in the patient, the video-to-scalar approach presented here is much more
efficient computationally than, for example, would be convolutional video transform analysis.

An example of conscious, but quiescent facies, as opposed to volitional activity, involving both mouth and eye
movements is demonstrated in Image 9. Eye and mouth movements (2 channels each) are monitored
simultaneously. Eye movements are characterized by lower-amplitude, higher-frequency components than
mouth movements. As seen here, and in Images 6-7, mouth movements also show more baseline drift and
other low-frequency noise, making interpretation more difficult, although the uncertainty caused by such drift is
considerably reduced by the multichannel sampling of Image 8. However, further improvement in reliability is
achieved by high-pass digital filtering, as demonstrated in Image 10. In this case, the baseline during
movement cessation is nearly flat, leading to less ambiguity and greater reliability in behavioral assessment.

By adding an asymmetrical exponential decay to the output of the high-pass filter, a time delay can be
introduced to assess consistency of the signal change as it may reflect a behaviorally significant event. This
method is illustrated in Image 11. When activity ceases, the signal level decays exponentially until it reaches a
level that can trigger a response from the system. As soon as activity resumes, the trigger is reset. In this case,
correlation among 4 mouth channels determines response triggering.

Another correlation method involves a similar approach, but with monitoring of 2 mouth and 2 eye channels, as
in Image 12. In the middle of the sweep, both mouth and eye activity cease long enough to produce a
combined trigger effect, while at the end of the sweep only the mouth activity ceases long enough for the
triggering effect.

These combinations of approaches allow for a wide variety of machine responses to behaviorally significant
facial activity. Because the algorithms are efficient and can run on a stand-alone system, preferably a video
digital signal processor board, major computer resources are still left free for artificial intelligence routines to
effect interpretation of and response to the patient activity indicated by these scalar signals.

An example of the operation of this approach in real time for interpretation of facial actions (calling out mouth
movements and eye blinks) can be found in the movie clip in Virtual Reality Biofeedback in Chronic Pain and
Psychiatry.

Development is continuing to enhance interpretation of these video-derived scalar responses to integrate


patient facial activity in machine response paradigms. The potential exists for faster, more efficient response
with this technique compared with voice recognition or EEG control of robotics. A combination of all of these
signal modalities (eg, video, electrical, verbal) will likely ultimately be used to generate assistive responses for
severely disabled patients. Initial indications suggest that machine-level video facial interpretation will play a
prominent role in the design of assistive robotics for patients with severe motor impairments. Such a result
would indeed represent a cooperative robot, attentive to nonverbal and verbal cues.

Neurology Underlying the VisualHaptic Approach


Movement disorders resulting in disabling inaccuracies and aberrations involve deficits in one or more of the
following systems (for a more detailed review, as applicable to haptic feedback, see Steffin, 1997 35 and Steffin,
199935 ).

Primary (corticospinal) efferent system

The primary, or direct, system includes predominantly excitatory output from large pyramidal cells projecting
directly to the spinal motor neurons. However, corticocortical inhibition plays a significant role in modulating
motor behavior at this level, and the projections of excitatory pyramidal cells are plastic and are modulated by
function. This is somewhat contrary to what had been suggested by previous conceptions of homuncular
anatomy. Plastic effects also, of course, involve connections from supplementary motor and other cortical
regions. Impairment in these regions also produces paresis.

Motoneuron modulatory projections

Projections, via the corticospinal tract and supplementary cortical areas (probably projecting onto spinal
interneurons), and cortical inhibition of reticulospinal and rubrospinal systems, also influence spinal motor
neuron set. Gamma efferent projections influence muscle spindle activity and therefore potentiate reflexes and
spasticity.

Sequencing deficits

Basal ganglia play an important role in sequencing motor behavior and modulating muscular tone. External
stimuli can produce improvement in sequencing and performance and probably account for kinesia paradoxica
(ie, temporary return of mobility in a patient with parkinsonism under the influence of an appropriate external
periodic keying stimulus) and gait amelioration.

Rationale for visual-haptic intervention

Evidence for neuroplasticity of the motor system suggests that visual-haptic assistance will be beneficial in 2
respects. First, such interactive systems can provide assistance in performing tasks otherwise precluded by
neurological deficits. These can range from force application to an impaired extremity to electrical stimulation of
intact musculature or can involve outright robotic assistance. At present, the first of these alternatives is
probably most practical from a resource standpoint. Second, the visual-haptic approach provides for the
development of novel modes of physical therapy.

The extent to which repetition of motor tasks with external cueing can enhance performance beyond immediate
assistance is unclear, but the evidence regarding neuroplastic enhancement of activity suggests that such
approaches may be effective. With the development of practical visual-haptic systems, as has been outlined
conceptually34,35 , significant advances in neurorehabilitation of motor deficits are likely to evolve from this

intervention. A corollary to this approach is the potential application of videospace-force interfacing technique to
the realm of functional electrical stimulation.

Such interfacing in effect entails a fusion of robotic principles with a bionic interaction between patient and
machine. The visual-haptic systems described here are likely to provide a useful test-bed for the continuing
dynamic development of both external (force application) and internal (functional electrical stimulation)
methods of improving motor control in patients with neurological deficits.

VR in Cognitive Assessment,
Modification, and Retraining
Theoretically, neuroplasticity can extend into sensorimotor performance and into cognitive realms. Application
of virtual reality (VR) techniques can be useful in providing standardization for neuropsychological testing and
in developing more encompassing environments for retraining.

Moreover, the immersive environments that can be generated with VR allow development of
neuropsychological test tasks that emulate necessary behavioral and cognitive performance requirements in
the real world with greater fidelity than currently provided by available instruments. Such approaches should
allow a high degree of interexamination standardization.

As a result of these unique capabilities, VR is finding a therapeutic role in several cognitive disorders. At
present, the long-term effect of visuomotor interventions on cognitive systems remains, to a great extent,
unexplored territory. Some attempts have been made to influence task-related performance, for example, in
patients with traumatic brain injury; results, however, remain uncertain.

The exact extent to which the motor component, as distinct from the sensory component, of the VR milieu can
alter behavior is in the early stages of investigation. Some interactions will be determined by closing the VR-

patient loop. Independent, objective measurements of patient attention are needed to assess the cognitive
effects of VR intervention and to provide feedback for modification of stimulation characteristics. Increasing the
richness and versatility of stimulation modes and measurement responses will involve interaction of haptic and
sensory modalities, hopefully with enhanced patient motivation.

Evaluations of cognitive performance based on overt performance and measurements such as event-related
potentials (ERPs) are likely to form the basis for training feedback systems. Assessment of attention and
motivation, aided by such measures, will determine at least some of the parameters of the haptic interaction of
VR training systems with patients. Following is a survey of some of these cognitive measures, including ERPs
and functional MRI (fMRI), and some likely directions their evolution will take in the context of VR interventions
for the treatment of cognitive disorders.

Autism

VR poses a major advantage in presenting cognitive material in this setting with attainable high levels of
immersion. Although fostering initial acceptance of the head-mounted display and the VR environment may be
difficult, in most cases this can be achieved fairly rapidly.

Because environmental features within the VR setting are vivid and entirely controllable by the therapist, and
because nonverbal feedback from the patient can be made a central feature of the desired response, VR
appears to be capable of eliciting demonstrable improvement in reaction patterns to external stimuli in patients
with autism. ERPs show some promise for both autism and learning disabilities as an objective measure of
cognitive processing in response to VR stimulus patterns.

For related information, see Medscape's Autism Resource Center.

Attention-deficit disorders and learning disabilities

The attention-deficit disorders can be difficult to diagnose, and diagnostic modalities may not correspond well
to clinical situations. VR appears to have the capability to link well-controlled multimodality stimuli to more
objective physiological measurements of attention and discrimination. Electrophysiological and imaging
abnormalities have increased the understanding of physiological mechanisms in these disorders.
Characteristics of ERPs have, in some studies, shown good correlation with behavioral responses to
appropriate medication.

Basic differences in brain physiology may exist with medication that are demonstrable with ERP monitoring and
will allow carryover, with refinement, to the detection of such physiological perturbations in more complex,
immersive environments. The study of ERPs allows dissection of the attention process, for example, into novel
but nonmeaningful stimuli versus novel and meaningful stimuli.

ERPs have been shown to distinguish electrophysiologically between attention-deficit/hyperactivity disorder


and combinations of attention-deficit/hyperactivity disorder with learning disabilities. The level of significance of
stimuli, particularly if such significance is established by prior events, can be assessed using ERPs. ERPs have
been shown to be a valid measure of the ability to discriminate phonemes. Visual-auditory cross-over tasks can
produce alterations in ERPs indicative of cross-modality processing.

Mapping of cortical asymmetries involved in tonal versus phonetic processing can be achieved by ERP
analysis. These approaches can be correlated with fMRI. Perception of phonemes as native or nonnative to the
subject's language markedly influences ERPs, as does phonologic-semantic inconsistency. Early ERP
components (N 100) have been shown to display less lateralization in dyslexic children than in nondyslexic
children. Subtle ERP differences also arise in autistic patients.

For related information, see Medscape's ADHD Resource Center.

Traumatic brain injury

VR simulation of daily activities can be used in the development of teaching environments for cognitive
disabilities. Here, too, ERPs appear to be a valid indicator of cognitive deficit. Haptic interventions can be
useful in the alleviation of motor dysfunction in some cases. Much work remains to increase the clinical
reliability and utility of such approaches in ameliorating cognitive dysfunction. However, VR almost certainly will
play a major role in the development of future therapeutic interventions, as indicated by correlating FMRI
activation patterns to stimuli presented in a VR environment.

Particularly with cross-correlation among electrophysiological, haptic, fMRI, and novel psychometric measures,
the capacity to diagnose and intervene rationally in cognitive disorders is expected to be enhanced greatly.
New "virtual world" approaches to therapy and daily living assistance for neurological and cognitive disorders
will begin, more routinely, to reach patients on an affordable and manageable basis.

Conclusion
VR as a motor, sensory, cognitive, and measurement link to patients with neurological and cognitive deficits
has opened a new vista in potential levels of patient interaction. The groundwork is now in place to integrate
the immersive characteristics of VR, including haptic and special sensory modalities, in the construction of
novel stimulating environments. Electrophysiological and new psychometric instruments, some based on
haptics, are likely to be derived from such approaches as more standardized and accurate evaluation tools are
applied for the diagnosis and treatment of neurological and cognitive deficits. Creation of tailored environments
for these patients should allow substantial enhancement of functionality and experience in many of these
conditions.

Additional Information
For more information on visual-haptic interfaces and their application to virtual reality, see Virtual Reality:
Overview of its Application to Neurology.

Multimedia

(Enlarge Image)

Media file 1: Patient with cerebellar


tremor showing free trajectory of
wrist and hand movement. Force
corridor is represented by 3
regions of interest (ROIs) as
corridor limits. Graphs indicate
degree of encroachment on ROIs
as an attempt is made to reach the
target.
Media file 2: The final frame of
Image 1 is magnified. Note failure
to reach the target successfully (ie,
the glass is overturned).

(Enlarge Image)

Media file 3: Same maneuver as in


Image 1 with suitable counterforce.

(Enlarge Image)

Media file 4: Final frame of Image


3, as in Image 2. Target (ie, glass)
is grasped successfully.

(Enlarge Image)

(Enlarge Image)

(Enlarge Image)

Media file 5: Video-to-scalar


method applied to eye movement
(profile view). A. Single eye
opening and closing on command.
Upper trace shows eyebrow region
movement; lower trace shows
movements in the region of
palpebral fissure. B. As in A,
except closure precedes opening.
C. Series of 2 opening-closing
cycles on command (square
wave). In each case, raw video is
shown at right, processed video
region at left. Eye position can be
observed in the raw video
corresponding to the scalar signals
as marked.
Media file 6: Mouth analysis using
the technique of Image 5. Mouth
opening (A) and closing (B) on
command (compare with
physiologic yawn in Image 7). As

in Image 5, mouth position at the


corresponding scalar points can be
observed in the raw video. C.
Series of 2 open-close cycles.

(Enlarge Image)

(Enlarge Image)

Media file 7: Physiologic yawn.


Mouth region of interest (ROI).
Four scalar channels derived from
subregions (SR) 1-4 as labeled.
Note the much more gradual onset
and decay, nearly sinusoidal rather
than rectangular, with greater lowto mid-frequency noise due to
changes in muscle tension and,
therefore, mouth configuration.
Media file 8: Multichannel
correlation of mouth region
configuration during movement,
cessation of movement, and
resumption of movement, as
labeled. Note the flat baseline in all
channels once complete cessation
of movement occurs and the
abrupt return of movement in all
channels with resumption of
movement.

Media file 9: Relaxed (quiescent) facies.


Note the lower amplitude, higher frequency
signals in the eye channels, also with greater

(Enlarge
Image)

baseline drift in the mouth channels.

(Enlarge
Image)

Media file 10: Effect of high-pass digital


filtering. Mouth and eye activity during
talking with period of cessation of talking.
Note flat, nearly noise-free baseline during
cessation of movement, generally
decreased baseline drift, and greater
resolution of movement components, as
compared to Images 8 and 9.

(Enlarge Image)

(Enlarge Image)

Media file 11: Addition of


asymmetrical exponential decay
after high-pass filter, 4 mouth
channels. With cessation of
movement, signal decay is
exponential. If cessation is longer,
signal declines to trigger level
(labeled "Alarm trigger," red
marker). Signal instantaneously
increases (no delay) when
movement resumes ("Reset alarm
trigger," green marker).
Media file 12: Filter technique as in
Image 11, applied to eye and
mouth images (each 2 channels).
With complete cessation of facial
movements, both eye and mouth
signals decrement, resulting in
"Combined Eye and Mouth

Trigger, red marker. When


movements in both regions
resume, both triggers are reset.
Later in the sweep, mouth
movements cease while eye
movements continue; only the
mouth trigger is set ("Mouth Alarm
Trigger," red marker), then reset
when mouth movements resume
("Reset Mouth Trigger," green
marker).

You might also like