University of Birmingham
Defining and Evaluating Context for Wearable
Computers
Bristow, Huw; Baber, Christopher; Cross, James; Knight, James; Woolley, Sandra
DOI:
10.1016/j.ijhcs.2003.11.009
Document Version
Early version, also known as pre-print
Citation for published version (Harvard):
Bristow, H, Baber, C, Cross, J, Knight, J & Woolley, S 2004, 'Defining and Evaluating Context for Wearable
Computers', International Journal of Human-Computer Studies, vol. 60, pp. 798-819.
https://doi.org/10.1016/j.ijhcs.2003.11.009
Link to publication on Research at Birmingham portal
General rights
Unless a licence is specified above, all rights (including copyright and moral rights) in this document are retained by the authors and/or the
copyright holders. The express permission of the copyright holder must be obtained for any use of this material other than for purposes
permitted by law.
• Users may freely distribute the URL that is used to identify this publication.
• Users may download and/or print one copy of the publication from the University of Birmingham research portal for the purpose of private
study or non-commercial research.
• User may use extracts from the document in line with the concept of ‘fair dealing’ under the Copyright, Designs and Patents Act 1988 (?)
• Users may not further distribute the material nor use it for the purposes of commercial gain.
Where a licence is displayed above, please note the terms and conditions of the licence govern your use of this document.
When citing, please reference the published version.
Take down policy
While the University of Birmingham exercises care and attention in making items available there are rare occasions when an item has been
uploaded in error or has been deemed to be commercially or otherwise sensitive.
If you believe that this is the case for this document, please contact
[email protected] providing details and we will remove access to
the work immediately and investigate.
Download date: 01. Jun. 2020
ARTICLE IN PRESS
Int. J. Human-Computer Studies 60 (2004) 798–819
Defining and evaluating context for
wearable computing
Huw W. Bristow, Chris Baber*, James Cross, James F. Knight,
Sandra I. Woolley
Kodak/Royal Academy Educational Technicon, School of Electronic, Electrical and Computer Engineering,
The University of Biringham, Edgbaston, Birmingham B15 2TT, UK
Received 17 November 2003; accepted 21 November 2003
Abstract
Defining ‘context’ has proved to be a non-trivial problem for research in context-awareness.
In this paper we address two questions: what features of activity are required to define context?
and does the use of context-awareness measurably improve user performance? The first
question was addressed by a study of everyday activities, using a Photo Diary method to arrive
at a set of Context Identifiers. We feel that it is important to discover what features of activity
are needed in order to describe context. Two user trials were carried out to address the second
question. We conclude that the use of context improves user task proficiency.
r 2004 Elsevier Ltd. All rights reserved.
1. Introduction
Recent developments in wearable and mobile computing are based on the
assumption that ‘context’ can be related to concepts such as the user’s location and
activity. Ostensibly this could allow associative structures (or ‘models’) that link
features of context to items of information; as the features of context change, so too
will the relevance of information. Thus, a significant aspect of wearable and mobile
computing is the ability of the technology to respond to changes in ‘context’.
Unfortunately, there remains a distinct lack of agreement as to what constitutes
‘context’. Abowd and Mynatt (2002) suggest that one can think of context in terms
of: who is using the system; what the system is being used for; where the system is
*Corresponding author. Tel.: +44-121-414-3965; fax: +44-121-414-4291.
E-mail addresses:
[email protected] (H.W. Bristow),
[email protected] (C. Baber).
1071-5819/$ - see front matter r 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.ijhcs.2003.11.009
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
799
being used; when the system is being used and why the system is being used. This
provides an initial avenue into the problem of defining context. We suggested a
classification scheme (Baber et al., 1999), as illustrated in Table 1, which pairs a
Reference Marker, i.e. the element that is being defined, with a simple demarcation
of time. In this manner context can be defined in terms of a combination of
Reference Markers that have different information relating to whether the
information is stored, whether it is predicted, or whether it is being captured at
the moment.
Table 1 draws upon the example of a visit to an art gallery. The visitor carries a
mobile telephone and receives ‘text alerts’ (perhaps from the gallery itself to describe
events that are taking place, say a lecture in one of the galleries). The visitor’s
primary goal is to look at the paintings and to learn something about them. The
‘current time’ column indicates how this is achieved. However, the system is assumed
to also make reference to data from ‘past’ events, settings, activities, etc. and draw
upon these data to construct the information provided to the visitor, and to also
make predictions of possible actions. A question of prime importance, therefore, is
what data ought to be used to construct the ‘context’ for such a system?
An early example of a context-aware application, Forget-Me-Not (Lamming and
Flynn, 1994), sought to offer a ‘memory prosthesis’ by collecting information about
specific events, such as where or when a meeting occurred or who was present at the
meeting, in order to provide a reminder to the user in support of subsequent recall of
the event. The Remembrance Agent (Rhodes, 1997) provided users with links to
textual material, e.g. emails and reports, which was relevant to a specific event. More
recent systems, such as Shopping Assistant (Asthana et al., 1994), ComMotion
(Marmasse and Schmandt, 2000), and CyberMinder (Dey and Abowd, 2000) also
provide the user with an application that can support specific activities, often
supporting the recall of some information or providing a reminder to perform a task,
based on the ‘context’ of the person. In other applications, context is used to select
and present relevant information to the user. For example, in the ‘Touring Machine’
(Feiner et al., 1997), environmental reference markers using Global Positioning
System (GPS), were used to link a person’s location with both pertinent information,
Table 1
Features of context
Reference
markers
Past
Current time
Future
Event
Environment
Log of previous alerts
Route taken to current
location
Been Done list
Acquired knowledge of
art
Previously viewed
paintings
Incoming text alert
Current location
Schedule of alerts
Possible destinations from
the current location
To Do list
Possible things to discover
Task
Person
Object
Current activity
Current interests; current
physiological state
Current painting
Possible similar paintings
ARTICLE IN PRESS
800
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
e.g. a description of the building that they were near, and augmentation of their view
of the world, i.e. through superimposing arrows on the head-up display to point to
buildings.
Applications have seen context being defined by environmental characteristics, e.g.
Schmidt et al. (1999) report a device (mounted on the wearer’s tie) that can detect
changes in ambient sound levels or wearer movement. Similarly there has been
interest in the notion of sensors mounted in badges that could be used to detect a
person’s location, and can modify information relating to their environment
accordingly (Want et al., 1992). Alternatively, context could be defined by changes in
the physical and physiological state of the wearer (Farringdon et al., 1999; Lind et al.,
1997). Finally, context can be related to the task that the person is performing. Thus,
wearable computers can assist the wearer in shopping activities (Asthana et al., 1994;
Randell and Muller, 2000).
The definition of ‘context’ represents a challenge to researchers in this field (Baber,
2001). Sensibly, researchers have tended to focus on defining the application and
seeking to measure or sense aspects of the context that are appropriate to the
application. Thus, in their study of delivery van drivers, Ashbrook and Starner
(2002) concentrated on GPSs and developed sophisticated means of collating these
data into patterns of activity. On the one hand, the approach to context-awareness as
the measurement of features is pragmatic, but on the other hand, it is not clear
whether these features are the most appropriate; we are relying on the intuitions of
the researchers to ensure that context is defined correctly. While such a ‘craft-based’
approach is proving fruitful in many applications, it tends to lead to researchers
rediscovering (or at least redefining) the whole concept of context. This, in turn, is
producing a degree of circularity into both the literature and the research
community, i.e. ‘context’ is in danger in being defined simply in terms of whatever
features a ‘context-aware’ device is able to measure. Furthermore, while it is taken
on trust that context-awareness is a good thing, there has been surprisingly little by
way of research into the potential benefits of context-awareness on user
performance.
The first aim of this paper is to take an activity-centred approach to the definition
of context. In doing so, we hope to explore the meaning of context for a host of
everyday activities. From this we can begin to develop a comprehensive set of
context identifiers. The second aim of the paper is to examine the potential benefits
of context-aware devices, particularly in comparison with their counterparts that do
not exhibit such awareness.
2. Study one
The goal of study one is to investigate the context of everyday activity. In order to
achieve this goal, it was decided that participants would record everyday activity. In
this study a variation of a diary study (Rieman, 1993) was employed. It was felt that
asking participants to complete a written diary could prove difficult in that they
might not be sure what to record (hence, leading to problems of ‘limited recording’ in
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
801
which participants recording only aspects of the activity that they find easy to put
into words and producing only limited accounts) and that recording might be unduly
intrusive. Furthermore, a diary study might also have led ‘recording bias’, i.e. to
participants recording information that they felt was relevant to the study rather
than examples of the mundane activities that we sought. Consequently, it was
proposed that participants would record instances of their everyday activities using
photographs. Eldridge et al. (1992) previously reported a study into the use of videorecording as a means of creating a record of everyday activity. Tolmie et al. (2002)
and Brown et al. (2000) have reported the use photographic diary methods to study
everyday activities. Brown et al. (2000) limited the activities studied to ‘information
capture’ events in the work place and did not apply their work to the definition of
context. Tolmie et al. (2002) looked at a set of ‘domestic routines’ (specifically
interactions between to mothers leaving a house to pick up their children from
school) and discuss the context of these interactions in order to show how context
plays an implicit role in understanding the interactions, although they do not apply
their findings to the definition of context.
In this study, participants were asked to photograph everyday activities. However,
in order to overcome potential recording bias, it was decided that taking
photographs would be cued by the experimental protocol. In this way, participants
would not be able to select activities that they felt were interesting. For convenience
and simplicity, the experimental protocol was based on time; by asking people to
make a record using time of day, rather than activity, it was hoped that we would be
able to collect a richer, more representative set of examples of everyday activity. For
example, given the instruction to record ‘everyday activity’, people may have
concentrated on the more interesting aspects of their day that could have unduly
skewed the recordings. Obviously, participants were granted some lee-way in what to
record, e.g. they were allowed to exclude ‘personal’ activities from the study
providing they recorded activity within the window of around 15 min. It is worth
noting that, in this study, ‘time of day’ functioned solely as a cue to performing the
act of taking a photograph (and not as an independent variable in the experiment).
2.1. Phase one: photo collection
The photo diary method involved participants taking photographs of tasks or
activities that they were performing at specified times during the day.
Participants: Seven people took part in the study (3 male, 4 female: mean age 37.2
(719) years). The sample size was limited in order to make the subsequent analysis
less onerous. The selection of participants was intended to reflect as broad a
population as possible, and the occupations were, student, local government officer,
teacher, social worker, gas engineer and two researchers.
Procedure: The week was split into 84, hour and a-quarter sessions (each day
started at 8 am and finished at 11 pm). Twelve sessions were randomly allocated to
each of the seven participants. The participants were asked to take two photographs
of one activity they did during each time period. When the participants took these
photographs they were asked to note the time, exposure number and a one-line
ARTICLE IN PRESS
802
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
description of the activity. The objective was to get evidence of a wide range of
typical everyday activities. Through this method we sought to minimize bias in the
selection of activities people photographed during the time slots, and to ensure that
the activities were evenly distributed over the week. In addition the participants were
not told how to photograph their activity, only that the photos should be selfexplanatory in describing the activity. Each of the time slots was only used once.
Repetition of time slot by different photograph takers was not necessary. As
mentioned above, we are not so much interested in activities at certain times, the
time slots are used as a way of forcing photograph taker to get a wide range of
activities and not just one which they feel are interesting.
2.2. Phase two: photo sorting
Once the activities had been recorded the sorting of the photos followed a
variation on the Card Sorting Method (Maiden and Hare, 1998; Sinclair, 1995). This
involved laying out all of the photographs and then searching for features that would
allow photographs to be grouped. This process was conducted by a panel of judges,
who performed the card sort task twice. The reason for performing the task twice
was to explore variations in sorting, which would emphasize different features in the
photographs.
Participants: Seven judges completed the photo-sorting task. All the participants
were engineering students (4 male, 3 female, mean age 25 (73) years) and had not
completed the photo collection task.
Procedure: The photographs were mounted on pieces of card in their pairs. Each
judge was given identical verbal instructions to sort the photo cards into similar
groups. No definition of ‘similar groups’ was given; this was to be defined by the
judge. On the back of each photograph was written the time they were taken, the one
line description of the activity and the user number. The judges were instructed to
only look at these details if they could not ascertain the activity from the photo itself.
The judges were then asked to give a heading to depict each of their sets and describe
the main features that defined membership to the group. These features were called
the Key Features.
One week after completing the first sorting exercise the judges were asked to resort
the cards, ensuring that their classification was different to that in their first sort.
This was done to increase detail and variety of the Key Features elicited by the
technique. It was obvious from the first sort that judges sorted in a similar manner,
with body posture and movement and location being the most popular Key
Features. It is, perhaps, interesting to note how ‘location’ serves as a ‘common-sense’
descriptor of context and has been the basis of much of context-awareness research.
The body posture and movement is, we felt interesting, in that the emphasis of the
study has been on human performance and this implies (we feel) that ‘location’ was
also related to human activity as opposed to being a parameter that related to a
specific device that could be carried by a person. Asking more people to sort the card
from the random starting position (as in sort 1) would probably have produced more
of the same Key Features. By getting the same people, who had familiarity with the
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
803
photos and the features to sort again, but in a different way produced additional
features. Since the desired outcome of the exercise was to produce as many and as
diverse range of features as possible, the second sorting sessions went a long way to
achieving this. Again the heading, features and card numbers were recorded for each
group.
2.3. Phase three: context identifier sort
Each Context Identifier from the photograph sort was then written onto a post-it
note. The post-it notes were then subjected to a further sort. The Post-it sort had two
purposes; firstly to identify sets of similar Context Identifier identified by the judges.
Secondly to give an importance weighting to each of the Context Identifiers found,
allowing a designer to select the most appropriate Context Identifier for their
context-aware device. The Context Identifiers do not tell a designer anything about
what types of sensors to use, just the types of entities to sense.
Participants: in the final sorting phase, the authors performed the task.
Procedure: All of the elicited Context Identifiers from both photo sorts were
recorded on small yellow post-it notes. These post-its were then stuck to a large piece
of paper and organized into groups of connected features (i.e. ones that contained
similar key words or themes). This sorting produced a concise set of theme areas or
Context Identifiers, these included Location, Body Position, Objects and others. In
Location you would find terms such as home, work, office, indoors, outdoors.
Location was then split into subcategories such as known-locations, indoors/
outdoors, and novel-locations. The final groups of Post-its can be seen below in
Fig. 1. The miscellaneous category includes terms that were too general to fit into
any other category; generally, they were terms that described an activity e.g.
shopping, which is the activity rather than a component of the activity.
Fig. 1.
ARTICLE IN PRESS
804
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
2.4. Conclusions
Study one covered a large number of everyday activities, both work and leisure
orientated. By randomly allocating time slots, many mundane activities were
included that might have been missed. The Context Identifiers are shown below in
Table 2 in order of importance (highest first) with examples of states. The ranking
was defined by the percentage of Key Features that were included in a specific
Context Identifier, i.e. given that participants identified some 400 Key Features, the
top Context Identifier, ‘Body Posture and Movement’, had around 96 Key Features
associated with it. It is hypothesized that the highest weighted Context Identifiers are
most important as they occur most often in everyday activities. Body Posture and
Movement could be important because it is featured in almost all of the activities
analysed (which, it must be noted, featured people performing everyday activities)
whereas Weather was only a Context Identifier in a few of the activities (Fig. 2).
It is proposed that Table 2 could support a designer in developing context-aware
devices. This can be achieved by using this list to determine the primary Context
Table 2
Ranked list of context identifiers
context identifier
% of total
Possible states of the context identifier
Posture and movement
Location
Object
People
Time
Mood
Indoors/outdoors
Type of interaction
Aim of task
Physiological indicators
Cognitive load
Weather
Frequency
Time critical?
Planning?
24.25
14.75
14.5
8.00
7.50
6.25
5.75
5.25
4.00
3.25
3.00
1.75
0.75
0.75
0.50
Is the person moving, walking, sitting, standing?
Is the location known?
What object is being interacted with?
How many/which people are involved?
What time or day is it?
Is the user relaxed/stressed?
Is the person indoors?
How did the interaction between people of object take place?
Is the task being done for enjoyment/work?
Is the heart rate high?
How mentally difficult is the task?
Good/bad weather?
How frequently is the task done?
Is the task length (time) constrained?
Is the task planned/spontaneous?
Fig. 2.
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
805
Identifier for an application and then work down the list adding more Context
Identifiers as appropriate. The application of percentages to the Context Identifiers
should not be read in any absolute term, but provide an interesting view of the
relative importance of these Context Identifiers for the everyday activities captured
and analysed in study one. For example, the top three Context Identifiers account
for over 50% of the responses, whilst the bottom six account for only 10%. This is
not to say that one should not use the Context Identifiers at the bottom of the list;
one of the characteristics of ‘context’ is, of course, that the relevance of Context
Identifiers will be dependent on what activities are being examined (so, for instance,
a medical application or a study of mental workload might place higher ranking on
physiological indicators or cognitive load than is shown in Table 2). However, if
researchers wish to consider Context Identifiers in specific application domains, then
the methodology presented in study one provides a relatively quick and simple
means of defining this information. For the purposes of this paper, it is proposed
that a wearable computer that uses context-awareness to support everyday activity
ought to be designed to handle location, and posture and movement of its wearer.
3. Study two: using location to define context
3.1. Developing a wearable context-aware system
Having established the importance of specific Context Identifiers, it was important
to determine whether context-awareness could actually be shown to be beneficial for
wearable computing. After all, if it could not be demonstrated that contextawareness leads to measurable benefits to the user, why should such systems be
developed? Therefore, two studies were carried out to investigate the effect of using
specific Context Identifiers on user performance. In the following section, the
platform, i.e. the wearable computer and the software, used in the first study will be
described.
The Photo Diary study suggested that the top two Context Identifiers were body
position and movement, and location. As discussed in the Introduction, Location
has long been a favourite Context Identifier in the Context-awareness literature. If it
could be demonstrated that Location could enhance user performance when using a
context-aware device, then it is possible to propose that the other Context Identifiers
might also be beneficial. Therefore, sensing Location would seem to be a sensible
place to start. Thus, a system was developed that senses a user’s location and reacts
to it. This system is the w3 :
3.1.1. The w3
The w3 is a wearable computer developed at the University of Birmingham.1 The
3
w uses a PC104 embedded PC board, and has SVGA out, two communications
ports, on-board LAN and USB sound. The main unit is a 166 MHz Pentium class
1
http://www.wear-it.net
ARTICLE IN PRESS
806
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
chipset. A MicroOptical head-mounted display is used (with its own power source
and data converter), with the addition of a SVGA to NTSC converter allowing the
screen to be made larger for reading text from webpages. A Garmin GPS is used for
tracking the users’ location. The GPS is accurate to a few metres, although can be
affected by reflections from buildings.
The processor runs Windows 98. This offers the capability to run commercially
available software and to share files between different computers easily. However,
there is the assumption that Windows is ‘power hungry’. We have found that it is
possible to modify BIOS settings in such a way as to significantly reduce power
requirements and to extend battery life; we typically expect some 6–8 h of battery life
(and have managed to run for 10 h on full load in laboratory settings). While these
times are, perhaps, too short for commercial application (one would not want to
keep running out of power towards the end of the working day), they do suggest that
it is possible to control power management in windows to an acceptable degree.
Fig. 3 shows the w3 : It is fairly small, measuring 170 40 100 mm. Even with the
addition of the head-mounted display and battery the system it is still comfortable,
light and easy to wear (Knight et al., 2002). The entire system is housed in a shoulder
strap, as shown in Fig. 4.
Fig. 3.
Fig. 4.
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
807
3.1.2. WECA-PC software
WECA-PC Portal is the software that analyses the data from the w3
Communications Port and sends a URL from an on-board database to Internet
Explorer. The incoming data either arrives from a GPS or from infra-red
transmitters located in buildings (although the latter option is not used in this
trial). The software is written in Visual Basic. The software strips longitude and
latitude coordinates from the GPS data stream. In addition to extracting
coordinates, the software also performs error checking and invokes different
routines depending of the absence or weakness of signals from the GPS. The
coordinates are used to query a database of previously identified locations in order
to call up a URL. In this application, web-pages are pre-stored on the w3 and we do
not use the GSM link.
3.2. Objectives
In the trials presented in this paper, the aim has been to compare three conditions
that reflect realistic sources of information for the task ‘find out about buildings on
the University of Birmingham campus’. One source of information is the Internet;
people can access web pages to find out about departments and buildings (this is
Condition 2—web). A second source is the buildings themselves; people can walk
around campus and look for information (this is Condition 3—world). A wearable
computer ought to permit access to the Internet for people who are walking around
campus and should provide the best of both worlds (this is Condition 1—wearable).
Conditions 1–3 are described in more detail in Section 3.2.1.
The aim of the user trial was to determine whether a context-aware wearable
computer would be useful for information retrieval tasks. It was assumed that such a
device would allow users to extract information from both the World Wide Web and
from the world around them. Consequently, the trial was constructed so that people
would search for information in one of three conditions (web only, world only,
wearable computer). In terms of Context Identifiers from study one, this study is
primarily concerned with Location, i.e. through the use of GPS data.
The comparison, between world, web and wearable, was intended to test whether
wearing a context-aware device led to superior performance to simply using
information that was available in the world, and whether mobile access to web-pages
led to superior performance to simply accessing the web on a static terminal. It was
assumed that the two reference conditions, i.e. walking around and using the
internet, would produce superior performance on two measures, i.e. we hypothesized
that wearing the computer might impede mobility, so that the walking around
condition should lead to significantly faster mean time between questions (note the
mean time between questions is the average time, for the mobile conditions, walking
between building and for the Internet condition the average time rested between
questions); and that the internet would lead to quick information retrieval, and so
should support significantly faster time to find information (particularly when one
considers that the internet condition would employ the traditional mouse and
keyboard and a 1700 monitor familiar to all participants). There are two dependent
ARTICLE IN PRESS
808
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
variables in this trial Time and Accuracy; therefore better performance would be
faster and more accurate.
3.2.1. Method
Participants: 27 undergraduate students participated in the study (23 male and 4
female). Age range: 18–22. The participants were divided into three groups of 9 for
this study.
Procedure: A set of questions about buildings around the university campus was
devised, comprising of two questions for each building, eight questions in total
(Table 3).
The participants were allocated on appearance to the base-room to one of three
conditions (9 participants in each condition). The questions all had specific answers
and could be answered in both conditions. The condition denoted the type of
information sources available to answer all of the questions as follows:
1. Condition 1: Wearable Computer—the user wore the w3 with the WECA PC
software installed and therefore had access to both environmental and virtual
data. As the users approached the relevant building a single web page was
displayed, in each case this page contained the answer to at least one of the
questions asked about that building.
2. Condition 2: Internet—involved the user only having access to virtual data. The
users were asked to sit at an Internet ready terminal in the School of Electronic
and Electrical Engineering and only use the University of Birmingham web sites
to find the answers.
3. Condition 3: World only—involved only environmental data. The users were
asked to walk around campus with a paper map, visiting the relevant building to
answer the questions.
Each group was given a brief explanation of the task and an introduction to the
equipment used. Training was limited, as we wanted to see how people would cope
with the novelty of the wearable computer. The questions were given to participants,
Table 3
Example questions
Location
Library
Q1: In what year did
Queen Elizabeth lay a
stone in the library?
Q2: How many shields
are above the main
doors?
Arrival time at
building
Departure time
at building
Answer
How did you get it?
1957
Look at stone or look
at picture on the web
3
http://
www.is.bham.ac.uk/
mainlib/about.htm or
go look at the main
doors
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
809
and any queries regarding task or questions addressed. Each group was then told
that they had 20 min to complete the task (although they were not stopped if they
went over this time).
The time taken to answer each set of questions (i.e. the time taken looking for
information) and the total time completing the task were recorded. This allowed a
comparison to be made of the time taken to complete each individual question, the
total time taken and the time taken between questions (in Conditions 1 and 3 the
time taken walking between buildings). The users answers to the questions were
recorded and then marked to give them a score. Users were also asked to record
whether they answered the questions making use of virtual or environmental
information and any problems they encountered. At the end of the experiment, the
answers were checked and the participants debriefed. A full set of results was made
available to participants the week after the experiment.
3.2.2. Results
The results are divided into two parts: the first considers the effect of the different
conditions on performance time, and the second considers the effect of the different
conditions on participants’ ability to answer the questions.
Performance time data: Table 4 shows the overall time spent answering the
questions, between questions and overall task time for each condition.
A non-parametric Kruskal–Wallis test (assuming that the sample size would not
produce normally distributed data) was carried out to compare the mean time spent
answering questions and the results show a significant effect of condition
[w2 ¼ 11:779; df=2; po0:005]. Looking at the mean times it is apparent that the
wearable computer condition (Condition 1) performed significantly faster than the
other conditions.
In addition to considering how quickly participants could answer the questions,
the time to complete the study and the time between attempting each question were
measured. For the ‘outdoor’ conditions, i.e. Condition 1: Wearable and Condition 3:
World, this was the time taken to complete the task at one building and walk to the
next building, and Condition 2: Internet, this was the time to move to the relevant
web-page. Kruskal–Wallis tests were applied to the time between questions and the
total time taken to complete the task by each condition. For the ‘Time between
Questions’ a significant difference between conditions was revealed [w2 ¼ 6:11; df=2;
Table 4
Mean performance times for each condition
Mean time (min)
1. Wearable
2. Internet
3. World
To answer all questions
Between all questions
Task completion
7.51
23.13
12.34
11.17
3.33
10.51
19.09
26.47
23.17
ARTICLE IN PRESS
810
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
po0:05]. The time between questions is much greater for Condition 1: Wearable and
Condition 3: World (walking around), than for the Condition 2: Internet. The
difference between the total times taken to complete the test were not shown to be
significant although the participants in condition 1 were seen to complete the task in
the shortest amount of time.
Ability to answer questions: Table 5 compares the mean performance on question
answering in terms of percentage of questions answered correctly.
A Kruskal–Wallis test indicated a significant difference in performance between
conditions [w2 ¼ 14:853; df=2; po0:001]. Inspection of Table 5 suggests that while
the Condition 2: Internet and Condition 3: World performed at similar levels, the
wearable condition exhibited superior performance.
Fig. 5 shows how participants in the different conditions relied on different
sources of information to answer the questions. The sources of information could be
the Environment (e) in which the person was walking or the Virtual display (v) of
information on the World Wide Web, or could be answered using either
Environment or Virtual (ve). The numbers, in Fig. 5, refer to the three experimental
conditions, i.e. 1=wearable, 2=internet; 3=walking around. Thus, Condition 2
only made use of virtual information (hence, 2v in Fig. 5), and Condition 3 only
made use of the environment (hence, 3e in Fig. 5), but Condition 1 could conceivably
use any source, i.e. 1e, 1v, 1ve. It is apparent, that questions 1 and 5 could be most
easily answered using environmental information, whereas the other questions could
be answered using either environmental or virtual information. What is interesting is
Table 5
Mean performance on question answering
Condition
Mean score on questions
1. Wearable
2. Internet
3. World
100%
71.2%
80.5%
Fig. 5.
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
811
that the participants in Condition 1: wearable seemed to use the information that
most easily provided the answer, with a possible trend to use environmental
information if readily available and then resort to virtual information (unless virtual
information would more easily provide the answer). For example when the
participants in the Internet condition answered a question incorrectly, the
participants in the Wearable condition typically elected to gain information from
the environment and not the Internet. Participants in the Wearable condition
successfully rejected the source that had failed to provide the answer quickly and
accurately and hence maintained greater accuracy and speed of response. It can be
seen that the number of users in the Wearable condition answering correctly matches
the number from either the Internet or World conditions, i.e. the same information
source has been used to answer the questions correctly.
3.2.3. Conclusions
The results indicate that using the wearable computer (Condition 1) not only
reduced the mean question–answering time but also enabled the participants to
answer all of the questions correctly. Those using this system performed more
effectively and efficiently. In terms of answering questions, the mean number
answered correctly was greater for the Wearable and World condition than the
Internet condition (Condition 2). Participants in the World condition scored higher
than expected because they made use of ‘other’ information sources in the real world,
e.g. one of the questions related to degree programmes and participants in this
condition all went into the relevant Undergraduate office to ask the secretary, these
incidents were isolated.
It is proposed that the participants’ ability to answer each question was dependent
on the source of information available to the participant. In other words there is a
relationship between the source of the information (whether it is environmental or
virtual) and the accuracy and speed of response for each of the specific questions.
Certain questions were more easily answered based on the source of the information
available. This is indicated by the results, as it is clear that the participants in the
Internet and World conditions answered different questions correctly. It is further
supported by the link to the choice of information source selected by the Wearable
condition; the users in the Wearable condition consistently chose the most
appropriate source.
Benefits of experimental design: It is important to note that if the experiment only
looked at total time to complete the task, no useful results would be found. This is
often the case in comparative evaluation of products and technologies; differences lie
in the process rather than the outcome of task performance. The fact that times were
looked at for each individual component has been useful in ascertaining the full
implications of using w3 : The mean time taken between questions (or physically the
time taken walking between buildings) is obviously greater for the wearable and
walking around conditions, where walking was necessary, than for Condition 2
where the users has only to sit at a computer. What is interesting (and against our
expectations) is that these times are comparable for the Wearable and World
ARTICLE IN PRESS
812
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
condition. This was taken to imply that the wearable did not significantly impede
performance.
Specific problems: Some users found a problem with the GPS, particularly when
near a specific building. This can be accounted for by two factors. Firstly if a user is
standing near the edge of the active area (the area in which WECA PC Portal will
react and load the relevant web page) around each building, the GPS has a certain
drift factor, this factor can then virtually move a user in and out of the active area,
thus loading and unloading the web page and causing confusion. Secondly, if the
area around the building has a number of high buildings, as the users walked though
small gaps between the buildings the GPS may have been affected by the buildings
thus confusing the system. Users also complained about the Head-Mounted Display,
feeling that the resolution and size of the display make it almost impossible to read
text from a page without zooming and scrolling around the screen for the relevant
piece of text. However, it was not very easy to operate and many users found it
frustrating. Despite the problem of reading the display, participants’ performance
did not appear to be unduly impaired.
4. Study three: using location and body position to define context
Study two indicated that a location-based context-aware wearable system can
improve a users performance in a given information retrieval task. Study one
suggested a set of Context Identifiers that could be used to find a users context.
Whilst the Context Identifier of Location featured in the top three from study one,
and was used in study two, posture and movement position was identified as the
most important feature of context in study one. Therefore, it is suggested that this is
next entity to sense. The aim of study three was to determine whether a level-two
context-aware wearable computer would be useful for information retrieval tasks. It
is hypothesized that adding a second level of context will perform better than an
improved version of the single level used in study two.
For study three two versions of a wearable computer called the w3þ were
produced. System 1 took the concepts of best performing system from study two and
improved it in terms of better underlying technology, for example a faster processor.
The underlying concept of just sensing location as the only Context Identifier has not
changed. Thus, context-awareness using Location represented the control condition
of study three. System 2 was created using the same technology but with the addition
of the ability to sense the users body position. These two systems can then be
compared. In terms of the Context Identifiers defined in study one, study three
employed Location and Body Position.
4.1. The w3þ
The w3þ is an upgraded version of the w3 used in study two. The main board is still
a PC104, however it is now a 700 MHz 256 Mb RAM. In addition the USB sound
card and headset that was not operational is now fully working due to a new more
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
813
Table 6
Comparison of w3 versions
Feature
w3
w3þ
Main board
Processing power
RAM
Context sensors
PC104
166 MHz
32 Mb
GPS
Carrying system
Head mounted
display
Sound
Power supply
Operating system
Control device
Internet
connectivity
Across shoulder system
Micro-Optical, single eye, 200 300
resolution, translucent
PC104
700 MHz
256 Mb
SYSTEM ONE
GPS ONLY
SYSTEM TWO
GPS & Accelerometer
‘Camel Back’ system
Cy-Visor, dual eye, 800 600
resolution, opaque
USB headphones and microphone
8A
Windowsr 2000
Thumb mouse
GSM Nokia 6210 Mobile phone,
9.2 kbps data rate
5A
Windowsr 98
Thumb mouse
powerful power supply and different chip set, and the GSM Internet connection is
working. The Micro-Optical head-mounted display has been replaced by a Personal
Display Systems Cy-Visor, which has display giving a true 600 800 resolution
making text clearer and more readable. However, this display was not translucent
and therefore did make walking more difficult. As with study one a Garmin GPS was
used, for the second level of context a two-axis leg accelerometer was added. Fig. 5
shows participants wearing this kit. Table 6 shows a comparison of the w3 and
the w3þ :
4.2. SSW—stand, sit, walk software
The WECA PC Software used in study one was replaced with SSW Software. SSW
captures and analyses data from the GPS and the accelerometer. Data from the
accelerometer is used to tell whether the user is sitting, standing or walking. The
accelerometer has an X - and Y -axis. When the accelerometer is attached to the leg
the X -axis runs parallel to the leg and the Y -axis at right angles to it. SSW takes an
average reading over 2 s (approximately 300 values) to get a value of the acceleration
that checks whether the user is moving or not. Once SSW has decided the users Body
Position it queries a database with the GPS coordinates and Body Position and
retrieves a URL associated with the context. In study two the URL was passed to
Internet Explorer however SSW has a custom Internet browser built in so the URL
is sent to this. The browser can be seen in Fig. 6. The user is told the Location and
Body Position via the headset; they are also shown at the top of the browser.
When the user is walking there is a much higher acceleration in both directions,
particularly the X -axis. Thus SSW tests when the average X value is greater than
1.6g and declares that the user must be walking
ARTICLE IN PRESS
814
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
Fig. 6.
When the user is in the standing position, gravity acts on the X -axis, making the
acceleration 1g or greater. In addition since the user is stationary the Y acceleration
is small. Therefore, SSW tests when the average X near zero is greater than or equal
to 1 and tests when the average Y value is small (less than 0.5g) and declares this the
standing position
When the user is in the standing position, gravity acts on the Y -axis, making the
acceleration 1g or greater. In addition since the user is stationary the X acceleration
is near zero. Therefore, SSW tests when the average Y value is greater than or equal
to 1 and tests when the average X value is small (less than 0.5g) and declares this the
sitting position
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
815
4.3. Objectives
In terms of the experiment (method and conditions explained in the next section),
in System 1 used in Condition 1 (Location Only) the accelerometer is switched off
and the software behaves similarly to wearable in study one, a web page associated
with the current location is shown. In System 2 used in Condition 2 (Location and
Body Position) however the software behaves differently. If the user is walking a
blank screen is shown (the user does not therefore need to look at the screen, which
should make it easier to walk), when the user approaches a building of interest SSW
tells the user they are walking near the building via audio. They are also told if they
want more detailed information to stand still or sit down. If the user sits down the
web page is shown in full (the same web page as when the user approaches the
building in Condition 1). However, if the user stands still they are shown a cut down
version of the web page. At present the editing of the web page is pre-done manually,
but it is proposed that this could be automated. Thus, the information displayed to
the user is adapted for their current context. For example, when the user is walking,
the information presented needs to be simple and non-distracting. When a user
stands still they can cope with more detailed information; they can devote more
attention to the task rather that looking where they are going. Again if the user sits
down they can devote still further attention to the information being presented to
them and cope with a higher level of information. In terms of study three, it is
hypothesized that Condition 2 (Location and Body Position) will perform better
since less searching is need since the information is automatically searched/edited by
the system and presented in a way that is more helpful to the user in their given
context.
4.4. Method
Participants: 10 undergraduate students participated in the study (9 male and 1
female). Age range: 18–22. Participants were divided into two groups of 5 for this
study.
Procedure: A set of questions about buildings around the university campus
was devised, comprising of two questions for each building, six questions in
total. The participants were allocated on presentation to the base-room to one
of two conditions (five participants in each condition). The condition denoted
the type of information source to be used to answer all of the questions as
follows:
1. Condition 1: (Location Only) Wearable Computer with level-one context—The
user wore the w3 with SSW Software (see below) installed, with the GPS attached
but without the accelerometer attached. Therefore Location only was sensed.
2. Condition 2: (Location and Body Position) Wearable Computer with level-two
context—The user wore the w3 with the SSW Software installed with the
accelerometer and GPS attached. Therefore, Location and Body Position were
sensed.
ARTICLE IN PRESS
816
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
Each group was given a brief explanation of the task and an introduction to the
equipment used. Training was limited, as we wanted to see how people would cope
with the novelty of the wearable computer. The questions were given to participants,
and any queries regarding task or questions addressed. Each group was then told
that they had 30 min to complete the task (although they were not stopped if they
went over this time).
The time taken to answer each set of questions was recorded (i.e. the time taken
looking for information). This allowed comparison to be made of the time taken to
complete each individual question. Users were asked to record whether they
answered the questions making use of virtual or environmental information and any
problems they encountered. At the end of the experiment, the answers were checked
and the participants debriefed. A full set of results was made available to participants
the week after the experiment.
4.5. Results
The results are divided into two parts: the first considers the effect of the different
conditions on performance time, and the second considers the effect of the different
conditions on participants’ ability to answer the questions.
Performance time data: The mean time taken to answer the questions has greatly
decreased from 20:12 in Condition 1 (Location Only) to 8:00 in Condition 2
(Location and Body Position).
If the mean times to answer the questions in condition one and condition two are
compared, a difference of 12 min can be seen. It was felt that most of this time
difference was due to the amount of information being downloaded and not due to
the performance of the user. Therefore, an additional five users were asked to
complete an altered Condition 1 on a computer in the laboratory using a high-speed
line thus reducing the download time. Participants were made to use the same thumb
mouse and headset to simulate Condition 1. The mean time to answer the questions
taken by the altered Condition 1 was then subtracted from the mean time taken to
answer the questions in Condition 1 giving an estimate of the mean time spent
downloading information (12 min, 13 s). The mean download time was then divided
by the number of questions giving a mean download time per question of 2 min, 2 s.
The mean time taken to answer the questions and the mean total time taken to
complete the task were then adjusted to take account of the download time. For both
Conditions 1 and 2 the mean download time per question was multiplied by the
number of questions that required information to be downloaded (In Condition 1 all
questions required downloading of information, but in Condition 2 fewer questions
required downloading since the some of the information was stored on the hard
drive, in most cases only 2 or 3 questions required downloading, depending on the
way the users answered the questions.) was subtracted from each users time taken to
answer the questions and total time taken to complete the task, then means were
then recalculated and are shown in Table 7. Thus, a fairer result is shown.
After adjusting for the downloading time a reduction in the mean time taken to
answer the questions and the total time to complete the task can be seen in Table 7.
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
817
Table 7
Mean time taken to answer questions and to complete the task adjusted for the time taken to down load
pages from the Internet
Download adjusted
Mean time answering questions
Total time
Standard deviation time answering questions
Standard deviation total time
Condition 1: location
only DA
Condition 2: location
and body DA
07:59
15:47
01:47
02:21
03:56
10:20
02:33
01:31
Fig. 7.
Kruskal–Wallis tests were carried out on the download-adjusted times, these show
that both times give significant results, mean time answering questions [w2 ¼ 3:962;
df=1; po0:05] and Total time to complete task [w2 ¼ 6:902; df=1; po0:05]. Thus,
condition two performed better than condition one.
Ability to answer questions: From Fig. 7, it can be seen that Condition 2 performed
7.5% more accurately than Condition 1. This result is not statistically significant.
4.6. Conclusions
From the results of Study 2, it can be concluded that adding a second Context
Identifier increased the performance of a user completing a given task, particularly in
terms of time (although the accuracy data have not seen much improvement between
conditions). The mean time taken to answer the questions and the mean total time to
complete the task has been significantly reduced by the addition of a second context
level. The result should not be taken simply to indicate that more Context Identifiers
lead to better performance. Rather, each Context Identifier has a positive effect on a
specific aspect of performance, i.e. Location provides a direct route to information
that is relevant to a particular place, and Body Position provides a means of
managing the presentation of the information, depending on what the user is doing.
Specific problems: Since the HMD was not translucent, and even though it could
be pushed up and down, some people found it difficult to walk at a normal pace. A
ARTICLE IN PRESS
818
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
second problem was the slow download speed of the mobile phone; this has been
accounted for by the download adjustment and therefore should not influence the
results. In the future the campus may be covered by high-speed wireless local area
network (IEEE 802.11b) coverage, which would remove this problem.
5. Discussion
In conclusion, the Photo Diary Method has allowed us to define that Context
Identifiers that are important in defining a users context. Designers could use the list
of Context Identifiers, presented in Table 2, and the links between them to select
what to sense in a given situation. Work must now be carried out to find
technological means of sensing each of the Identifiers.
The user trials involving w3 and w3þ have answered the question: ‘does contextawareness support user performance?’ The results indicate that context-awareness
can improve user performance on information retrieval tasks. In fact, a considerable
improvement in user task proficiency can be seen when using the wearable system.
It is proposed that there are two main conclusions that can be drawn from the user
trial results. The first is that providing people with appropriate information ‘just-intime’ can be beneficial to the performance of specific tasks. This is not too surprising
and basically offers empirical support to some of the underlying assumptions that
the ‘context-aware’ world has been using for some time. The second is that, given the
option to use more than one information source, people are good at selecting the
most appropriate source. In other words, the results indicate that context-aware
technology is not simply about providing the information, but about providing a
source of information to which users can flexibly respond.
Acknowledgements
This work was partly supported by European Union grant IST-2000-25076 ‘Lab
of Tomorrow’ and EPSRC Grant GR/R33830.
References
Abowd, G.D., Mynatt, E.D., 2002. Charting past, present, and future research in ubiquitous computing.
In: Carroll, J.M. (Ed.), Human–Computer Interaction in the New Millennium. Addison-Wesley,
Boston, MA, pp. 513–536.
Ashbrook, D., Starner, T., 2002. Learning significant locations and predicting user movement with GPS.
Digest of Papers of the Sixth International Symposium on Wearable Computers. IEEE Computer
Society, Los Alamitos, CA, pp. 101–108
Asthana, A., Cravatts, C., Krzyzanowski, P., 1994. An indoor wireless system for personalized shopping
assistance. Proceedings of the IEEE Workshop on Mobile Computing Systems and Applications,
Santa Cruz, CA, December 8–9.
Baber, C., 2001. Wearable computers: a human factors review. International Journal of Human–
Computer Interaction 13 (2), 123–145.
ARTICLE IN PRESS
H.W. Bristow et al. / Int. J. Human-Computer Studies 60 (2004) 798–819
819
Baber, C., Haniff, D.J., Woolley, S.I., 1999. Contrasting paradigms for the development of wearable
computers. IBM Systems Journal 38 (4), 551–565.
Brown, B.A.T., Sellen, A.J., O’hara, K.P., 2000. A diary study of information capture in working life. In:
Turner, T., Szwillus, G., Czerwinski, M., Paterno, F. (Eds.), CHI 2000. ACM Press, New York,
pp. 438–445.
Eldridge, M., Lamming, M., Flynn, M., 1992. Does a video library help recall? In: Monk, A., Diaper, D.,
Harrison, M.D. (Eds.), People and Computers VII. Cambridge University Press, Cambridge,
pp. 257–269.
Farringdon, J., Moore, A.J., Tilbury, N., Church, J., Biemond, P.D., 1999. Wearable sensor badge and
sensor jacket for contextual awareness. Digest of Papers of the Third International Symposium on
Wearable Computers. IEEE Computer Society, Los Alamitos, CA, pp. 107–113.
Feiner, S., MacIntyre, B., Hollerer, T., Webster, A., 1997. A touring machine: prototyping 3D mobile
augmented reality systems for exploring the urban environment. Digest of Papers of the First
International Symposium on Wearable Computers. IEEE Computer Society, Los Alamitos, CA,
pp. 176–177.
Knight, J.F., Baber, C., Schwirtz, A., Bristow, H., 2002. The comfort assessment of wearable computers.
Digest of Papers of the 6th International Symposium on Wearable Computing. IEEE Computer
Society, Los Alamitos, CA, pp. 65–74.
Lamming, M., Flynn, M., 1994. Forget-me-not-intimate computing in support of human memory.
Proceedings of FRIEND21: International Symposium on Next Generation Human Interface, Megufo
Gajoen, Japan.
Lind, E.J., Jayaraman, S., Park, S., Rajamanickam, R., Eisler, R., Burghart, G., McKee, T., 1997. A
sensate liner for personal monitoring applications. Digest of Papers of the First International
Symposium on Wearable Computers. IEEE Computer Society, Los Alamitos, CA, pp. 98–107.
Maiden, N.A.M., Hare, M., 1998. Problem domain categories in requirements engineering. International
Journal of Human–Computer Studies 49, 218–304.
Marmasse, N., Schmandt, C., 2000. Location-aware information delivery with commotion. HUC 2000,
pp. 157–171.
Randell, C., Muller, H., 2000. The shopping jacket: wearable computing for the consumer. Personal
Technologies 4, 241.
Rhodes, B., 1997. The wearable remembrance agent: a system for augmented memory. The First
International Symposium on Wearable Computers. IEEE Computer Society, Los Alamitos, CA,
pp. 123–128.
Rieman, J., 1993. The diary study: a workplace-oriented research tool to guide laboratory efforts. In:
Ashlund, S., Mullet, K., Henderson, A., Hollnagel, E., White, T. (Eds.), INTERCHI. Addison-Wesley,
Reading, MA, pp. 321–326.
Schmidt, A., Gellerson, H.-W., Beigl, M., 1999. A wearable context-awareness component: finally a good
reason to wear a tie. Third International Symposium on Wearable Computers. IEEE Computer
Society, Los Alamitos, CA, pp. 176–177.
Sinclair, M., 1995. Subjective assessment. In: Wilson, J.R., Corlett, E.N. (Eds.), Evaluation of Human
Work. Taylor & Francis, London, pp. 69–100.
Tolmie, P., Pycock, J., Diggins, T., Maclean, A., Karsenty, A., 2002. ‘‘Unremarkable Computing’’, Chi
’02, Vol. 4(1), Minneapolis, MN, USA, 20–25 April 2002, pp. 399–406.
Want, R., Hopper, A., Falcao, V., Gibbons, J., 1992. The active badge location system. ACM
Transactions on Information Systems 10 (1), 91–102.