3.4 Study Design
This within-subjects study had two independent variables:
Scenario and
eHMI. Participants used the VR simulator to interact with an SAE level 5 AV using each
eHMI across five
Scenarios: (1)
Controlled Intersection, (2)
Roundabout, (3)
Uncontrolled Intersection, (4)
Lane Merging and (5)
Bottleneck (road users moving toward each other in a narrow lane). These often prompted human driver-cyclist communication [
1]. Each scenario has different characteristics, e.g. traffic lights or AV position, allowing us to investigate eHMI
versatility. Scenarios were grouped into four
tracks, one for each
eHMI condition. Each track was a straight 1km two-lane road. Riders cycled in the left lane and had the right of way at intersections and roundabouts. Tracks contained each of the five scenarios where the AV yielded, plus two additional ones where the AV did not yield (see Figure
4). These two were excluded from analysis and used to ensure cyclists paid attention and did not assume the car would always yield. Tracks had a random
Scenario order. Participants navigated the seven scenarios, placed 100m apart, until the track’s end. All AVs in one track had the same eHMI. The eHMI sequence was counterbalanced using a Latin square.
Scenarios were modelled after video footage of cycling in the city of Glasgow [
1]. UK traffic features were used.
Lane Merging had obstacles requiring cyclists to enter from the right lane and exit from the left while
merging lanes with a moving AV behind them.
Bottleneck had parked cars on both sides; participants cycled in a narrow lane between them with the AV approaching, and one road user had to steer away. At intersections and roundabouts, the AV accelerated to 30mph (standard UK speed in urban areas) when it was 50 meters from the cyclist and stopped 50cm behind the give-way line if yielding. It accelerated to 25mph in
Lane Merging and decelerated to 10mph when yielding. The AV drove at 15mph in
Bottleneck, steered to the left (between two parked cars) and stopped when yielding. The vehicle maintained speed when not yielding in all scenarios.
Controlled Intersection had red lights for 30 seconds in the non-yielding condition. AVs used directional indicators in
Roundabout and
Bottleneck. We collected the following data:
•
Post-scenario questionnaire. To measure the versatility aspect of RQ1, NASA TLX was used for an interaction’s workload, and five-point Likert scale (strongly disagree-strongly agree) questions asked:
The AV was aware of my presence and
I was confident in the AV’s next manoeuvre. These were derived from work showing that AV awareness and intent are key for AV-cyclist interaction [
3,
22].
•
Cycling behaviour. We addressed RQ2 by measuring speed (meters per second) and shoulder checks (Unity camera (head) Y-axis rotation> 45°; determined through eight pilot tests).
These were logged every second while navigating each scenario. We collected gaze data: number of fixations on an
area of interest (AOI); these cover vehicle (e.g. windscreen) and traffic control features (e.g. traffic lights, see Figure
6).
•
Post-track questionnaire. We measured the acceptability aspect of RQ1 using the Car Technology Acceptance Model (CTAM) [
26] and usability with the User Experience Questionnaire - Short Version (UEQ-S) [
30]. These were previously used to evaluate cycling interfaces and pedestrian eHMIs [
10,
35].
•
Qualitative data. Post-study semi-structured interviews were used to contextualise the findings. Participants discussed and ranked each eHMI. They highlighted any points for improvement, discussed the different scenarios and identified ones that they felt needed/did not need eHMIs.
3.6 Results
We answer the RQs by reporting cyclist
perceptions and
behaviours toward each interaction. We also examine how visual attention changes with
eHMI through eye-tracking and show how acceptable and usable each eHMI was.
Non-significant post hoc results are included in the supplementary material for clarity.3.6.1 Post-Scenario Questionnaire.
The data was not normally distributed (via the Shapiro-Wilk test); we conducted an Aligned-Rank Transform (ART) two-way ANOVA [
37] exploring the effects of
Scenario and
eHMI on our outcomes.
Post hoc tests between
Scenario and
eHMI pairs were conducted using the ART-C method [
14].
NASA-TLX Overall Workload. Mean values are presented in Figure
5. We found a significant main effect of
Scenario, with a small effect size (F(4, 349.68) = 4.92,
P < .001;
η2 = 0.05) and a significant main effect of
eHMI with a large effect size (F(3, 349.78) = 23.49,
P < .001;
η2 = 0.17). There was no interaction (F(12, 349.77) = 0.96,
P = .486).
Controlled Intersection (M = 3.9, SD = 3.55), which had the most traffic control, was the least demanding Scenario to navigate; it was significantly less demanding than Lane Merging (M = 5.66, SD = 4.69; P = .001) and Bottleneck (M = 5.62, SD = 4.34; P = .003), these involved both road users moving and no traffic control. Safe Zone (M = 3.1, SD = 3.2) was the least demanding eHMI, it caused a significantly lower workload than Emoji-Car (M = 5.08, SD = 4.1; P < .0001), LightRing (M = 5.73, SD = 4.25; P < .0001) and No eHMI (M = 5.75, SD = 4.14; P < .0001). Emoji-Car also caused a significantly lower workload than No eHMI (P = .03), where cyclists had to infer yielding intent from driving behaviour. NASA-TLX subscale results followed a similar trend to the overall ones;
Controlled Intersection required the lowest workload,
Safe Zone outperformed the other
eHMIs for all subscales. No interaction was found between
Scenario and
eHMI for all subscales. The detailed findings can be found as supplementary material.
Confidence in AV Awareness.
A significant main effect of Scenario was found, with a small effect size (F(4, 349.95) = 2.64, P < .05; η2 = 0.03) and a significant main effect of eHMI, with a large effect size (F(3, 350.19) = 32.48, P < .001; η2 = 0.22). There was no interaction (F(12, 350.11) = 0.93, P = .518). No significant differences were found between Scenarios. Participants were least confident in the AV’s awareness when they did not receive explicit signals in the No eHMI condition (M = 2.91, SD = 1.51). Confidence was significantly lower around No eHMI than all other conditions: Safe Zone (M = 4.26, SD = 0.97; P < .0001), Emoji-Car (M = 4.02, SD = 1.24; P < .0001) and LightRing (M = 3.73, SD = 1.1; P = .0001). Similarly, LightRing caused significantly lower confidence than Safe Zone (P < .0001) and Emoji-Car (P = .008), despite it explicitly communicating awareness through colour changes.
Confidence in AV Intent.
A significant main effect of Scenario was found, with a small effect size (F(4, 350.03) = 3.79, P < .005 ; η2 = 0.04), and a significant main effect of eHMI with a large effect size (F(3, 350.38) = 36.17, P < .001; η2 = 0.24). There was no interaction (F(12, 350.23) = 0.83, P = .620). Participants were most confident in the AV’s intent in the Bottleneck scenario (M = 3.62, SD = 1.43) when the AV was opposite them, moving at a slower speed. We found significant differences comparing Bottleneck with Lane Merging (M = 3.05, SD = 1.54; P = .002), where the AV was behind the cyclist (not in their field of view). Participants were most confident when Safe Zone (M = 4.31, SD = 0.91) was used. Safe Zone produced significantly higher confidence scores than Emoji-Car (M = 3.65, SD = 1.4; P = .0008), and LightRing (M = 3.16, SD = 1.4; P < .0001). Emoji-Car produced significantly higher scores than LightRing (P = .02). In contrast, participants were least confident in AV intent when there was No eHMI (M = 2.65, SD = 1.47), compared to Safe Zone (P < .0001), Emoji-Car (P < .0001) and LightRing (P = .007), where they had a display supporting them.
3.6.2 Cycling Behaviour.
Data did not have a normal distribution, so we conducted a two-way ANOVA of Aligned-Rank Transformed (ART) data exploring effects of Scenario and eHMI on cycling behaviour, with post hoc comparisons using ART-C.
Cycling Speed.
We found a significant main effect of Scenario with a large effect size (F(4, 361) = 19.33, P < .001; η2 = 0.18), and a significant main effect of eHMI with a medium effect size (F(3, 361) = 11.92, P < .001; η2 = 0.09). There was no interaction (F(12, 361) = 0.64, P = .807). Participants were fastest at Controlled Intersection (M = 5.36m/s, SD = 1.37), with no need for any right-of-way negotiation. They were significantly faster at Controlled Intersection than Roundabout (M = 4.8, SD = 1.29; P = .03), Uncontrolled Intersection (M = 4.8, SD = 1.49; P = .01) and Bottleneck (M = 3.92, SD = 1.24; P < .0001). They were slowest at Bottleneck, where the lane was narrower, so we found significant differences between Bottleneck and Roundabout (P < .0001), Uncontrolled Intersection (P < .0001) and Lane Merging (M = 5.06, SD = 1.6; P < .0001) (participants had to make a fast manoeuvre due to an AV moving behind them). Safe Zone (M = 5.26, SD = 1.23), which had simple signals covering a large surface, helped participants ride at higher speeds, compared to Emoji-Car (M = 4.87, SD = 1.5; P = .05), LightRing (M = 4.33, SD = 1.49; P < .0001) and No eHMI (M = 4.69, SD = 1.5; P = .003). In contrast, participants were slowest around LightRing, where they inferred yielding intent from eHMI animations, and were significantly slower than around Emoji-Car (P = .0053).
Shoulder Checks.
Data were binary (1 shoulder check was conducted, 0 not); we analysed the mean number of shoulder checks for each eHMI in each scenario. We found a significant main effect of Scenario with a medium effect size (F(4, 361) = 9.53, P > .001; η2 = 0.10) and a significant main effect of eHMI with a small effect size (F(3,361) = 3.94, P < .01; η2 = 0.03). There was no interaction (F(12, 361) = 1.7, P = .064). Shoulder checks were most likely at an Uncontrolled Intersection (M = 0.04, SD = 0.08), and we found significant differences with Controlled Intersection (M = 0.02, SD = 0.05; P = .02) and Roundabout (M = 0.02, SD = 0.057; P = .002). However, with the AV in front of the rider, they were least likely at Bottlenecks (M = 0.01, SD = 0.03). Shoulder checks were significantly less likely at Bottleneck than Controlled Intersection (P = .03), Uncontrolled Intersection (P < .0001) and Lane Merging (M = 0.03, SD = 0.06; P = .0045). Emoji-Car (M = 0.03, SD = 0.07) displayed icons on the AV’s roof, producing the highest likelihood of shoulder checks. Checks were significantly higher around Emoji-Car than Safe Zone (M = 0.01, SD = 0.04; P = .006), which projected colours on the road over a larger area, causing the least shoulder checks.
3.6.3 Gaze Behaviour.
Figure
6 shows the effect of
eHMI on participant gaze behaviours. We conducted a Chi-square test of independence to investigate the relationship between
eHMI and
fixation counts.
Post hoc tests were performed using a Chi-Square test of independence with a Bonferroni correction. We found a significant association between the variables (
χ2(36, 10970) = 2187.8,
P < .001).
Post hoc comparisons showed that participants relied more on traffic control with
No eHMI as they fixated on traffic signs/lights and road markings more often than with
Safe Zone (
P < .0001),
Emoji-Car (
P < .0001) and
LightRing (
P < .0001). Results also showed that
Safe Zone required less visual attention (fewer fixations on the eHMI display) than
Emoji-Car (
P < .0001) and
LightRing (
P < .0001).
3.6.4 Post-Track Questionnaire.
Figure
7 shows the mean scale ratings. A Friedman’s test was conducted to investigate the impact of
eHMI on the results. Pairwise
post hoc comparisons were conducted using the Nemenyi test.
CTAM Overall Score. We found significant differences among the conditions (χ2 = 19.675, df = 3, P < .001; η2 = 0.2194). Safe Zone (MD = 3.86, IQR = 0.64) was the most acceptable. It was significantly more acceptable than LightRing (MD = 3.21, IQR = 0.93; P = .002) and No eHMI (MD = 2.9, IQR = 1.02; P = .0005), which was the least acceptable. CTAM subscale results are presented in the supplementary material. Safe Zone was the most acceptable in all subscales, except for Perceived Safety, where there was no significant difference with the other conditions (χ2 = 2.882, df = 3, P = .41017; η2 = -0.0016).
UEQ-S. Significant differences were found among the conditions (
χ2 = 29.793, df = 3,
P < .001;
η2 = 0.3525).
No eHMI (MD = −0.75, IQR = 1.28) was the least usable. It was significantly less usable than Safe Zone (MD = 1.38, IQR = 1.5; P < .0001) and Emoji-Car (M = 0.69, IQR = 1.59; P = .002). Safe Zone was the most usable; it was significantly more usable than LightRing (MD = −0.06, IQR = 1.53; P = .003). We also found significant differences among
eHMI Pragmatic Qualities (
χ2 = 27.454, df = 3,
P < .001;
η2 = 0.3218).
Safe Zone (MD = 1.38, IQR = 1.5) had the highest qualities, these were significantly higher than LightRing (MD = −0.63, IQR = 1.88; P < .0001) and No eHMI (MD = −0.38, IQR = 1.56; P = .0001). LightRing had the lowest qualities, these were significantly lower than Emoji-Car’s (MD = 1, IQR = 2.75; P = .005). There were also significant differences between
eHMI Hedonic Qualities (
χ2 = 25.569, df = 3,
P < .001;
η2 = 0.297).
No eHMI (MD = −0.63, IQR = 2.38) had significantly lower qualities than Safe Zone (MD = 0.88, IQR = 2.19; P = .0001), Emoji-Car (MD = 0.75, IQR = 1.13; P = .001) and LightRing (MD = 0.63, IQR = 1.69; P = .006). 3.6.5 Qualitative Results.
We report themes based on the post-study interviews. We conducted an inductive, data-driven, thematic analysis [
6] of the interview transcripts (auto-transcribed by otter.ai
9 and corrected by an author). Transcripts were imported into NVivo
10. One author extracted 42 unique codes from the data. Two authors sorted these into three themes based on code similarity. This was iterative; disagreements were discussed, and codes were remapped until resolved. Themes with two or more overlapping codes were reassessed and combined when necessary. Participant eHMI rankings are visualised in Figure
8; participants ranked
Safe Zone as the best and
No eHMI as the worst.
Theme 1: eHMI colours. Participants spoke about their experiences with colour-changing eHMIs communicating intent. They were comfortable with Safe Zone using red and green: "I would go with conventional colours [...] They are easy to understand. I felt safer" - P14. They felt that the colours were distinguishable and unambiguous; "Red and green. Super, super intuitive. I understood very quickly what was going on" - P2, and preferred colour changes over animation: "LightRing would be my favourite if it used Safe Zone’s colours" - P16.
Theme 2: eHMI animations. LightRing’s animations communicating AV intent were hard to distinguish on the move: "I didn’t have time to concentrate on [the animation] while cycling" - P20. Some participants preferred animation to complement other distinguishable signals: "I think it’s tough interpreting what the car will do through animation alone." - P18.
Theme 3: eHMI state distinguishability. Icons could help eHMIs be more detailed and explicit in their signals. However, participants could not easily differentiate between the emojis in Emoji-Car from a distance. For example, P21 said "I spent more time trying to identify the emoji." and P13 said, "Interpreting emojis from far caused a lot of ambiguity". This could be because they are too detailed and share some features: "There is too much detail in the emojis, so I had to concentrate more. They are very similar. They both use yellow and have a similar shape." - P20.
3.7 Discussion and Design Changes
All designs were versatile; there were no interaction effects between
Scenario and
eHMI conditions in any result. This validates Al-Taie et al.’s [
3] method for designing versatile eHMIs.
However, we found areas where improvement was necessary; just proposing new designs based on cyclist expectations is insufficient, and first-hand interaction feedback needs to be part of an iterative design process. We used our findings to refine each design (see Figure
9).
Controlled Intersection required a lower workload. Cyclists relied on traffic lights, even when eHMIs were present:
"I didn’t see the eHMI, I saw a green light and went." - P15. This was similar to findings with human drivers; cyclists fixated more on traffic lights than nearby cars [
1]. AV position also impacted our results; participants experienced a higher workload, conducted more shoulder checks and were less confident in AV intent when it was behind them when
Lane Merging, but were more comfortable when it was in front of them at
Bottleneck, even though there was no traffic control.
Red/green signals were positively evaluated throughout. The colours were easy to recognise and distinguish; distinguishability is a key AV-cyclist eHMI feature due to the many fast-paced scenarios riders navigate. Most eHMIs use one colour and animation to communicate yielding intent [
8]. However, we found that this hinders distinguishability and may not communicate enough information quickly. Our findings align with Hou et al.’s [
18], where red/green signals performed well for lane merging scenarios. Red/green are also useful for pedestrians [
8,
21], providing common ground for eHMIs accommodating multiple road user types.
However, due to the use of red/green in traffic lights, there is a risk of the signals being misinterpreted as instructions from the AV instead of its yielding intent. Nevertheless, participants were most confident in AV intent when the colours were used in all scenarios, and the signals performed well in negotiation-based scenarios, such as
Bottleneck. Some examples in traffic show that the same colour can convey different meanings (e.g., amber for pedestrian crossings, traffic lights, directional indicators, hazard lights, and on-car blind spot warnings). Human drivers also use hand signals similar to instructive ones from traffic control officers (e.g., waving for ’go’ [
15]) to communicate their intentions to cyclists rather than instruct them [
1]; this effect may extend to red and green. The signal’s perceived meaning may depend on its source, and our results captured this:
"There is no rule telling me to stop. Even if it is red, the car will react to me if I go. A rule tells me to stop at traffic lights, and the lights communicate this rule." - P13. New traffic colours, such as cyan, could be a more effective approach to avoid red/green being misinterpreted [
9], but these were not positively evaluated in our investigation. A longitudinal study with cyclists learning to interpret such signals might show different outcomes. However, our results showed they need suitable contrasts to be distinguishable and effective. A compelling area for future work is investigating suitable contrasting colours and comparing their performance with red/green eHMIs in different scenarios.
All refined designs incorporated red/green signals to enhance eHMI signal recognisability and distinguishability. We recognised the challenge for colourblind riders to differentiate between red and green, so we incorporated animations, patterns, or symbols into our designs to enhance accessibility. We drew inspiration from traffic lights using light positions (red-top and green-bottom) and animations (flashing amber) to convey meaning.
Safe Zone was the most positively evaluated. The eHMI covered a large surface and used red/green signals to communicate intent. Al-Taie et al. [
1] discussed the advantages of using the road as a design space.
Safe Zone led to fewer shoulder checks and reduced the workload. Eye-tracking data showed it was easily visible with quick glances; cyclists spent less time fixating on the eHMI than others. Participants did not pay much attention to the bonnet display used in Safe Zone. They were sometimes unaware of its presence; "There was something on the bonnet? I did not know" - P4. Therefore, we relocated the bonnet display to the roof and replaced the traffic signs with colours synchronised with the projected signals. This spread the signals throughout the AV area and emphasised the idea of having displays in cyclists’ peripheral vision, making it easier for them to process the colours and information. To accommodate colour-blind cyclists, we incorporated patterns on the roof display, using vertical lines for green and crossed lines for red.
Cyclists were slower around Emoji-Car and performed more shoulder checks than Safe Zone. This could be due to the display being on the roof, which currently does not display any signals for interaction; participants were not used to this. They also paid greater attention to interpreting the icons than colours in Safe Zone; eye-tracking data supported this. Qualitative feedback indicated that participants had difficulty distinguishing between emojis, requiring a higher workload. They were also confused by the lightning emoji and suggested an icon more aligned with standard traffic symbols: "I can’t map lightning to anything meaningful" -P1. Therefore, eHMI signals must be easily distinguishable and understandable from a distance. Some participants incorrectly interpreted the top cyan light as a signal of the AV yielding, leading to potentially unsafe actions; P3 mentioned, "I saw the light on top and thought I could pass." The blinking arrow echoing directional indicators proved redundant and ambiguous, as participants were unsure whether it instructed them to turn or displayed the AV’s turn direction.
We simplified
Emoji-Car by keeping it focused on communicating the AV’s intent and awareness. The revised version used red triangles to communicate non-yielding (found in traffic signs, suggesting caution) and green bicycle symbols for yielding. We removed the cyan light and blinking arrow to avoid confusing riders, with the eHMI only communicating necessary information. To address colour-blind cyclists, we relied on icons to differentiate signals. We deviated from Hou et al.’s [
18] findings where AV-cyclist interfaces placed on specific car areas did not perform well for lane merging, as we wanted to investigate roof-placed interfaces visible from around the vehicle, recommended by previous research [
1,
4]. This approach aimed to balance visibility and conformity to existing interface placements, such as taxi signs.
LightRing did not perform well; cyclists did not respond positively to a new colour (cyan) in traffic. Animations imposed a higher workload and were harder to distinguish than colours or icons.
LightRing had a higher complexity; it incorporated features such as synchronising amber lights on the car’s side with directional indicators, navy blue lights to indicate awareness, and animations communicating intent. This proved a hurdle as cyclists preferred a more straightforward interface closer to
Safe Zone.
LightRing’s lights were changed to slowly pulse in green when the AV detects and yields to the cyclist and flash quickly in red when not yielding. Animations complement colour changes rather than being the primary source of information. It also helps colourblind cyclists distinguish between yielding conditions, as the animations (speed-based rather than directional) are easier to differentiate [
9]. Flashing animations are used in traffic, e.g. some pedestrian crossing signs flash before changing state.
LightRing still communicates AV state using cyan, as the signal changes are more apparent with animations and colours, it will not display multiple signals simultaneously, as in
Emoji-Car.
Here, cyan is used to communicate a new message not currently communicated by human drivers. Overall, red/green was a useful colour scheme for eHMIs to communicate easily distinguishable messages about the AV’s yielding intent across various scenarios. More complex messages, such as echoing a directional indicator, only added to the workload of using an eHMI. We adjusted all three designs based on cyclist feedback and behaviours observed in the simulator to evaluate a second iteration in a real-world setting.