1 Introduction
Dance and rhythm games have emerged as one of the most popular genres in consumer Virtual Reality (VR). A prime example is Beat Saber [
14], the best-selling VR title of all time [
5], in which the player wields dual lightsabers to slice targets in rhythm. Dance has also organically emerged on social VR platforms such as VRChat [
48] in various user-driven communities, where people enjoy social dancing and performance [
34].
Despite the success of VR dancing, a fundamental problem remains: Dancing can be difficult, and it is challenging to instruct dancing, without an expert human teacher. In this paper, we propose and evaluate a new solution to this problem, focusing on a setting
The design challenge is how to present body movement trajectories in a way the user can easily understand and follow. [
30] describes the problem as:
“we don’t currently have a mechanism for streaming kinesthetic data into the human proprioceptive system in the same way that we stream audio and visual content” (p. 103).
Expecting a user to simply copy a model is not effective, as the user’s reaction time is limited and the human body has considerable inertia, an issue that Miller calls
kinesthetic lag [
30, p. 121]. Rather, skilled motor control generally requires some way of
anticipating the required movements ahead of time [
2,
40,
49].
Prior work typically uses some symbolic abstraction of the movement that allows for an anticipatory timeline presentation, e.g., as the timeline of arrows indicating dance pad footstep sequences in Dance Dance Revolution [
22] and the sparse key poses of Just Dance [
46]. In some cases, such symbolic presentation is augmented with a model dancer demonstrating full-body movements for the player to mimic (e.g., Just Dance and Dance Central [
43]), Dance Central additionally include a practice mode that teaches the player to read the symbolic notation and execute the signified movements. However, dance game players may find such practice modes tedious or prefer not to switch modes during a social dance game session [
30]. What is missing in dance games and more generally in VR dancing is
a way to instruct realistic, non-abstracted choreography in real time, without a separate practice mode.Contribution: We propose and evaluate WAVE, a new solution to the dance instruction problem. WAVE is a novel anticipatory movement visualization technique, illustrated in Figure
1. The core idea is that the user becomes part of a crowd of virtual dancers performing the instructed choreography with different time offsets, similar to spectators making waves in sports events. In dance pedagogy terms, we use the choreographic device of “canon” [
16] Our evaluation data (
N=36) indicates that WAVE allows users to anticipate movements propagate through multiple virtual dancers, improving users’ accuracy in following the choreography in comparison to following a single model dancer.
2 Background and Related Work
Today, there are multiple different solutions for instructing movement and dance in virtual environments [
8]. Below, we review both dance games and non-game dance learning applications, dividing the discussion into non-VR and VR approaches. For a broader overview of HCI in the context of dance performance and related creative processes, we refer the reader to the review by Zhou et al. [
53].
A popular approach for using computers to instruct dance is to adapt the mimetic method, an established dance-teaching approach [
35]. Most applications using the mimetic method focus on teaching individual moves rather than whole choreographic pieces. Here, we have focused on studies using technologies similar to those available in Oculus Quest 2, the platform used in our application. These technologies include sound, visuals, and motion tracking of the user’s hands and head.
Chan et al. [
7] use a motion capture suit for dance training. The student receives three types of feedback. Firstly, the user’s pose, captured by the motion capture suit, is shown in real time next to an animated 3D model performing the desired movements. Secondly, a report displays the joints in which the player’s movements were incorrect. Lastly, a slow-motion replay allows the user to review their performance.
When focused on teaching specific genres of dance, rather than dance movement generally, studies have tended to follow a similar pattern [
18,
19,
51]. Teachers’ movements are recorded using motion capture, users try to copy those movements, and then the system evaluates their performance. Some studies of this kind have had promising results using Microsoft Kinect for real-time evaluation of full-body movement [
4,
37] and gamified movement instruction using Labanotation [
36].
The teaching approaches above rely heavily on information the user gets after performing; in effect, it is assumed that the user will learn gradually through multiple repetitions. In contrast, we strive to provide foresight of the desired movements so that players have a possibility of succeeding on the first try.
Optimizing instruction for the first try is also what commercial dance games appear to aim at. This is reasonable because optimizing the first-time user experience is of high importance in games [
29,
33] and players have been found eager to skip tutorials [
9]. However, most games simplify the instruction problem by specifying choreography only partially, abstracting away nuances. For instance, Dance Dance Revolution’s [
22] arrows only specify footsteps and the key poses of Just Dance [
46] do not indicate how to exactly transition between them. While there is evidence that dance games can teach dance skills [
27], instructing realistic and nuanced dancing remains has remained non-trivial, requiring added complexity like the separate practice mode of Dance Central [
43].
2.1 Dance Instruction in VR
VR has proven to be useful for instructing dance. For example, hip-hop students appreciated the way VR dance materials simplified movements and made them clear and easy to follow. [
47]. Similarly, Eaves et al. [
12] found that information provided to users should not be too detailed. Feedback based on only four tracked joints worked better than twelve, in that users were unable to extract the relevant information when they were presented with too much data.
Some research indicates that learning dance with a partner can be beneficial, elevating users’ interest in learning dance [
51] and improving performance [
19]. It is no surprise that virtual dance partners are used in multiple VR dance studies. Kirakosian et al. [
21] The study did not measure how effective the method was for learning but the users’ rated their enjoyment as high and most of them anticipated being more confident to lead someone in real life. Senecal et al. [
41] similarly used virtual partners for salsa dance instruction and found that the movement patterns of users without prior dance experience became more similar to that of users with dance experience after using their system. They measured movement patterns using a number of features, including several specifically designed to capture core technical elements of salsa. Studies have also explored using multiple virtual model dancers to support dance instruction, as in the work of Kico et al. [
20]. Here, we extend their work by having the virtual model dancers perform in canon instead of unison to provide anticipation of the next movements.
3 Design
The WAVE prototype evaluated in this study has three lines of dancers, including dancers positioned to the left and right of the user, as shown in Figure
1. Naturally, this is only one of many possible configurations of dancers, and Figure
2 shows alternatives tested during development. Below, we explain our design process.
3.1 Problem Definition
Based on our review of related work, its limitations, we defined two key requirements:
(1)
The system can instruct choreography to the same level of full-body detail as can be instructed outside of VR (ie., in naturalistic dance settings like the studio, stage or street), instead of relying on symbolic and/or abstracted dance notation.
(2)
The user can follow the instructions on-the-fly, instead of having to first engage in a separate learning or memorization phase.
3.2 Design Principles
We derived design principles to help us satisfy the above requirements. Regarding the first requirement, we hypothesized that we should focus on instructing movements through demonstration. Demonstration is prevalent in dance teaching and even most non-dancers have engaged with mimicking demonstrated movements at least occasionally, e.g., during childhood. From this point of view, it is natural to focus on using the moving body to instruct the moving body, i.e., using animated dancer characters as a core visualization element. With earlier screen-based systems, choreography was typically limited by the user needing to face forward to see the screen. However, dance choreography generally involves moving and facing in multiple directions, which called for placing model dancers in multiple positions, rather than than only immediately in front of the user.
To meet the second requirement, the user should be provided with a capability to anticipate/predict the upcoming movements. Executing the movement and timing demonstrated by a model in real-time is impossible; human reaction time is limited and the body has considerable inertia, so movements need to be planned and initiated ahead of time.
These design principles quite naturally lead to the core WAVE design idea of multiple dancers performing at different time offsets, which provides access to full-fidelity demonstrations of complex full-body movements with enough time for users to anticipate and then execute those movements at the target times.
3.3 Dance Style and Content
Our study used an 84-second contemporary dance choreography, designed for beginners. The choreography was designed for us by a professional contemporary dance teacher with over 20 years of experience teaching students of different levels and creating choreography for them. We chose contemporary dance as our focus, as it is relatively underexplored in dance games, compared to styles like hip-hop or other forms of street dance.
3.4 Formations of Dancers
We considered the formation using straight parallel lines most promising for two main reasons. First, with straight parallel lines to the left and right, dancers could see the upcoming moves, even when turned sideways. Second, with the formations using curved lines, some dancers noticed themselves accidentally following movements too early, possibly because the far-future dancers are more directly visible.
Note that our choreographies have the user mostly facing forward and only occasionally turning around and sideways. We do not expect our chosen formation of virtual dancers to be effective for choreography in which the dancer turns to face the back; future work will need to address this limitation, perhaps having a wave coming towards the user from each direction. However, even our present design is more flexible than traditional dance visualizations requiring the user to face a screen.
3.5 VR Technology
We targeted our system for the Oculus Quest 2 standalone headset, both because it is prevalent on the VR market [
24] and because it supports accurate positional tracking of the user’s head and hands. The Quest 2 does not support tracking the user’s feet, but we deemed this an acceptable limitation, as the platform nevertheless has multiple dance and rhythm games, and our test choreography also largely focused on upper body movements.
4 Evaluation
We conducted a quantitative evaluation (
N=36) of our WAVE prototype, comparing against a baseline visualization with a single model dancer showing the movements in real-time. The two compared visualizations are shown in Figure
3.
4.1 Study design
We used a within-subjects design with two experimental conditions (WAVE & baseline), with the visualization type as the single categorical independent variable. Each participant danced the same 84-second choreography The order of experimental conditions was counterbalanced to mitigate the inevitable order effect caused by the participants remembering at least parts of the choreography.
4.2 Hypotheses
We tested two hypotheses about the suitability of the proposed WAVE visualization technique for instructing dance using VR:
H1:
WAVE allows players to perform choreography more accurately than the baseline. As discussed above, following choreographed movements requires the user to be able to anticipate upcoming movements, which WAVE is designed to facilitate.
H2:
WAVE elicits higher subjective assessment of being able to perform the choreography correctly.
4.3 Sample Size
Since our hypotheses are directional we used single-tailed tests. A priori power analysis using G*Power 3 [
13] was used to determine the total sample size necessary. For single-tailed paired-samples $t$-tests, a sample of 27 participants is required to detect a medium effect size (Cohen's d
z = .50) with type I error rate 0.05 and 80% power.
4.4 Participants
36 adult volunteers were recruited among the students and staff of Aalto University, using social media and by having a testing stand on campus. 17 participants were men, 18 women, and 1 preferred to not specify their gender. Mean participant age was 26 (SD = 5.5, min 20, max 43). The participants were somewhat experienced with VR (29 had tried VR before and 5 owned a VR device of their own). The participants were required to be comfortable with light exercise and to believe to have sufficient vision for using the headset. The choreography was designed for dancers without any movement disabilities. Only a few participants had dance experience (19 had no experience, 11 had less than 5 years of experience, and 6 had 5 or more years of experience) or experience playing dance games (20 had 0 hours of experience, 9 had less than 15 hours of experience, and 7 had 15 or more hours of experience).
4.5 Procedure
After the video instructions, the participants put on the VR headset. During the experiment, the facilitator watched the user’s view on the laptop, allowing the facilitator to help the user get into position, if needed.
The VR software prompted the participant to input an ID provided by the facilitator (this ID was not input by the facilitator to avoid having to switch the headset between persons, for hygienic reasons). The participant was then asked to calibrate their height by standing straight and clicking on a virtual button; the height was used to scale the virtual dancers to make the visualizations more appropriate for each participant’s body. The system displayed text instruction to click a “Done” virtual button after completing the calibration.
The participant then danced in both experimental conditions. At the start of each condition, the system instructed the participant to move to a marked position. Once the user was in the correct position, the system prompted the user to click a virtual button to start the choreography. After performing the choreography, the participant filled the per-condition questionnaire.
After completing both experimental conditions, the participant removed the VR headset and filled in the final questionnaire.
4.6 Data Collection
The following data was collected:
•
Demographics: age, gender, VR experience (has used before? owns a device?), dance experience (years of practice?), experience with dance games (total estimated hours played? which games?).
•
During dancing: the rotation and translation of the player’s head and hands for each game frame were tracked using the VR headset and hand trackers. This data was collected to allow for quantitative comparison between the player’s movements and the desired choreography (see Section
4.7).
•
At the end of each experimental condition: Users were instructed to indicate how they felt about their performance using two sliders: “I felt I was able to perform the choreography correctly” and “I felt I was able to time my movements correctly”. The sliders used a range from 0% to 100% and the order of the two items was randomized for each participant.
•
Final questionnaire: The participants were asked which of the two game versions was their favourite and to give justification for their choice. They were also asked for any additional comments or feedback.
Our primary interest in this study was to test whether anticipatory visualizations support users in accurately following the model choreography. While building the prototype, we observed that slow movements are relatively easy to follow, even without extra visual aids. The first part of the choreography used in this experiment only included slow movements, which are less appropriate for testing our hypothesis. Further, first-time users may need time to get used to the visualization and position themselves. For these reasons, we excluded the very slow start of the choreography from our analyses. After this exclusion, 47 seconds of data remained for each participant.
4.7 Methods: How to Measure Movement Accuracy?
Our goal was for users’ movements to accurately reflect the provided choreography, so we considered high error between the target movement and the user’s actual movement as indicating low accuracy. We measured error in two different ways:
•
Position-based movement error, the mean Euclidian distance between the tracked head and hand positions and their choreographed target positions, measured every frame. In the WAVE condition, the user’s goal is to move as the last dancer of the middle line, as shown in Figure
1 (communicated to the user as described in Section
4.5). Thus, the target timing corresponds to the dancers on the user’s left and right. In the baseline condition, the target timing corresponds to that of the single model dancer.
•
Direction-based movement error, the mean cosine distance between tracked and choreographed body-part, measured every frame,
4.8 Results
Position-based Movement Accuracy. A paired-samples
t-test indicated that WAVE (
M = 0.47,
SD = 0.08) resulted in statistically significantly lower error than the baseline (
M = 0.50,
SD = 0.08);
t = −2.18,. The effect size, as measured by Cohen’s
d, was
d = 0.36, indicating a small effect. Boxplots of the data are shown in Figure
4 A.
Direction-based Movement Accuracy. A paired-samples
t-test indicated a statistically significant difference in direction-based movement error, with WAVE (
M = 0.40,
SD = 0.04) resulting in lower error than the baseline (
M = 0.43,
SD = 0.04);
t = −4.87,. The effect size, as measured by Cohen’s
d, was
d = 0.81, indicating a large effect. Boxplots are shown in the Figure
4 B.
Subjective Performance. We used two virtual sliders at the end of each experimental condition to collect data about the participants’ subjective assessment about movement and timing accuracy. Paired-samples t-tests indicated no statistically significant differences between WAVE (Movement: M = 47.39, SD = 22.87; Timing: M = 48.42, SD = 23.03) and the baseline (Movement: M = 46.05, SD = 21.76; Timing: M = 48.05, SD = 22.65); Movement: t = 0.40, ; Timing: t = 0.10,.
Preferred Visualization. In the final questionnaire, participants were asked which approach they prefer. 20 participants preferred the WAVE approach while 16 preferred the baseline. We observed a clear order effect: 78% of the users preferred the approach they tested later.
Additional analyses. The effect of WAVE on anticipating upcoming movements is visualized in Figure
5. The figure shows how the direction-based movement error changes when the choreography is shifted in time. With the baseline condition, error is minimized with a shift of 0.5 seconds, indicating that the users follow the choreography 0.5 seconds late, on average. With WAVE, users perform slightly ahead of the target time, on average.
5 Discussion
5.1 Summary of Results
Our results suggest that WAVE provides a potentially useful visualization approach for VR dance designers. Supporting H1, both the position-based and direction-based movement error analyses indicate that users can match the choreography better when using WAVE than when using the baseline visualization (Figure
4). The effect is small for position-based movement error, but large for direction-based movement error. The majority of participants (20) also preferred WAVE over the baseline. The subjective performance ratings are inconclusive, however, providing no support for H2. This should be investigated in future work, although it may be that the subjective data is simply more noisy than the objective movement-based measures.
5.2 Dancing Ahead of Time
Fig.
5 clearly shows that users are late in following choreography with the baseline visualization, as expected. More surprisingly, with WAVE, the users perform the choreography slightly ahead of time.
We hypothesize two explanations for this. First, in both the user study and the initial testing of different dancer configurations, we noticed that users occasionally tried to follow the “future” dancers instead of the dancers closest to them, which affects Fig.
5 to some degree. We hypothesize that this is an artefact of the user study focusing on first-time use; in our own experience, one may at first instinctively copy the “future” dancers when they make larger and faster movements that steal one’s attention.
Second, it may be that at least some users synchronize their movements with the dancer directly in front of them, instead of the dancers to the left and right, which only become the focus of attention when the choreography requires one to turn sideways. We did not explicitly ask our participants to synchronize with the dancers to the left and right, or to add a small delay in relation to the dancer in front of them. In the tested WAVE version, the correct delay would be 0.7 seconds.
Here, it should be noted that different choreographies and movements might require different formations of virtual dancers, e.g., in all directions around the user.
Making users feel competent is important to facilitate enjoyment and intrinsic motivation [
6,
31,
39,
45]. In addition to providing encouraging feedback, another way to support competence could be by manipulating the user’s perception of their own movements so that they appear more capable, e.g., through exaggerated jump height and flexibility [
15,
17,
28]. In our system, the user does not have a visual avatar except for small indicators of their current hand positions. In future work, an avatar could be visible in a mirror, which would reflect the real-life experience of many dance studios.
5.3 Wider Applicability
Presently, WAVE is designed for a single user. However, we could imagine applying WAVE in a setting like social VR, allowing dancers to emit their movements as waves that other users can try to follow. This might also mitigate the latency problems inherent in social VR dancing, for example, by matching the wave propagation time between two users to one musical bar, so that even though the “follower” is delayed with respect to the “leader”, the movements of both would feel right with the music.
5.4 Methodological Limitations
We acknowledge that our choice of baseline only allows us to conclude that the WAVE visualization helps in timing and performing movements compared to not using any assistive visualizations at all. It does not allow determining whether WAVE is better than some other visualization technique.
We also tested WAVE with only one choreography, in one specific style of dance. In our own opinion, WAVE works best for relatively slow and continuous movements, whereas the fastest parts of our choreography feel less easy to follow. Hence, it may be that WAVE does not work for some other dance styles, though we hypothesize that careful timing of the wave propagation may support faster movements and should be explored in future work.
6 Conclusion
We have proposed and evaluated WAVE, a new VR movement visualization technique aimed at solving the on-the-fly dance instruction problem. We build on a metaphor of the user being part of a crowd making a wave in a sports event—we use multiple model dancers with different time offsets, allowing the player to both mimic the movements of a model dancer close to them and anticipate future movements through seeing other dancers perform those movements ahead of time. To minimize visual occlusion and allow the use of peripheral vision, we render multiple lines of dancers at different locations.
Our study comparing WAVE against a baseline (
N=36) provided evidence that WAVE helps users anticipate upcoming movements and perform choreography more accurately, particularly in terms of more-closely matching the velocities of the head and hands as choreographed (e.g., direction-based movement error in Section
4.7). In future work, it should also be possible to extend WAVE to multi-user social VR dancing, e.g., by allowing dancers to emit their own movements as waves for other dancers to follow.