1 Introduction
Over the last three decades, devices that deliver immersive, digital experiences like
Virtual Reality (VR) and
Augmented Reality (AR) have reduced in size from bulky hardware [
54,
76] to today’s consumer-friendly devices (e.g., Oculus Quest 2, Microsoft Hololens 2). Nowadays, it has become easier to provide great experiences and immersion in a variety of different professional [
15,
41] or social settings [
184,
321]. In the past, many of these experiences were created around specific manifestations of the Reality-Virtuality Continuum [
199], meaning they are limited to concrete technology classes. Here, examples include training in VR [
80,
100,
174], enhancing the real world with AR [
2,
81,
175,
250], and vice versa enhancing virtual environments with parts of the real world using
Augmented Virtuality (AV) [
36,
192,
211]. However, due to recent technological advancements, experiences are not limited to concrete manifestations anymore. Users can interact across different manifestations (e.g., a novice user in AR on-site gets support from a remote expert in VR [
41]), or they can transition along the continuum and thereby experience different manifestations (e.g., a book that allows users to transition between reading and experiencing its content [
29]). Systems that power such experiences are called cross-reality systems [
273] as they involve different or changing actualities—meaning the manifestations that users experience can differ (e.g., one AR and one VR user) or users experience that their actuality is changing over time (e.g., an AR user is transitioning to VR).
Today, we see a trend toward cross-reality systems and research. While these systems provide great opportunities for novel experiences, they also introduce tremendous complexity. The complexity of these systems roots in the many users and their actualities, the possibility of bystanders, the different physical objects involved (e.g., keyboards in VR [
266]), and the surrounding environment that may be involved in the experience (e.g., walls in VR [
180] and physical forces from in-car VR [
118]). This highlights the uniqueness and complexity of cross-reality systems, making them hard to describe and compare. With clear terminology, researchers could compare existing cross-reality systems more easily, while design and implementation rules can guide developers and practitioners through their development process. This would allow a wider range of groups to contribute to the emerging field of cross-reality systems and fosters a shared understanding among all involved groups and communities. However, a common language is not yet well established. Thus, it remains challenging how to formalize, interpret, and compare cross-reality systems.
How can we align the language across communities and establish a solid foundation for future work that benefits both researchers and practitioners?
Motivated by this overarching question, we extract three sub-questions, which we will answer in this work. First, we investigate: How to define the terminologies in the field of cross-reality systems? (RQ1)—allowing for a common language. Second, we pose the question: Which design and implementation aspects of cross-reality systems form fundamental principles? (RQ2)—allowing to categorize current and future systems. Lastly, we go beyond past and present by targeting the challenges ahead. Here, we ask: What are the future trends of cross-reality systems? (RQ3)—allowing us to support designers and practitioners in developing the next generation of cross-reality systems.
To answer our research questions, we conducted a scoping literature review that investigates cross-reality systems. We identified 306 papers as relevant and analyzed them to provide insight into the current state of cross-reality research. First, we gathered terms and concepts provided by previous research and present a definition of cross-reality systems that distinguishes between three different types (multiple types can apply to the same system): Type 1 (Transitional): subjects transitioning on the continuum experiencing a changing actuality; Type 2 (Substitutional): subjects interacting with objects that are repurposed for the subject’s actuality; and Type 3 (Multi-user): multiple subjects experiencing different actualities. Thereafter, we build up our literature corpus and analyze the introduced systems, following our three types of cross-reality systems. Our analysis reveals these systems are increasingly complex, often using implicit transitions that are hard to comprehend. Next, we present nine guiding principles extracted from previous findings that can guide researchers and developers while building cross-reality systems. Each principle addresses one of the three types of cross-reality systems and provides supportive studies. We conclude our work with research challenges and opportunities for future investigations of cross-reality systems.
Contribution. In this work, we propose definitions for cross-reality systems, categorizing them into three types. Furthermore, we present the results from an analysis of 306 cross-reality systems proposed in previous work, including the addressed research topics, involved actualities, and transitions. We postulate nine guiding principles that formalize the findings from previous studies to help researchers, developers, and practitioners to build better systems. Finally, we conclude with future research challenges and opportunities.
2 Cross-reality Systems
Immersive technologies such as AR and VR allow users to engage in digitally alternated or synthesized realities. However, these technologies can isolate their users (e.g.,
head-mounted display (HMD) users) [
258] and exclude bystanders (e.g., non-HMD users) [
14,
105,
106]. To tackle these issues, a new research direction has formed—cross-reality systems [
273]—that aims to enable interaction across different degrees of virtuality along the Reality-Virtuality Continuum [
199].
In this work, we present a systematical review of cross-reality systems proposed in previous literature. However, as this research direction has formed recently, a fundamental terminology is not yet established. Thus, we first introduce existing terminology required to understand cross-reality systems (cf.
Mixed Reality (MR) [
278]). Thereafter, we contribute new terms to the existing terminology that allow the classification of these systems and their interactions in a more structured way. Similar to other research [
12,
278], we believe structuring the young field of cross-reality systems and introducing common terms helps future researchers, designers, and practitioners entering the field to compare cross-reality system research and develop novel experiences more easily.
2.1 The Reality-Virtuality Continuum
At the time of writing, almost 30 years have passed since Milgram and Kishino introduced the Reality-Virtuality Continuum in 1994 [
199]. Up to this point, the work has had a profound impact, coining terms that are frequently used in the field. According to
Google Scholar, the work has reached over 8,000 citations, which highlights its impact. During the last 3 years working on this survey, the article’s citations increased by over 3,000, demonstrating the rapid growth of interest in the wide range of related research topics and applications that can be classified using this continuum.
The Reality-Virtuality Continuum that spans between
reality and
virtuality allows the classification of different degrees of virtuality. On this continuum,
reality refers to the real world, in which every entity is real and subject to the laws of physics. On the other end,
virtuality refers to virtual environments, in which every entity is digital and generated by a computer. Certain degrees of virtuality can be referred to as manifestations [
199,
200] such as AR and AV. These manifestations allow one to refer to technology classes and the corresponding form of the generated experience that have been frequently researched in previous work and implemented in consumer devices. Each point on this continuum between
reality and
virtuality refers to a degree of virtuality, which incorporates a different amount of virtuality depending on the position on the continuum. Milgram and Kishino refer to all degrees of virtuality that are not the two extremes as MR.
2.2 Manifestations of the Continuum
Along the continuum, there are different areas that represent concrete technology classes, which we refer to as manifestations (e.g., AR [
200]). Theoretically, infinite manifestations could exist; however, only a few are distinctive enough to be frequently used in literature. In the following, we discuss these well-known manifestations. However, it should be noted that the Reality-Virtuality Continuum does not inherently define concrete locations or ranges to describe these manifestations. Instead, it specifies where they are positioned relative to one another [
199,
200].
Augmented reality (AR). AR alters reality by overlaying digital information. Superimposing information empowers users to interact with virtual objects within the real world [
200]. Thus, AR is the manifestation closest to
reality, as it results in users perceiving the physical environment to a stronger degree than they do virtual aspects. According to Azuma et al., AR has three characteristics that need to be fulfilled: AR (1) combines real and virtual elements, (2) is interactive in real time, and (3) is registered in 3D [
20]. A persistent challenge of AR systems is using and interacting with physical objects [
152,
343], which is of particular interest for cross-reality research.
Augmented virtuality (AV). In AV, users are immersed in a virtual environment; however, parts of reality are incorporated into the digital experience [
192,
200]. In comparison to AR, AV relates more to the virtual environment, while AR relates more to the real environment. With the support of see-through modes in current VR devices, AV has recently gained popularity and is, for example, used to configure the play area for the latest VR devices.
Virtual reality (VR). In VR, users experience an entirely virtual environment with as little interference from the real-world environment as possible. This digital world is not directly bound to the laws of physics and, therefore, can exceed these boundaries [
199]. Although one could argue that VR represents virtuality on the continuum, current VR experiences do not completely immerse the user into a virtual environment and, thus, do not represent
virtuality. For example, users may bump into walls or get motion sickness if the real-world and VR experiences do not align. Hence, we understand VR as a part of MR. VR can be seen as a mode of reality that exists together with the physical reality to provide its users new forms of experiences [
333].
Mixed reality (MR). MR is not a term describing a particular manifestation on the continuum; instead, it represents all possible manifestations on the continuum that involve both
reality and
virtuality to some extent. In other words, every experience that lies between
reality and
virtuality is considered to be MR [
198,
200]. Three years ago, Speicher et al. [
278] published a paper addressing the following question:
“What is Mixed Reality?” They conducted interviews with experts and analyzed 68 related papers, finding that different definitions of MR exist. Hence, in our article, we use MR as an umbrella term that represents all manifestations of the continuum, such as AR, AV, and VR. Furthermore, four experts interviewed by Speicher et al. stated that “five or ten years from now, we will not distinguish between AR, MR, and VR anymore.” In other words, there could be one merged category of devices that supports different manifestations. In the future, this category of devices will form the ultimate
cross-reality systems.
2.3 Actualities
Some cross-reality systems allow for seamless transitions on the continuum, for example, to allow users to transition from the real world into VR [
137,
258,
284] or to integrate parts of reality into their VR experience [
59,
111,
192]. Here, the existing term
manifestation is too inflexible to reflect such experiences and, more importantly, does not allow to describe changes in these experiences over time. Moreover, reality and virtuality are used to describe the extremes, and thus, their use to describe such experiences could be ambiguous (e.g., the user’s reality). Thus, we argue for the term
actuality to depict the currently experienced reality of a user. The term
actuality goes back to the concept of
potentiality and actuality introduced by Aristotle [
260]. In short, Aristotle stated that potentiality is a not-yet-realized possibility of all possibilities that can happen, and actuality is the realization of a specific potentiality—the actual thing that became real. The English word
actuality is derived from the Latin word
actualitas, which translates to
in existence or
currently happening. Thus, an actuality describes the
current reality—the things that currently seem to be facts for a user. In the context of reality and virtuality and all their combinations, we can use the word
actuality to describe the actual experience of a user. For example, we can consider two users—one using VR and one just standing nearby. The actuality for the VR user would be a virtual, digital experience, while for the bystander, the actuality is just reality. Here, two actualities exist, whereas each actuality is described by one point on the Reality-Virtuality Continuum. Moreover, when a user transitions, for example, from reality to VR, we can say that the actuality of this user changes over time. We use
actuality as the universal term to refer to the individual experiences that users of cross-reality systems are having at a specific point in time. Our definition aligns with Eissele et al., who suggests using
actuality to describe virtual experiences [
68].
2.4 Subjects and Objects
Cross-reality systems involve different entities: subjects and objects. The difference between both entities is that subjects have ways of perceiving their environment, while objects have no perception (e.g., a user, bystander, or animal would be a subject, while a table, keyboard, or vacuum cleaner would be an object). Hence, subjects can experience their environment; an actuality that describes their current experience exists. However, besides this difference, subjects and objects also have attributes in common. Primarily, both can exist physically in the real environment, digitally in the virtual environment, or simultaneously in both environments. In previous work, researchers focused mainly on the role of subjects in cross-reality systems. Nevertheless, we believe that objects also play an important role (cf. Section
2.5).
2.5 Definition of Cross-reality Systems
Simeone et al. categorized cross-reality systems into two types that either involve (1) a smooth transition between systems using different degrees of virtuality or (2) collaboration between users using different systems with different degrees of virtuality [
273]. Following this definition, the role that objects can play in cross-reality systems is somewhat neglected, as the definition focuses on the perspectives of the subjects. Nevertheless, the interaction between subjects and objects should be considered in cross-reality systems as well, especially if the object is not intended purely for the subject’s actuality but instead was repurposed and integrated into the user’s experience (substitutional reality). Following this definition, a haptic prop specifically designed for a VR experience should not be considered a cross-reality system; however, if a real-world object such as a vacuum cleaner is repurposed for a VR experience, we consider it a cross-reality system (e.g., Wang et al. [
315]). Therefore, we distinguish three different types of cross-reality systems, which can be defined through the following definition.
3 Review Method
This scoping review [
233] presents the first compilation of a literature corpus that analyzes cross-reality systems and interactions. While the first publications describing cross-reality systems appeared recently (e.g., for the design space of transitional interfaces [
313]), they focus on specific types of cross-reality systems and do not provide a holistic overview of the topic. Following our definition of cross-reality systems, we considered a broader range of literature that focused on research involving:
(i)
A
subject changes its actuality (e.g., a user transitions into VR [
29,
30]):
Type 1 (Transitional).
(ii)
There is an interaction between at least one
subject and at least one
object that is repurposed for the current
actuality (e.g., a physical keyboard brought into VR for typing [
192]):
Type 2 (Substitutional).
(iii)
There is an interaction between at least one
subject and at least one other
subject, experiencing different actualities each (e.g., users collaborate using AR and VR [
41]):
Type 3 (Multi-user).
An initial investigation revealed that a systematic search-term-based literature review (e.g., PRISMA
1) would not be possible, as terms to describe cross-reality systems are not yet fully established. Furthermore, relevant aspects are often hidden within a research prototype or system, are a smaller part of a broader research agenda, or seem too marginal for the scope of the corresponding publication to be described by the authors. An example would be the paper from Ruvimova et al. in which a user is distracted by the noise of an open office space and, therefore, transitions into VR for an isolated experience [
258]. Here, the developed system was not explicitly described as a cross-reality system; however, it is an intrinsic part of the approach. Hence, to present the most complete literature corpus, we individually screened our initial literature set manually.
For our literature review, we performed the following steps (see Figure
1):
(1)
We started by manually going through the proceedings from 2015 to 2022 of the five leading conferences in which related cross-reality system papers were published (in parentheses: corresponding publication count): ACM CHI (5131), ACM UIST (767), ACM VRST (627), IEEE VR (1,539), and IEEE ISMAR (373). The corresponding digital libraries account for 8,437 entries for these venues in the given time frame. All authors together checked the title of each paper to identify off-topic research. We considered only full papers, while other types of publications were excluded (e.g., workshop publications, demos, and posters).
(2)
We then individually read the abstracts (and further sections if necessary) of all remaining publications to identify if the publications fit the scope of our literature review (meaning the three inclusion criteria hold; see Figure
1) and gathered them in a spreadsheet similar to Doherty and Doherty [
61]. If the relevance of a publication was not clear to the screening author, it was discussed with all authors and a mutual decision was made. In total, we identified 160 papers that are relevant for this review.
(3)
After that, we looked at all references and all citing papers of the already gathered literature to identify further relevant papers, an approach that others have also applied, e.g., Katsini et al. [
146]. We applied this process recursively, going through the references and citing papers of newly added ones until we could not find any more relevant publications. In this step, we went through 11,465 references and 13,620 citations and found 103 additional referenced papers and 43 additional cited papers (n = 146).
(4)
In total, we found 306 relevant papers describing a cross-reality system, which we further classified to extract their core features and identify common themes.
The initial literature corpus was compiled using Google Scholar as the main search engine for citing papers while also relying heavily on the IEEE DL and ACM DL. At this point, it is worth mentioning that this strategy does not guarantee one will identify all relevant papers. We screened a tremendous number of publications, and while our literature corpus grew substantial in size, there is a chance that we missed some relevant publications due to human error. However, strict database queries suffer from similar issues, especially when the terminology of the research field is unclear or not yet fully established. Therefore, we argue that our approach was able to identify more relevant research publications than an automatic approach.
The final publication corpus (n = 306) served as the basis for understanding the interplay among different subjects and their actualities and corresponding objects that manifest across the Reality-Virtuality Continuum. For the publication corpus, we went through all publications and identified important features relevant to this survey to obtain a holistic view of the review corpus. Here, we identified features like the research topic and keywords that briefly describe the given research and involved scenarios as well as the purpose of the scenario (e.g., collaboration, leisure activity). Furthermore, we categorized the scenario together with involved subjects and objects. Therefore, we identified and quantified the involved entities (e.g., users, objects/artifacts) and how they were integrated into their scenarios (e.g., real-world objects brought into VR). Further, we extracted the form-factors (i.e., type of used devices) and modalities (i.e., visual, auditive, or haptic). We then identified how different entities relate to one another across the Reality-Virtuality Continuum and how they manifest on the continuum (e.g., VR, AV, AR). A complete version of our literature corpus, including a classification concerning different features, can be found as supplementary material.
Descriptive Summary of Literature Corpus. Over the last decade, we see a clear uptick of publications proposing cross-reality systems (see Figure
2(a)), indicating a growing interest in the research community. While the publication count before 2015 may be inaccurate because we did not screen conference proceedings before that year, a clear trend between 2015 and 2022 remains recognizable. Nevertheless, in 2021, a dip in publications is observable, which is likely an artifact of the global Covid-19 pandemic, as in the year after, the publication count recovers. Furthermore, besides the identified five leading conferences, we identified the IEEE journal
Transactions on Visualization and Computer Graphics and the ACM SIGGRAPH conference as highly relevant venues (see Figure
2(b)). Finally, our corpus revealed that a few authors have around 10 publications published on the topic already. Here, Mark Billinghurst is taking the lead with over 20 publications (see Figure
2(c)).
5 Analyzing Changing Actualities in Cross-reality Systems
When using a
Type 1 system, the actuality of a user changes over time due to a transition along the Reality-Virtuality Continuum. However, numerous systems in the literature are not introduced as cross-reality systems, nor are the transitions highlighted in particular because the presented research did not investigate the cross-reality aspects in itself but, for example, topics like user perception [
254] or collision avoidance [
1]. Therefore, we conducted an in-depth analysis of the literature to find
Type 1 systems and corresponding transitions that are not obvious to readers. We identified 118 relevant publications that introduced systems that changed the actualities of its users. Continuing our overview presented in Section
4.1.1, we present our in-depth analysis of these transitions in the following. First, we analyzed the involved manifestations in the described systems (see Section
5.1). Here, we limited ourselves to the distinct manifestation previously introduced: VR, AV, and AR, including transitions involving the
RealWorld (RW). Thereafter, we identify the cause of these transitions (see Section
5.2). Finally, we conclude with a summary (see Section
5.2.9).
5.1 Transitions between Manifestations
As seen in Table
8, subjects transition along the Reality-Virtuality Continuum from and to various manifestations. Here, the perception of the transition is dependent on the perspective of a subject—the actuality (e.g., a VR user experiencing VR or a bystander experiencing reality). For example, a bystander could walk by a VR user and is shown to the VR user in the virtual environment when being close [
192]. The bystander’s actuality does not change as the bystander still perceives the RW while crossing the area around the VR user. However, the VR user sees the bystander in the virtual environment; therefore, the VR user’s actuality changes with a transition from VR to AV. This is because the virtual environment is augmented with objects from the real world and therefore is no longer purely virtual, in this case, with the bystander. In the following, we introduce the different manifestations involved in the transitions that we found in the literature.
5.1.1 Transitions to Real World.
We found eight (2.61%) publications that involved a transition to the RW. Here, taking a glimpse at a bystander while being in VR results in a transition from VR to the real world [
40]. This can be useful when immersed VR users want to interact with surrounding persons for a brief moment. To avoid collisions when using AR obstacle detection and accompanying alerts that make users aware of these obstacles forms a transition from AR to the RW [
141]. When taking the VR-HMD, and thereby transitioning to the RW, users report that they, for example, felt disoriented [
157]. Therefore, gradual exit procedures could help VR users to exit their virtual experience more comfortably and safely. Likewise, one could use metaphors like a door to the real world to exit virtual experiences [
277].
5.1.2 Transitions to Augmented Reality.
We identified 12 (3.95%) publications that investigate switches from the RW to AR. Editing the real world with AR’s help can be seen as a transition from a real environment to AR [
338]. Likewise, overlaying virtual objects onto real ones lets a user transition from RW to AR as soon as the overlays are brought into place [
117]. Also, sharing content with a bystander can be seen as a transition from the RW to AR [
112]. Here, the bystander is the transitioning subject.
5.1.3 Transitions to Augmented Virtuality.
Overall, we found 60 (19.74%) publications that involved transitions to AV. The most common transition within the type are publications investigating transitions from VR to AV (54, 17.76%). Bringing in real objects like a cup for drinking, a keyboard for typing [
192], or a smartphone [
59] when needed depicts a transition from VR to AV. Also, integrating approaching bystanders into the virtual world in order to create awareness or foster interaction results in a transition from pure VR to AV [
305] or when actively interacting with them [
104]. Further, while in VR, partially showing the RW would result in a transition from VR to AV [
111]. Further, transitions from VR to AV can occur in a non-obvious manner and often rely heavily on the visual sense. For example, for two users that use redirected walking to meet each other to shake hands while being immersed in VR [
201], as soon as they are redirected toward each other and shake hands, their VR is externally influenced through the handshake, which is part of the real world. In this case, they transition for a brief moment from VR to AV. Additionally, we found five (1.65%) that investigated transitions from the RW to AV. Here, a bystander could enter a VR user’s experience and thereby augment the virtual experiences with their appearance [
308].
5.1.4 Transitions to Virtual Reality.
In sum, we found 37 (12.17%) publications that involved transitions to VR. We identified 10 (3.29%) publications that investigate transitions from AR to VR. Users could start in AR and then, for example, decide to transition to VR [
254,
256], to exchange information between the two manifestations [
253], or to collaborate [
99]. Further, we identified 20 publications (6.58%) involving a transition from RW to VR. For example, Steinicke et al. introduced an approach for transitioning into VR through a portal metaphor. They provided a portal from the real environment to VR to the user. The user could enter the portal to enter the virtual environment [
284]. Also, it could be shown that a smooth transition into VR helps the user to create awareness of the virtual environment [
303].
5.1.5 Transitions to Multiple Manifestations.
We found eight (2.63%) publications that focused on interfaces for transitions along the whole continuum from the RW to AR, then further to AV, and finally to VR. In these scenarios, users transitioned step by step from the real world to the virtual. Each step involved different objects or actions taken by the user [
255].
5.1.6 Summary.
We investigated 118 publications that introduce transitions on the continuum and identified involved manifestations. We found that most transitions (54) are from VR to AV, followed by transitions from the real world to VR (20). Some transition categories are underrepresented, like transitions from AR to the RW or from AR to AV. Moreover, the presented transitions can be non-obvious at first (e.g., VR users transitioning to AV when they meet and shake hands [
201]).
5.2 Causes of Transitions
Transitions on the Reality-Virtuality Continuum can have different causes. We identified several causes for transitions (see Table
9). In the following, we introduce these causes in greater detail.
5.2.1 Substitution of Physical Object.
We found 26 (8.55%) publications that substituted physical objects with virtual ones. For instance, providing a realistic walking experience and at the same time enhancing VR can be accomplished by constantly scanning the real-world environment and adapting the virtual world accordingly to let the user walk in the automatically generated world [
44]. Here, the user transitions from VR when not adapted to AV when the virtual world is adapted to the surrounding physical environment; in other words, the physical environment is substituted by the virtual environment. Furthermore, real-world objects can be substituted to provide haptic feedback to virtual objects that share similar haptic properties [
117].
5.2.2 Change Actuality.
We found 22 (9.62%) publications that introduce transitions on the continuum that are deliberately caused by the user to access virtual objects or to enter a virtual environment. Such transitions can enhance presence [
284]. For example, when entering a virtual environment, transitioning gradually from the RW to VR makes users feel more presence [
137]. This can be accomplished by gradually blending out real-world objects and at the same time blending in the virtual environment. Users may also exit VR, which causes a transition from VR to the real world. Here, Knibbe et al. investigated which factors influence transitions out of the virtual experience [
157]. The results pointed out that the virtual experiences influences the users beyond the point of exit and therefore need further consideration. To exit virtual experiences, metaphors like portals [
308] or curtains [
161] can be used to indicate the possibility of a transition between VR and the RW. Traversing on the continuum can be accomplished by different user actions or using objects [
255].
5.2.3 Bystander Inclusion.
Including bystanders can also be a cause for transitions. We identified 21 (6.91%) publications that investigate transitions caused by bystanders. For example, a transition from the real world to AV can be caused if the bystander enters the tracking space of a VR user [
305]. Here, the bystander is integrated visually into the virtual environment. A bystander could also cause a transition from the real world to AR when projections are used to give access to the virtual content that an AR user experiences [
112]. Breaking the VR isolation can be done by enabling bystanders to interact with the VR user [
104]. Here, the bystander can actively participate in the VR user’s activity and influence the virtual environment. In this scenario, the VR users transition from VR to AV when interacting physically with the bystander. From the perspective of the bystanders, they can see floor projections in the RW and can use a display to enter the virtual experience, which also can be seen as a transition from the RW to VR. Other ways to include bystanders into virtual experiences utilize audio to allow for communication between VR users and bystanders [
224].
5.2.4 Interaction with Physical Object.
We found that most transitions occur due to interactions with physical objects. Here, we found 19 (6.25%) publications. Interaction with the real world can cause transitions, for example, from VR to AV [
192]. Users transition when they want to drink or eat something while experiencing VR [
37]. Further, we found that the usage of an external device causes transitions [
59]. Users could check a smartphone for messages [
3] or using a tablet [
125]. For using a smartphone, one could capture it in the RW by video. Then, the smartphone can be cropped out of the video feed and presented to the VR user. This augments the VR experience, making it AV. Similarly, using a physical object such as a keyboard in VR constitutes a cause for a transition [
266]. Here, the VR user is transitioning from VR to AV when using the keyboard.
5.2.5 Collision Avoidance.
We found 10 (3.27%) publications in which obstacle avoidance caused transitions of users. Providing such safety features can cause transitions along the continuum, like creating awareness of obstacles in the VR user’s proximity [
140,
322]. Modalities other than the visual were also investigated, e.g., auditive feedback, which lets the user transition out of VR to AV as the virtual environment is augmented with auditive warnings of real-world objects [
1].
5.2.6 Collaboration.
We found eight (2.61%) publications in which the cause for a transition was the collaboration among users. Often, collaborators transition from AR to VR when creating a collaborative solution [
99,
156,
179]. For instance, they shape a maze in AR and then use the created maze to play a game in VR [
179].
5.2.7 Providing Haptic Feedback.
We found eight (2.63%) publications that introduced transitions when providing haptic feedback. For example, to enhance typing in VR, one can integrate a physical keyboard [
102,
159] or smartphone [
114]. Users also transition when using physical objects around them to mimic the haptics of virtual objects, for example, through haptic retargeting [
232].
5.2.8 Interacting Virtual Object.
We identified four (1.32%) publications that introduce transitions that allow for the interaction with virtual objects, for instance, when a real-world environment is scanned and edited in AR [
317]. Further, a transition can be caused when combining a physical environment with a virtual one [
55], or when the real environment is occluded, a user could use a virtual copy of the same to get a better overview [
23].
5.2.9 Summary.
We investigated 118 publications that introduce transitions on the continuum and identified their corresponding transition causes. We found that most transitions (26) occurred when physical objects were substituted in virtual experiences, for example, to design virtual environments on the basis of the physical world [
275]. This is followed by 22 publications that introduced transitions that occurred when there was the need to deliberately change the actuality, for example, when leaving a virtual experience [
157,
277]. The third highest cause of transitions was bystander inclusion into the virtual experience, with 21 publications. Here, bystanders were brought into the virtual experience of, for example, a VR user to create awareness of their presence, thereby making the VR experience an AV experience [
305].
7 Research Challenges and Opportunities
Based on our literature review, it is evident that there has been an uptick in research around cross-reality systems (cf. Figure
2). In recent years, we can see a strongly increasing interest in this topic, with larger numbers of actualities involved and a trend toward more dynamic actualities that frequently change over time. Our literature review revealed that it is difficult to identify relevant research, especially
Type 1 (Transitional) cross-reality systems, as occurring transitions on the continuum are often not the focus of the work. Thus, they are not prominently described (see Section
7.1). Further, we found that cross-reality systems can become rather complex due to the different perspectives involved (see Section
7.2). Moreover, we identified that current cross-reality systems partially neglect AR devices (see Section
7.3) and a trend toward AV solutions becomes visible (see Section
7.4). To address the increasing complexity of cross-reality systems, we conclude this section by discussing novel prototyping methods of cross-reality systems as an opportunity to make the field more inclusive and allow for quicker iterations (see Section
7.5).
7.1 Implicit Transitions
Many of the surveyed papers contain transitions on the continuum, meaning they change users’ actuality over time. However, the presented evaluations did not or only vaguely investigated the transition, in particular, cf. [
91,
183]. Often, authors do not explicitly describe the transition that takes place on the continuum, for example, when the underlying research instead focuses on haptic feedback through the inclusion of real-world objects [
159,
275]. Nevertheless, these transitions can be manifold, as they potentially involve multiple actualities and can affect various subjects that interact with the cross-reality system. We refer to these transitions as implicit transitions since they are a byproduct of the proposed system and not the focus of the introduced research. As these implicit transitions between actualities are complex, we found that they are difficult to grasp and hard to articulate. But due to their strong impact, they should be considered. Here, we found that common ground to describe these transitions has not yet been established. As a result, it is tough to extract the transitions’ essence, making an evaluation and comparison non-trivial. To make implicit transitions comprehensible and comparable, we recommend investigating visualization methods that enable one to convey the transitions taking place within a cross-reality system. Finally, cross-reality systems often do not investigate the transitions of their proposed systems. For example, research evaluating different approaches to display a physical keyboard in VR assumes the keyboard is always present [
159,
266]. Thereby, these works focus more strongly on interacting with the keyboard in VR but less strongly on the transition between the keyboard being present or not. While it makes sense to focus on interacting with the keyboard, the aspect of how to transition between these states of the keyboard received less attention.
7.2 Multiple Actualities
We identified several research topics that involve multiple users and bystanders (cf. Section
4.1.3), which we refer to as
Type 3 cross-reality systems. Here, both users and bystanders have different actualities and can transition along the continuum. Thereby, they can change their actuality, resulting in more complex interactions. For example, von Willich et al. introduced a cross-reality system in which from the VR user’s perspective, a bystander enters VR and thereby transitions closer to the VR user; however, from the bystander’s perspective, there is no transition into VR, meaning the bystander still experiences the real world [
305]. Thus, all perspectives need to be taken into account as they contribute to an all-encompassing understanding of the scenario. However, it remains challenging to grasp and convey users’ and bystanders’ perspectives and actualities to an audience that has not experienced the system itself. Again, we recommend investigating visualization methods; nevertheless, we emphasize that such visualizations need to consider the different actualities of the users involved in
Type 3 cross-reality systems.
7.3 Missing Research on Augmented Reality
We revealed that current research investigations mainly focus on cross-reality systems that shape around VR users. We found only a smaller number of systems that proposed cross-reality experiences with AR users (VR is present in 236 papers, while AR only exists in 111 papers—less than half). We believe that the tendency of immersive VR to blend out the visual information from the real world while auditory or haptic sensations remain perceivable inherently offers more conflict potential, which previous work has aimed to address. Nonetheless, previous work has demonstrated that AR suffers from similar problems—just to a smaller degree [
140,
141]. Still, neglecting these issues can cause severe problems, especially when cross-reality systems are operated in more dangerous environments (e.g., while navigating traffic [
136]). Hence, more investigations into head-mounted AR systems are needed, especially as these systems already provide the possibility to communicate more easily with bystanders, but the digital content is hidden similar to VR systems. Novel approaches introduced conceptual solutions to these issues [
69]. However, especially for cross-reality systems that allow users to transition on the continuum, more hardware is required as only very few devices allow transitioning between AR and VR. Currently, these devices are also limited to video see-through AR.
7.4 Trend toward Augmented Virtuality
Current VR systems aim for immersive experiences; however, the physical environment of VR users continues to have an impact [
187]. For example, VR users need to be careful not to bump into bystanders or furniture [
192]. Thus, in recent years, research has shifted toward cross-reality systems that include parts of the VR users’ environment on demand, meaning they temporally or permanently transition users toward AV. In this work, we define such systems as
Type 2 cross-reality systems (or
Type 3 if they include other users). Commercial products have followed this trend, for example,
Oculus with the release of its Pass-through API. Thereby, researchers have acknowledged the shortcomings of current VR systems and started embracing the opportunities cross-reality systems do offer. In the future, more research is needed to systematically investigate which aspects of users’ real environments need to be introduced to VR experiences and, more importantly, when and how users transition to AV with the goal to incorporate these aspects into their experiences. Finally, integrating real-world objects into the experience requires considering many different objects. If we manage to find computational approaches to integrate them automatically (e.g., [
117]), it will enable users to engage with more objects.
7.5 Prototyping Cross-reality Systems
Prototyping and developing cross-reality systems is still challenging [
214] and can be a time-intensive process that often requires software and hardware prototyping expertise [
12]. Especially, the creation of cross-reality hardware prototypes (e.g., [
44,
104,
105,
192]) has a high entry barrier and requires the use of various hardware components (e.g., displays, projectors, sensors), engineering skills (e.g., electrical engineering, software development), and design expertise (e.g., rapid prototyping). Enabling fast and low-effort prototyping of cross-reality systems could support researchers, developers, and designers of cross-reality systems to quickly iterate their ideas and designs without the need to fully implement the entire system in both software and hardware (e.g., by avoiding a hardware implementation). We argue that more novel prototyping methods are required to help develop cross-reality systems. Recently, Gruenefeld et al. published
VRception, a prototyping concept and toolkit that allows for rapid creation of cross-reality systems entirely in VR [
103]. With this system, multiple users can remotely join one virtual environment. In this environment, they can use various pre-defined virtual components to build cross-reality systems and prototype their functionality in VR. A useful addition to this would be a modular hardware system that allows users to create cross-reality systems with less effort and without the need for extensive software and hardware experience. Such a system could include modular hardware components that can be easily integrated with each other (e.g., small projectors, displays, cameras) and software components that allow for easy integration into virtual environments. Moreover, researchers have proposed various prototyping tools relevant to cross-reality systems [
214]. For example, they have presented approaches utilizing VR to prototype AR applications [
97,
166] or to enact futuristic interfaces [
272]. While these approaches are not directly targeting cross-reality systems, they can still be valuable for the prototyping process of these systems.
8 General Discussion
In this section, we discuss the current state of cross-reality system research, thereby answering our guiding question: How can we align the language across communities and establish a solid foundation for future work that benefits both researchers and practitioners? For each extracted research question, we have a dedicated paragraph below that aims at discussing our related findings.
Classification of Cross-reality Systems. The field of cross-reality systems is a relatively young research area. Hence, a well-established terminology is not yet present in the relevant research communities. We argue that it is timely to establish a common terminology as we see an increasing number of publications that introduce cross-reality systems and research. Through our review, we aimed to provide a terminology that allows one to classify cross-reality systems. This can foster research by providing terms that make such systems more comparable or ease the communication of novel ideas. In this context, we argued for the term
actuality to describe the current experience of cross-reality system users. Through this term, we can clearly describe what a user is currently experiencing (e.g., the actuality of a user is VR). Further, we introduced a clear distinction between subjects and objects. Subjects are conscious and can perceive their environment, or in other words, they have an actuality. For example, a person in the real world perceives the physical environment; therefore, the actuality for this person is the real world. When the person uses a VR-HMD, the actuality would be VR. To describe cross-reality systems that allow one to transition between different manifestations on the Reality-Virtuality Continuum [
200], we introduced
Type 1 cross-reality systems. Transitional interfaces [
29,
30,
300] can be classified as Type 1 cross-reality systems as they allow their users to transition between various manifestations (e.g., the real world, AR, VR), thereby changing the actuality of their users. Objects play a key role and, with their utilization, form an important new category within cross-reality systems. We have identified a large number of publications that utilize objects within cross-reality systems (158 out of 306 publications). Therefore, we introduced
Type 2 cross-reality systems. These types of systems allow one to repurpose objects, for example, from the real world in virtual experiences [
192]. Through Type 2 systems, we can describe all systems that integrate objects from another manifestation into the current actuality (e.g., a smartphone into VR [
6]). We limit ourselves not only to physical tangible objects. Also, systems that make use of physical phenomena like heat [
291] or motion [
48,
193,
194] can be categorized as Type 2 cross-reality systems. To describe systems that involve multiple subjects, each of which experiences different actualities, we introduced
Type 3 cross-reality systems. A typical scenario would be users collaborating using AR and VR [
220] or bystander inclusion [
104,
192,
305]. We argue that this classification allows for structuring the field of cross-reality systems, thereby allowing one to get a better understanding of current trends and even recognize research that is not explicitly introduced as part of the cross-reality domain, for instance, utilizing objects within the user’s actuality for haptics [
206] or integrating real-world motion into VR [
48]. We believe that along these types, we can establish useful terminology and guidelines for researchers and practitioners in the area of cross-reality systems. In this sense, we introduced nine guiding principles for the design of cross-reality systems.
Nine Guiding Principles for Cross-reality Systems. As suggested by the literature, there are entry barriers for the development of AR/VR applications [
12]. At the same time, MR applications are envisioned to become more relevant in the future [
278]. Through our review, we observed a strong rise in contributions to the field of cross-reality systems, yet we lack guidelines that help to design and implement novel cross-reality systems and experiences. At this time, we strongly believe that it is important to propose a set of rules for cross-reality system design. With our nine guiding principles, we proposed such a fundamental set along our three types of cross-reality systems that are grounded in a large literature corpus. Although these rules may be partly familiar to cross-reality experts, formalizing and communicating such a set can benefit the field of cross-reality systems. Novice researchers or practitioners can benefit from years of research distilled into a crisp set of rules that serve as useful guidelines in many practical and educational contexts. The nine guiding principles we have proposed are backed by our extensive literature review. Nevertheless, they are not verified through empirical evaluations. In this sense, future research is necessary to assess their overall applicability. Still, we strongly believe that the rules in their current state form an important starting point for future and well-established guidelines.
Research Challenges and Opportunities. We extracted promising research challenges and opportunities for future work through our literature review. The field of cross-reality systems is manifold, ranging from introducing implicit transitions that were not part of the underlying research question [
159] to bystander inclusion that focuses primarily on immersed users and less on bystanders [
305]. Therefore, little is known about their effects on the corresponding scenario. We see numerous research opportunities here that can help to shape the understanding of cross-reality systems and their effects on all involved users.
Limitations. We acknowledge the following limitations to our survey. We intentionally opted for a manual screening approach to compile our literature corpus because it allowed us to include a larger, more diverse set of publications. On the one hand, this procedure can introduce human error (e.g., overlooking a publication) as our corpus grew substantially in size (overall we screened 33,522 publications). On the other hand, our manual approach allowed for the identification of publications that investigated cross-reality systems but did not use common terminology or present the research as a cross-reality-related evaluation. An automated approach like a database query would have suffered from the same limitations. Hence, we believe that our manual approach led to the compilation of a literature corpus that represents current research in greater detail than an automated one. Further, we compiled the literature corpus starting with HCI-related conferences. Consequently, literature that introduced cross-reality systems in other venues might not be considered in our literature corpus. As this survey approaches cross-reality systems from an interaction perspective, we started with HCI venues. Other venues (e.g., TVCG or SIGGRAPH) often present graphic-focused publications and might lack the interaction part that is of interest to this survey. Nonetheless, through checking references and citing papers iteratively, we identified a huge amount of cross-reality systems published in other venues. Finally, we did not investigate the underlying population of the corresponding user studies in the reviewed papers. Therefore, our survey does not address possible novelty effects introduced by the presented systems.
9 Conclusion
Due to the increasing interest in cross-reality systems, we conducted a scoping literature review, surveying existing publications that propose such systems. Here, we conducted an in-depth literature review by surveying more than 8,437 papers as an initial pool of papers in this domain, from 2015 to 2022. By following their referenced papers and papers that cited them, we surveyed around 25,000 additional papers (as citing or referenced publications). In sum, we identified 306 papers that describe implementations of cross-reality systems (e.g., [
137,
192,
255]). These served as a corpus for classifying their research topics and identifying shared properties. While we see a growing interest in cross-reality systems, we could not identify common terminology. However, to describe cross-reality systems and the aforementioned interplay among different actualities, such terminology should be established. Hence, in our work, we answer the following research question:
How can we align the language across communities and establish a solid foundation for future work that benefits both researchers and practitioners? In particular, we contribute a classification of cross-reality systems into three different types:
Type 1: Subjects transitioning on the continuum experiencing a changing actuality;
Type 2: Subjects interacting with objects that are repurposed for the subject’s actuality; and
Type 3: Multiple subjects experiencing different actualities. Furthermore, we contribute to a better understanding of these systems by identifying shared properties and providing nine guiding principles that should be followed when implementing these systems. Finally, we conclude our work with research challenges and opportunities that can benefit cross-reality systems. Here, we address current shortcomings and propose future research perspectives, including visualization and prototyping methods for these systems.