1. Introduction
The Nouvelle Aquitaine DA3T regional project (in English, Device for the Analysis of Digital Tracks for the Valuation of Touristic Territories) aims at improving the management and the valorization of the coastal tourist territories of Nouvelle Aquitaine by exploiting both indoor and outdoor touristic mobility tracks and contextual data. We have datasets from various sources describing touristic trips (e.g., mobility tracks in the city of La Rochelle or mobility tracks within the Bordeaux Wine City). In order to meet the project objective, geographers and computer scientists wish to devise methods and tools that will allow manual and automatic processing of mobility tracks. Our results will allow the extraction of knowledge that is useful to the actors of territorial planning.
In this paper, we use outdoor mobility tracks from volunteer tourists. They agreed to use Geoluciole, our mobile application for capturing spatio-temporal positions, during their visit in the city but also to participate in an oral interview conducted at the end of their stay in order to complete the raw mobility tracks. In our work, we need mobility representations of various granularities (e.g., sequences of positions, set of trajectory, sequences of trajectory segments, etc.). The representation of computed data (e.g., speed) or contextual data from external resources (e.g., weather) is also important to enrich the movement. This data will be associated with positions or trajectory segments. Thus, we describe a mobility track through different sequences of trajectory segments annotated with semantic features. These segments represent movement characteristics: (i) disjoint semantic segments (e.g., “the tourist walks in the street then goes to the museum”); (ii) imbricated semantic segments (e.g., “at the end of their visit, the tourist buys a book in the museum store”). The model we propose addresses two main challenges concerning the description of raw mobility data, on the one hand, and the description of associated semantic features, on the other hand. Therefore, a semantic trajectory is described as a sequence of disjoint or imbricated semantic segments. As these semantic segments can be imbricated, we speak of multi-level trajectory. Each segment has semantic features with spatial, temporal and thematic dimensions so any type of real-world phenomenon can be described to enrich trajectories. We speak of multiple aspect trajectory.
The article is organized as follows.
Section 2 describes the work on mobility data by first stating some key definitions and then discussing the modeling of semantic trajectories in the literature.
Section 3 outlines the motivating scenario of the DA3T project by describing the data manipulated (i.e., touristic mobility tracks as well as potential enrichment data) and then specifying the needs of geographers and planners in terms of semantic trajectory modeling. The research challenges are then presented.
Section 4 presents our contribution: the DA3T model dedicated to the representation of semantic trajectories.
Section 5 tests the DA3T model through several experimentations: two experimentations, based on visit data from the city of La Rochelle, aim at validating the multi-level and multiple aspect dimensions of our model; two other experimentations, based on data from other application domains, test the genericity of the model. Finally,
Section 6 concludes and presents the future work perspectives.
3. Motivation Scenario
The motivation scenario from the DA3T project is divided into three parts. The first part describes the characteristics of the data manipulated in the framework of the project. The second part expresses the requirements of geographers and planners in terms of representation and processing of these data. Lastly, the third part highlights the corresponding scientific challenges.
3.1. Tourist Mobility Tracks and Enrichment Data
We collected data from three distinct source categories: (i) mobility tracks of tourists to the city of La Rochelle from a mobile application, (ii) interview transcripts of all or part of these same tourists, and (iii) contextual data allowing the enrichment of this dataset (e.g., weather, division of the city into districts, events, points of interest, etc.) collected/extracted from Open Data.
Our data collection process went through several steps: (1) promotion of our project to tourists in the tourist offices of La Rochelle, (2) collection of mobility tracks of volunteer tourists thanks to the mobile application Geoluciole which, at regular time intervals, captures the position of the phone (i.e., of its bearer), (3) realization, a posteriori, of semi-directive interviews with these same tourists to ask for precisions and explanations on their movements and, finally, (4) research, in the Open Data, of contextual data in connection with the city. At the end of the process, we had a set of raw mobility tracks, collected through Geoluciole, interviews and various contextual data.
Figure 1 shows that the track (c.f.
Figure 1, block a), completed by contextual data such as nearby points of interest (c.f.
Figure 1, block b) allowed a first interpretation such as “they must have visited the Lantern Tower and the Chain Tower”. The interview extract that corresponded to this track was: “On 28/07, visit of the towers: first the Chain tower, then we wanted to go to the Lantern tower but it was too late so we turned back and went to buy an ice cream across the street.”. This interview (c.f.
Figure 1, block c) made it possible to confirm certain interpretations (e.g., “visit to the Chain tower”) or to correct others (e.g., “we wanted to go to the Lantern tower but it was too late”). It also allowed one to discover other activities (e.g., “buy an ice cream”). Thus, aggregating these data, from three distinct source categories, allowed us to be more precise in our analyses. In short, an interview could complement the silences of the corresponding mobility track and vice versa.
3.2. Definition of Geographers’ Requirements
The DA3T project is a multidisciplinary project in which geographers and planners use tourist mobility data, associated with interview data and contextual data, according to the representations we have illustrated in
Figure 1. These enrichment data are an important added value for geographers and planners because they permit to give meaning to the tourist’s movement in order to facilitate its understanding.
In order to highlight the requirements of geographers, let us first use
Figure 1. A track could be associated with enrichment data from Open Data or interviews. Each enrichment data corresponds to a subset of the track’s positions. On the one hand, data from Open Data enriched the track of
Figure 1 with different points of interest crossed by the tourist (e.g., the sequence <“Chain tower”, “Lantern tower”, “Ze’bar”>). On the other hand, interview data permitted enriching the track of
Figure 1 with points of interest that were actually visited (e.g., <“Chain tower”, “Ze’bar”>) or with touristic activities (e.g., “eat an ice cream”). These two different enrichments could each complete or correct the semantics associated with the track.
It should also be noted that spatial, temporal and thematic triptychs were mobilized for the description of enrichment data (e.g., the “Francofolies” event could be described by a location, a duration and a name). Hence, we understood that it was interesting to associate such data with a subpart of the track based on these dimensions.
Finally, a sequence of enrichment data enriching a track could be recursively decomposed into sub-levels. For example, a touristic activity “discovery” mentioned in the interview could be decomposed into two new touristic activities with a more precise meaning like “visit” and “restauration”.
3.3. From Touristic Mobility Track to Semantic Trajectory: Locks and Working Hypotheses
The characteristics of tourist mobility data and the specific expectations of the geographers and planners who use them highlight specific modeling requirements. We must, in the same model, be able to describe raw mobility tracks, raw trajectories and semantic trajectories, integrating the characteristics described in
Section 3.2. Thus, after processing a mobility track (i.e., pre-processing, enrichment, filtering, etc.), we obtained one or more raw trajectories (i.e., segments of the mobility track that were of interest for a given application) or semantic trajectories (i.e., raw trajectories enriched with one or more specific interpretations based on contextual data from various sources). Semantic trajectory enrichment data were derived from interviews or from contextual data extracted from external resources.
The challenges lay in (i) the modeling of semantic trajectories with enrichment data associated with positions or trajectory segments (potentially structured in sub-segments); (ii) the generic definition of enrichment data integrating spatial, temporal and thematic dimensions; (iii) the structuring of such enrichment data according to a hierarchical organization. We hypothesized that the notions of multi-level episodes [
8,
9] be reused and extented in order to remove the lock (i) and multiple aspect [
10], whose aspect description enhanced the description of aspects with spatial, temporal and thematic dimensions, in order to remove the two other locks (ii) and (iii). To our knowledge, there is no semantic trajectory model that integrates these different notions simultaneously.
4. DA3T Semantic Trajectory Model
Figure 2 shows our model of semantic trajectories, which is broken down into three distinct parts:
The raw data level part (c.f.
Figure 2, block c) gathered the classes representing the collected raw data. General data related to mobile objects were in the
MobileObject class and more specific data related to a particular category of mobile objects (e.g., the volunteer tourists of our project) were in the corresponding classes that inherited from it (e.g., the
GeolucioleVisitor class). A moving object had a sequence of positions
Position that described the entire captured movement, i.e., its mobility track. There may be several types of positions (e.g., in our project,
IndoorPosition for visitor positions in museums and
OutdoorPosition for positions collected with Geoluciole) that had different attributes, but all inherited from the generic
Position class. A trajectory
Trajectory was a sub-part of this mobility track that was of interest for the application. It could be constructed on temporal, spatial or spatio-temporal criteria (e.g., from a mobility track describing a tourist’s 1-week stay, one may construct a set of trajectories of one-day duration). This model was generic and extensible according to the application context. The parts that could be extended were the
MobileObject class, to which child classes representing new types of mobile objects could be added (e.g., a mobile object class
Vehicle inheriting from
MobileObject), and the
Position class, to which child classes representing new types of positions could be added (e.g., a position class
GPSPosition inheriting from
Position).
The semantic data level part (c.f.
Figure 2, block a) comprised the classes describing the enrichment data. As in the MASTER model [
10], we wished to represent semantic data in the form of semantic aspects. Our model was therefore called a multiple aspect model. Four main classes represented these aspects (
Aspect,
AspectType,
AspectAttribute and
Value). The aspect type represented the category of the enrichment data (e.g., “touristic activity” for a “swimming” aspect, “point of interest” for a “central place” aspect, etc.) and had specific attributes (e.g., aspects of “points of interest” type were each characterized by a name, a location, a type, etc.). When an aspect was created, at least one aspect type was associated with it, and each attribute associated with that type was instantiated through the association class
Value (e.g., the “Lantern tower” aspect was a point of interest that had name: Lantern Tower, for location: [46.1558333, −1.1569444], for type: tower, etc.). Finally, we note that the
AspectType class included an
isSubTypeOf relationship that allowed enrichment data to be structured hierarchically. Thus, for example, touristic activities could be defined iteratively at different levels of detail, as outlined in the requirements. In the MASTER model, attributes were always instantiated as strings. In our model, we specified the classes
TemporalValue,
SpatialValue and
ThematicValue to distinguish between temporal, spatial and thematic attributes. Moreover, at instantiation, some instances of the model could be linked to concepts coming from external ontologies thanks to an attribute
uri (e.g., to describe aspects of “point of interest” type, we could rely on external ontologies such as DataTourism:
https://framagit.org/datatourisme/ontology, accessed on 6 June 2021). Thus, a URI could either describe an aspect type, or an aspect attribute, or a particular value of an aspect attribute.
The interpretation level part (c.f.
Figure 2, block b) served as a link between the raw data and the semantic data. An enrichment datum (i.e., an aspect in this model) could be directly linked to a particular position
Position. The link between a position and an aspect could be based on the attributes of the aspect having spatial (e.g., the spatial footprint of a district), temporal (e.g., the temporal interval of a holiday), or spatio-temporal (e.g., the spatial footprint and temporal interval of an event) dimensions. This link could also be computed for some aspects (e.g., the mode of locomotion could be computed using the speed of travel and knowledge of the road or pedestrian network) or manually defined (e.g., the touristic activities specified by the tourist during the interview). A set of positions enriched with the same enrichment data could result in the creation of a particular episode
Episode (i.e., a sequence of consecutive positions related to the same aspect). It was possible to express the granularity of episodes through the recursive compositional linkage
isComposedOf meaning that an episode could be specified by other episodes. Thus, our model was referred to as a multi-level model [
8]. A specific trajectory could be linked to one or more interpretations of the
Interpretation class. An interpretation was a particular sequence of episodes (e.g., a tourist’s trajectory may have had one interpretation to describe the weather during travel, another interpretation to describe touristic activity, etc.) [
3]. The
Pattern class allowed for groups of trajectories that shared spatial, temporal, or semantic characteristics (e.g., trajectories that corresponded to the same activities, trajectories that passed through the same districts, etc.). Finally, the
MobileObjectCategory class allowed one to classify mobile objects into particular categories according to their behaviors (e.g., tourists, residents, etc.).
We have presented a multiple aspect, multi-level model that is generic and extensible. Let us test it now on real scenarios.
6. Conclusions and Perspectives
The DA3T project aims to propose models, methods and tools dedicated to the processing of tourist mobility data for the development and enhancement of the French Atlantic coast touristic territories. It is based on a mixed methodological approach of collecting tourist mobility data. These data are both heterogeneous and multidimensional (space, time and theme): digital tracks by GPS tracking, information on mobility and touristic activities from interviews and completed by the collection of contextual data from Open Data. The linking of these data requires the design of a multiple aspect and multi-level trajectory model. Therefore, the main contribution of this paper is to present the DA3T semantic trajectory model that integrates multi-level and multiple aspect features as well as its exploitation in processing chains. Multiple aspect models propose to link semantic aspects directly to points of the trajectory or to the whole trajectory. In our model, we propose (1) to link aspects to episodes of the trajectory. Moreover, our model illustrates what geographers are used to doing manually to analyze trajectories, that is, analyzing the trajectory under different semantic axes in order to draw a global conclusion about the movement of an individual. In our model, these semantic axes (i.e., Interpretation class) are represented by interpretations associated with a trajectory. Our model also integrates (2) the notion of ordered episodes (i.e., Episode class), an episode itself can be composed of other ones (multi-level approach). Each episode, disjoint or imbricated, is enriched with (3) semantic aspects (i.e., Aspect class) that allow one to represent real-world objects using attributes of spatial, temporal and thematic dimensions (i.e., SpatialValue, ThematicValue and TemporalValue classes).
Through four scenarios, we explained these characteristics but also the generecity of our model by confronting it with mobility data from other application domains (i.e., Foursquare data and migratory birds data). These various experimentations allowed us to validate our approach and to show the capacity of implementing different processing chains facilitating the processing, analysis and visualization of heterogeneous mobility data. For the semantic, we mobilized several types of resources: ontologies concerning, for example, touristic activities, Open Data for meteorology, points of interest or city districts can be integrated.
The model is already implemented in the customized DA3T processing platform where the input and output data of each service are instances of the model. For now, we have tested and validated it on outdoor mobility data. We now want to implement it on indoor mobility data corresponding to museum visits. Moreover, we are experimenting with trajectory and tourist aggregation services with semantic similarities (Pattern and MobileObjectCategory classes). Finally, we are working on services exploiting tourists’ interviews to enrich their trajectories. In the longer term, the goal is to develop a platform that can be fully customized by non-computer end-users.