The proposed ML-based alignment methodology was evaluated using simulations and field experiments using smartphones. The following sections present the main simulative and experimental results, including classical CA and MLCA. The MLCA results presented in the tables are the ones obtained after feature selection, hyper-parameter tuning, and a window size optimization for each method.
4.1. Simulations
A simulative environment was developed to emulate the accelerometers sensors readings of a low accuracy with a velocity random walk (VRW) error model:
where
is the nominal value of the specific force, and
is the inertial sensors random noise defined as a zero-mean white Gaussian noise. To that end, a velocity random walk value of 0.05
was used.
In stationary conditions, the values of the deterministic parts of the bias, scale-actor and misalignment error terms can be mostly removed. Therefore, those were not addressed in the simulation part. Later, in the experiments, all error-terms are accounted for. Using this simulation, noisy accelerometer readings were produced, each scenario for 30 s with a sampling rate of 100 Hz, labeled with the nominal reference of the roll and pitch angles. In total, 138.9 million samples (from the three accelerometers) were created, of which 138.3 million for the train datasets and 600 K for the test dataset.
Four train sets in two representative angles ranges, differing by their resolution, were chosen for the analysis: First, within a narrow range of angles −1° ≤ roll, pitch ≤ 1° with relatively high resolutions of 0.01 and 0.05 degrees, that contain 120 million and 4.8 million samples respectively, and then a wider range of −15° < roll, pitch < 15° in lower resolutions of 0.5 and 1 degrees that contain 10.8 million and 2.7 million samples respectively. For example, the narrow range train set of −1° ≤ roll, pitch ≤ 1° with a resolution of 0.01, contains 40 K combinations of 200 roll and 200 pitch angles. Each simulation was recorded for 30 s at 100 Hz, which means that each data set contains a total of 120 million measurements of the three-axis accelerometers. The test sets were comprised of new additional 100 simulations recordings, each for 30 s at 100 Hz, in the narrow and wide ranges, respectively.
The motivation for this strategy was to cope with the huge amount of information collected from the simulative accelerometers operating at 100 Hz, using a standard CPU computer and still evaluate the ability of MLCA both in a wide-angle range and also narrow range.
For narrow range tests, the ML models were first trained, twice, on the entire simulative train sets in the two resolutions of 0.05° and 0.01° for roll and pitch angles in the range of 0 ± 1°. The test set was comprised of new additional 100 simulations recordings, each with a different random orientation in the range of 0 ± 1° for the roll and pitch angles values. Given the train and test sets, a total of 73 features (as presented in
Section 3.3) were calculated. Then, 6 ML approaches and the classical CA approach were applied on the data.
Table 5 shows prediction results for the random orientations test set, as obtained from the ML models trained in the resolution of 0.01°, for a required for prediction duration of 3 s (300 accelerometer measurements). The traditional CA error results for the roll and pitch angles are 0.336 mrad and 0.323 mrad, respectively, while MLCA achieved better results in all methods. The maximum roll improvement was to 42.75% with the XGBoost method and 42.42% improvement for pitch angle when using the LightGBM method.
The results show a clear advantage of ML models predictions over the traditional CA method, where all models achieved improvement over the classical method results around 40% and more for both the roll and the pitch.
The required CA prediction time has a strong influence on the performance. In general, as the required time increases, the results improve both in the classical and ML approaches. However, as the CA time increases, the ML performance is improved compared to the classical CA method.
Figure 5 shows the percentage of the improvement achieved by the XGBoost model, trained in the lower resolution of 0.05° in the range of 0 ± 1° for the roll and pitch angles, relative to traditional CA, for several prediction time durations. The model increased its MAE improvement from 3% and 2% for 0.5 s prediction time, for the roll and pitch angles, until it reached 26% in roll and 19% in pitch when the prediction time was 3 s.
To better show the improvement that can be achieved using the ML CA, the LightGBM model performance is further evaluated.
Figure 6 shows the roll and pitch prediction errors for the LightGBM model trained in the resolution of 0.01° in the range of 0 ± 1° for the time required for prediction in the range of 0.1–3.5 s, which corresponds to 10–350 accelerometer measurements in each axis.
The results shows two benefits of MLCA: (1) the ability to obtain lower errors at the same prediction time and (2) achieving a required CA error level in a shorter prediction time. For example, the traditional CA error result for the roll angle in
Figure 6 is 1.173 mrad after 13 accelerometer measurements, while the MLCA using LightGBM obtained better results with only 0.9 mrad MAE, which is a 23% improvement. Moreover, the LightGBM MAE converged to 0.7 mrad after 23 accelerometer measurements while CA needs 35 measurements to achieve such accuracy, which is another 12 ms for having similar prediction accuracy. In some applications, it is a meaningful time duration. The LightGBM model also shows predictive stability across all prediction time values versus the classical method, which is heavily influenced by local errors in some of the higher prediction time values, and its MAE increases sharply.
After validating the ability to have a better prediction for the CA than the classical method in the narrow range, prediction of a wide range and low-resolution ML models is now addressed. ML models were trained on simulative train sets with roll and pitch values in two resolutions of 0.5° and 1° in a wide range of 0 ± 15°. The test stage was made with an additional 100 simulations set of recordings, each with a different random orientation in the range of 0 ± 15° for the roll and pitch angles values.
Table 6 shows prediction results obtained from the ML models trained in the resolution of 0.5° and 1°, for the time required for the prediction of 3 s, which corresponds to 300 accelerometer measurements in each axis.
As seen in the narrow range, also in the wide range—increasing the train set angles’ resolution improves the results clearly and significantly. However, we are limited in the ability to raise the resolution due to the limitations of memory and CPU. However, still, the results show that there is a reasonable ability to predict the angles using the ML model in the wide range in all the methods tested, which enable us to focus the search in the relevant narrow range model as presented before. Referring to
Table 6, for ML models trained in the resolution of 0.5°, the worse results for a 3 s prediction time are 2.03 mrad (±1.56) for the roll and 2.26 mrad (±1.1) for the pitch, which are 0.12° (±0.09) for the roll and 0.13° (±0.06), when XGBoost method was used. Those results can still easily guide us to the relevant narrow range model. Among the ML methods we tested in the wide range, there is generally an advantage to the CatBoost, LightGBM, and ExtraTrees methods, where CatBoost usually provides the best and most stable prediction. We can also see the effect in this wide range of the time required for prediction on both the classical method and the ML so that, in general, a long time allows for better results.
Figure 7 shows this in a more detailed comparison of the classical method results versus the CatBoost method trained in the resolution of 0.5°.
To summarize, classical CA keeps the same level of performance both in a narrow and wide range. For MLCA, a wide range is used to direct and focus on the narrow range CA to yield the roll and pitch estimation. It was shown that the overall time required in MLCA to obtain the same accuracy is much lower than the traditional CA, and its overall accuracy is better.
4.2. Experiments
Field experiments were conducted using smartphone-based inertial sensors under real environment operation conditions. The MLCA predictive models have been tested in a set of stationary INS alignment scenarios. The sensor’s raw data was recorded using the ‘Sensor Fusion’ android application, which was developed at Linköping University (LiU) in Sweden [
42]. The application screenshots are presented in
Figure 8. The set of real inertial sensors measurements having their errors and random noise, which have been recorded, was then used as input for the performance comparison test between the traditional CA to the MLCA instead of the simulated data. The sensor fusion Android application was installed on the Samsung Galaxy phone and was configured to record the specific force vector and smaerphone orientation.
To calculate the attitude ground truth (GT) recordings of three minutes while in stationary conditions were made. There, the attitude solution provided by the application was employed. The attitude is calculated in an Attitude and Heading Reference System (AHRS) framework using the well-known Madgwick algorithm [
43] which is based on both gyroscopes and accelerometer readings. This algorithm provided the attitude estimation for a time duration of three minutes, over each of the recorded raw time series, while the GT was taken as the average attitude value. By averaging, the influence of noise on the solution is reduced. Assuming zero-mean white Gaussian noise, as more samples are used the better is the noise reduction. Prior sensor calibration was not conducted; thus, misalignment errors were not removed. In the following experiments, we compared CA and MLCA for a time duration of one second, therefore the noise reduction has there less influence compared to the GT. Both traditional CA and the proposed MLCA were compared to the same GT.
The MLCA performance in the field experiment was first evaluated in a narrow range of 0–1° for the roll and pitch angles. The ML models were first trained and tested on a dataset of raw data in a narrow range that contains 3-min random recordings of varying angles in the narrow range of 0–1°.
Figure 9 presents the distribution of the recorded orientations. 70% of each of the recordings in the dataset was used as a training set, and the rest was used as the test set.
Table 7 shows the experimental results for the low accuracy accelerometers in a narrow range scenario that were produced for one second prediction time, which corresponds to approximately 100 accelerometer measurements. The CA error results for the roll and pitch angles are 0.233 mrad and 0.255 mrad, respectively, while MLCA achieved significantly better results in all presented methods with up to 85.89% and 78.03% relative improvement for roll and pitch angles with the LightGBM method.
That is, working on the real experimental data sets, MLCA showed remarkable results with LightGBM been able to predict the roll and pitch angles better than the classical method on a known set of angles sets with up to 86% and 78% relative MAE improvement, while CatBoost did also well with up to 67% and 76% relative improvement for a one-second prediction time.
Similar experimental results were also achieved with the same ML models in the narrow range trained on the full dataset of recordings in the range of 0–1° while the roll and pitch angles were tested against a newly recorded data set of a different orientation than the ones in the train set.
Table 8 presents the prediction results on a new set of recordings in randomly chosen orientation of 0.93° for the roll and 0.52° for the pitch angle produced for the time required for prediction of one-second. The CA error results for the roll and pitch angles are 0.177 mrad and 0.153 mrad, respectively, while MLCA achieved significantly better results in most presented methods with up to 70.94% relative improvement for the roll using LightGBM and 45.07% for the pitch angle when the RF method used.
The results show that even when tested on new angles, there is a clear improvement in favor of the MLCA methods over the classical method. LightGBM and RF stood out with the best results and showed an accuracy improvement for new angles by 71% and 70% for the roll and by 45% and 37% for pitch, respectively, versus the classical CA.
Next, CA prediction in a wide range low-resolution dataset was evaluated. This is a necessary step in order to validate the possibility to later focus on a specific area of narrow range angles. At this stage, 3-min recordings were collected at varying angles in the wide range of 0–15°.
Figure 10 presents the distribution of data set recordings in the wide range.
The ML models were initially trained and tested on this data set, where 70% of each of the recording in the dataset were used in the training set and the rest in the test set. Then, the same ML models in the wide range that were trained on the full dataset of recordings in the range of 0–15° for the roll and pitch angles were tested against a newly recorded data set of a different orientation than in the train set.
Table 9 presents a comparison of the prediction results of ML models on a new random orientation recording of 10.88° for the roll and 3.01° for the pitch angle that was not present in the train set. The results were produced for the time required for the prediction of one-second. The comparison is between classical CA and MLCA, including the precision in terms of MAE results.
Similar to the simulation results, the ML models trained in a wide range didn’t achieve better results than the classical method. But again, the achieved values, including the STD values, can easily allow us to focus on a narrow range and run the relevant model that has been trained for that range, with better resolution, and get overall better results (in shorter times)compared to the classical method. For example, from the results in
Table 9 for the tested ML methods, the worse errors achieved when using CatBoost with MAE of 9.41 mrad (0.54° ± 0.26) for the roll and 5.202 mrad (0.3° ± 0.05) for the pitch. Furthermore, all other methods obtained much higher accuracy. For example, ExtraTrees achieved an accuracy of 2.076 (±0.22) mrad and 1.092 (±0.52) mrad for the pitch and roll angles, respectively, which are 0.12° (±0.01) for the roll and 0.06° (±0.03), which can easily guide us to the relevant narrow range model.
To summarize the presented experimental results, the prediction performance of the wide range MLCA models is proven to be more than sufficient to direct and focus on the relevant tested narrow range of range of 0–1° with overall accuracy of up to less than 0.2°. Thus, the overall accuracy achievable by using the proposed MLCA is determined by the performance of the ML models for the narrow range stage. The MLCA in the narrow range tests outperformed the traditional CA with an accuracy improvement for new test angle by 71% for the roll and 45% for pitch when using the LightGBM method. Given these results, it is possible to compose the best preforming ML models for each of the pyramidal methodology stages; the ExtraTrees model for the wide range predictions and the LightGBM model for the stage of narrow range predictions. This result is illustrated in
Figure 11.