X-CANIDS Dataset (In-Vehicle Signal Dataset)

Citation Author(s):
Seonghoon
Jeong
Korea University
Hyunjae
Kang
Korea University
Huy Kang
Kim
Korea University
Submitted by:
Seonghoon Jeong
Last updated:
Fri, 04/26/2024 - 07:22
DOI:
10.21227/epsj-y384
Data Format:
Link to Paper:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

X-CANIDS Dataset (In-Vehicle Signal Dataset)

In March 2024, one of our recent research "X-CANIDS: Signal-Aware Explainable Intrusion Detection System for Controller Area Network-Based In-Vehicle Network" was published in IEEE Transactions on Vehicular Technology. Here we publish the dataset used in the article. We hope our dataset facilitates further research using deserialized signals as well as raw CAN messages.

Real-world data collection. Our benign driving dataset is unique in that it has been collected from real-world environments.

Signal deserialization. We offer our dataset in two formats, i.e., raw CAN messages and deserialized signals, enabling development of either message- and signal-based models and comparison of their performance.

For more specifications regarding this dataset, please refer to ReadMe.md in DOCUMENTATION.

 

Benign driving dataset (raw/dump*.parquet)

  • The dataset contains CAN messages from Hyundai LF Sonata 2017 e-VGT. Our interface Kvaser Memorator Professional HS/HS (with built-in clock; 1 μs resolution) was connected with our vehicle via the OBD-II port.
  • Six driving dataset + one idling dataset.
  • Each dump consists of 62-64 arbitration IDs. As our dataset was collected from a real commercial vehicle, it contains more numerous arbitration IDs than other public datasets that were made synthetically.
  • What makes our dataset especially special? We could maximize the potential of our dataset with open-source CAN databases! CAN database is a formal description for payload deserialization (or dissection). Using hyundai_ccan_2015.dbc (kindly visit https://github.com/commaai/opendbc/blob/master/hyundai_2015_ccan.dbc), you could obtain 688 signals from the raw payloads in the CAN messages.

 

Intrusion dataset (raw/dump6-*.parquet)

The intrusion dataset is made from Dump 6 dataset. We conducted attack simulations in the period 480-1440 s, half of the capture period of the dataset, to obtain label-balanced data.

Attack type description

  • Fuzzing Attack (fuzz): It manipulates various ECUs with random payloads and it can be performed with CAN messages that contain random AIDs and payloads. The attack can cause a malfunction of the target vehicle even if the adversary does not have prior knowledge of the in-vehicle communications.
  • Fabrication Attack (fabr): A specific ECU is manipulated as the intention of the adversary, and it can be performed using well-crafted CAN messages with a specific AID and payload. As a legitimate ECU periodically transmits CAN messages with the same AID, an adversary can transmit their CAN message directly after every benign message.
  • Suspension Attack (susp): It neutralizes an ECU by exploiting the error-handling mechanism of the CAN. A target ECU does not transmit any CAN messages during the attack.
  • Masquerade Attack (masq): It is a combination of the fabrication and suspension attacks. A stream from a specific ECU is replaced with arbitrary messages that are generated by the adversary during the attack.
  • Replay Attack (repl): An adversary captures legitimate CAN messages in a certain period. Then, they transmit the CAN messages within the CAN bus. The attack can cause a certain malfunction that the target vehicle have performed in the capture duration.

For more information regarding this intrusion dataset, a reader is referred to the original manuscript of our paper, X-CANIDS.

 

Signal dataset (sig/*.parquet)

We also provide the data that contains deserialized signals from every Benign driving dataset and Intrusion datasetfiles.

 

Citation

When this dataset helps your research, please consider citing this dataset as well as the original article X-CANIDS.

@ARTICLE{JeongLLK24X-CANIDS, 
  author={Jeong, Seonghoon and Lee, Sangho and Lee, Hwejae and Kim, Huy Kang},
  journal={IEEE Transactions on Vehicular Technology},
  title={X-CANIDS: Signal-Aware Explainable Intrusion Detection System for Controller Area Network-Based In-Vehicle Network},
  year={2024},
  volume={73},
  number={3},
  pages={3230--3246},
  doi={10.1109/TVT.2023.3327275}
}

Dataset Files

LOGIN TO ACCESS DATASET FILES
Open Access dataset files are accessible to all logged in  users. Don't have a login?  Create a free IEEE account.  IEEE Membership is not required.

Documentation

AttachmentSize
File ReadMe.md91.19 KB