Connected Driving Pipelines
Introduction
I created a few different connected driving pipelines. Each pipeline has improvements on the previous pipeline. Here are the current pipelines, ordered from latest to oldest:
ConnectedDrivingPipelineV4 (4th Pipeline)
This is the latest connected driving pipeline. It has new features such as automatic caching, dependency injection, and context/path providers.
Summary
This pipeline was created to end the continuous cycle of making maching learning (ML) pipelines for connected driving.
The pipeline was created for connected driving research. The goal is to create a better, more realistic dataset for training malicious bsm detection systems in the connected driving space. The current datasets, such as Veremi and Veremi Extension are too easy to train detection models for and don't model the world realistically.
We took the Wyoming CV Pilot Basic Safety Message One Day Sample dataset from OpenDataNetwork as our original, real data.
This pipeline creates malicious datasets based on customizable attacks and then tests them using machine learning models such as Random Forest, Decision Tree and K-Nearest-Neighbours.
You can visit the documentation at www.aaroncollins.info/ConnectedDrivingPipelineV4/ and the repository at github.com/aaron777collins/ConnectedDrivingPipelineV4.
Connecteddrivingresearch (3rd Pipeline)
Summary
This pipeline was created to start generating BSMs rather than relying on the VEREMI dataset. It was messy and not very well documented. The 4th version was created to fix these issues.
This pipeline used real data from the Wyoming CV Pilot Basic Safety Message One Day Sample dataset from OpenDataNetwork as our original, real data. We added malicious data to some of the BSMs and then trained machine learning models to detect the malicious BSMs.
You can see the repository at github.com/aaron777collins/connecteddrivingresearch.
ConnectedDrivingMachineLearningPipeline (2nd Pipeline)
Summary
This pipeline used the Veremi dataset to train machine learning models to detect malicious data within BSMs. It was a good start, but the dataset was too easy to train models for. Thus, the 3rd pipeline was created to generate more realistic data.
You can see the repository at github.com/aaron777collins/ConnectedDrivingMachineLearningPipeline.
https://github.com/aaron777collins/ConnectedDrivingDataGenerator (1st Pipeline)
Summary
This pipeline was actually built off of VEINS to simulate cars talking to eachother in a virtual environment. Most of the code was taken from the VEINS tutorial and modified to fit our needs. This pipeline was abandoned because of the difficulty setting it up and making changes to fix our needs. The high learning curve was not worth it when we could just use the Veremi dataset. No machine learning was done in this pipeline.
You can see the repository at github.com/aaron777collins/ConnectedDrivingDataGenerator.
Links
- ConnectedDrivingPipelineV4 Docs
- ConnectedDrivingPipelineV4 Github Repo
- connecteddrivingresearch Github Repo
- ConnectedDrivingMachineLearningPipeline Github Repo
- ConnectedDrivingDataGenerator Github Repo
- VEINS
- Veremi
- Veremi Extension
- Wyoming CV Pilot Basic Safety Message One Day Sample
- Random Forest Classifier
- Decision Tree Classifier
- K-Nearest-Neighbours Classifier