3.2.1. Server Software
In the central server, as shown in
Figure 1c, three server software runs MQTT broker, OTA server, and MySQL database server. The database has been described in our work in [
13]. The MQTT broker and the OTA server are discussed below.
Message Queuing Telemetry Transport (MQTT) Broker: In our previous work [
13], there was no authentication to connect to the Transmission Control Protocol (TCP) server and it was open to cyber-attack. To solve this, a secure MQTT [
33] transmission protocol, designed for Internet of Things (IoT) applications, is used to communicate bidirectional data among the gunshot detector devices, the central server, and smartphone apps. The MQTT protocol offers an efficient and lightweight approach to messaging through a publish/subscribe model. According to the MQTT protocol, the transmitter client publishes a message to a topic in the broker server, and then the broker server sends the message to the receiving clients who subscribed to that particular topic.
In this project, the Mosquitto [
34] is implemented in a Windows computer as the MQTT broker server. To increase security: authentication using username and password, and Transport Layer Security (TLS) certificates are configured. Username and password authentication ensure that only authorized users can access the broker. To create the password file and add a new user, the mosquito password utility [
35] is used. One of its key security features is that it stores passwords in a hashed format rather than in plain text. Hashing algorithms such as SHA512, SHA256, and PBKDF2 are designed to be one-way functions, meaning it is computationally infeasible to reverse the hash to obtain the original password. This ensures that even if someone gains access to the password file, they cannot easily retrieve the actual passwords.
The purpose of using TLS certificates [
36] in an MQTT broker and client device is to establish a secure, encrypted communication channel between them. TLS certificates provide authentication of both the broker and the client, ensuring that both parties can trust each other’s identity before exchanging data. TLS ensures that data transmitted over the network are encrypted, protecting them from eavesdropping, tampering, or man-in-the-middle attacks. When TLS is used in the Mosquitto MQTT broker, common encryption algorithms include Advanced Encryption Standard (AES) for symmetric encryption, Rivest–Shamir–Adleman (RSA) or Elliptic Curve Cryptography (ECC) for key exchange and authentication, and SHA-256 for hashing to ensure data integrity, depending on the negotiated cipher suites. To configure TLS certificates [
37] in the Mosquitto MQTT broker, we generate the necessary certificates: a CA certificate, a server certificate, and the corresponding private key using OpenSSL [
38]. These certificates are used to encrypt communication between the broker and clients. The certificates are then copied to a secure location on the server. The Mosquitto configuration file is modified to enable TLS by specifying the paths to the CA certificate, server certificate, and private key, and setting the listener port to use TLS. When connecting, clients must also be configured to use TLS and use the same CA certificate to verify their identity.
To enable public access to the MQTT broker server from any location, it needs a fixed address and port number. The host computer’s private IP is dynamically assigned by the router’s dynamic host configuration protocol (DHCP) server, and it can change based on the devices connected to the local network. To resolve this, the private IP of the host computer is made static, and port forwarding is set up in the router [
39]. The port forwarding mechanism directs incoming data packets from the Internet to the MQTT broker server. Additionally, the listener port is opened in the Windows Firewall settings [
40]. A memorable and user-friendly name for the router’s public IP is assigned using No-IP—a free dynamic domain name system (DDNS) service [
41]. Although the router’s public IP, assigned by the Internet service provider (ISP), does not change often, it may change after a few months or when the modem is restarted. To handle this, Dynamic DNS Update Client software [
42] is installed on the host computer, which continuously checks for changes in the public IP and automatically updates the DNS at No-IP when necessary.
Over-the-Air (OTA) Server: An OTA server is essential for managing firmware and security updates for IoT devices like gunshot detectors, without requiring physical access to the devices. Once the devices are installed, especially in hard-to-reach or critical locations such as schools or offices, manually updating them can be costly and time-consuming. OTA updates allow system administrators to push new firmware or security patches remotely, ensuring that the devices remain secure and functional over time. This approach minimizes downtime, enhances device performance, and ensures quick deployment of updates in response to newly discovered vulnerabilities, all without having to remove or manually reconfigure each device.
In this project, the OTA server is written in Python and hosted on the same computer where the MQTT broker is hosted. To facilitate production-level deployment, the server uses the Waitress WSGI HTTP server [
43] to handle requests. By using Waitress, the server can handle multiple requests simultaneously, making it suitable for environments where numerous devices may request firmware updates concurrently. The server is configured to listen on a particular port number and the port is opened in the Windows Firewall settings [
40]. To access the server publicly, port forwarding is configured in the router [
39].
The OTA update server implements HTTP Basic Auth [
44] to ensure secure access to its endpoints, protecting sensitive data such as firmware files. It verifies user credentials stored in an external file (e.g., ‘users.txt’). Usernames and hashed passwords are loaded from this file, allowing dynamic updates without modifying the server code. A separate Python script is used by the server administrators to add new users and hashed passwords; and append them to the file. Upon successful verification by the server, the client device is granted access to the server’s resources, ensuring that only authorized devices can access version information or trigger firmware updates.
The server provides a dedicated endpoint for retrieving the current firmware version of a specified hardware version. The /get_fw_ver/<hw_ver> endpoint serves as a mechanism to query the version of firmware that must be deployed for a particular device type. The server reads the firmware version from a text file (e.g., ver.txt) stored in a directory specific to each hardware version. This design enables efficient version control and ensures that the correct firmware version is served to requesting devices.
The core functionality of the OTA server is delivered through the /
get_fw/<hw_ver> endpoint, which provides the actual firmware update files. Upon receiving a request, the server first checks the corresponding ver.txt file to determine the latest firmware version for the specified hardware. It then locates the appropriate firmware package in .zip format (e.g., fw_v1.zip). The zip file contains 3 files: python code for gunshot detection and notification, python code for the server to access the recorded audio files, and the TensorFlow Lite deep learning model. If the Zip file is found, it is securely transmitted to the client using Flask’s
send_file function [
45]. Error handling and logging are implemented to manage scenarios where files are missing or inaccessible.
3.2.2. Device for Gunshot Detection and Notification
The gunshot detection device monitors environmental sounds and classifies them as either gunshot or non-gunshot. Upon detecting a gunshot, the device sends a notification to smartphones over the Internet using the MQTT protocol and also saves the gunshot sound files locally on the device. Initial Wi-Fi and user configuration of the device, and controlling the device, such as enabling and disabling the device, is managed through the developed smartphone app. A brief overview of the device’s hardware and firmware is provided below.
Hardware: The hardware block diagram of the gunshot detection and notification device is shown in
Figure 6. The Raspberry Pi (RPi) Zero 2W [
46] is used as the main processing and communication unit. It has a 1 GHz quad-core 64-bit Arm Cortex-A53 CPU, 512 MB of SDRAM, 2.4 GHz 802.11 b/g/n Wi-Fi, Bluetooth Low Energy (BLE), onboard antenna, microSD card slot, and Hardware Attached on Top (HAT) compatible 40-pin GPIO header. It also has a compact 65 mm × 30 mm form factor. A MEMS microphone breakout [
17] is interfaced with the Raspberry Pi Zero 2W using the I2S interface for mono channel (e.g., left channel) input. The breakout board contains a compact, low-power microphone comprising of a high-performance SiSonic™ acoustic sensor, a serial analog-to-digital converter, and a signal conditioning interface that outputs audio in the standard 24-bit I2S format. The I2S interface facilitates integration with digital processors eliminating the need for an external audio codec or sound card. The Acoustic Overload Point (AOP) of this microphone is 120 dB. A sound level of 120 dB is equivalent to very loud noises, such as a rock concert, a jet engine from a short distance, or a gunshot. Thus, it can accurately capture sound levels up to 120 decibels without significant distortion. Three LEDs are connected to the GPIO ports of the Raspberry Pi in the active low configuration: a yellow LED to indicate listening mode, a green LED to indicate Internet connectivity, and a red LED to indicate gunshot event. Three current limiting resistors R1, R2, and R3 of 330 Ω are used for the LEDs. A push button switch is interfaced with a GPIO pin to reset the device manually by the user. For the power supply, a 100–240 v AC to 5 v DC converter module [
47] is used. It can provide a maximum of 600 mA current and 3-watt power. A printed circuit board (PCB) containing the MEMS microphone connector, three LEDs with resistors, and the reset switch is developed, and connected to the 40-pin header of the Raspberry Pi as a HAT. A casing with a wall-outlet AC plug [
48] is used to hold the electronics.
Figure 6.
Hardware block diagram of the gunshot detection and notification device.
Figure 6.
Hardware block diagram of the gunshot detection and notification device.
Firmware: The Raspberry Pi Zero 2W is equipped with a 32 GB SD card, running Raspberry Pi OS 32-bit, which supports inferencing TensorFlow Lite models [
49]. The application software is written in Python, with all necessary packages installed on the system. After boot, two Python programs operate concurrently in separate threads: one for initializing the device, Wi-Fi provisioning, detecting gunshots, and OTA update management, and the other for accessing the recorded gunshot sounds.
Initializing device: The device initialization process involves setting up the hardware and software components necessary for real-time gunshot detection. Upon boot, the Raspberry Pi Zero 2W configures its GPIO pins for various LED indicators and a reset button, ensuring proper signaling during operation. The system also loads the pre-trained TensorFlow Lite model into memory, preparing the neural network for real-time classification of audio data [
49]. Additionally, the sound input subsystem is configured by initializing the I2S microphone and setting up the audio stream for continuous capture at the 44.1 kHz sampling rate [
50,
51].
The MQTT client is also set up for communication with the MQTT broker, ensuring that detected gunshot events are transmitted in real-time [
52]. To ensure communication security, the MQTT client is configured with the same CA certificate as discussed in
Section 3.2.1, which provides encrypted communication over TLS. Upon successful connection to the MQTT broker, the device immediately sends a device status update to the user’s smartphone indicating its connected status. In addition, the device subscribes to its specific MQTT topic, GSD_DEVICE_CMD/<DeviceID>, where <DeviceID> is the unique identifier for the device, allowing it to receive control commands from the user’s smartphone. A “last will” message is configured within the MQTT client, allowing the device to notify the MQTT broker and the users if an unexpected disconnection occurs. To maintain connection stability, the MQTT client is configured with a keep-alive interval of 10 s. This setting ensures that the device periodically sends ping messages to the broker to verify that the connection remains active. If no communication is detected within the 10-s window, the MQTT broker can assume the device has disconnected and trigger the “last will” message.
A callback function, MQTT_on_message(), is a key component of the system’s communication architecture, enabling real-time command from the user’s smartphone to the device. This function is triggered whenever the device receives a message from the MQTT broker for the topic GSD_DEVICE_CMD/<DeviceID>. The smartphone app can send various commands to the device, such as enabling or disabling the gunshot detection functionality, requesting the device’s local IP address, or initiating an OTA update. Upon receiving a command, it parses the payload and executes the appropriate action. For example, if the command is to enable the device, the system activates its detection mode by setting the isDeviceEnabled flag.
Upon boot, the init_ota() function determines whether all tasks of the OTA update have been completed. Lastly, any previous device status, such as enabled or disabled states, is read from the system files to determine whether the device should start in an active or inactive mode.
Wi-Fi provisioning: This function in the RPi device initiates the process of establishing a network connection, either by connecting to a pre-configured Wi-Fi network or setting up a hotspot for Wi-Fi provisioning. If the system detects that no active Wi-Fi connection is available, it automatically starts a hotspot [
53], turns on all three LEDs, and waits for a smartphone app to provide Wi-Fi configuration information. The communication protocol between the RPi and the smartphone app is established through a socket connection [
54], allowing the app to send Wi-Fi credentials securely to the device.
The smartphone app is responsible for scanning available Wi-Fi networks and allowing the user to select the desired network, and password if required. Once the user provides this information, the app opens a socket connection to the device hotspot and transmits the Wi-Fi configuration data in a structured format (e.g., security type, SSID, and password). The Python program running on the device receives these data through the socket, parses them, and attempts to connect to the specified Wi-Fi network. If successful, the device switches from hotspot mode to the configured network, turns off all the LEDs and continues the normal operation. The app verifies the successful Wi-Fi connection of the device by receiving the device serial number, <DeviceID>, which is needed by the smartphone to send commands to the device using MQTT.
Detecting gunshots: The pseudo-code of the main loop is shown in
Figure 7. This loop governs the main operations, including audio recording, feature extraction, gunshot classification, and system state management.
At the beginning of each iteration of the loop, the device first checks if it is enabled by verifying the value of the isDeviceEnabled flag. If enabled, the system captures an audio sample using the RecordAudioSample() function. Here, the function turns on the yellow LED and waits until all the 1 s audio samples are read. It then turns off the yellow LED and leaves the function. The function does not terminate the recording process once it finishes reading the data. Instead, the audio stream continues running in the background, allowing continuous audio capture while doing feature extraction and classification. The average delay between this subsequent function call is 35.2 milliseconds, which is less than 1 s. This design ensures that no audio data are lost due to the delay in the remaining code of the loop. The audio data are then passed to the GenerateFeatures() function, which extracts both time-domain and frequency-domain features, as discussed in
Section 3.1.2.
After generating the 2D features,
f, the system processes them using overlapping windows, a technique that enhances the robustness of detection. Here, the number of overlaps,
v, is set as 4 and
offset as 16, calculated by dividing total feature columns = 64 by
v = 4. Unlike the approach in our previous work in [
13], where each iteration analyzes new 1 s audio data, this system creates overlap by combining the last columns of the previously generated feature matrix,
fp, from the last audio sample with the beginning columns of the current feature matrix,
f, from the new audio sample. For instance, the last 3/4th of
fp columns are concatenated with the first 1/4th of the
f column in the first iteration, then the last 2/4th of
fp columns are concatenated with the first 2/4th of the
f column in the second iteration, and so on. Finally,
fp is set as
f for the next cycle. This overlapping mechanism is essential for capturing transient audio events, such as gunshots, which may not be fully captured within a single window. By combining features from consecutive audio samples, the system increases the likelihood that the important characteristics of a gunshot are preserved across multiple windows, reducing the chances of missing critical audio patterns due to the boundaries.
Once the feature matrix is constructed, it is passed to the ClassifySound() function. This function employs the pre-trained convolutional neural network (CNN) model to classify the sound based on the extracted features. The CNN, optimized for real-time execution via TensorFlow Lite [
49], analyzes the combined feature set and generates a probability score indicating the likelihood of the detected sound being a gunshot. If the score exceeds the 0.5 probability threshold, the sound is classified as a gunshot. If a gunshot is detected, the system immediately triggers the HandleGunshotEvent() function, which manages all subsequent actions. These include publishing an MQTT message on the topic GSD_OBSERVER/<DeviceID> containing a timestamp, saving the audio data as a WAV file for forensic purposes, and turning on the red LED as a visual indicator.
In the main loop, the UpdateMQTTConnectionStatus() function continuously checks whether the device is connected to the MQTT broker and updates the status of the green LED accordingly. If the connection is lost, the device attempts to reconnect automatically. Finally, the CheckResetButton() function monitors the physical reset button connected to the Raspberry Pi’s GPIO. If the button is pressed, the device measures the duration of the press to determine whether to perform a soft reboot or shut down the system. A short press triggers a reboot, allowing the system to restart. A long press, on the other hand, results in a complete shutdown.
OTA update management: The CheckForOTAUpdate () ensures that the system stays up-to-date with the latest firmware version without manual intervention. In each loop of the firmware, it checks the current time with the scheduled OTA update time. The scheduled OTA time is randomized based on the unique identity of the device to avoid overloading the server with many requests at the same time. If the current time matches the scheduled OTA time, the system initiates a series of operations to communicate with the OTA server that includes: verifying the availability of a new firmware version, downloading the update, installing it, restating, and deleting old files.
The first step in the OTA update process is to determine whether a new firmware version is available. When the update process is triggers, the device establishes a secure HTTP connection with the OTA server. This communication is facilitated through the Flask-based OTA server running on a designated IP and port as discussed in
Section 3.2.1. It then sends a GET request to the server’s/get_fw_ver/<hw_ver> endpoint, where <hw_ver> represents the hardware version of the device. This endpoint checks the ver.txt file on the server, which contains the current firmware version for the specified hardware version. Upon receiving the request, the OTA server verifies the identity of the device using HTTP Basic Authentication [
44]. If authentication is successful, the server reads the ver.txt file, retrieves the current firmware version, and sends it back to the device in the HTTP response. The device then compares the received firmware version with its own. If the versions differ, indicating that a new update is available, the device proceeds with the update process.
Once the system confirms that a new firmware version is available, the device initiates the downloading and installation of the update. This function establishes another secure HTTP connection with the OTA server, targeting the /get_fw/<hw_ver> endpoint. The firmware files are packaged based on hardware version compatibility, preventing the installation of firmware on incompatible devices. The OTA server responds by sending the firmware file packaged as a Zip archive. The Zip file is streamed directly to the device, ensuring minimal latency. Upon receiving the Zip archive, the device uses Python’s zip file module to extract the firmware files into the appropriate directory on the device.
After extracting the files, the device updates a shell script with the new filenames, which lists the Python files that will run after the system boots [
55]. It also writes in an ota.dat file: the version number of its current firmware and the flag isUpdatingDone as False, indicating that its current firmware files still need to be deleted by the new firmware after the boot, as the currently running Python script cannot delete itself. Finally, the system triggers a reboot to apply the firmware changes, ensuring that the updated firmware is executed on the next boot. Upon boot, it checks the ota.dat file to determine whether all tasks of the update have been completed. If the flag
isUpdatingDone is set to False, the function reads the version of the previous firmware that can now be deleted. The system then proceeds to remove the old version’s files, such as outdated Python scripts and model files, and sets
isUpdatingDone as True. This function is crucial for freeing up space for future updates.
Secure gunshot audio file access via local network: A secure web server using Flask on the RPi device is implemented to allow access to recorded gunshot audio files over the local Wi-Fi network. The server employs HTTP Basic Authentication with password hashing to ensure that only authorized users can access the files. SSL/TLS encryption ensures secure transmission of data between the client (smartphone or desktop) and the server, protecting sensitive information like passwords and audio files. The server also features a password change functionality. Users can update the administrator password through a web form, and the updated password is securely hashed and stored in a file.
Gunshot audio files, stored in a specific directory, are dynamically listed on the homepage, and users can click links to download the files. File access is handled by Flask’s ‘send_from_directory()’ method, ensuring that only files in the designated directory are accessible.