Malcolm is a powerful network traffic analysis tool suite designed with the following goals in mind:
- Easy to use – Malcolm accepts network traffic data in the form of full packet capture (PCAP) files and Zeek (formerly Bro) logs. These artifacts can be uploaded via a simple browser-based interface or captured live and forwarded to Malcolm using lightweight forwarders. In either case, the data is automatically normalized, enriched, and correlated for analysis.
- Powerful traffic analysis – Visibility into network communications is provided through two intuitive interfaces: OpenSearch Dashboards, a flexible data visualization plugin with dozens of prebuilt dashboards providing an at-a-glance overview of network protocols; and Arkime (formerly Moloch), a powerful tool for finding and identifying the network sessions comprising suspected security incidents.
- Streamlined deployment – Malcolm operates as a cluster of Docker containers, isolated sandboxes which each serve a dedicated function of the system. This Docker-based deployment model, combined with a few simple scripts for setup and run-time management, makes Malcolm suitable to be deployed quickly across a variety of platforms and use cases, whether it be for long-term deployment on a Linux server in a security operations center (SOC) or for incident response on a Macbook for an individual engagement.
- Secure communications – All communications with Malcolm, both from the user interface and from remote log forwarders, are secured with industry standard encryption protocols.
- Permissive license – Malcolm is comprised of several widely used open source tools, making it an attractive alternative to security solutions requiring paid licenses.
- Expanding control systems visibility – While Malcolm is great for general-purpose network traffic analysis, its creators see a particular need in the community for tools providing insight into protocols used in industrial control systems (ICS) environments. Ongoing Malcolm development will aim to provide additional parsers for common ICS protocols.
Although all of the open source tools which make up Malcolm are already available and in general use, Malcolm provides a framework of interconnectivity which makes it greater than the sum of its parts. And while there are many other network traffic analysis solutions out there, ranging from complete Linux distributions like Security Onion to licensed products like Splunk Enterprise Security, the creators of Malcolm feel its easy deployment and robust combination of tools fill a void in the network security space that will make network traffic analysis accessible to many in both the public and private sectors as well as individual enthusiasts.
In short, Malcolm provides an easily deployable network analysis tool suite for full packet capture artifacts (PCAP files) and Zeek logs. While Internet access is required to build it, it is not required at runtime.
You can help steer Malcolm's development by sharing your ideas and feedback. Please take a few minutes to complete this survey ↪ (hosted on Google Forms) so we can understand the members of the Malcolm community and their use cases for this tool.
- Automated Build Workflows Status
- Quick start
- Overview
- Components
- Supported Protocols
- Development
- Pre-Packaged installation files
- Preparing your system
- Running Malcolm
- Capture file and log archive upload
- Live analysis
- Arkime
- OpenSearch Dashboards
- Search Queries in Arkime and OpenSearch
- Other Malcolm features
- Ingesting Third-party Logs
- Malcolm installer ISO
- Installation example using Ubuntu 22.04 LTS
- Upgrading Malcolm
- Modifying or Contributing to Malcolm
- Copyright
- Contact
See Building from source to read how you can use GitHub workflow files to build Malcolm.
For a TL;DR
example of downloading, configuring, and running Malcolm on a Linux platform, see Installation example using Ubuntu 22.04 LTS.
The scripts to control Malcolm require Python 3. The install.py
script requires the requests module for Python 3, and will make use of the pythondialog module for user interaction (on Linux) if it is available.
The files required to build and run Malcolm are available on its GitHub page. Malcolm's source code is released under the terms of a permissive open source software license (see see License.txt
for the terms of its release).
The build.sh
script can build Malcolm's Docker images from scratch. See Building from source for more information.
You must run auth_setup
prior to pulling Malcolm's Docker images. You should also ensure your system configuration and docker-compose.yml
settings are tuned by running ./scripts/install.py
or ./scripts/install.py --configure
(see System configuration and tuning).
Malcolm's Docker images are periodically built and hosted on Docker Hub. If you already have Docker and Docker Compose, these prebuilt images can be pulled by navigating into the Malcolm directory (containing the docker-compose.yml
file) and running docker-compose pull
like this:
$ docker-compose pull
Pulling api ... done
Pulling arkime ... done
Pulling dashboards ... done
Pulling dashboards-helper ... done
Pulling file-monitor ... done
Pulling filebeat ... done
Pulling freq ... done
Pulling htadmin ... done
Pulling logstash ... done
Pulling name-map-ui ... done
Pulling nginx-proxy ... done
Pulling opensearch ... done
Pulling pcap-capture ... done
Pulling pcap-monitor ... done
Pulling suricata ... done
Pulling upload ... done
Pulling zeek ... done
You can then observe that the images have been retrieved by running docker images
:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
malcolmnetsec/api 6.3.0 xxxxxxxxxxxx 3 days ago 158MB
malcolmnetsec/arkime 6.3.0 xxxxxxxxxxxx 3 days ago 816MB
malcolmnetsec/dashboards 6.3.0 xxxxxxxxxxxx 3 days ago 1.02GB
malcolmnetsec/dashboards-helper 6.3.0 xxxxxxxxxxxx 3 days ago 184MB
malcolmnetsec/filebeat-oss 6.3.0 xxxxxxxxxxxx 3 days ago 624MB
malcolmnetsec/file-monitor 6.3.0 xxxxxxxxxxxx 3 days ago 588MB
malcolmnetsec/file-upload 6.3.0 xxxxxxxxxxxx 3 days ago 259MB
malcolmnetsec/freq 6.3.0 xxxxxxxxxxxx 3 days ago 132MB
malcolmnetsec/htadmin 6.3.0 xxxxxxxxxxxx 3 days ago 242MB
malcolmnetsec/logstash-oss 6.3.0 xxxxxxxxxxxx 3 days ago 1.35GB
malcolmnetsec/name-map-ui 6.3.0 xxxxxxxxxxxx 3 days ago 143MB
malcolmnetsec/nginx-proxy 6.3.0 xxxxxxxxxxxx 3 days ago 121MB
malcolmnetsec/opensearch 6.3.0 xxxxxxxxxxxx 3 days ago 1.17GB
malcolmnetsec/pcap-capture 6.3.0 xxxxxxxxxxxx 3 days ago 121MB
malcolmnetsec/pcap-monitor 6.3.0 xxxxxxxxxxxx 3 days ago 213MB
malcolmnetsec/suricata 6.3.0 xxxxxxxxxxxx 3 days ago 278MB
malcolmnetsec/zeek 6.3.0 xxxxxxxxxxxx 3 days ago 1GB
Once built, the malcolm_appliance_packager.sh
script can be used to create pre-packaged Malcolm tarballs for import on another machine. See Pre-Packaged Installation Files for more information.
Use the scripts in the scripts/
directory to start and stop Malcolm, view debug logs of a currently running
instance, wipe the database and restore Malcolm to a fresh state, etc.
A few minutes after starting Malcolm (probably 5 to 10 minutes for Logstash to be completely up, depending on the system), the following services will be accessible:
- Arkime: https://localhost:443
- OpenSearch Dashboards: https://localhost/dashboards/ or https://localhost:5601
- Capture File and Log Archive Upload (Web): https://localhost/upload/
- Capture File and Log Archive Upload (SFTP):
sftp://<username>@127.0.0.1:8022/files
- Host and Subnet Name Mapping Editor: https://localhost/name-map-ui/
- Account Management: https://localhost:488
Malcolm processes network traffic data in the form of packet capture (PCAP) files or Zeek logs. A sensor (packet capture appliance) monitors network traffic mirrored to it over a SPAN port on a network switch or router, or using a network TAP device. Zeek logs and Arkime sessions are generated containing important session metadata from the traffic observed, which are then securely forwarded to a Malcolm instance. Full PCAP files are optionally stored locally on the sensor device for examination later.
Malcolm parses the network session data and enriches it with additional lookups and mappings including GeoIP mapping, hardware manufacturer lookups from organizationally unique identifiers (OUI) in MAC addresses, assigning names to network segments and hosts based on user-defined IP address and MAC mappings, performing TLS fingerprinting, and many others.
The enriched data is stored in an OpenSearch document store in a format suitable for analysis through two intuitive interfaces: OpenSearch Dashboards, a flexible data visualization plugin with dozens of prebuilt dashboards providing an at-a-glance overview of network protocols; and Arkime, a powerful tool for finding and identifying the network sessions comprising suspected security incidents. These tools can be accessed through a web browser from analyst workstations or for display in a security operations center (SOC). Logs can also optionally be forwarded on to another instance of Malcolm.
For smaller networks, use at home by network security enthusiasts, or in the field for incident response engagements, Malcolm can also easily be deployed locally on an ordinary consumer workstation or laptop. Malcolm can process local artifacts such as locally-generated Zeek logs, locally-captured PCAP files, and PCAP files collected offline without the use of a dedicated sensor appliance.
Malcolm leverages the following excellent open source tools, among others.
- Arkime (formerly Moloch) - for PCAP file processing, browsing, searching, analysis, and carving/exporting; Arkime itself consists of two parts:
- OpenSearch - a search and analytics engine for indexing and querying network traffic session metadata
- Logstash and Filebeat - for ingesting and parsing Zeek Log Files and ingesting them into OpenSearch in a format that Arkime understands in the same way it natively understands PCAP data
- OpenSearch Dashboards - for creating additional ad-hoc visualizations and dashboards beyond that which is provided by Arkime viewer
- Zeek - a network analysis framework and IDS
- Suricata - an IDS and threat detection engine
- Yara - a tool used to identify and classify malware samples
- Capa - a tool for detecting capabilities in executable files
- ClamAV - an antivirus engine for scanning files extracted by Zeek
- CyberChef - a "swiss-army knife" data conversion tool
- jQuery File Upload - for uploading PCAP files and Zeek logs for processing
- List.js - for the host and subnet name mapping interface
- Docker and Docker Compose - for simple, reproducible deployment of the Malcolm appliance across environments and to coordinate communication between its various components
- Nginx - for HTTPS and reverse proxying Malcolm components
- nginx-auth-ldap - an LDAP authentication module for nginx
- Fluent Bit - for forwarding metrics to Malcolm from network sensors (packet capture appliances)
- Mark Baggett's freq - a tool for calculating entropy of strings
- Florian Roth's Signature-Base Yara ruleset
- These Zeek plugins:
- some of Amazon.com, Inc.'s ICS protocol analyzers
- Andrew Klaus's Sniffpass plugin for detecting cleartext passwords in HTTP POST requests
- Andrew Klaus's zeek-httpattacks plugin for detecting noncompliant HTTP requests
- ICS protocol analyzers for Zeek published by DHS CISA and Idaho National Lab
- Corelight's "bad neighbor" (CVE-2020-16898) plugin
- Corelight's "Log4Shell" (CVE-2021-44228) plugin
- Corelight's "OMIGOD" (CVE-2021-38647) plugin
- Corelight's Apache HTTP server 2.4.49-2.4.50 path traversal/RCE vulnerability (CVE-2021-41773) plugin
- Corelight's bro-xor-exe plugin
- Corelight's callstranger-detector plugin
- Corelight's community ID flow hashing plugin
- Corelight's DCE/RPC remote code execution vulnerability (CVE-2022-26809) plugin
- Corelight's HTTP More Filenames plugin
- Corelight's HTTP protocol stack vulnerability (CVE-2021-31166) plugin
- Corelight's pingback plugin
- Corelight's ripple20 plugin
- Corelight's SIGred plugin
- Corelight's VMware Workspace ONE Access and Identity Manager RCE vulnerability (CVE-2022-22954) plugin
- Corelight's Zerologon plugin
- Corelight's Microsoft Excel privilege escalation detection (CVE-2021-42292) plugin
- J-Gras' Zeek::AF_Packet plugin
- Johanna Amann's CVE-2020-0601 ECC certificate validation plugin and CVE-2020-13777 GnuTLS unencrypted session ticket detection plugin
- Lexi Brent's EternalSafety plugin
- MITRE Cyber Analytics Repository's Bro/Zeek ATT&CK®-Based Analytics (BZAR) script
- Salesforce's gQUIC analyzer
- Salesforce's HASSH SSH fingerprinting plugin
- Salesforce's JA3 TLS fingerprinting plugin
- Zeek's Spicy plugin framework
- GeoLite2 - Malcolm includes GeoLite2 data created by MaxMind
Malcolm uses Zeek and Arkime to analyze network traffic. These tools provide varying degrees of visibility into traffic transmitted over the following network protocols:
Traffic | Wiki | Organization/Specification | Arkime | Zeek |
---|---|---|---|---|
Internet layer | 🔗 | 🔗 | ✓ | ✓ |
Border Gateway Protocol (BGP) | 🔗 | 🔗 | ✓ | |
Building Automation and Control (BACnet) | 🔗 | 🔗 | ✓ | |
Bristol Standard Asynchronous Protocol (BSAP) | 🔗 | 🔗🔗 | ✓ | |
Distributed Computing Environment / Remote Procedure Calls (DCE/RPC) | 🔗 | 🔗 | ✓ | |
Dynamic Host Configuration Protocol (DHCP) | 🔗 | 🔗 | ✓ | ✓ |
Distributed Network Protocol 3 (DNP3) | 🔗 | 🔗 | ✓✓ | |
Domain Name System (DNS) | 🔗 | 🔗 | ✓ | ✓ |
EtherCAT | 🔗 | 🔗 | ✓ | |
EtherNet/IP / Common Industrial Protocol (CIP) | 🔗 🔗 | 🔗 | ✓ | |
FTP (File Transfer Protocol) | 🔗 | 🔗 | ✓ | |
GENISYS | 🔗🔗 | ✓ | ||
Google Quick UDP Internet Connections (gQUIC) | 🔗 | 🔗 | ✓ | ✓ |
Hypertext Transfer Protocol (HTTP) | 🔗 | 🔗 | ✓ | ✓ |
IPsec | 🔗 | 🔗 | ✓ | |
Internet Relay Chat (IRC) | 🔗 | 🔗 | ✓ | ✓ |
Lightweight Directory Access Protocol (LDAP) | 🔗 | 🔗 | ✓ | ✓ |
Kerberos | 🔗 | 🔗 | ✓ | ✓ |
Modbus | 🔗 | 🔗 | ✓✓ | |
MQ Telemetry Transport (MQTT) | 🔗 | 🔗 | ✓ | |
MySQL | 🔗 | 🔗 | ✓ | ✓ |
NT Lan Manager (NTLM) | 🔗 | 🔗 | ✓ | |
Network Time Protocol (NTP) | 🔗 | 🔗 | ✓ | |
Oracle | 🔗 | 🔗 | ✓ | |
Open Platform Communications Unified Architecture (OPC UA) Binary | 🔗 | 🔗 | ✓ | |
Open Shortest Path First (OSPF) | 🔗 | 🔗🔗🔗 | ✓ | |
OpenVPN | 🔗 | 🔗🔗 | ✓ | |
PostgreSQL | 🔗 | 🔗 | ✓ | |
Process Field Net (PROFINET) | 🔗 | 🔗 | ✓ | |
Remote Authentication Dial-In User Service (RADIUS) | 🔗 | 🔗 | ✓ | ✓ |
Remote Desktop Protocol (RDP) | 🔗 | 🔗 | ✓ | |
Remote Framebuffer (RFB) | 🔗 | 🔗 | ✓ | |
S7comm / Connection Oriented Transport Protocol (COTP) | 🔗 🔗 | 🔗 🔗 | ✓ | |
Secure Shell (SSH) | 🔗 | 🔗 | ✓ | ✓ |
Secure Sockets Layer (SSL) / Transport Layer Security (TLS) | 🔗 | 🔗 | ✓ | ✓ |
Session Initiation Protocol (SIP) | 🔗 | 🔗 | ✓ | |
Server Message Block (SMB) / Common Internet File System (CIFS) | 🔗 | 🔗 | ✓ | ✓ |
Simple Mail Transfer Protocol (SMTP) | 🔗 | 🔗 | ✓ | ✓ |
Simple Network Management Protocol (SNMP) | 🔗 | 🔗 | ✓ | ✓ |
SOCKS | 🔗 | 🔗 | ✓ | ✓ |
STUN (Session Traversal Utilities for NAT) | 🔗 | 🔗 | ✓ | ✓ |
Syslog | 🔗 | 🔗 | ✓ | ✓ |
Tabular Data Stream (TDS) | 🔗 | 🔗 🔗 | ✓ | ✓ |
Telnet / remote shell (rsh) / remote login (rlogin) | 🔗🔗 | 🔗🔗 | ✓ | ✓❋ |
TFTP (Trivial File Transfer Protocol) | 🔗 | 🔗 | ✓ | |
WireGuard | 🔗 | 🔗🔗 | ✓ | |
various tunnel protocols (e.g., GTP, GRE, Teredo, AYIYA, IP-in-IP, etc.) | 🔗 | ✓ | ✓ |
Additionally, Zeek is able to detect and, where possible, log the type, vendor and version of various other software protocols.
As part of its network traffic analysis, Zeek can extract and analyze files transferred across the protocols it understands. In addition to generating logs for transferred files, deeper analysis is done into the following file types:
- Portable executable files
- X.509 certificates
See automatic file extraction and scanning for additional features related to file scanning.
See Zeek log integration for more information on how Malcolm integrates Arkime sessions and Zeek logs for analysis.
Checking out the Malcolm source code results in the following subdirectories in your malcolm/
working copy:
api
- code and configuration for theapi
container which provides a REST API to query Malcolmarkime
- code and configuration for thearkime
container which processes PCAP files usingcapture
and which serves the Viewer applicationarkime-logs
- an initially empty directory to which thearkime
container will write some debug log filesarkime-raw
- an initially empty directory to which thearkime
container will write captured PCAP files; as Arkime as employed by Malcolm is currently used for processing previously-captured PCAP files, this directory is currently unusedDockerfiles
- a directory containing build instructions for Malcolm's docker imagesdocs
- a directory containing instructions and documentationopensearch
- an initially empty directory where the OpenSearch database instance will resideopensearch-backup
- an initially empty directory for storing OpenSearch index snapshotsfilebeat
- code and configuration for thefilebeat
container which ingests Zeek logs and forwards them to thelogstash
containerfile-monitor
- code and configuration for thefile-monitor
container which can scan files extracted by Zeekfile-upload
- code and configuration for theupload
container which serves a web browser-based upload form for uploading PCAP files and Zeek logs, and which serves an SFTP share as an alternate method for uploadfreq-server
- code and configuration for thefreq
container used for calculating entropy of stringshtadmin
- configuration for thehtadmin
user account management containerdashboards
- code and configuration for thedashboards
container for creating additional ad-hoc visualizations and dashboards beyond that which is provided by Arkime Viewerlogstash
- code and configuration for thelogstash
container which parses Zeek logs and forwards them to theopensearch
containermalcolm-iso
- code and configuration for building an installer ISO for a minimal Debian-based Linux installation for running Malcolmname-map-ui
- code and configuration for thename-map-ui
container which provides the host and subnet name mapping interfacenginx
- configuration for thenginx
reverse proxy containerpcap
- an initially empty directory for PCAP files to be uploaded, processed, and storedpcap-capture
- code and configuration for thepcap-capture
container which can capture network trafficpcap-monitor
- code and configuration for thepcap-monitor
container which watches for new or uploaded PCAP files notifies the other services to process themscripts
- control scripts for starting, stopping, restarting, etc. Malcolmsensor-iso
- code and configuration for building a Hedgehog Linux ISOshared
- miscellaneous code used by various Malcolm componentssuricata
- code and configuration for thesuricata
container which handles PCAP processing using Suricatasuricata-logs
- an initially empty directory for Suricata logs to be uploaded, processed, and storedzeek
- code and configuration for thezeek
container which handles PCAP processing using Zeekzeek-logs
- an initially empty directory for Zeek logs to be uploaded, processed, and stored
and the following files of special note:
auth.env
- the script./scripts/auth_setup
prompts the user for the administrator credentials used by the Malcolm appliance, andauth.env
is the environment file where those values are storedcidr-map.txt
- specify custom IP address to network segment mappinghost-map.txt
- specify custom IP and/or MAC address to host mappingnet-map.json
- an alternative tocidr-map.txt
andhost-map.txt
, mapping hosts and network segments to their names in a JSON-formatted filedocker-compose.yml
- the configuration file used bydocker-compose
to build, start, and stop an instance of the Malcolm appliancedocker-compose-standalone.yml
- similar todocker-compose.yml
, only used for the "packaged" installation of Malcolm
Building the Malcolm docker images from scratch requires internet access to pull source files for its components. Once internet access is available, execute the following command to build all of the Docker images used by the Malcolm appliance:
$ ./scripts/build.sh
Then, go take a walk or something since it will be a while. When you're done, you can run docker images
and see you have fresh images for:
malcolmnetsec/api
(based onpython:3-slim
)malcolmnetsec/arkime
(based ondebian:11-slim
)malcolmnetsec/dashboards-helper
(based onalpine:3.16
)malcolmnetsec/dashboards
(based onopensearchproject/opensearch-dashboards
)malcolmnetsec/file-monitor
(based ondebian:11-slim
)malcolmnetsec/file-upload
(based ondebian:11-slim
)malcolmnetsec/filebeat-oss
(based ondocker.elastic.co/beats/filebeat-oss
)malcolmnetsec/freq
(based ondebian:11-slim
)malcolmnetsec/htadmin
(based ondebian:11-slim
)malcolmnetsec/logstash-oss
(based onopensearchproject/logstash-oss-with-opensearch-output-plugin
)malcolmnetsec/name-map-ui
(based onalpine:3.16
)malcolmnetsec/nginx-proxy
(based onalpine:3.16
)malcolmnetsec/opensearch
(based onopensearchproject/opensearch
)malcolmnetsec/pcap-capture
(based ondebian:11-slim
)malcolmnetsec/pcap-monitor
(based ondebian:11-slim
)malcolmnetsec/suricata
(based ondebian:11-slim
)malcolmnetsec/zeek
(based ondebian:11-slim
)
Alternately, if you have forked Malcolm on GitHub, workflow files are provided which contain instructions for GitHub to build the docker images and sensor and Malcolm installer ISOs. The resulting images are named according to the pattern ghcr.io/owner/malcolmnetsec/image:branch
(e.g., if you've forked Malcolm with the github user romeogdetlevjr
, the arkime
container built for the main
would be named ghcr.io/romeogdetlevjr/malcolmnetsec/arkime:main
). To run your local instance of Malcolm using these images instead of the official ones, you'll need to edit your docker-compose.yml
file(s) and replace the image:
tags according to this new pattern, or use the bash helper script ./shared/bin/github_image_helper.sh
to pull and re-tag the images.
scripts/malcolm_appliance_packager.sh
can be run to package up the configuration files (and, if necessary, the Docker images) which can be copied to a network share or USB drive for distribution to non-networked machines. For example:
$ ./scripts/malcolm_appliance_packager.sh
You must set a username and password for Malcolm, and self-signed X.509 certificates will be generated
Store administrator username/password for local Malcolm access? (Y/n): y
Administrator username: analyst
analyst password:
analyst password (again):
(Re)generate self-signed certificates for HTTPS access (Y/n): y
(Re)generate self-signed certificates for a remote log forwarder (Y/n): y
Store username/password for primary remote OpenSearch instance? (y/N): n
Store username/password for secondary remote OpenSearch instance? (y/N): n
Store username/password for email alert sender account? (y/N): n
Packaged Malcolm to "/home/user/tmp/malcolm_20190513_101117_f0d052c.tar.gz"
Do you need to package docker images also [y/N]? y
This might take a few minutes...
Packaged Malcolm docker images to "/home/user/tmp/malcolm_20190513_101117_f0d052c_images.tar.gz"
To install Malcolm:
1. Run install.py
2. Follow the prompts
To start, stop, restart, etc. Malcolm:
Use the control scripts in the "scripts/" directory:
- start (start Malcolm)
- stop (stop Malcolm)
- restart (restart Malcolm)
- logs (monitor Malcolm logs)
- wipe (stop Malcolm and clear its database)
- auth_setup (change authentication-related settings)
A minute or so after starting Malcolm, the following services will be accessible:
- Arkime: https://localhost/
- OpenSearch Dashboards: https://localhost/dashboards/
- PCAP upload (web): https://localhost/upload/
- PCAP upload (sftp): sftp://[email protected]:8022/files/
- Host and subnet name mapping editor: https://localhost/name-map-ui/
- Account management: https://localhost:488/
The above example will result in the following artifacts for distribution as explained in the script's output:
$ ls -lh
total 2.0G
-rwxr-xr-x 1 user user 61k May 13 11:32 install.py
-rw-r--r-- 1 user user 2.0G May 13 11:37 malcolm_20190513_101117_f0d052c_images.tar.gz
-rw-r--r-- 1 user user 683 May 13 11:37 malcolm_20190513_101117_f0d052c.README.txt
-rw-r--r-- 1 user user 183k May 13 11:32 malcolm_20190513_101117_f0d052c.tar.gz
If you have obtained pre-packaged installation files to install Malcolm on a non-networked machine via an internal network share or on a USB key, you likely have the following files:
malcolm_YYYYMMDD_HHNNSS_xxxxxxx.README.txt
- This readme file contains a minimal set up instructions for extracting the contents of the other tarballs and running the Malcolm appliance.malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz
- This tarball contains the configuration files and directory configuration used by an instance of Malcolm. It can be extracted viatar -xf malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz
upon which a directory will be created (named similarly to the tarball) containing the directories and configuration files. Alternatively,install.py
can accept this filename as an argument and handle its extraction and initial configuration for you.malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz
- This tarball contains the Docker images used by Malcolm. It can be imported manually viadocker load -i malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz
install.py
- This install script can load the Docker images and extract Malcolm configuration files from the aforementioned tarballs and do some initial configuration for you.
Run install.py malcolm_XXXXXXXX_XXXXXX_XXXXXXX.tar.gz
and follow the prompts. If you do not already have Docker and Docker Compose installed, the install.py
script will help you install them.
Malcolm runs on top of Docker which runs on recent releases of Linux, Apple macOS and Microsoft Windows 10.
To quote the Elasticsearch documentation, "If there is one resource that you will run out of first, it will likely be memory." The same is true for Malcolm: you will want at least 16 gigabytes of RAM to run Malcolm comfortably. For processing large volumes of traffic, I'd recommend at a bare minimum a dedicated server with 16 cores and 16 gigabytes of RAM. Malcolm can run on less, but more is better. You're going to want as much hard drive space as possible, of course, as the amount of PCAP data you're able to analyze and store will be limited by your hard drive.
Arkime's wiki has a couple of documents (here and here and here and a calculator here) which may be helpful, although not everything in those documents will apply to a Docker-based setup like Malcolm.
If you already have Docker and Docker Compose installed, the install.py
script can still help you tune system configuration and docker-compose.yml
parameters for Malcolm. To run it in "configuration only" mode, bypassing the steps to install Docker and Docker Compose, run it like this:
./scripts/install.py --configure
Although install.py
will attempt to automate many of the following configuration and tuning parameters, they are nonetheless listed in the following sections for reference:
Edit docker-compose.yml
and search for the OPENSEARCH_JAVA_OPTS
key. Edit the -Xms4g -Xmx4g
values, replacing 4g
with a number that is half of your total system memory, or just under 32 gigabytes, whichever is less. So, for example, if I had 64 gigabytes of memory I would edit those values to be -Xms31g -Xmx31g
. This indicates how much memory can be allocated to the OpenSearch heaps. For a pleasant experience, I would suggest not using a value under 10 gigabytes. Similar values can be modified for Logstash with LS_JAVA_OPTS
, where using 3 or 4 gigabytes is recommended.
Various other environment variables inside of docker-compose.yml
can be tweaked to control aspects of how Malcolm behaves, particularly with regards to processing PCAP files and Zeek logs. The environment variables of particular interest are located near the top of that file under Commonly tweaked configuration options, which include:
ARKIME_ANALYZE_PCAP_THREADS
– the number of threads available to Arkime for analyzing PCAP files (default1
)AUTO_TAG
– if set totrue
, Malcolm will automatically create Arkime sessions and Zeek logs with tags based on the filename, as described in Tagging (defaulttrue
)BEATS_SSL
– if set totrue
, Logstash will use require encrypted communications for any external Beats-based forwarders from which it will accept logs (defaulttrue
)CONNECTION_SECONDS_SEVERITY_THRESHOLD
- when severity scoring is enabled, this variable indicates the duration threshold (in seconds) for assigning severity to long connections (default3600
)EXTRACTED_FILE_CAPA_VERBOSE
– if set totrue
, all Capa rule hits will be logged; otherwise (false
) only MITRE ATT&CK® technique classifications will be loggedEXTRACTED_FILE_ENABLE_CAPA
– if set totrue
, Zeek-extracted files that are determined to be PE (portable executable) files will be scanned with CapaEXTRACTED_FILE_ENABLE_CLAMAV
– if set totrue
, Zeek-extracted files will be scanned with ClamAVEXTRACTED_FILE_ENABLE_YARA
– if set totrue
, Zeek-extracted files will be scanned with YaraEXTRACTED_FILE_HTTP_SERVER_ENABLE
– if set totrue
, the directory containing Zeek-extracted files will be served over HTTP at./extracted-files/
(e.g., https://localhost/extracted-files/ if you are connecting locally)EXTRACTED_FILE_HTTP_SERVER_ENCRYPT
– if set totrue
, those Zeek-extracted files will be AES-256-CBC-encrypted in anopenssl enc
-compatible format (e.g.,openssl enc -aes-256-cbc -d -in example.exe.encrypted -out example.exe
)EXTRACTED_FILE_HTTP_SERVER_KEY
– specifies the AES-256-CBC decryption password for encrypted Zeek-extracted files; used in conjunction withEXTRACTED_FILE_HTTP_SERVER_ENCRYPT
EXTRACTED_FILE_IGNORE_EXISTING
– if set totrue
, files extant in./zeek-logs/extract_files/
directory will be ignored on startup rather than scannedEXTRACTED_FILE_PRESERVATION
– determines behavior for preservation of Zeek-extracted filesEXTRACTED_FILE_UPDATE_RULES
– if set totrue
, file scanner engines (e.g., ClamAV, Capa, Yara) will periodically update their rule definitionsEXTRACTED_FILE_YARA_CUSTOM_ONLY
– if set totrue
, Malcolm will bypass the default Yara ruleset and use only user-defined rules in./yara/rules
FREQ_LOOKUP
- if set totrue
, domain names (from DNS queries and SSL server names) will be assigned entropy scores as calculated byfreq
(defaultfalse
)FREQ_SEVERITY_THRESHOLD
- when severity scoring is enabled, this variable indicates the entropy threshold for assigning severity to events with entropy scores calculated byfreq
; a lower value will only assign severity scores to fewer domain names with higher entropy (e.g.,2.0
forNQZHTFHRMYMTVBQJE.COM
), while a higher value will assign severity scores to more domain names with lower entropy (e.g.,7.5
fornaturallanguagedomain.example.org
) (default2.0
)LOGSTASH_OUI_LOOKUP
– if set totrue
, Logstash will map MAC addresses to vendors for all source and destination MAC addresses when analyzing Zeek logs (defaulttrue
)LOGSTASH_REVERSE_DNS
– if set totrue
, Logstash will perform a reverse DNS lookup for all external source and destination IP address values when analyzing Zeek logs (defaultfalse
)LOGSTASH_SEVERITY_SCORING
- if set totrue
, Logstash will perform severity scoring when analyzing Zeek logs (defaulttrue
)MANAGE_PCAP_FILES
– if set totrue
, all PCAP files imported into Malcolm will be marked as available for deletion by Arkime if available storage space becomes too low (defaultfalse
)MAXMIND_GEOIP_DB_LICENSE_KEY
- Malcolm uses MaxMind's free GeoLite2 databases for GeoIP lookups. As of December 30, 2019, these databases are no longer available for download via a public URL. Instead, they must be downloaded using a MaxMind license key (available without charge from MaxMind). The license key can be specified here for GeoIP database downloads during build- and run-time.OPENSEARCH_LOCAL
- if set totrue
, Malcolm will use its own internal OpenSearch instance (defaulttrue
)OPENSEARCH_URL
- when using Malcolm's internal OpenSearch instance (i.e.,OPENSEARCH_LOCAL
istrue
) this should behttp://opensearch:9200
, otherwise this value specifies the primary remote instance URL in the formatprotocol://host:port
(defaulthttp://opensearch:9200
)OPENSEARCH_SSL_CERTIFICATE_VERIFICATION
- if set totrue
, connections to the primary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (defaultfalse
)OPENSEARCH_SECONDARY
- if set totrue
, Malcolm will forward logs to a secondary remote OpenSearch instance in addition to the primary (local or remote) OpenSearch instance (defaultfalse
)OPENSEARCH_SECONDARY_URL
- when forwarding to a secondary remote OpenSearch instance (i.e.,OPENSEARCH_SECONDARY
istrue
) this value specifies the secondary remote instance URL in the formatprotocol://host:port
OPENSEARCH_SECONDARY_SSL_CERTIFICATE_VERIFICATION
- if set totrue
, connections to the secondary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (defaultfalse
)NGINX_BASIC_AUTH
- if set totrue
, use TLS-encrypted HTTP basic authentication (default); if set tofalse
, use Lightweight Directory Access Protocol (LDAP) authenticationNGINX_LOG_ACCESS_AND_ERRORS
- if set totrue
, all access to Malcolm via its web interfaces will be logged to OpenSearch (defaultfalse
)NGINX_SSL
- if set totrue
, require HTTPS connections to Malcolm'snginx-proxy
container (default); if set tofalse
, use unencrypted HTTP connections (using unsecured HTTP connections is NOT recommended unless you are running Malcolm behind another reverse proxy like Traefik, Caddy, etc.)PCAP_ENABLE_NETSNIFF
– if set totrue
, Malcolm will capture network traffic on the local network interface(s) indicated inPCAP_IFACE
using netsniff-ngPCAP_ENABLE_TCPDUMP
– if set totrue
, Malcolm will capture network traffic on the local network interface(s) indicated inPCAP_IFACE
using tcpdump; there is no reason to enable bothPCAP_ENABLE_NETSNIFF
andPCAP_ENABLE_TCPDUMP
PCAP_FILTER
– specifies a tcpdump-style filter expression for local packet capture; leave blank to capture all trafficPCAP_IFACE
– used to specify the network interface(s) for local packet capture ifPCAP_ENABLE_NETSNIFF
,PCAP_ENABLE_TCPDUMP
,ZEEK_LIVE_CAPTURE
orSURICATA_LIVE_CAPTURE
are enabled; for multiple interfaces, separate the interface names with a comma (e.g.,'enp0s25'
or'enp10s0,enp11s0'
)PCAP_IFACE_TWEAK
- if set totrue
, Malcolm will useethtool
to disable NIC hardware offloading features and adjust ring buffer sizes for capture interface(s); this should betrue
if the interface(s) are being used for capture only,false
if they are being used for management/communicationPCAP_ROTATE_MEGABYTES
– used to specify how large a locally-captured PCAP file can become (in megabytes) before it is closed for processing and a new PCAP file createdPCAP_ROTATE_MINUTES
– used to specify a time interval (in minutes) after which a locally-captured PCAP file will be closed for processing and a new PCAP file createdpipeline.workers
,pipeline.batch.size
andpipeline.batch.delay
- these settings are used to tune the performance and resource utilization of the thelogstash
container; see Tuning and Profiling Logstash Performance,logstash.yml
and Multiple PipelinesPUID
andPGID
- Docker runs all of its containers as the privilegedroot
user by default. For better security, Malcolm immediately drops to non-privileged user accounts for executing internal processes wherever possible. ThePUID
(process user ID) andPGID
(process group ID) environment variables allow Malcolm to map internal non-privileged user accounts to a corresponding user account on the host.SENSITIVE_COUNTRY_CODES
- when severity scoring is enabled, this variable defines a comma-separated list of sensitive countries (using ISO 3166-1 alpha-2 codes) (default'AM,AZ,BY,CN,CU,DZ,GE,HK,IL,IN,IQ,IR,KG,KP,KZ,LY,MD,MO,PK,RU,SD,SS,SY,TJ,TM,TW,UA,UZ'
, taken from the U.S. Department of Energy Sensitive Country List)SURICATA_AUTO_ANALYZE_PCAP_FILES
– if set totrue
, all PCAP files imported into Malcolm will automatically be analyzed by Suricata, and the resulting logs will also be imported (defaultfalse
)SURICATA_AUTO_ANALYZE_PCAP_THREADS
– the number of threads available to Malcolm for analyzing Suricata logs (default1
)SURICATA_CUSTOM_RULES_ONLY
– if set totrue
, Malcolm will bypass the default Suricata ruleset and use only user-defined rules (./suricata/rules/*.rules
).SURICATA_UPDATE_RULES
– if set totrue
, Suricata signatures will periodically be updated (defaultfalse
)SURICATA_LIVE_CAPTURE
- if set totrue
, Suricata will monitor live traffic on the local interface(s) defined byPCAP_FILTER
SURICATA_ROTATED_PCAP
- if set totrue
, Suricata can analyze captured PCAP files captured bynetsniff-ng
ortcpdump
(seePCAP_ENABLE_NETSNIFF
andPCAP_ENABLE_TCPDUMP
, as well asSURICATA_AUTO_ANALYZE_PCAP_FILES
); ifSURICATA_LIVE_CAPTURE
istrue
, this should be false, otherwise Suricata will see duplicate trafficSURICATA_…
- thesuricata
container entrypoint script can use many more environment variables to tweak suricata.yaml; in that script,DEFAULT_VARS
defines those variables (albeit without theSURICATA_
prefix you must add to each for use)TOTAL_MEGABYTES_SEVERITY_THRESHOLD
- when severity scoring is enabled, this variable indicates the size threshold (in megabytes) for assigning severity to large connections or file transfers (default1000
)VTOT_API2_KEY
– used to specify a VirusTotal Public API v.20 key, which, if specified, will be used to submit hashes of Zeek-extracted files to VirusTotalZEEK_AUTO_ANALYZE_PCAP_FILES
– if set totrue
, all PCAP files imported into Malcolm will automatically be analyzed by Zeek, and the resulting logs will also be imported (defaultfalse
)ZEEK_AUTO_ANALYZE_PCAP_THREADS
– the number of threads available to Malcolm for analyzing Zeek logs (default1
)ZEEK_DISABLE_…
- if set to any non-blank value, each of these variables can be used to disable a certain Zeek function when it analyzes PCAP files (for example, settingZEEK_DISABLE_LOG_PASSWORDS
totrue
to disable logging of cleartext passwords)ZEEK_DISABLE_BEST_GUESS_ICS
- see "Best Guess" Fingerprinting for ICS ProtocolsZEEK_EXTRACTOR_MODE
– determines the file extraction behavior for file transfers detected by Zeek; see Automatic file extraction and scanning for more detailsZEEK_INTEL_FEED_SINCE
- when querying a TAXII or MISP feed, only process threat indicators that have been created or modified since the time represented by this value; it may be either a fixed date/time (01/01/2021
) or relative interval (30 days ago
)ZEEK_INTEL_ITEM_EXPIRATION
- specifies the value for Zeek'sIntel::item_expiration
timeout as used by the Zeek Intelligence Framework (default-1min
, which disables item expiration)ZEEK_INTEL_REFRESH_CRON_EXPRESSION
- specifies a cron expression indicating the refresh interval for generating the Zeek Intelligence Framework files (defaults to empty, which disables automatic refresh)ZEEK_LIVE_CAPTURE
- if set totrue
, Zeek will monitor live traffic on the local interface(s) defined byPCAP_FILTER
ZEEK_ROTATED_PCAP
- if set totrue
, Zeek can analyze captured PCAP files captured bynetsniff-ng
ortcpdump
(seePCAP_ENABLE_NETSNIFF
andPCAP_ENABLE_TCPDUMP
, as well asZEEK_AUTO_ANALYZE_PCAP_FILES
); ifZEEK_LIVE_CAPTURE
istrue
, this should be false, otherwise Zeek will see duplicate traffic
Docker installation instructions vary slightly by distribution. Please follow the links below to docker.com to find the instructions specific to your distribution:
After installing Docker, because Malcolm should be run as a non-root user, add your user to the docker
group with something like:
$ sudo usermod -aG docker yourusername
Following this, either reboot or log out then log back in.
Docker starts automatically on DEB-based distributions. On RPM-based distributions, you need to start it manually or enable it using the appropriate systemctl
or service
command(s).
You can test docker by running docker info
, or (assuming you have internet access), docker run --rm hello-world
.
Please follow this link on docker.com for instructions on installing docker-compose.
The host system (ie., the one running Docker) will need to be configured for the best possible OpenSearch performance. Here are a few suggestions for Linux hosts (these may vary from distribution to distribution):
- Append the following lines to
/etc/sysctl.conf
:
# the maximum number of open file handles
fs.file-max=2097152
# increase maximums for inotify watches
fs.inotify.max_user_watches=131072
fs.inotify.max_queued_events=131072
fs.inotify.max_user_instances=512
# the maximum number of memory map areas a process may have
vm.max_map_count=262144
# decrease "swappiness" (swapping out runtime memory vs. dropping pages)
vm.swappiness=1
# the maximum number of incoming connections
net.core.somaxconn=65535
# the % of system memory fillable with "dirty" pages before flushing
vm.dirty_background_ratio=40
# maximum % of dirty system memory before committing everything
vm.dirty_ratio=80
- Depending on your distribution, create either the file
/etc/security/limits.d/limits.conf
containing:
# the maximum number of open file handles
* soft nofile 65535
* hard nofile 65535
# do not limit the size of memory that can be locked
* soft memlock unlimited
* hard memlock unlimited
OR the file /etc/systemd/system.conf.d/limits.conf
containing:
[Manager]
# the maximum number of open file handles
DefaultLimitNOFILE=65535:65535
# do not limit the size of memory that can be locked
DefaultLimitMEMLOCK=infinity
- Change the readahead value for the disk where the OpenSearch data will be stored. There are a few ways to do this. For example, you could add this line to
/etc/rc.local
(replacing/dev/sda
with your disk block descriptor):
# change disk read-adhead value (# of blocks)
blockdev --setra 512 /dev/sda
-
Change the I/O scheduler to
deadline
ornoop
. Again, this can be done in a variety of ways. The simplest is to addelevator=deadline
to the arguments inGRUB_CMDLINE_LINUX
in/etc/default/grub
, then runningsudo update-grub2
-
If you are planning on using very large data sets, consider formatting the drive containing the
opensearch
volume as XFS.
After making all of these changes, do a reboot for good measure!
The install.py
script will attempt to guide you through the installation of Docker and Docker Compose if they are not present. If that works for you, you can skip ahead to Configure docker daemon option in this section.
The easiest way to install and maintain docker on Mac is using the Homebrew cask. Execute the following in a terminal.
$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
$ brew install cask
$ brew tap homebrew/cask-versions
$ brew cask install docker-edge
This will install the latest version of docker and docker-compose. It can be upgraded later using brew
as well:
$ brew cask upgrade --no-quarantine docker-edge
You can now run docker from the Applications folder.
Some changes should be made for performance (this link gives a good succinct overview).
-
Resource allocation - For a good experience, you likely need at least a quad-core MacBook Pro with 16GB RAM and an SSD. I have run Malcolm on an older 2013 MacBook Pro with 8GB of RAM, but the more the better. Go in your system tray and select Docker → Preferences → Advanced. Set the resources available to docker to at least 4 CPUs and 8GB of RAM (>= 16GB is preferable).
-
Volume mount performance - You can speed up performance of volume mounts by removing unused paths from Docker → Preferences → File Sharing. For example, if you're only going to be mounting volumes under your home directory, you could share
/Users
but remove other paths.
After making these changes, right click on the Docker 🐋 icon in the system tray and select Restart.
Installing and configuring Docker to run under Windows must be done manually, rather than through the install.py
script as is done for Linux and macOS.
- Be running Windows 10, version 1903 or higher
- Prepare your system and install WSL and a Linux distribution by running
wsl --install -d Debian
in PowerShell as Administrator (these instructions are tested with Debian, but may work with other distributions) - Install Docker Desktop for Windows either by downloading the installer from the official Docker site or installing it through chocolatey.
- Follow the Docker Desktop WSL 2 backend instructions to finish configuration and review best practices
- Reboot
- Open the WSL distribution's terminal and run run
docker info
to make sure Docker is running
Once Docker is installed, configured and running as described in the previous section, run ./scripts/install.py --configure
to finish configuration of the local Malcolm installation. Malcolm will be controlled and run from within your WSL distribution's terminal environment.
Malcolm's default standalone configuration is to use a local OpenSearch instance in a Docker container to index and search network traffic metadata. OpenSearch can also run as a cluster with instances distributed across multiple nodes with dedicated roles like cluster manager, data node, ingest node, etc.
As the permutations of OpenSearch cluster configurations are numerous, it is beyond Malcolm's scope to set up multi-node clusters. However, Malcolm can be configured to use a remote OpenSearch cluster rather than its own internal instance.
The OPENSEARCH_…
environment variables in docker-compose.yml
control whether Malcolm uses its own local OpenSearch instance or a remote OpenSearch instance as its primary data store. The configuration portion of Malcolm install script (./scripts/install.py --configure
) can help you configure these options.
For example, to use the default standalone configuration, answer Y
when prompted Should Malcolm use and maintain its own OpenSearch instance?
.
Or, to use a remote OpenSearch cluster:
…
Should Malcolm use and maintain its own OpenSearch instance? (Y/n): n
Enter primary remote OpenSearch connection URL (e.g., https://192.168.1.123:9200): https://192.168.1.123:9200
Require SSL certificate validation for communication with primary OpenSearch instance? (y/N): n
You must run auth_setup after install.py to store OpenSearch connection credentials.
…
Whether the primary OpenSearch instance is a locally maintained single-node instance or is a remote cluster, Malcolm can be configured additionally forward logs to a secondary remote OpenSearch instance. The OPENSEARCH_SECONDARY_…
environment variables in docker-compose.yml
control this behavior. Configuration of a remote secondary OpenSearch instance is similar to that of a remote primary OpenSearch instance:
…
Forward Logstash logs to a secondary remote OpenSearch instance? (y/N): y
Enter secondary remote OpenSearch connection URL (e.g., https://192.168.1.123:9200): https://192.168.1.124:9200
Require SSL certificate validation for communication with secondary OpenSearch instance? (y/N): n
You must run auth_setup after install.py to store OpenSearch connection credentials.
…
In addition to setting the environment variables in docker-compose.yml
as described above, you must provide Malcolm with credentials for it to be able to communicate with remote OpenSearch instances. These credentials are stored in the Malcolm installation directory as .opensearch.primary.curlrc
and .opensearch.secondary.curlrc
for the primary and secondary OpenSearch connections, respectively, and are bind mounted into the Docker containers which need to communicate with OpenSearch. These cURL-formatted config files can be generated for you by the auth_setup
script as illustrated:
$ ./scripts/auth_setup
…
Store username/password for primary remote OpenSearch instance? (y/N): y
OpenSearch username: servicedb
servicedb password:
servicedb password (again):
Require SSL certificate validation for OpenSearch communication? (Y/n): n
Store username/password for secondary remote OpenSearch instance? (y/N): y
OpenSearch username: remotedb
remotedb password:
remotedb password (again):
Require SSL certificate validation for OpenSearch communication? (Y/n): n
…
These files are created with permissions such that only the user account running Malcolm can access them:
$ ls -la .opensearch.*.curlrc
-rw------- 1 user user 36 Aug 22 14:17 .opensearch.primary.curlrc
-rw------- 1 user user 35 Aug 22 14:18 .opensearch.secondary.curlrc
One caveat with Malcolm using a remote OpenSearch cluster as its primary document store is that the accounts used to access Malcolm's web interfaces, particularly OpenSearch Dashboards, are in some instance passed directly through to OpenSearch itself. For this reason, both Malcolm and the remote primary OpenSearch instance must have the same account information. The easiest way to accomplish this is to use an Active Directory/LDAP server that both Malcolm and OpenSearch use as a common authentication backend.
See the OpenSearch documentation on access control for more information.
Malcolm requires authentication to access the user interface. Nginx can authenticate users with either local TLS-encrypted HTTP basic authentication or using a remote Lightweight Directory Access Protocol (LDAP) authentication server.
With the local basic authentication method, user accounts are managed by Malcolm and can be created, modified, and deleted using a user management web interface. This method is suitable in instances where accounts and credentials do not need to be synced across many Malcolm installations.
LDAP authentication are managed on a remote directory service, such as a Microsoft Active Directory Domain Services or OpenLDAP.
Malcolm's authentication method is defined in the x-auth-variables
section near the top of the docker-compose.yml
file with the NGINX_BASIC_AUTH
environment variable: true
for local TLS-encrypted HTTP basic authentication, false
for LDAP authentication.
In either case, you must run ./scripts/auth_setup
before starting Malcolm for the first time in order to:
- define the local Malcolm administrator account username and password (although these credentials will only be used for basic authentication, not LDAP authentication)
- specify whether or not to (re)generate the self-signed certificates used for HTTPS access
- key and certificate files are located in the
nginx/certs/
directory
- key and certificate files are located in the
- specify whether or not to (re)generate the self-signed certificates used by a remote log forwarder (see the
BEATS_SSL
environment variable above)- certificate authority, certificate, and key files for Malcolm's Logstash instance are located in the
logstash/certs/
directory - certificate authority, certificate, and key files to be copied to and used by the remote log forwarder are located in the
filebeat/certs/
directory; if using Hedgehog Linux, these certificates should be copied to the/opt/sensor/sensor_ctl/logstash-client-certificates
directory on the sensor
- certificate authority, certificate, and key files for Malcolm's Logstash instance are located in the
- specify whether or not to store the username/password for email alert senders
- these parameters are stored securely in the OpenSearch keystore file
opensearch/opensearch.keystore
- these parameters are stored securely in the OpenSearch keystore file
auth_setup
is used to define the username and password for the administrator account. Once Malcolm is running, the administrator account can be used to manage other user accounts via a Malcolm User Management page served over HTTPS on port 488 (e.g., https://localhost:488 if you are connecting locally).
Malcolm user accounts can be used to access the interfaces of all of its components, including Arkime. Arkime uses its own internal database of user accounts, so when a Malcolm user account logs in to Arkime for the first time Malcolm creates a corresponding Arkime user account automatically. This being the case, it is not recommended to use the Arkime Users settings page or change the password via the Password form under the Arkime Settings page, as those settings would not be consistently used across Malcolm.
Users may change their passwords via the Malcolm User Management page by clicking User Self Service. A forgotten password can also be reset via an emailed link, though this requires SMTP server settings to be specified in htadmin/config.ini
in the Malcolm installation directory.
The nginx-auth-ldap module serves as the interface between Malcolm's Nginx web server and a remote LDAP server. When you run auth_setup
for the first time, a sample LDAP configuration file is created at nginx/nginx_ldap.conf
.
# This is a sample configuration for the ldap_server section of nginx.conf.
# Yours will vary depending on how your Active Directory/LDAP server is configured.
# See https://github.com/kvspb/nginx-auth-ldap#available-config-parameters for options.
ldap_server ad_server {
url "ldap://ds.example.com:3268/DC=ds,DC=example,DC=com?sAMAccountName?sub?(objectClass=person)";
binddn "bind_dn";
binddn_passwd "bind_dn_password";
group_attribute member;
group_attribute_is_dn on;
require group "CN=Malcolm,CN=Users,DC=ds,DC=example,DC=com";
require valid_user;
satisfy all;
}
auth_ldap_cache_enabled on;
auth_ldap_cache_expiration_time 10000;
auth_ldap_cache_size 1000;
This file is mounted into the nginx
container when Malcolm is started to provide connection information for the LDAP server.
The contents of nginx_ldap.conf
will vary depending on how the LDAP server is configured. Some of the avaiable parameters in that file include:
url
- theldap://
orldaps://
connection URL for the remote LDAP server, which has the following syntax:ldap[s]://<hostname>:<port>/<base_dn>?<attributes>?<scope>?<filter>
binddn
andbinddn_password
- the account credentials used to query the LDAP directorygroup_attribute
- the group attribute name which contains the member object (e.g.,member
ormemberUid
)group_attribute_is_dn
- whether or not to search for the user's full distinguished name as the value in the group's member attributerequire
andsatisfy
-require user
,require group
andrequire valid_user
can be used in conjunction withsatisfy any
orsatisfy all
to limit the users that are allowed to access the Malcolm instance
Before starting Malcolm, edit nginx/nginx_ldap.conf
according to the specifics of your LDAP server and directory tree structure. Using a LDAP search tool such as ldapsearch
in Linux or dsquery
in Windows may be of help as you formulate the configuration. Your changes should be made within the curly braces of the ldap_server ad_server { … }
section. You can troubleshoot configuration file syntax errors and LDAP connection or credentials issues by running ./scripts/logs
(or docker-compose logs nginx
) and examining the output of the nginx
container.
The Malcolm User Management page described above is not available when using LDAP authentication.
Authentication over LDAP can be done using one of three ways, two of which offer data confidentiality protection:
- StartTLS - the standard extension to the LDAP protocol to establish an encrypted SSL/TLS connection within an already established LDAP connection
- LDAPS - a commonly used (though unofficial and considered deprecated) method in which SSL negotiation takes place before any commands are sent from the client to the server
- Unencrypted (cleartext) (not recommended)
In addition to the NGINX_BASIC_AUTH
environment variable being set to false
in the x-auth-variables
section near the top of the docker-compose.yml
file, the NGINX_LDAP_TLS_STUNNEL
and NGINX_LDAP_TLS_STUNNEL
environment variables are used in conjunction with the values in nginx/nginx_ldap.conf
to define the LDAP connection security level. Use the following combinations of values to achieve the connection security methods above, respectively:
- StartTLS
NGINX_LDAP_TLS_STUNNEL
set totrue
indocker-compose.yml
url
should begin withldap://
and its port should be either the default LDAP port (389) or the default Global Catalog port (3268) innginx/nginx_ldap.conf
- LDAPS
NGINX_LDAP_TLS_STUNNEL
set tofalse
indocker-compose.yml
url
should begin withldaps://
and its port should be either the default LDAPS port (636) or the default LDAPS Global Catalog port (3269) innginx/nginx_ldap.conf
- Unencrypted (clear text) (not recommended)
NGINX_LDAP_TLS_STUNNEL
set tofalse
indocker-compose.yml
url
should begin withldap://
and its port should be either the default LDAP port (389) or the default Global Catalog port (3268) innginx/nginx_ldap.conf
For encrypted connections (whether using StartTLS or LDAPS), Malcolm will require and verify certificates when one or more trusted CA certificate files are placed in the nginx/ca-trust/
directory. Otherwise, any certificate presented by the domain server will be accepted.
When you set up authentication for Malcolm a set of unique self-signed TLS certificates are created which are used to secure the connection between clients (e.g., your web browser) and Malcolm's browser-based interface. This is adequate for most Malcolm instances as they are often run locally or on internal networks, although your browser will most likely require you to add a security exception for the certificate the first time you connect to Malcolm.
Another option is to generate your own certificates (or have them issued to you) and have them placed in the nginx/certs/
directory. The certificate and key file should be named cert.pem
and key.pem
, respectively.
A third possibility is to use a third-party reverse proxy (e.g., Traefik or Caddy) to handle the issuance of the certificates for you and to broker the connections between clients and Malcolm. Reverse proxies such as these often implement the ACME protocol for domain name authentication and can be used to request certificates from certificate authorities like Let's Encrypt. In this configuration, the reverse proxy will be encrypting the connections instead of Malcolm, so you'll need to set the NGINX_SSL
environment variable to false
in docker-compose.yml
(or answer no
to the "Require encrypted HTTPS connections?" question posed by install.py
). If you are setting NGINX_SSL
to false
, make sure you understand what you are doing and ensure that external connections cannot reach ports over which Malcolm will be communicating without encryption, including verifying your local firewall configuration.
Docker compose is used to coordinate running the Docker containers. To start Malcolm, navigate to the directory containing docker-compose.yml
and run:
$ ./scripts/start
This will create the containers' virtual network and instantiate them, then leave them running in the background. The Malcolm containers may take a several minutes to start up completely. To follow the debug output for an already-running Malcolm instance, run:
$ ./scripts/logs
You can also use docker stats
to monitor the resource utilization of running containers.
You can run ./scripts/stop
to stop the docker containers and remove their virtual network. Alternatively, ./scripts/restart
will restart an instance of Malcolm. Because the data on disk is stored on the host in docker volumes, doing these operations will not result in loss of data.
Malcolm can be configured to be automatically restarted when the Docker system daemon restart (for example, on system reboot). This behavior depends on the value of the restart:
setting for each service in the docker-compose.yml
file. This value can be set by running ./scripts/install.py --configure
and answering "yes" to "Restart Malcolm upon system or Docker daemon restart?
."
Run ./scripts/wipe
to stop the Malcolm instance and wipe its OpenSearch database (including index snapshots and management policies and alerting configuration).
To temporarily set the Malcolm user interaces into a read-only configuration, run the following commands from the Malcolm installation directory.
First, to configure [Nginx] to disable access to the upload and other interfaces for changing Malcolm settings, and to deny HTTP methods other than GET
and POST
:
docker-compose exec nginx-proxy bash -c "cp /etc/nginx/nginx_readonly.conf /etc/nginx/nginx.conf && nginx -s reload"
Second, to set the existing OpenSearch data store to read-only:
docker-compose exec dashboards-helper /data/opensearch_read_only.py -i _cluster
These commands must be re-run every time you restart Malcolm.
Note that after you run these commands you may see an increase of error messages in the Malcolm containers' output as various background processes will fail due to the read-only nature of the indices. Additionally, some features such as Arkime's Hunt and building your own visualizations and dashboards in OpenSearch Dashboards will not function correctly in read-only mode.
Malcolm serves a web browser-based upload form for uploading PCAP files and Zeek logs at https://localhost/upload/ if you are connecting locally.
Additionally, there is a writable files
directory on an SFTP server served on port 8022 (e.g., sftp://USERNAME@localhost:8022/files/
if you are connecting locally).
The types of files supported are:
- PCAP files (of mime type
application/vnd.tcpdump.pcap
orapplication/x-pcapng
)- PCAPNG files are partially supported: Zeek is able to process PCAPNG files, but not all of Arkime's packet examination features work correctly
- Zeek logs in archive files (
application/gzip
,application/x-gzip
,application/x-7z-compressed
,application/x-bzip2
,application/x-cpio
,application/x-lzip
,application/x-lzma
,application/x-rar-compressed
,application/x-tar
,application/x-xz
, orapplication/zip
)- where the Zeek logs are found in the internal directory structure in the archive file does not matter
Files uploaded via these methods are monitored and moved automatically to other directories for processing to begin, generally within one minute of completion of the upload.
In addition to be processed for uploading, Malcolm events will be tagged according to the components of the filenames of the PCAP files or Zeek log archives files from which the events were parsed. For example, records created from a PCAP file named ACME_Scada_VLAN10.pcap
would be tagged with ACME
, Scada
, and VLAN10
. Tags are extracted from filenames by splitting on the characters "," (comma), "-" (dash), and "_" (underscore). These tags are viewable and searchable (via the tags
field) in Arkime and OpenSearch Dashboards. This behavior can be changed by modifying the AUTO_TAG
environment variable in docker-compose.yml
.
Tags may also be specified manually with the browser-based upload form.
The Analyze with Zeek and Analyze with Suricata checkboxes may be used when uploading PCAP files to cause them to be analyzed by Zeek and Suricata, respectively. This is functionally equivalent to the ZEEK_AUTO_ANALYZE_PCAP_FILES
and SURICATA_AUTO_ANALYZE_PCAP_FILES
environment variables described above, only on a per-upload basis. Zeek can also automatically carve out files from file transfers; see Automatic file extraction and scanning for more details.
A dedicated network sensor appliance is the recommended method for capturing and analyzing live network traffic when performance and throughput is of utmost importance. Hedgehog Linux is a custom Debian-based operating system built to:
- monitor network interfaces
- capture packets to PCAP files
- detect file transfers in network traffic and extract and scan those files for threats
- generate and forward Zeek and Suricata logs, Arkime sessions, and other information to Malcolm
Please see the Hedgehog Linux README for more information.
Malcolm's pcap-capture
, suricata-live
and zeek-live
containers can monitor one or more local network interfaces, specified by the PCAP_IFACE
environment variable in docker-compose.yml
. These containers are started with additional privileges (IPC_LOCK
, NET_ADMIN
, NET_RAW
, and SYS_ADMIN
) to allow opening network interfaces in promiscuous mode for capture.
The instances of Zeek and Suricata (in the suricata-live
and zeek-live
containers when the SURICATA_LIVE_CAPTURE
and ZEEK_LIVE_CAPTURE
environment variables in docker-compose.yml
are set to true
, respectively) analyze traffic on-the-fly and generate log files containing network session metadata. These log files are in turn scanned by Filebeat and forwarded to Logstash for enrichment and indexing into the OpenSearch document store.
In contrast, the pcap-capture
container buffers traffic to PCAP files and periodically rotates these files for processing (by Arkime's capture
utlity in the arkime
container) according to the thresholds defined by the PCAP_ROTATE_MEGABYTES
and PCAP_ROTATE_MINUTES
environment variables in docker-compose.yml
. If for some reason (e.g., a low resources environment) you also want Zeek and Suricata to process these intermediate PCAP files rather than monitoring the network interfaces directly, you can set SURICATA_ROTATED_PCAP
/ZEEK_ROTATED_PCAP
to true
and SURICATA_LIVE_CAPTURE
/ZEEK_LIVE_CAPTURE
to false.
These various options for monitoring traffic on local network interfaces can also be configured by running ./scripts/install.py --configure
.
Note that currently Microsoft Windows and Apple macOS platforms run Docker inside of a virtualized environment. Live traffic capture and analysis on those platforms would require additional configuration of virtual interfaces and port forwarding in Docker which is outside of the scope of this document.
Malcolm's Logstash instance can also be configured to accept logs from a remote forwarder by running ./scripts/install.py --configure
and answering "yes" to "Expose Logstash port to external hosts?
." Enabling encrypted transport of these logs files is discussed in Configure authentication and the description of the BEATS_SSL
environment variable in the docker-compose.yml
file.
Configuring Filebeat to forward Zeek logs to Malcolm might look something like this example filebeat.yml
:
filebeat.inputs:
- type: log
paths:
- /var/zeek/*.log
fields_under_root: true
compression_level: 0
exclude_lines: ['^\s*#']
scan_frequency: 10s
clean_inactive: 180m
ignore_older: 120m
close_inactive: 90m
close_renamed: true
close_removed: true
close_eof: false
clean_renamed: true
clean_removed: true
output.logstash:
hosts: ["192.0.2.123:5044"]
ssl.enabled: true
ssl.certificate_authorities: ["/foo/bar/ca.crt"]
ssl.certificate: "/foo/bar/client.crt"
ssl.key: "/foo/bar/client.key"
ssl.supported_protocols: "TLSv1.2"
ssl.verification_mode: "none"
The Arkime interface will be accessible over HTTPS on port 443 at the docker hosts IP address (e.g., https://localhost if you are connecting locally).
A stock installation of Arkime extracts all of its network connection ("session") metadata ("SPI" or "Session Profile Information") from full packet capture artifacts (PCAP files). Zeek (formerly Bro) generates similar session metadata, linking network events to sessions via a connection UID. Malcolm aims to facilitate analysis of Zeek logs by mapping values from Zeek logs to the Arkime session database schema for equivalent fields, and by creating new "native" Arkime database fields for all the other Zeek log values for which there is not currently an equivalent in Arkime:
In this way, when full packet capture is an option, analysis of PCAP files can be enhanced by the additional information Zeek provides. When full packet capture is not an option, similar analysis can still be performed using the same interfaces and processes using the Zeek logs alone.
A few values of particular mention include Data Source (event.provider
in OpenSearch), which can be used to distinguish from among the sources of the network traffic metadata record (e.g., zeek
for Zeek logs and arkime
for Arkime sessions); and, Log Type (event.dataset
in OpenSearch), which corresponds to the kind of Zeek .log
file from which the record was created. In other words, a search could be restricted to records from conn.log
by searching event.provider == zeek && event.dataset == conn
, or restricted to records from weird.log
by searching event.provider == zeek && event.dataset == weird
.
Click the icon of the owl 🦉 in the upper-left hand corner of to access the Arkime usage documentation (accessible at https://localhost/help if you are connecting locally), click the Fields label in the navigation pane, then search for zeek
to see a list of the other Zeek log types and fields available to Malcolm.
The values of records created from Zeek logs can be expanded and viewed like any native Arkime session by clicking the plus ➕ icon to the left of the record in the Sessions view. However, note that when dealing with these Zeek records the full packet contents are not available, so buttons dealing with viewing and exporting PCAP information will not behave as they would for records from PCAP files. Other than that, Zeek records and their values are usable in Malcolm just like native PCAP session records.
The Arkime interface displays both Zeek logs and Arkime sessions alongside each other. Using fields common to both data sources, one can craft queries to filter results matching desired criteria.
A few fields of particular mention that help limit returned results to those Zeek logs and Arkime session records generated from the same network connection are Community ID (network.community_id
) and Zeek's connection UID (zeek.uid
), which Malcolm maps to both Arkime's rootId
field and the ECS event.id
field.
Community ID is specification for standard flow hashing published by Corelight with the intent of making it easier to pivot from one dataset (e.g., Arkime sessions) to another (e.g., Zeek conn.log
entries). In Malcolm both Arkime and Zeek populate this value, which makes it possible to filter for a specific network connection and see both data sources' results for that connection.
The rootId
field is used by Arkime to link session records together when a particular session has too many packets to be represented by a single session. When normalizing Zeek logs to Arkime's schema, Malcolm piggybacks on rootId
to store Zeek's connection UID to crossreference entries across Zeek log types. The connection UID is also stored in zeek.uid
.
Filtering on community ID OR'ed with zeek UID (e.g., network.community_id == "1:r7tGG//fXP1P0+BXH3zXETCtEFI=" || rootId == "CQcoro2z6adgtGlk42"
) is an effective way to see both the Arkime sessions and Zeek logs generated by a particular network connection.
Click the icon of the owl 🦉 in the upper-left hand corner of to access the Arkime usage documentation (accessible at https://localhost/help if you are connecting locally), which includes such topics as search syntax, the Sessions view, SPIView, SPIGraph, and the Connections graph.
The Sessions view provides low-level details of the sessions being investigated, whether they be Arkime sessions created from PCAP files or Zeek logs mapped to the Arkime session database schema.
The Sessions view contains many controls for filtering the sessions displayed from all sessions down to sessions of interest:
- search bar: Indicated by the magnifying glass 🔍 icon, the search bar allows defining filters on session/log metadata
- time bounding controls: The 🕘, Start, End, Bounding, and Interval fields, and the date histogram can be used to visually zoom and pan the time range being examined.
- search button: The Search button re-runs the sessions query with the filters currently specified.
- views button: Indicated by the eyeball 👁 icon, views allow overlaying additional previously-specified filters onto the current sessions filters. For convenience, Malcolm provides several Arkime preconfigured views including filtering on the
event.dataset
field.
- map: A global map can be expanded by clicking the globe 🌎 icon. This allows filtering sessions by IP-based geolocation when possible.
Some of these filter controls are also available on other Arkime pages (such as SPIView, SPIGraph, Connections, and Hunt).
The number of sessions displayed per page, as well as the page currently displayed, can be specified using the paging controls underneath the time bounding controls.
The sessions table is displayed below the filter controls. This table contains the sessions/logs matching the specified filters.
To the left of the column headers are two buttons. The Toggle visible columns button, indicated by a grid ⊞ icon, allows toggling which columns are displayed in the sessions table. The Save or load custom column configuration button, indicated by a columns ◫ icon, allows saving the current displayed columns or loading previously-saved configurations. This is useful for customizing which columns are displayed when investigating different types of traffic. Column headers can also be clicked to sort the results in the table, and column widths may be adjusted by dragging the separators between column headers.
Details for individual sessions/logs can be expanded by clicking the plus ➕ icon on the left of each row. Each row may contain multiple sections and controls, depending on whether the row represents a Arkime session or a Zeek log. Clicking the field names and values in the details sections allows additional filters to be specified or summary lists of unique values to be exported.
When viewing Arkime session details (ie., a session generated from a PCAP file), an additional packets section will be visible underneath the metadata sections. When the details of a session of this type are expanded, Arkime will read the packet(s) comprising the session for display here. Various controls can be used to adjust how the packet is displayed (enabling natural decoding and enabling Show Images & Files may produce visually pleasing results), and other options (including PCAP download, carving images and files, applying decoding filters, and examining payloads in CyberChef) are available.
See also Arkime's usage documentation for more information on the Sessions view.
Clicking the down arrow ▼ icon to the far right of the search bar presents a list of actions including PCAP Export (see Arkime's sessions help for information on the other actions). When full PCAP sessions are displayed, the PCAP Export feature allows you to create a new PCAP file from the matching Arkime sessions, including controls for which sessions are included (open items, visible items, or all matching items) and whether or not to include linked segments. Click Export PCAP button to generate the PCAP, after which you'll be presented with a browser download dialog to save or open the file. Note that depending on the scope of the filters specified this might take a long time (or, possibly even time out).
See the issues section of this document for an error that can occur using this feature when Zeek log sessions are displayed.View
Arkime's SPI (Session Profile Information) View provides a quick and easy-to-use interface for exploring session/log metrics. The SPIView page lists categories for general session metrics (e.g., protocol, source and destination IP addresses, sort and destination ports, etc.) as well as for all of various types of network traffic understood by Malcolm. These categories can be expanded and the top n values displayed, along with each value's cardinality, for the fields of interest they contain.
Click the the plus ➕ icon to the right of a category to expand it. The values for specific fields are displayed by clicking the field description in the field list underneath the category name. The list of field names can be filtered by typing part of the field name in the Search for fields to display in this category text input. The Load All and Unload All buttons can be used to toggle display of all of the fields belonging to that category. Once displayed, a field's name or one of its values may be clicked to provide further actions for filtering or displaying that field or its values. Of particular interest may be the Open [fieldname] SPI Graph option when clicking on a field's name. This will open a new tab with the SPI Graph (see below) populated with the field's top values.
Note that because the SPIView page can potentially run many queries, SPIView limits the search domain to seven days (in other words, seven indices, as each index represents one day's worth of data). When using SPIView, you will have best results if you limit your search time frame to less than or equal to seven days. This limit can be adjusted by editing the spiDataMaxIndices
setting in config.ini and rebuilding the malcolmnetsec/arkime
docker container.
See also Arkime's usage documentation for more information on SPIView.
Arkime's SPI (Session Profile Information) Graph visualizes the occurrence of some field's top n values over time, and (optionally) geographically. This is particularly useful for identifying trends in a particular type of communication over time: traffic using a particular protocol when seen sparsely at regular intervals on that protocol's date histogram in the SPIGraph may indicate a connection check, polling, or beaconing (for example, see the llmnr
protocol in the screenshot below).
Controls can be found underneath the time bounding controls for selecting the field of interest, the number of elements to be displayed, the sort order, and a periodic refresh of the data.
See also Arkime's usage documentation for more information on SPIGraph.
The Connections page presents network communications via a force-directed graph, making it easy to visualize logical relationships between network hosts.
Controls are available for specifying the query size (where smaller values will execute more quickly but may only contain an incomplete representation of the top n sessions, and larger values may take longer to execute but will be more complete), which fields to use as the source and destination for node values, a minimum connections threshold, and the method for determining the "weight" of the link between two nodes. As is the case with most other visualizations in Arkime, the graph is interactive: clicking on a node or the link between two nodes can be used to modify query filters, and the nodes themselves may be repositioned by dragging and dropping them. A node's color indicates whether it communicated as a source/originator, a destination/responder, or both.
While the default source and destination fields are Src IP and Dst IP:Dst Port, the Connections view is able to use any combination of fields. For example:
- Src OUI and Dst OUI (hardware manufacturers)
- Src IP and Protocols
- Originating Network Segment and Responding Network Segment (see CIDR subnet to network segment name mapping)
- Originating GeoIP City and Responding GeoIP City
or any other combination of these or other fields.
See also Arkime's usage documentation for more information on the Connections graph.
Arkime's Hunt feature allows an analyst to search within the packets themselves (including payload data) rather than simply searching the session metadata. The search string may be specified using ASCII (with or without case sensitivity), hex codes, or regular expressions. Once a hunt job is complete, matching sessions can be viewed in the Sessions view.
Clicking the Create a packet search job on the Hunt page will allow you to specify the following parameters for a new hunt job:
- a packet search job name
- a maximum number of packets to examine per session
- the search string and its format (ascii, ascii (case sensitive), hex, regex, or hex regex)
- whether to search source packets, destination packets, or both
- whether to search raw or reassembled packets
Click the ➕ Create button to begin the search. Arkime will scan the source PCAP files from which the sessions were created according to the search criteria. Note that whatever filters were specified when the hunt job is executed will apply to the hunt job as well; the number of sessions matching the current filters will be displayed above the hunt job parameters with text like "ⓘ Creating a new packet search job will search the packets of # sessions."
Once a hunt job is submitted, it will be assigned a unique hunt ID (a long unique string of characters like yuBHAGsBdljYmwGkbEMm
) and its progress will be updated periodically in the Hunt Job Queue with the execution percent complete, the number of matches found so far, and the other parameters with which the job was submitted. More details for the hunt job can be viewed by expanding its row with the plus ➕ icon on the left.
Once the hunt job is complete (and a minute or so has passed, as the huntId
must be added to the matching session records in the database), click the folder 📂 icon on the right side of the hunt job row to open a new Sessions tab with the search bar prepopulated to filter to sessions with packets matching the search criteria.
From this list of filtered sessions you can expand session details and explore packet payloads which matched the hunt search criteria.
The hunt feature is available only for sessions created from full packet capture data, not Zeek logs. This being the case, it is a good idea to click the eyeball 👁 icon and select the Arkime Sessions view to exclude Zeek logs from candidate sessions prior to using the hunt feature.
See also Arkime's usage documentation for more information on the hunt feature.
Arkime provides several other reports which show information about the state of Arkime and the underlying OpenSearch database.
The Files list displays a list of PCAP files processed by Arkime, the date and time of the earliest packet in each file, and the file size:
The ES Indices list (available under the Stats page) lists the OpenSearch indices within which log data is contained:
The History view provides a historical list of queries issues to Arkime and the details of those queries:
See also Arkime's usage documentation for more information on the Files list, statistics, and history.
The Settings page can be used to tweak Arkime preferences, defined additional custom views and column configurations, tweak the color theme, and more.
See Arkime's usage documentation for more information on settings.
While Arkime provides very nice visualizations, especially for network traffic, OpenSearch Dashboards (an open source general-purpose data visualization tool for OpenSearch) can be used to create custom visualizations (tables, charts, graphs, dashboards, etc.) using the same data.
The OpenSearch Dashboards container can be accessed at https://localhost/dashboards/ if you are connecting locally. Several preconfigured dashboards for Zeek logs are included in Malcolm's OpenSearch Dashboards configuration.
OpenSearch Dashboards has several components for data searching and visualization:
The Discover view enables you to view events on a record-by-record basis (similar to a session record in Arkime or an individual line from a Zeek log). See the official Kibana User Guide (OpenSearch Dashboards is an open-source fork of Kibana, which is no longer open-source software) for information on using the Discover view:
Malcolm comes with dozens of prebuilt visualizations and dashboards for the network traffic represented by each of the Zeek log types. Click Dashboard to see a list of these dashboards. As is the case with all OpenSearch Dashboards visualizations, all of the charts, graphs, maps, and tables are interactive and can be clicked on to narrow or expand the scope of the data you are investigating. Similarly, click Visualize to explore the prebuilt visualizations used to build the dashboards.
Many of Malcolm's prebuilt visualizations for Zeek logs were originally inspired by the excellent Kibana Dashboards that are part of Security Onion.
See the official Kibana User Guide and OpenSearch Dashboards (OpenSearch Dashboards is an open-source fork of Kibana, which is no longer open-source software) documentation for information on creating your own visualizations and dashboards:
OpenSearch Dashboards supports two query syntaxes: the legacy Lucene syntax and Dashboards Query Language (DQL), both of which are somewhat different than Arkime's query syntax (see the help at https://localhost/help#search if you are connecting locally). The Arkime interface is for searching and visualizing both Arkime sessions and Zeek logs. The prebuilt dashboards in the OpenSearch Dashboards interface are for searching and visualizing Zeek logs, but will not include Arkime sessions. Here are some common patterns used in building search query strings for Arkime and OpenSearch Dashboards, respectively. See the links provided for further documentation.
Arkime Search String | OpenSearch Dashboards Search String (Lucene) | OpenSearch Dashboards Search String (DQL) | |
---|---|---|---|
Field exists | event.dataset == EXISTS! |
_exists_:event.dataset |
event.dataset:* |
Field does not exist | event.dataset != EXISTS! |
NOT _exists_:event.dataset |
NOT event.dataset:* |
Field matches a value | port.dst == 22 |
destination.port:22 |
destination.port:22 |
Field does not match a value | port.dst != 22 |
NOT destination.port:22 |
NOT destination.port:22 |
Field matches at least one of a list of values | tags == [foo, bar] |
tags:(foo OR bar) |
tags:(foo or bar) |
Field range (inclusive) | http.statuscode >= 200 && http.statuscode <= 300 |
http.statuscode:[200 TO 300] |
http.statuscode >= 200 and http.statuscode <= 300 |
Field range (exclusive) | http.statuscode > 200 && http.statuscode < 300 |
http.statuscode:{200 TO 300} |
http.statuscode > 200 and http.statuscode < 300 |
Field range (mixed exclusivity) | http.statuscode >= 200 && http.statuscode < 300 |
http.statuscode:[200 TO 300} |
http.statuscode >= 200 and http.statuscode < 300 |
Match all search terms (AND) | (tags == [foo, bar]) && (http.statuscode == 401) |
tags:(foo OR bar) AND http.statuscode:401 |
tags:(foo or bar) and http.statuscode:401 |
Match any search terms (OR) | `(zeek.ftp.password == EXISTS!) | (zeek.http.password == EXISTS!) | |
Global string search (anywhere in the document) | all Arkime search expressions are field-based | microsoft |
microsoft |
Wildcards | host.dns == "*micro?oft*" (? for single character, * for any characters) |
dns.host:*micro?oft* (? for single character, * for any characters) |
dns.host:*micro*ft* (* for any characters) |
Regex | host.http == /.*www\.f.*k\.com.*/ |
zeek.http.host:/.*www\.f.*k\.com.*/ |
DQL does not support regex |
IPv4 values | ip == 0.0.0.0/0 |
source.ip:"0.0.0.0/0" OR destination.ip:"0.0.0.0/0" |
source.ip:"0.0.0.0/0" OR destination.ip:"0.0.0.0/0" |
IPv6 values | `(ip.src == EXISTS! | ip.dst == EXISTS!) && (ip != 0.0.0.0/0)` | |
GeoIP information available | country == EXISTS! |
_exists_:destination.geo OR _exists_:source.geo |
destination.geo:* or source.geo:* |
Zeek log type | event.dataset == notice |
event.dataset:notice |
event.dataset:notice |
IP CIDR Subnets | ip.src == 172.16.0.0/12 |
source.ip:"172.16.0.0/12" |
source.ip:"172.16.0.0/12" |
Search time frame | Use Arkime time bounding controls under the search bar | Use OpenSearch Dashboards time range controls in the upper right-hand corner | Use OpenSearch Dashboards time range controls in the upper right-hand corner |
When building complex queries, it is strongly recommended that you enclose search terms and expressions in parentheses to control order of operations.
As Zeek logs are ingested, Malcolm parses and normalizes the logs' fields to match Arkime's underlying OpenSearch schema. A complete list of these fields can be found in the Arkime help (accessible at https://localhost/help#fields if you are connecting locally).
Whenever possible, Zeek fields are mapped to existing corresponding Arkime fields: for example, the orig_h
field in Zeek is mapped to Arkime's source.ip
field. The original Zeek fields are also left intact. To complicate the issue, the Arkime interface uses its own aliases to reference those fields: the source IP field is referenced as ip.src
(Arkime's alias) in Arkime and source.ip
or source.ip
in OpenSearch Dashboards.
The table below shows the mapping of some of these fields.
Field Description | Arkime Field Alias(es) | Arkime-mapped Zeek Field(s) | Zeek Field(s) |
---|---|---|---|
Community ID Flow Hash | network.community_id |
network.community_id |
|
Destination IP | ip.dst |
destination.ip |
destination.ip |
Destination MAC | mac.dst |
destination.mac |
destination.mac |
Destination Port | port.dst |
destination.port |
destination.port |
Duration | session.length |
length |
zeek.conn.duration |
First Packet Time | starttime |
firstPacket |
zeek.ts , @timestamp |
IP Protocol | ip.protocol |
ipProtocol |
network.transport |
Last Packet Time | stoptime |
lastPacket |
|
MIME Type | email.bodymagic , http.bodymagic |
http.bodyMagic |
file.mime_type , zeek.files.mime_type , zeek.ftp.mime_type , zeek.http.orig_mime_types , zeek.http.resp_mime_types , zeek.irc.dcc_mime_type |
Protocol/Service | protocols |
protocol |
network.transport , network.protocol |
Request Bytes | databytes.src , bytes.src |
source.bytes , client.bytes |
zeek.conn.orig_bytes , zeek.conn.orig_ip_bytes |
Request Packets | packets.src |
source.packets |
zeek.conn.orig_pkts |
Response Bytes | databytes.dst , bytes.dst |
destination.bytes , server.bytes |
zeek.conn.resp_bytes , zeek.conn.resp_ip_bytes |
Response Packets | packets.dst |
destination.packets |
zeek.con.resp_pkts |
Source IP | ip.src |
source.ip |
source.ip |
Source MAC | mac.src |
source.mac |
source.mac |
Source Port | port.src |
source.port |
source.port |
Total Bytes | databytes , bytes |
totDataBytes , network.bytes |
|
Total Packets | packets |
network.packets |
|
Username | user |
user |
related.user |
Zeek Connection UID | zeek.uid , event.id |
||
Zeek File UID | zeek.fuid , event.id |
||
Zeek Log Type | event.dataset |
In addition to the fields listed above, Arkime provides several special field aliases for matching any field of a particular type. While these aliases do not exist in OpenSearch Dashboards per se, they can be approximated as illustrated below.
Matches Any | Arkime Special Field Example | OpenSearch Dashboards/Zeek Equivalent Example |
---|---|---|
IP Address | ip == 192.168.0.1 |
source.ip:192.168.0.1 OR destination.ip:192.168.0.1 |
Port | port == [80, 443, 8080, 8443] |
source.port:(80 OR 443 OR 8080 OR 8443) OR destination.port:(80 OR 443 OR 8080 OR 8443) |
Country (code) | country == [RU,CN] |
destination.geo.country_code2:(RU OR CN) OR source.geo.country_code2:(RU OR CN) OR dns.GEO:(RU OR CN) |
Country (name) | destination.geo.country_name:(Russia OR China) OR source.geo.country_name:(Russia OR China) |
|
ASN | asn == "*Mozilla*" |
source.as.full:*Mozilla* OR destination.as.full:*Mozilla* OR dns.ASN:*Mozilla* |
Host | host == www.microsoft.com |
zeek.http.host:www.microsoft.com (or zeek.dhcp.host_name, zeek.dns.host, zeek.ntlm.host, smb.host, etc.) |
Protocol (layers >= 4) | protocols == tls |
protocol:tls |
User | user == EXISTS! && user != anonymous |
_exists_:user AND (NOT user:anonymous) |
For details on how to filter both Zeek logs and Arkime session records for a particular connection, see Correlating Zeek logs and Arkime sessions.
Malcolm can leverage Zeek's knowledge of network protocols to automatically detect file transfers and extract those files from PCAPs as Zeek processes them. This behavior can be enabled globally by modifying the ZEEK_EXTRACTOR_MODE
environment variable in docker-compose.yml
, or on a per-upload basis for PCAP files uploaded via the browser-based upload form when Analyze with Zeek is selected.
To specify which files should be extracted, the following values are acceptable in ZEEK_EXTRACTOR_MODE
:
none
: no file extractioninteresting
: extraction of files with mime types of common attack vectorsmapped
: extraction of files with recognized mime typesknown
: extraction of files for which any mime type can be determinedall
: extract all files
Extracted files can be examined through any of the following methods:
- submitting file hashes to VirusTotal; to enable this method, specify the
VTOT_API2_KEY
environment variable indocker-compose.yml
- scanning files with ClamAV; to enable this method, set the
EXTRACTED_FILE_ENABLE_CLAMAV
environment variable indocker-compose.yml
totrue
- scanning files with Yara; to enable this method, set the
EXTRACTED_FILE_ENABLE_YARA
environment variable indocker-compose.yml
totrue
- scanning PE (portable executable) files with Capa; to enable this method, set the
EXTRACTED_FILE_ENABLE_CAPA
environment variable indocker-compose.yml
totrue
Files which are flagged via any of these methods will be logged as Zeek signatures.log
entries, and can be viewed in the Signatures dashboard in OpenSearch Dashboards.
The EXTRACTED_FILE_PRESERVATION
environment variable in docker-compose.yml
determines the behavior for preservation of Zeek-extracted files:
quarantined
: preserve only flagged files in./zeek-logs/extract_files/quarantine
all
: preserve flagged files in./zeek-logs/extract_files/quarantine
and all other extracted files in./zeek-logs/extract_files/preserved
none
: preserve no extracted files
The EXTRACTED_FILE_HTTP_SERVER_…
environment variables in docker-compose.yml
configure access to the Zeek-extracted files path through the means of a simple HTTPS directory server. Beware that Zeek-extracted files may contain malware. As such, the files may be optionally encrypted upon download.
The host-map.txt
file in the Malcolm installation directory can be used to define names for network hosts based on IP and/or MAC addresses in Zeek logs. The default empty configuration looks like this:
# IP or MAC address to host name map:
# address|host name|required tag
#
# where:
# address: comma-separated list of IPv4, IPv6, or MAC addresses
# e.g., 172.16.10.41, 02:42:45:dc:a2:96, 2001:0db8:85a3:0000:0000:8a2e:0370:7334
#
# host name: host name to be assigned when event address(es) match
#
# required tag (optional): only check match and apply host name if the event
# contains this tag
#
Each non-comment line (not beginning with a #
), defines an address-to-name mapping for a network host. For example:
127.0.0.1,127.0.1.1,::1|localhost|
192.168.10.10|office-laptop.intranet.lan|
06:46:0b:a6:16:bf|serial-host.intranet.lan|testbed
Each line consists of three |
-separated fields: address(es), hostname, and, optionally, a tag which, if specified, must belong to a log for the matching to occur.
As Zeek logs are processed into Malcolm's OpenSearch instance, the log's source and destination IP and MAC address fields (source.ip
, destination.ip
, source.mac
, and destination.mac
, respectively) are compared against the lists of addresses in host-map.txt
. When a match is found, a new field is added to the log: source.hostname
or destination.hostname
, depending on whether the matching address belongs to the originating or responding host. If the third field (the "required tag" field) is specified, a log must also contain that value in its tags
field in addition to matching the IP or MAC address specified in order for the corresponding _hostname
field to be added.
source.hostname
and destination.hostname
may each contain multiple values. For example, if both a host's source IP address and source MAC address were matched by two different lines, source.hostname
would contain the hostname values from both matching lines.
The cidr-map.txt
file in the Malcolm installation directory can be used to define names for network segments based on IP addresses in Zeek logs. The default empty configuration looks like this:
# CIDR to network segment format:
# IP(s)|segment name|required tag
#
# where:
# IP(s): comma-separated list of CIDR-formatted network IP addresses
# e.g., 10.0.0.0/8, 169.254.0.0/16, 172.16.10.41
#
# segment name: segment name to be assigned when event IP address(es) match
#
# required tag (optional): only check match and apply segment name if the event
# contains this tag
#
Each non-comment line (not beginning with a #
), defines an subnet-to-name mapping for a network host. For example:
192.168.50.0/24,192.168.40.0/24,10.0.0.0/8|corporate|
192.168.100.0/24|control|
192.168.200.0/24|dmz|
172.16.0.0/12|virtualized|testbed
Each line consists of three |
-separated fields: CIDR-formatted subnet IP range(s), subnet name, and, optionally, a tag which, if specified, must belong to a log for the matching to occur.
As Zeek logs are processed into Malcolm's OpenSearch instance, the log's source and destination IP address fields (source.ip
and destination.ip
, respectively) are compared against the lists of addresses in cidr-map.txt
. When a match is found, a new field is added to the log: source.segment
or destination.segment
, depending on whether the matching address belongs to the originating or responding host. If the third field (the "required tag" field) is specified, a log must also contain that value in its tags
field in addition to its IP address falling within the subnet specified in order for the corresponding _segment
field to be added.
source.segment
and destination.segment
may each contain multiple values. For example, if cidr-map.txt
specifies multiple overlapping subnets on different lines, source.segment
would contain the hostname values from both matching lines if source.ip
belonged to both subnets.
If both source.segment
and destination.segment
are added to a log, and if they contain different values, the tag cross_segment
will be added to the log's tags
field for convenient identification of cross-segment traffic. This traffic could be easily visualized using Arkime's Connections graph, by setting the Src: value to Originating Network Segment and the Dst: value to Responding Network Segment:
As an alternative to manually editing cidr-map.txt
and host-map.txt
, a Host and Subnet Name Mapping editor is available at https://localhost/name-map-ui/ if you are connecting locally. Upon loading, the editor is populated from cidr-map.txt
, host-map.txt
and net-map.json
.
This editor provides the following controls:
- 🔎 Search mappings - narrow the list of visible items using a search filter
- Type, Address, Name and Tag (column headings) - sort the list of items by clicking a column header
- 📝 (per item) - modify the selected item
- 🚫 (per item) - remove the selected item
- 🖳 host / 🖧 segment, Address, Name, Tag (optional) and 💾 - save the item with these values (either adding a new item or updating the item being modified)
- 📥 Import - clear the list and replace it with the contents of an uploaded
net-map.json
file - 📤 Export - format and download the list as a
net-map.json
file - 💾 Save Mappings - format and store
net-map.json
in the Malcolm directory (replacing the existingnet-map.json
file) - 🔁 Restart Logstash - restart log ingestion, parsing and enrichment
When changes are made to either cidr-map.txt
, host-map.txt
or net-map.json
, Malcolm's Logstash container must be restarted. The easiest way to do this is to restart malcolm via restart
(see Stopping and restarting Malcolm) or by clicking the 🔁 Restart Logstash button in the name mapping interface interface.
Restarting Logstash may take several minutes, after which log ingestion will be resumed.
Malcolm releases prior to v6.2.0 used environment variables to configure OpenSearch Index State Management policies.
Since then, OpenSearch Dashboards has developed and released plugins with UIs for Index State Management and Snapshot Management. Because these plugins provide a more comprehensive and user-friendly interfaces for these features, the old environment variable-based configuration code has been removed from Malcolm, with the exception of the code that uses OPENSEARCH_INDEX_SIZE_PRUNE_LIMIT
and OPENSEARCH_INDEX_SIZE_PRUNE_NAME_SORT
which deals with deleting the oldest network session metadata indices when the database exceeds a certain size.
Note that OpenSearch index state management and snapshot management only deals with disk space consumed by OpenSearch indices: it does not have anything to do with PCAP file storage. The MANAGE_PCAP_FILES
environment variable in the docker-compose.yml
file can be used to allow Arkime to prune old PCAP files based on available disk space.
As Zeek logs are parsed and enriched prior to indexing, a severity score up to 100
(a higher score indicating a more severe event) can be assigned when one or more of the following conditions are met:
- cross-segment network traffic (if network subnets were defined)
- connection origination and destination (e.g., inbound, outbound, external, internal)
- traffic to or from sensitive countries
- The comma-separated list of countries (by ISO 3166-1 alpha-2 code) can be customized by setting the
SENSITIVE_COUNTRY_CODES
environment variable indocker-compose.yml
.
- The comma-separated list of countries (by ISO 3166-1 alpha-2 code) can be customized by setting the
- domain names (from DNS queries and SSL server names) with high entropy as calculated by freq
- The entropy threshold for this condition to trigger can be adjusted by setting the
FREQ_SEVERITY_THRESHOLD
environment variable indocker-compose.yml
. A lower value will only assign severity scores to fewer domain names with higher entropy (e.g.,2.0
forNQZHTFHRMYMTVBQJE.COM
), while a higher value will assign severity scores to more domain names with lower entropy (e.g.,7.5
fornaturallanguagedomain.example.org
).
- The entropy threshold for this condition to trigger can be adjusted by setting the
- file transfers (categorized by mime type)
notice.log
,intel.log
andweird.log
entries, including those generated by Zeek plugins detecting vulnerabilities (see the list of Zeek plugins under Components)- detection of cleartext passwords
- use of insecure or outdated protocols
- tunneled traffic or use of VPN protocols
- rejected or aborted connections
- common network services communicating over non-standard ports
- file scanning engine hits on extracted files
- large connection or file transfer
- The size (in megabytes) threshold for this condition to trigger can be adjusted by setting the
TOTAL_MEGABYTES_SEVERITY_THRESHOLD
environment variable indocker-compose.yml
.
- The size (in megabytes) threshold for this condition to trigger can be adjusted by setting the
- long connection duration
- The duration (in seconds) threshold for this condition to trigger can be adjusted by setting the
CONNECTION_SECONDS_SEVERITY_THRESHOLD
environment variable indocker-compose.yml
.
- The duration (in seconds) threshold for this condition to trigger can be adjusted by setting the
As this feature is improved it's expected that additional categories will be identified and implemented for severity scoring.
When a Zeek log satisfies more than one of these conditions its severity scores will be summed, with a maximum score of 100
. A Zeek log's severity score is indexed in the event.severity
field and the conditions which contributed to its score are indexed in event.severity_tags
.
These categories' severity scores can be customized by editing logstash/maps/malcolm_severity.yaml
:
- Each category can be assigned a number between
1
and100
for severity scoring. - Any category may be disabled by assigning it a score of
0
. - A severity score can be assigned for any supported protocol by adding an entry with the key formatted like
"PROTOCOL_XYZ"
, whereXYZ
is the uppercased value of the protocol as stored in thenetwork.protocol
field. For example, to assign a score of40
to Zeek logs generated for SSH traffic, you could add the following line tomalcolm_severity.yaml
:
"PROTOCOL_SSH": 40
Restart Logstash after modifying malcolm_severity.yaml
for the changes to take effect. The hostname and CIDR subnet names interface provides a convenient button for restarting Logstash.
Severity scoring can be disabled globally by setting the LOGSTASH_SEVERITY_SCORING
environment variable to false
in the docker-compose.yml
file and restarting Malcolm.
To quote Zeek's Intelligence Framework documentation, "The goals of Zeek’s Intelligence Framework are to consume intelligence data, make it available for matching, and provide infrastructure to improve performance and memory utilization. Data in the Intelligence Framework is an atomic piece of intelligence such as an IP address or an e-mail address. This atomic data will be packed with metadata such as a freeform source field, a freeform descriptive field, and a URL which might lead to more information about the specific item." Zeek intelligence indicator types include IP addresses, URLs, file names, hashes, email addresses, and more.
Malcolm doesn't come bundled with intelligence files from any particular feed, but they can be easily included into your local instance. On startup, Malcolm's malcolmnetsec/zeek
docker container enumerates the subdirectories under ./zeek/intel
(which is bind mounted into the container's runtime) and configures Zeek so that those intelligence files will be automatically included in its local policy. Subdirectories under ./zeek/intel
which contain their own __load__.zeek
file will be @load
-ed as-is, while subdirectories containing "loose" intelligence files will be loaded automatically with a redef Intel::read_files
directive.
Note that Malcolm does not manage updates for these intelligence files. You should use the update mechanism suggested by your feeds' maintainers to keep them up to date, or use a TAXII or MISP feed as described below.
Adding and deleting intelligence files under this directory will take effect upon restarting Malcolm. Alternately, you can use the ZEEK_INTEL_REFRESH_CRON_EXPRESSION
environment variable containing a cron expression to specify the interval at which the intel files should be refreshed. It can also be done manually without restarting Malcolm by running the following command from the Malcolm installation directory:
docker-compose exec --user $(id -u) zeek /usr/local/bin/entrypoint.sh true
For a public example of Zeek intelligence files, see Critical Path Security's repository which aggregates data from various other threat feeds into Zeek's format.
In addition to loading Zeek intelligence files, on startup Malcolm will automatically generate a Zeek intelligence file for all Structured Threat Information Expression (STIX™) v2.0/v2.1 JSON files found under ./zeek/intel/STIX
.
Additionally, if a special text file named .stix_input.txt
is found in ./zeek/intel/STIX
, that file will be read and processed as a list of TAXII™ 2.0/2.1 feeds, one per line, according to the following format (the username and password are optional):
taxii|version|discovery_url|collection_name|username|password
For example:
taxii|2.0|http://example.org/taxii/|IP Blocklist|guest|guest
taxii|2.1|https://example.com/taxii/api2/|URL Blocklist
…
Malcolm will attempt to query the TAXII feed(s) for indicator
STIX objects and convert them to the Zeek intelligence format as described above. There are publicly available TAXII 2.x-compatible services provided by a number of organizations including Anomali Labs and MITRE, or you may choose from several open-source offerings to roll your own TAXII 2 server (e.g., oasis-open/cti-taxii-server, freetaxii/server, StephenOTT/TAXII-Server, etc.).
Note that only indicators of cyber-observable objects matched with the equals (=
) comparison operator against a single value can be expressed as Zeek intelligence items. More complex STIX indicators will be silently ignored.
In addition to loading Zeek intelligence files, on startup Malcolm will automatically generate a Zeek intelligence file for all Malware Information Sharing Platform (MISP) JSON files found under ./zeek/intel/MISP
.
Additionally, if a special text file named .misp_input.txt
is found in ./zeek/intel/MISP
, that file will be read and processed as a list of MISP feed URLs, one per line, according to the following format (the authentication key is optional):
misp|manifest_url|auth_key
For example:
misp|https://example.com/data/feed-osint/manifest.json|df97338db644c64fbfd90f3e03ba8870
…
Malcolm will attempt to connect to the MISP feed(s) and retrieve Attribute
objects of MISP events and convert them to the Zeek intelligence format as described above. There are publicly available MISP feeds and communities, or you may run your own MISP instance.
Note that only a subset of MISP attribute types can be expressed with the Zeek intelligence indicator types. MISP attributes with other types will be silently ignored.
Malcolm uses the Anomaly Detection plugins for OpenSearch and OpenSearch Dashboards to identify anomalous log data in near real-time using the Random Cut Forest (RCF) algorithm. This can be paired with Alerting to automatically notify when anomalies are found. See Anomaly detection in the OpenSearch documentation for usage instructions on how to create detectors for any of the many fields Malcolm supports.
A fresh installation of Malcolm configures several detectors for detecting anomalous network traffic:
- network_protocol - Detects anomalies based on application protocol (
network.protocol
) - action_result_user - Detects anomalies in action (
event.action
), result (event.result
) and user (related.user
) within application protocols (network.protocol
) - file_mime_type - Detects anomalies based on transferred file type (
file.mime_type
) - total_bytes - Detects anomalies based on traffic size (sum of
network.bytes
)
These detectors are disabled by default, but may be enabled for anomaly detection over streaming or historical data.
Malcolm uses the Alerting plugins for OpenSearch and OpenSearch Dashboards. See Alerting in the OpenSearch documentation for usage instructions.
A fresh installation of Malcolm configures an example custom webhook destination named Malcolm API Loopback Webhook that directs the triggered alerts back into the Malcolm API to be reindexed as a session record with event.dataset
set to alerting
. The corresponding monitor Malcolm API Loopback Monitor is disabled by default, as you'll likely want to configure the trigger conditions to suit your needs. These examples are provided to illustrate how triggers and monitors can interact with a custom webhook to process alerts.
When using an email account to send alerts, you must authenticate each sender account before you can send an email. The auth_setup
script can be used to securely store the email account credentials:
./scripts/auth_setup
Store administrator username/password for local Malcolm access? (Y/n): n
(Re)generate self-signed certificates for HTTPS access (Y/n): n
(Re)generate self-signed certificates for a remote log forwarder (Y/n): n
Store username/password for primary remote OpenSearch instance? (y/N): n
Store username/password for secondary remote OpenSearch instance? (y/N): n
Store username/password for email alert sender account? (y/N): y
Email account username: [email protected]
[email protected] password:
[email protected] password (again):
Email alert sender account variables stored: opensearch.alerting.destination.email.destination_alpha.password, opensearch.alerting.destination.email.destination_alpha.username
This action should only be performed while Malcolm is stopped: otherwise the credentials will not be stored correctly.
There are many ICS (industrial control systems) protocols. While Malcolm's collection of protocol parsers includes a number of them, many, particularly those that are proprietary or less common, are unlikely to be supported with a full parser in the foreseeable future.
In an effort to help identify more ICS traffic, Malcolm can use "best guess" method based on transport protocol (e.g., TCP or UDP) and port(s) to categorize potential traffic communicating over some ICS protocols without full parser support. This feature involves a mapping table and a Zeek script to look up the transport protocol and destination and/or source port to make a best guess at whether a connection belongs to one of those protocols. These potential ICS communications are categorized by vendor where possible.
Naturally, these lookups could produce false positives, so these connections are displayed in their own dashboard (the Best Guess dashboard found under the ICS section of Malcolm's OpenSearch Dashboards navigation pane). Values such as IP addresses, ports, or UID can be used to pivot to other dashboards to investigate further.
This feature is disabled by default, but it can be enabled by clearing (setting to ''
) the value of the ZEEK_DISABLE_BEST_GUESS_ICS
environment variable in docker-compose.yml
.
Malcolm provides a REST API that can be used to programatically query some aspects of Malcolm's status and data. Malcolm's API is not to be confused with the Viewer API provided by Arkime, although there may be some overlap in functionality.
GET
- /mapi/ping
Returns pong
(for a simple "up" check).
Example output:
{"ping":"pong"}
GET
- /mapi/version
Returns version information about Malcolm and version/health information about the underlying OpenSearch instance.
Example output:
{
"built": "2022-01-18T16:10:39Z",
"opensearch": {
"cluster_name": "docker-cluster",
"cluster_uuid": "TcSiEaOgTdO_l1IivYz2gA",
"name": "opensearch",
"tagline": "The OpenSearch Project: https://opensearch.org/",
"version": {
"build_date": "2021-12-21T01:36:21.407473Z",
"build_hash": "8a529d77c7432bc45b005ac1c4ba3b2741b57d4a",
"build_snapshot": false,
"build_type": "tar",
"lucene_version": "8.10.1",
"minimum_index_compatibility_version": "6.0.0-beta1",
"minimum_wire_compatibility_version": "6.8.0",
"number": "7.10.2"
}
},
"opensearch_health": {
"active_primary_shards": 29,
"active_shards": 29,
"active_shards_percent_as_number": 82.85714285714286,
"cluster_name": "docker-cluster",
"delayed_unassigned_shards": 0,
"discovered_master": true,
"initializing_shards": 0,
"number_of_data_nodes": 1,
"number_of_in_flight_fetch": 0,
"number_of_nodes": 1,
"number_of_pending_tasks": 0,
"relocating_shards": 0,
"status": "yellow",
"task_max_waiting_in_queue_millis": 0,
"timed_out": false,
"unassigned_shards": 6
},
"sha": "8ddbbf4",
"version": "5.2.0"
}
GET
- /mapi/fields
Returns the (very long) list of fields known to Malcolm, comprised of data from Arkime's fields
table, the Malcolm OpenSearch template and the OpenSearch Dashboards index pattern API.
Example output:
{
"fields": {
"@timestamp": {
"type": "date"
},
…
"zeek.x509.san_uri": {
"description": "Subject Alternative Name URI",
"type": "string"
},
"zeek.x509.san_uri.text": {
"type": "string"
}
},
"total": 2005
}
GET
- /mapi/indices
Lists information related to the underlying OpenSearch indices, similar to Arkime's esindices API.
Example output:
{
"indices": [
…
{
"docs.count": "2268613",
"docs.deleted": "0",
"health": "green",
"index": "arkime_sessions3-210301",
"pri": "1",
"pri.store.size": "1.8gb",
"rep": "0",
"status": "open",
"store.size": "1.8gb",
"uuid": "w-4Q0ofBTdWO9KqeIIAAWg"
},
…
]
}
GET
or POST
- /mapi/agg/<fieldname>
Executes an OpenSearch bucket aggregation query for the requested fields across all of Malcolm's indexed network traffic metadata.
Parameters:
fieldname
(URL parameter) - the name(s) of the field(s) to be queried (comma-separated if multiple fields) (default:event.provider
)limit
(query parameter) - the maximum number of records to return at each level of aggregation (default: 500)from
(query parameter) - the time frame (gte
) for the beginning of the search based on the session'sfirstPacket
field value in a format supported by the dateparser library (default: "1 day ago")to
(query parameter) - the time frame (lte
) for the beginning of the search based on the session'sfirstPacket
field value in a format supported by the dateparser library (default: "now")filter
(query parameter) - field filters formatted as a JSON dictionary
The from
, to
, and filter
parameters can be used to further restrict the range of documents returned. The filter
dictionary should be formatted such that its keys are field names and its values are the values for which to filter. A field name may be prepended with a !
to negate the filter (e.g., {"event.provider":"zeek"}
vs. {"!event.provider":"zeek"}
). Filtering for value null
implies "is not set" or "does not exist" (e.g., {"event.dataset":null}
means "the field event.dataset
is null
/is not set" while {"!event.dataset":null}
means "the field event.dataset
is not null
/is set").
Examples of filter
parameter:
{"!network.transport":"icmp"}
-network.transport
is noticmp
{"network.direction":["inbound","outbound"]}
-network.direction
is eitherinbound
oroutbound
{"event.provider":"zeek","event.dataset":["conn","dns"]}
- "event.provider
iszeek
andevent.dataset
is eitherconn
ordns
"{"!event.dataset":null}
- "event.dataset
is set (is notnull
)"
See Examples for more examples of filter
and corresponding output.
GET
or POST
- /mapi/document
Executes an OpenSearch query query for the matching documents across all of Malcolm's indexed network traffic metadata.
Parameters:
limit
(query parameter) - the maximum number of documents to return (default: 500)from
(query parameter) - the time frame (gte
) for the beginning of the search based on the session'sfirstPacket
field value in a format supported by the dateparser library (default: the UNIX epoch)to
(query parameter) - the time frame (lte
) for the beginning of the search based on the session'sfirstPacket
field value in a format supported by the dateparser library (default: "now")filter
(query parameter) - field filters formatted as a JSON dictionary (see Field Aggregations for examples)
Example cURL command and output:
$ curl -k -u username -L -XPOST -H 'Content-Type: application/json' \
'https://localhost/mapi/document' \
-d '{"limit": 10, filter":{"zeek.uid":"CYeji2z7CKmPRGyga"}}'
{
"filter": {
"zeek.uid": "CYeji2z7CKmPRGyga"
},
"range": [
0,
1643056677
],
"results": [
{
"_id": "220124-CYeji2z7CKmPRGyga-http-7677",
"_index": "arkime_sessions3-220124",
"_score": 0.0,
"_source": {
"@timestamp": "2022-01-24T20:31:01.846Z",
"@version": "1",
"agent": {
"hostname": "filebeat",
"id": "bc25716b-8fe7-4de6-a357-65c7d3c15c33",
"name": "filebeat",
"type": "filebeat",
"version": "7.10.2"
},
"client": {
"bytes": 0
},
"destination": {
"as": {
"full": "AS54113 Fastly"
},
"geo": {
"city_name": "Seattle",
"continent_code": "NA",
"country_code2": "US",
"country_code3": "US",
"country_iso_code": "US",
"country_name": "United States",
"dma_code": 819,
"ip": "151.101.54.132",
"latitude": 47.6092,
"location": {
"lat": 47.6092,
"lon": -122.3314
},
"longitude": -122.3314,
"postal_code": "98111",
"region_code": "WA",
"region_name": "Washington",
"timezone": "America/Los_Angeles"
},
"ip": "151.101.54.132",
"port": 80
},
"ecs": {
"version": "1.6.0"
},
"event": {
"action": [
"GET"
],
"category": [
"web",
"network"
],
…
Some security-related API examples:
Protocols
/mapi/agg/network.type,network.transport,network.protocol,network.protocol_version
{
"fields": [
"network.type",
"network.transport",
"network.protocol",
"network.protocol_version"
],
"filter": null,
"range": [
1970,
1643067256
],
"urls": [
"/dashboards/app/dashboards#/view/abdd7550-2c7c-40dc-947e-f6d186a158c4?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'1970-01-01T00:32:50Z',to:now))"
],
"values": {
"buckets": [
{
"doc_count": 442240,
"key": "ipv4",
"values": {
"buckets": [
{
"doc_count": 279538,
"key": "udp",
"values": {
"buckets": [
{
"doc_count": 266527,
"key": "bacnet",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 12365,
"key": "dns",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 78,
"key": "dhcp",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 44,
"key": "ntp",
"values": {
"buckets": [
{
"doc_count": 22,
"key": "4"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 3,
"key": "enip",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 2,
"key": "krb",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "syslog",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 30824,
"key": "tcp",
"values": {
"buckets": [
{
"doc_count": 7097,
"key": "smb",
"values": {
"buckets": [
{
"doc_count": 4244,
"key": "1"
},
{
"doc_count": 1438,
"key": "2"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1792,
"key": "http",
"values": {
"buckets": [
{
"doc_count": 829,
"key": "1.0"
},
{
"doc_count": 230,
"key": "1.1"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1280,
"key": "dce_rpc",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 857,
"key": "s7comm",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 426,
"key": "ntlm",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 378,
"key": "gssapi",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 146,
"key": "tds",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 125,
"key": "ssl",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 91,
"key": "tls",
"values": {
"buckets": [
{
"doc_count": 48,
"key": "TLSv13"
},
{
"doc_count": 28,
"key": "TLSv12"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 29,
"key": "ssh",
"values": {
"buckets": [
{
"doc_count": 18,
"key": "2"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 26,
"key": "modbus",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 17,
"key": "iso_cotp",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 8,
"key": "enip",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 6,
"key": "rdp",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 4,
"key": "ftp",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 4,
"key": "krb",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 4,
"key": "rfb",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 3,
"key": "ldap",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 2,
"key": "telnet",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 848,
"key": "icmp",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1573,
"key": "ipv6",
"values": {
"buckets": [
{
"doc_count": 1486,
"key": "udp",
"values": {
"buckets": [
{
"doc_count": 1433,
"key": "dns",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 80,
"key": "icmp",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
Software
/mapi/agg/zeek.software.name,zeek.software.unparsed_version
{
"fields": [
"zeek.software.name",
"zeek.software.unparsed_version"
],
"filter": null,
"range": [
1970,
1643067759
],
"urls": [
"/dashboards/app/dashboards#/view/87d990cc-9e0b-41e5-b8fe-b10ae1da0c85?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'1970-01-01T00:32:50Z',to:now))"
],
"values": {
"buckets": [
{
"doc_count": 6,
"key": "Chrome",
"values": {
"buckets": [
{
"doc_count": 2,
"key": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
},
{
"doc_count": 1,
"key": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
},
{
"doc_count": 1,
"key": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
},
{
"doc_count": 1,
"key": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
},
{
"doc_count": 1,
"key": "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/525.19 (KHTML, like Gecko) Chrome/1.0.154.36 Safari/525.19"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 6,
"key": "Nmap-SSH",
"values": {
"buckets": [
{
"doc_count": 3,
"key": "Nmap-SSH1-Hostkey"
},
{
"doc_count": 3,
"key": "Nmap-SSH2-Hostkey"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 5,
"key": "MSIE",
"values": {
"buckets": [
{
"doc_count": 2,
"key": "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
},
{
"doc_count": 1,
"key": "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; Win64; x64; Trident/7.0; .NET4.0C; .NET4.0E)"
},
{
"doc_count": 1,
"key": "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)"
},
{
"doc_count": 1,
"key": "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 4,
"key": "Firefox",
"values": {
"buckets": [
{
"doc_count": 2,
"key": "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0"
},
{
"doc_count": 1,
"key": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:34.0) Gecko/20100101 Firefox/34.0"
},
{
"doc_count": 1,
"key": "Mozilla/5.0 (X11; Linux x86_64; rv:96.0) Gecko/20100101 Firefox/96.0"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 3,
"key": "ECS (sec",
"values": {
"buckets": [
{
"doc_count": 2,
"key": "ECS (sec/96EE)"
},
{
"doc_count": 1,
"key": "ECS (sec/97A6)"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 3,
"key": "NmapNSE",
"values": {
"buckets": [
{
"doc_count": 3,
"key": "NmapNSE_1.0"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 2,
"key": "<unknown browser>",
"values": {
"buckets": [
{
"doc_count": 2,
"key": "Mozilla/5.0 (compatible; Nmap Scripting Engine; https://nmap.org/book/nse.html)"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 2,
"key": "Microsoft-Windows",
"values": {
"buckets": [
{
"doc_count": 2,
"key": "Microsoft-Windows/6.1 UPnP/1.0 Windows-Media-Player-DMS/12.0.7601.17514 DLNADOC/1.50"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 2,
"key": "Microsoft-Windows-NT",
"values": {
"buckets": [
{
"doc_count": 2,
"key": "Microsoft-Windows-NT/5.1 UPnP/1.0 UPnP-Device-Host/1.0 Microsoft-HTTPAPI/2.0"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 2,
"key": "SimpleHTTP",
"values": {
"buckets": [
{
"doc_count": 2,
"key": "SimpleHTTP/0.6 Python/2.7.17"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 2,
"key": "Windows-Media-Player-DMS",
"values": {
"buckets": [
{
"doc_count": 2,
"key": "Windows-Media-Player-DMS/12.0.7601.17514"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "A-B WWW",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "A-B WWW/0.1"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "CONF-CTR-NAE1",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "CONF-CTR-NAE1"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "ClearSCADA",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "ClearSCADA/6.72.4644.1"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "GoAhead-Webs",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "GoAhead-Webs"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "MSFT",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "MSFT 5.0"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "Microsoft-IIS",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "Microsoft-IIS/7.5"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "Microsoft-WebDAV-MiniRedir",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "Microsoft-WebDAV-MiniRedir/6.1.7601"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "Python-urllib",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "Python-urllib/2.7"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "Schneider-WEB/V",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "Schneider-WEB/V2.1.4"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "Version",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "Version_1.0"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "nginx",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "nginx"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "sublime-license-check",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "sublime-license-check/3.0"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
User agent
/mapi/agg/user_agent.original
{
"fields": [
"user_agent.original"
],
"filter": null,
"range": [
1970,
1643067845
],
"values": {
"buckets": [
{
"doc_count": 230,
"key": "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0"
},
{
"doc_count": 142,
"key": "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"
},
{
"doc_count": 114,
"key": "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)"
},
{
"doc_count": 50,
"key": "Mozilla/5.0 (compatible; Nmap Scripting Engine; https://nmap.org/book/nse.html)"
},
{
"doc_count": 48,
"key": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
},
{
"doc_count": 43,
"key": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
},
{
"doc_count": 33,
"key": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:34.0) Gecko/20100101 Firefox/34.0"
},
{
"doc_count": 17,
"key": "Python-urllib/2.7"
},
{
"doc_count": 12,
"key": "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
},
{
"doc_count": 9,
"key": "Microsoft-Windows/6.1 UPnP/1.0 Windows-Media-Player-DMS/12.0.7601.17514 DLNADOC/1.50"
},
{
"doc_count": 9,
"key": "Windows-Media-Player-DMS/12.0.7601.17514"
},
{
"doc_count": 8,
"key": "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
},
{
"doc_count": 5,
"key": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
},
{
"doc_count": 5,
"key": "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/525.19 (KHTML, like Gecko) Chrome/1.0.154.36 Safari/525.19"
},
{
"doc_count": 3,
"key": "Mozilla/5.0 (X11; Linux x86_64; rv:96.0) Gecko/20100101 Firefox/96.0"
},
{
"doc_count": 2,
"key": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
},
{
"doc_count": 1,
"key": "Microsoft-WebDAV-MiniRedir/6.1.7601"
},
{
"doc_count": 1,
"key": "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; Win64; x64; Trident/7.0; .NET4.0C; .NET4.0E)"
},
{
"doc_count": 1,
"key": "sublime-license-check/3.0"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
External traffic (outbound/inbound)
$ curl -k -u username -L -XPOST -H 'Content-Type: application/json' \
'https://localhost/mapi/agg/network.protocol' \
-d '{"filter":{"network.direction":["inbound","outbound"]}}'
{
"fields": [
"network.protocol"
],
"filter": {
"network.direction": [
"inbound",
"outbound"
]
},
"range": [
1970,
1643068000
],
"urls": [
"/dashboards/app/dashboards#/view/abdd7550-2c7c-40dc-947e-f6d186a158c4?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'1970-01-01T00:32:50Z',to:now))"
],
"values": {
"buckets": [
{
"doc_count": 202597,
"key": "bacnet"
},
{
"doc_count": 129,
"key": "tls"
},
{
"doc_count": 128,
"key": "ssl"
},
{
"doc_count": 33,
"key": "http"
},
{
"doc_count": 33,
"key": "ntp"
},
{
"doc_count": 20,
"key": "dns"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
Cross-segment traffic
$ curl -k -u username -L -XPOST -H 'Content-Type: application/json' \
'https://localhost/mapi/agg/source.segment,destination.segment,network.protocol' \
-d '{"filter":{"tags":"cross_segment"}}'
{
"fields": [
"source.segment",
"destination.segment",
"network.protocol"
],
"filter": {
"tags": "cross_segment"
},
"range": [
1970,
1643068080
],
"urls": [
"/dashboards/app/dashboards#/view/abdd7550-2c7c-40dc-947e-f6d186a158c4?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'1970-01-01T00:32:50Z',to:now))"
],
"values": {
"buckets": [
{
"doc_count": 6893,
"key": "Corporate",
"values": {
"buckets": [
{
"doc_count": 6893,
"key": "OT",
"values": {
"buckets": [
{
"doc_count": 891,
"key": "enip"
},
{
"doc_count": 889,
"key": "cip"
},
{
"doc_count": 202,
"key": "http"
},
{
"doc_count": 146,
"key": "modbus"
},
{
"doc_count": 1,
"key": "ftp"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 189,
"key": "OT",
"values": {
"buckets": [
{
"doc_count": 138,
"key": "Corporate",
"values": {
"buckets": [
{
"doc_count": 128,
"key": "http"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 51,
"key": "DMZ",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 28,
"key": "Battery Network",
"values": {
"buckets": [
{
"doc_count": 25,
"key": "Combined Cycle BOP",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 3,
"key": "Solar Panel Network",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 20,
"key": "Combined Cycle BOP",
"values": {
"buckets": [
{
"doc_count": 11,
"key": "Battery Network",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 9,
"key": "Solar Panel Network",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "Solar Panel Network",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "Combined Cycle BOP",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
Plaintext password
$ curl -k -u username -L -XPOST -H 'Content-Type: application/json' \
'https://localhost/mapi/agg/network.protocol' \
-d '{"filter":{"!related.password":null}}'
{
"fields": [
"network.protocol"
],
"filter": {
"!related.password": null
},
"range": [
1970,
1643068162
],
"urls": [
"/dashboards/app/dashboards#/view/abdd7550-2c7c-40dc-947e-f6d186a158c4?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'1970-01-01T00:32:50Z',to:now))"
],
"values": {
"buckets": [
{
"doc_count": 20,
"key": "http"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
Insecure/outdated protocols
$ curl -k -u username -L -XPOST -H 'Content-Type: application/json' \
'https://localhost/mapi/agg/network.protocol,network.protocol_version' \
-d '{"filter":{"event.severity_tags":"Insecure or outdated protocol"}}'
{
"fields": [
"network.protocol",
"network.protocol_version"
],
"filter": {
"event.severity_tags": "Insecure or outdated protocol"
},
"range": [
1970,
1643068248
],
"urls": [
"/dashboards/app/dashboards#/view/abdd7550-2c7c-40dc-947e-f6d186a158c4?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'1970-01-01T00:32:50Z',to:now))"
],
"values": {
"buckets": [
{
"doc_count": 4244,
"key": "smb",
"values": {
"buckets": [
{
"doc_count": 4244,
"key": "1"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 2,
"key": "ftp",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 2,
"key": "rdp",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "5.1"
},
{
"doc_count": 1,
"key": "5.2"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 2,
"key": "telnet",
"values": {
"buckets": [],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
Notice categories
/mapi/agg/zeek.notice.category,zeek.notice.sub_category
{
"fields": [
"zeek.notice.category",
"zeek.notice.sub_category"
],
"filter": null,
"range": [
1970,
1643068300
],
"urls": [
"/dashboards/app/dashboards#/view/f1f09567-fc7f-450b-a341-19d2f2bb468b?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'1970-01-01T00:32:50Z',to:now))",
"/dashboards/app/dashboards#/view/95479950-41f2-11ea-88fa-7151df485405?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'1970-01-01T00:32:50Z',to:now))"
],
"values": {
"buckets": [
{
"doc_count": 100,
"key": "ATTACK",
"values": {
"buckets": [
{
"doc_count": 42,
"key": "Lateral_Movement_Extracted_File"
},
{
"doc_count": 30,
"key": "Lateral_Movement"
},
{
"doc_count": 17,
"key": "Discovery"
},
{
"doc_count": 5,
"key": "Execution"
},
{
"doc_count": 5,
"key": "Lateral_Movement_Multiple_Attempts"
},
{
"doc_count": 1,
"key": "Lateral_Movement_and_Execution"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 14,
"key": "EternalSafety",
"values": {
"buckets": [
{
"doc_count": 11,
"key": "EternalSynergy"
},
{
"doc_count": 3,
"key": "ViolationPidMid"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 6,
"key": "Scan",
"values": {
"buckets": [
{
"doc_count": 6,
"key": "Port_Scan"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
},
{
"doc_count": 1,
"key": "Ripple20",
"values": {
"buckets": [
{
"doc_count": 1,
"key": "Treck_TCP_observed"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
Severity tags
/mapi/agg/event.severity_tags
{
"fields": [
"event.severity_tags"
],
"filter": null,
"range": [
1970,
1643068363
],
"urls": [
"/dashboards/app/dashboards#/view/d2dd0180-06b1-11ec-8c6b-353266ade330?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'1970-01-01T00:32:50Z',to:now))",
"/dashboards/app/dashboards#/view/95479950-41f2-11ea-88fa-7151df485405?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'1970-01-01T00:32:50Z',to:now))"
],
"values": {
"buckets": [
{
"doc_count": 160180,
"key": "Outbound traffic"
},
{
"doc_count": 43059,
"key": "Inbound traffic"
},
{
"doc_count": 11091,
"key": "Connection attempt rejected"
},
{
"doc_count": 8967,
"key": "Connection attempt, no reply"
},
{
"doc_count": 7131,
"key": "Cross-segment traffic"
},
{
"doc_count": 4250,
"key": "Insecure or outdated protocol"
},
{
"doc_count": 2219,
"key": "External traffic"
},
{
"doc_count": 1985,
"key": "Sensitive country"
},
{
"doc_count": 760,
"key": "Weird"
},
{
"doc_count": 537,
"key": "Connection aborted (originator)"
},
{
"doc_count": 474,
"key": "Connection aborted (responder)"
},
{
"doc_count": 206,
"key": "File transfer (high concern)"
},
{
"doc_count": 100,
"key": "MITRE ATT&CK framework technique"
},
{
"doc_count": 66,
"key": "Service on non-standard port"
},
{
"doc_count": 64,
"key": "Signature (capa)"
},
{
"doc_count": 30,
"key": "Signature (YARA)"
},
{
"doc_count": 25,
"key": "Signature (ClamAV)"
},
{
"doc_count": 20,
"key": "Cleartext password"
},
{
"doc_count": 19,
"key": "Long connection"
},
{
"doc_count": 15,
"key": "Notice (vulnerability)"
},
{
"doc_count": 13,
"key": "File transfer (medium concern)"
},
{
"doc_count": 6,
"key": "Notice (scan)"
},
{
"doc_count": 1,
"key": "High volume connection"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
POST
- /mapi/event
A webhook that accepts alert data to be reindexed into OpenSearch as session records for viewing in Malcolm's dashboards. See Alerting for more details and an example of how this API is used.
Example input:
{
"alert": {
"monitor": {
"name": "Malcolm API Loopback Monitor"
},
"trigger": {
"name": "Malcolm API Loopback Trigger",
"severity": 4
},
"period": {
"start": "2022-03-08T18:03:30.576Z",
"end": "2022-03-08T18:04:30.576Z"
},
"results": [
{
"_shards": {
"total": 5,
"failed": 0,
"successful": 5,
"skipped": 0
},
"hits": {
"hits": [],
"total": {
"value": 697,
"relation": "eq"
},
"max_score": null
},
"took": 1,
"timed_out": false
}
],
"body": "",
"alert": "PLauan8BaL6eY1yCu9Xj",
"error": ""
}
}
Example output:
{
"_index": "arkime_sessions3-220308",
"_type": "_doc",
"_id": "220308-PLauan8BaL6eY1yCu9Xj",
"_version": 4,
"result": "updated",
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"_seq_no": 9045,
"_primary_term": 1
}
Malcolm uses OpenSearch and OpenSearch Dashboards for data storage, search and visualization, and Logstash for log processing. Because these tools are data agnostic, Malcolm can be configured to accept various host logs and other third-party logs sent from log forwaders such as Fluent Bit and Beats. Some examples of the types of logs these forwarders might send include:
- System resource utilization metrics (CPU, memory, disk, network, etc.)
- System temperatures
- Linux system logs
- Windows event logs
- Process or service health status
- Logs appended to textual log files (e.g.,
tail
-ing a log file) - The output of an external script or program
- Messages in the form of MQTT control packets
- many more…
Refer to Forwarding Third-Party Logs to Malcolm for more information.
Malcolm's Docker-based deployment model makes Malcolm able to run on a variety of platforms. However, in some circumstances (for example, as a long-running appliance as part of a security operations center, or inside of a virtual machine) it may be desirable to install Malcolm as a dedicated standalone installation.
Malcolm can be packaged into an installer ISO based on the current stable release of Debian. This customized Debian installation is preconfigured with the bare minimum software needed to run Malcolm.
Official downloads of the Malcolm installer ISO are not provided: however, it can be built easily on an internet-connected Linux host with Vagrant:
- Vagrant
vagrant-reload
pluginvagrant-sshfs
pluginbento/debian-11
Vagrant box
The build should work with either the VirtualBox provider or the libvirt provider:
- VirtualBox provider
vagrant-vbguest
plugin
- libvirt
vagrant-libvirt
provider pluginvagrant-mutate
plugin to convertbento/debian-11
Vagrant box tolibvirt
format
To perform a clean build the Malcolm installer ISO, navigate to your local Malcolm working copy and run:
$ ./malcolm-iso/build_via_vagrant.sh -f
…
Starting build machine...
Bringing machine 'default' up with 'virtualbox' provider...
…
Building the ISO may take 30 minutes or more depending on your system. As the build finishes, you will see the following message indicating success:
…
Finished, created "/malcolm-build/malcolm-iso/malcolm-6.3.0.iso"
…
By default, Malcolm's Docker images are not packaged with the installer ISO, assuming instead that you will pull the latest images with a docker-compose pull
command as described in the Quick start section. If you wish to build an ISO with the latest Malcolm images included, follow the directions to create pre-packaged installation files, which include a tarball with a name like malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz
. Then, pass that images tarball to the ISO build script with a -d
, like this:
$ ./malcolm-iso/build_via_vagrant.sh -f -d malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz
…
A system installed from the resulting ISO will load the Malcolm Docker images upon first boot. This method is desirable when the ISO is to be installed in an "air gapped" environment or for distribution to non-networked machines.
Alternately, if you have forked Malcolm on GitHub, workflow files are provided which contain instructions for GitHub to build the docker images and sensor and Malcolm installer ISOs, specifically malcolm-iso-build-docker-wrap-push-ghcr.yml
for the Malcolm ISO. You'll need to run the workflows to build and push your fork's Malcolm docker images before building the ISO. The resulting ISO file is wrapped in a Docker image that provides an HTTP server from which the ISO may be downloaded.
The installer is designed to require as little user input as possible. For this reason, there are NO user prompts and confirmations about partitioning and reformatting hard disks for use by the operating system. The installer assumes that all non-removable storage media (eg., SSD, HDD, NVMe, etc.) are available for use and ⛔🆘😭💀 will partition and format them without warning 💀😭🆘⛔.
The installer will ask for several pieces of information prior to installing the Malcolm base operating system:
- Hostname
- Domain name
- Root password – (optional) a password for the privileged root account which is rarely needed
- User name: the name for the non-privileged service account user account under which the Malcolm runs
- User password – a password for the non-privileged sensor account
- Encryption password (optional) – if the encrypted installation option was selected at boot time, the encryption password must be entered every time the system boots
At the end of the installation process, you will be prompted with a few self-explanatory yes/no questions:
- Disable IPv6?
- Automatically login to the GUI session?
- Should the GUI session be locked due to inactivity?
- Display the Standard Mandatory DoD Notice and Consent Banner? (only applies when installed on U.S. government information systems)
Following these prompts, the installer will reboot and the Malcolm base operating system will boot.
When the system boots for the first time, the Malcolm Docker images will load if the installer was built with pre-packaged installation files as described above. Wait for this operation to continue (the progress dialog will disappear when they have finished loading) before continuing the setup.
Open a terminal (click the red terminal 🗔 icon next to the Debian swirl logo 🍥 menu button in the menu bar). At this point, setup is similar to the steps described in the Quick start section. Navigate to the Malcolm directory (cd ~/Malcolm
) and run auth_setup
to configure authentication. If the ISO didn't have pre-packaged Malcolm images, or if you'd like to retrieve the latest updates, run docker-compose pull
. Finalize your configuration by running scripts/install.py --configure
and follow the prompts as illustrated in the installation example.
Once Malcolm is configured, you can start Malcolm via the command line or by clicking the circular yellow Malcolm icon in the menu bar.
If you wish to set up time synchronization via NTP or htpdate
, open a terminal and run sudo configure-interfaces.py
. Select Continue, then choose Time Sync. Here you can configure the operating system to keep its time synchronized with either an NTP server (using the NTP protocol), another Malcolm instance, or another HTTP/HTTPS server. On the next dialog, choose the time synchronization method you wish to configure.
If htpdate is selected, you will be prompted to enter the IP address or hostname and port of an HTTP/HTTPS server (for a Malcolm instance, port 9200
may be used) and the time synchronization check frequency in minutes. A test connection will be made to determine if the time can be retrieved from the server.
If ntpdate is selected, you will be prompted to enter the IP address or hostname of the NTP server.
Upon configuring time synchronization, a "Time synchronization configured successfully!" message will be displayed.
The Malcolm aggregator base operating system targets the following guidelines for establishing a secure configuration posture:
- DISA STIG (Security Technical Implementation Guides) ported from DISA RHEL 7 STIG v1r1 to a Debian 9 base platform
- CIS Debian Linux 9 Benchmark with additional recommendations by the hardenedlinux/harbian-audit project
Currently there are 158 compliance checks that can be verified automatically and 23 compliance checks that must be verified manually.
The Malcolm aggregator base operating system claims the following exceptions to STIG compliance:
# | ID | Title | Justification |
---|---|---|---|
1 | SV-86535r1 | When passwords are changed a minimum of eight of the total number of characters must be changed. | Account/password policy exception: As an aggregator running Malcolm is intended to be used as an appliance rather than a general user-facing software platform, some exceptions to password enforcement policies are claimed. |
2 | SV-86537r1 | When passwords are changed a minimum of four character classes must be changed. | Account/password policy exception |
3 | SV-86549r1 | Passwords for new users must be restricted to a 24 hours/1 day minimum lifetime. | Account/password policy exception |
4 | SV-86551r1 | Passwords must be restricted to a 24 hours/1 day minimum lifetime. | Account/password policy exception |
5 | SV-86553r1 | Passwords for new users must be restricted to a 60-day maximum lifetime. | Account/password policy exception |
6 | SV-86555r1 | Existing passwords must be restricted to a 60-day maximum lifetime. | Account/password policy exception |
7 | SV-86557r1 | Passwords must be prohibited from reuse for a minimum of five generations. | Account/password policy exception |
8 | SV-86565r1 | The operating system must disable account identifiers (individuals, groups, roles, and devices) if the password expires. | Account/password policy exception |
9 | SV-86567r2 | Accounts subject to three unsuccessful logon attempts within 15 minutes must be locked for the maximum configurable period. | Account/password policy exception |
10 | SV-86569r1 | If three unsuccessful root logon attempts within 15 minutes occur the associated account must be locked. | Account/password policy exception |
11 | SV-86603r1 | The … operating system must prevent the installation of software, patches, service packs, device drivers, or operating system components of local packages without verification they have been digitally signed using a certificate that is issued by a Certificate Authority (CA) that is recognized and approved by the organization. | As the base distribution is not using embedded signatures, debsig-verify would reject all packages (see comment in /etc/dpkg/dpkg.cfg ). Enabling it after installation would disallow any future updates. |
12 | SV-86607r1 | USB mass storage must be disabled. | The ability to ingest data (such as PCAP files) from a mounted USB mass storage device is a requirement of the system. |
13 | SV-86609r1 | File system automounter must be disabled unless required. | The ability to ingest data (such as PCAP files) from a mounted USB mass storage device is a requirement of the system. |
14 | SV-86705r1 | The operating system must shut down upon audit processing failure, unless availability is an overriding concern. If availability is a concern, the system must alert the designated staff (System Administrator [SA] and Information System Security Officer [ISSO] at a minimum) in the event of an audit processing failure. | As maximizing availability is a system requirement, audit processing failures will be logged on the device rather than halting the system. |
15 | SV-86713r1 | The operating system must immediately notify the System Administrator (SA) and Information System Security Officer ISSO (at a minimum) when allocated audit record storage volume reaches 75% of the repository maximum audit record storage capacity. | same as above |
16 | SV-86715r1 | The operating system must immediately notify the System Administrator (SA) and Information System Security Officer (ISSO) (at a minimum) when the threshold for the repository maximum audit record storage capacity is reached. | same as above |
17 | SV-86597r1 | A file integrity tool must verify the baseline operating system configuration at least weekly. | This functionality is not configured by default, but it can be configured post-install using the aide tool |
18 | SV-86697r2 | The file integrity tool must use FIPS 140-2 approved cryptographic hashes for validating file contents and directories. | same as above |
19 | SV-86707r1 | The operating system must off-load audit records onto a different system or media from the system being audited. | same as above |
20 | SV-86709r1 | The operating system must encrypt the transfer of audit records off-loaded onto a different system or media from the system being audited. | same as above |
21 | SV-86833r1 | The system must send rsyslog output to a log aggregation server. | same as above |
22 | SV-87815r2 | The audit system must take appropriate action when there is an error sending audit records to a remote system. | same as above |
23 | SV-86693r2 | The file integrity tool must be configured to verify Access Control Lists (ACLs). | As this is not a multi-user system, the ACL check would be irrelevant. |
24 | SV-86837r1 | The system must use and update a DoD-approved virus scan program. | As this is a network traffic analysis appliance rather than an end-user device, regular user files will not be created. A virus scan program would impact device performance and would be unnecessary. |
25 | SV-86839r1 | The system must update the virus scan program every seven days or more frequently. | As this is a network traffic analysis appliance rather than an end-user device, regular user files will not be created. A virus scan program would impact device performance and would be unnecessary. |
26 | SV-86847r2 | All network connections associated with a communication session must be terminated at the end of the session or after 10 minutes of inactivity from the user at a command prompt, except to fulfill documented and validated mission requirements. | Malcolm be controlled from the command line in a manual capture scenario, so timing out a session based on command prompt inactivity would be inadvisable. |
27 | SV-86893r2 | The operating system must, for networked systems, synchronize clocks with a server that is synchronized to one of the redundant United States Naval Observatory (USNO) time servers, a time server designated for the appropriate DoD network (NIPRNet/SIPRNet), and/or the Global Positioning System (GPS). | While time synchronization is supported on the Malcolm aggregator base operating system, an exception is claimed for this rule as the device may be configured to sync to servers other than the ones listed in the STIG. |
28 | SV-86905r1 | For systems using DNS resolution, at least two name servers must be configured. | STIG recommendations for DNS servers are not enforced on the Malcolm aggregator base operating system to allow for use in a variety of network scenarios. |
29 | SV-86919r1 | Network interfaces must not be in promiscuous mode. | One purpose of the Malcolm aggregator base operating system is to sniff and capture network traffic. |
30 | SV-86931r2 | An X Windows display manager must not be installed unless approved. | A locked-down X Windows session is required for the sensor's kiosk display. |
31 | SV-86519r3 | The operating system must set the idle delay setting for all connection types. | As this is a network traffic aggregation and analysis appliance rather than an end-user device, timing out displays or connections would not be desirable. |
32 | SV-86523r1 | The operating system must initiate a session lock for the screensaver after a period of inactivity for graphical user interfaces. | This option is configurable during install time. Some installations of the Malcolm aggregator base operating system may be on appliance hardware not equipped with a keyboard by default, in which case it may not be desirable to lock the session. |
33 | SV-86525r1 | The operating system must initiate a session lock for graphical user interfaces when the screensaver is activated. | This option is configurable during install time. Some installations of the Malcolm aggregator base operating system may be on appliance hardware not equipped with a keyboard by default, in which case it may not be desirable to lock the session. |
34 | SV-86589r1 | The operating system must uniquely identify and must authenticate organizational users (or processes acting on behalf of organizational users) using multifactor authentication. | As this is a network traffic capture appliance rather than an end-user device or a multiuser network host, this requirement is not applicable. |
35 | SV-86921r2 | The system must be configured to prevent unrestricted mail relaying. | Does not apply as the Malcolm aggregator base operating system not does run a mail service. |
36 | SV-86929r1 | If the Trivial File Transfer Protocol (TFTP) server is required, the TFTP daemon must be configured to operate in secure mode. | Does not apply as the Malcolm aggregator base operating system does not run a TFTP server. |
37 | SV-86935r3 | The Network File System (NFS) must be configured to use RPCSEC_GSS. | Does not apply as the Malcolm aggregator base operating system does not run an NFS server. |
38 | SV-87041r2 | The operating system must have the required packages for multifactor authentication installed. | As this is a network traffic capture appliance rather than an end-user device or a multiuser network host, this requirement is not applicable. |
39 | SV-87051r2 | The operating system must implement multifactor authentication for access to privileged accounts via pluggable authentication modules (PAM). | As this is a network traffic capture appliance rather than an end-user device or a multiuser network host, this requirement is not applicable. |
40 | SV-87059r2 | The operating system must implement smart card logons for multifactor authentication for access to privileged accounts. | As this is a network traffic capture appliance rather than an end-user device or a multiuser network host, this requirement is not applicable. |
41 | SV-87829r1 | Wireless network adapters must be disabled. | As an appliance intended to capture network traffic in a variety of network environments, wireless adapters may be needed to capture and/or report wireless traffic. |
42 | SV-86699r1 | The system must not allow removable media to be used as the boot loader unless approved. | the Malcolm aggregator base operating system supports a live boot mode that can be booted from removable media. |
Please review the notes for these additional rules. While not claiming an exception, they may be implemented or checked in a different way than outlined by the RHEL STIG as the Malcolm aggregator base operating system is not built on RHEL or for other reasons.
# | ID | Title | Note |
---|---|---|---|
1 | SV-86585r1 | Systems with a Basic Input/Output System (BIOS) must require authentication upon booting into single-user and maintenance modes. | Although the compliance check script does not detect it, booting into recovery mode does in fact require the root password. |
2 | SV-86587r1 | Systems using Unified Extensible Firmware Interface (UEFI) must require authentication upon booting into single-user and maintenance modes. | Although the compliance check script does not detect it, booting into recovery mode does in fact require the root password. |
3 | SV-86651r1 | All files and directories contained in local interactive user home directories must have mode 0750 or less permissive. | Depending on when the compliance check script is run, some ephemeral files may exist in the service account's home directory which will cause this check to fail. For practical purposes the Malcolm aggregator base operating system's configuration does, however, comply. |
4 | SV-86623r3 | Vendor packaged system security patches and updates must be installed and up to date. | When the the Malcolm aggregator base operating system sensor appliance software is built, all of the latest applicable security patches and updates are included in it. How future updates are to be handled is still in design. |
6 | SV-86691r2 | The operating system must implement NIST FIPS-validated cryptography for the following: to provision digital signatures, to generate cryptographic hashes, and to protect data requiring data-at-rest protections in accordance with applicable federal laws, Executive Orders, directives, policies, regulations, and standards. | the Malcolm aggregator base operating system does use FIPS-compatible libraries for cryptographic functions. However, the kernel parameter being checked by the compliance check script is incompatible with some of the systems initialization scripts. |
In addition, DISA STIG rules SV-86663r1, SV-86695r2, SV-86759r3, SV-86761r3, SV-86763r3, SV-86765r3, SV-86595r1, and SV-86615r2 relate to the SELinux kernel which is not used in the Malcolm aggregator base operating system, and are thus skipped.
Currently there are 271 checks to determine compliance with the CIS Debian Linux 9 Benchmark.
The Malcolm aggregator base operating system claims exceptions from the recommendations in this benchmark in the following categories:
1.1 Install Updates, Patches and Additional Security Software - When the the Malcolm aggregator appliance software is built, all of the latest applicable security patches and updates are included in it. How future updates are to be handled is still in design.
1.3 Enable verify the signature of local packages - As the base distribution is not using embedded signatures, debsig-verify
would reject all packages (see comment in /etc/dpkg/dpkg.cfg
). Enabling it after installation would disallow any future updates.
2.14 Add nodev option to /run/shm Partition, 2.15 Add nosuid Option to /run/shm Partition, 2.16 Add noexec Option to /run/shm Partition - The Malcolm aggregator base operating system does not mount /run/shm
as a separate partition, so these recommendations do not apply.
2.18 Disable Mounting of cramfs Filesystems, 2.19 Disable Mounting of freevxfs Filesystems, 2.20 Disable Mounting of jffs2 Filesystems, 2.21 Disable Mounting of hfs Filesystems, 2.22 Disable Mounting of hfsplus Filesystems, 2.23 Disable Mounting of squashfs Filesystems, 2.24 Disable Mounting of udf Filesystems - The Malcolm aggregator base operating system is not compiling a custom Linux kernel, so these filesystems are inherently supported as they are part Debian Linux's default kernel.
4.6 Disable USB Devices - The ability to ingest data (such as PCAP files) from a mounted USB mass storage device is a requirement of the system.
6.1 Ensure the X Window system is not installed, 6.2 Ensure Avahi Server is not enabled, 6.3 Ensure print server is not enabled - An X Windows session is provided for displaying dashboards. The library packages libavahi-common-data
, libavahi-common3
, and libcups2
are dependencies of some of the X components used by the Malcolm aggregator base operating system, but the avahi
and cups
services themselves are disabled.
6.17 Ensure virus scan Server is enabled, 6.18 Ensure virus scan Server update is enabled - As this is a network traffic analysis appliance rather than an end-user device, regular user files will not be created. A virus scan program would impact device performance and would be unnecessary.
7.2.4 Log Suspicious Packets, 7.2.7 Enable RFC-recommended Source Route Validation, 7.4.1 Install TCP Wrappers - As Malcolm may operate as a network traffic capture appliance sniffing packets on a network interface configured in promiscuous mode, these recommendations do not apply.
8.4.1 Install aide package and 8.4.2 Implement Periodic Execution of File Integrity - This functionality is not configured by default, but it could be configured post-install using aide
.
8.1.1.2 Disable System on Audit Log Full, 8.1.1.3 Keep All Auditing Information, 8.1.1.5 Ensure set remote_server for audit service, 8.1.1.6 Ensure enable_krb5 set to yes for remote audit service, 8.1.1.7 Ensure set action for audit storage volume is fulled, 8.1.1.9 Set space left for auditd service, a few other audit-related items under section 8.1, 8.2.5 Configure rsyslog to Send Logs to a Remote Log Host - As maximizing availability is a system requirement, audit processing failures will be logged on the device rather than halting the system. auditd
is set up to syslog when its local storage capacity is reached.
Password-related recommendations under 9.2 and 10.1 - The library package libpam-pwquality
is used in favor of libpam-cracklib
which is what the compliance scripts are looking for. Also, as an appliance running Malcolm is intended to be used as an appliance rather than a general user-facing software platform, some exceptions to password enforcement policies are claimed.
9.3.13 Limit Access via SSH - The Malcolm aggregator base operating system does not create multiple regular user accounts: only root
and an aggregator service account are used. SSH access for root
is disabled. SSH login with a password is also disallowed: only key-based authentication is accepted. The service account accepts no keys by default. As such, the AllowUsers
, AllowGroups
, DenyUsers
, and DenyGroups
values in sshd_config
do not apply.
9.5 Restrict Access to the su Command - The Malcolm aggregator base operating system does not create multiple regular user accounts: only root
and an aggregator service account are used.
10.1.10 Set maxlogins for all accounts and 10.5 Set Timeout on ttys - The Malcolm aggregator base operating system does not create multiple regular user accounts: only root
and an aggregator service account are used.
12.10 Find SUID System Executables, 12.11 Find SGID System Executables - The few files found by these scripts are valid exceptions required by the Malcolm aggregator base operating system's core requirements.
Please review the notes for these additional guidelines. While not claiming an exception, the Malcolm aggregator base operating system may implement them in a manner different than is described by the CIS Debian Linux 9 Benchmark or the hardenedlinux/harbian-audit audit scripts.
4.1 Restrict Core Dumps - The Malcolm aggregator base operating system disables core dumps using a configuration file for ulimit
named /etc/security/limits.d/limits.conf
. The audit script checking for this does not check the limits.d
subdirectory, which is why this is incorrectly flagged as noncompliant.
5.4 Ensure ctrl-alt-del is disabled - The Malcolm aggregator base operating system disables the ctrl+alt+delete
key sequence by executing systemctl disable ctrl-alt-del.target
during installation and the command systemctl mask ctrl-alt-del.target
at boot time.
6.19 Configure Network Time Protocol (NTP) - While time synchronization is supported on the Malcolm aggregator base operating system, an exception is claimed for this rule as the network sensor device may be configured to sync to servers in a different way than specified in the benchmark.
7.4.4 Create /etc/hosts.deny, 7.7.1 Ensure Firewall is active, 7.7.4.1 Ensure default deny firewall policy, 7.7.4.3 Ensure default deny firewall policy, 7.7.4.4 Ensure outbound and established connections are configured - The Malcolm aggregator base operating system is configured with an appropriately locked-down software firewall (managed by "Uncomplicated Firewall" ufw
). However, the methods outlined in the CIS benchmark recommendations do not account for this configuration.
8.7 Verifies integrity all packages - The script which verifies package integrity only "fails" because of missing (status ??5??????
displayed by the utility) language ("locale") files, which are removed as part of the Malcolm aggregator base operating system's trimming-down process. All non-locale-related system files pass intergrity checks.
Here's a step-by-step example of getting Malcolm from GitHub, configuring your system and your Malcolm instance, and running it on a system running Ubuntu Linux. Your mileage may vary depending on your individual system configuration, but this should be a good starting point.
The commands in this example should be executed as a non-root user.
You can use git
to clone Malcolm into a local working copy, or you can download and extract the artifacts from the latest release.
To install Malcolm from the latest Malcolm release, browse to the Malcolm releases page on GitHub and download at a minimum install.py
and the malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz
file, then navigate to your downloads directory:
user@host:~$ cd Downloads/
user@host:~/Downloads$ ls
malcolm_common.py install.py malcolm_20190611_095410_ce2d8de.tar.gz
If you are obtaining Malcolm using git
instead, run the following command to clone Malcolm into a local working copy:
user@host:~$ git clone https://github.com/cisagov/Malcolm
Cloning into 'Malcolm'...
remote: Enumerating objects: 443, done.
remote: Counting objects: 100% (443/443), done.
remote: Compressing objects: 100% (310/310), done.
remote: Total 443 (delta 81), reused 441 (delta 79), pack-reused 0
Receiving objects: 100% (443/443), 6.87 MiB | 18.86 MiB/s, done.
Resolving deltas: 100% (81/81), done.
user@host:~$ cd Malcolm/
Next, run the install.py
script to configure your system. Replace user
in this example with your local account username, and follow the prompts. Most questions have an acceptable default you can accept by pressing the Enter
key. Depending on whether you are installing Malcolm from the release tarball or inside of a git working copy, the questions below will be slightly different, but for the most part are the same.
user@host:~/Malcolm$ sudo ./scripts/install.py
Installing required packages: ['apache2-utils', 'make', 'openssl', 'python3-dialog']
"docker info" failed, attempt to install Docker? (Y/n): y
Attempt to install Docker using official repositories? (Y/n): y
Installing required packages: ['apt-transport-https', 'ca-certificates', 'curl', 'gnupg-agent', 'software-properties-common']
Installing docker packages: ['docker-ce', 'docker-ce-cli', 'containerd.io']
Installation of docker packages apparently succeeded
Add a non-root user to the "docker" group?: y
Enter user account: user
Add another non-root user to the "docker" group?: n
"docker-compose version" failed, attempt to install docker-compose? (Y/n): y
Install docker-compose directly from docker github? (Y/n): y
Download and installation of docker-compose apparently succeeded
fs.file-max increases allowed maximum for file handles
fs.file-max= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y
fs.inotify.max_user_watches increases allowed maximum for monitored files
fs.inotify.max_user_watches= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y
fs.inotify.max_queued_events increases queue size for monitored files
fs.inotify.max_queued_events= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y
fs.inotify.max_user_instances increases allowed maximum monitor file watchers
fs.inotify.max_user_instances= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y
vm.max_map_count increases allowed maximum for memory segments
vm.max_map_count= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y
net.core.somaxconn increases allowed maximum for socket connections
net.core.somaxconn= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y
vm.swappiness adjusts the preference of the system to swap vs. drop runtime memory pages
vm.swappiness= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y
vm.dirty_background_ratio defines the percentage of system memory fillable with "dirty" pages before flushing
vm.dirty_background_ratio= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y
vm.dirty_ratio defines the maximum percentage of dirty system memory before committing everything
vm.dirty_ratio= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y
/etc/security/limits.d/limits.conf increases the allowed maximums for file handles and memlocked segments
/etc/security/limits.d/limits.conf does not exist, create it? (Y/n): y
If you are configuring Malcolm from within a git working copy, install.py
will now exit. Run install.py
again like you did at the beginning of the example, only remove the sudo
and add --configure
to run install.py
in "configuration only" mode.
user@host:~/Malcolm$ ./scripts/install.py --configure
Alternately, if you are configuring Malcolm from the release tarball you will be asked if you would like to extract the contents of the tarball and to specify the installation directory and install.py
will continue:
Extract Malcolm runtime files from /home/user/Downloads/malcolm_20190611_095410_ce2d8de.tar.gz (Y/n): y
Enter installation path for Malcolm [/home/user/Downloads/malcolm]: /home/user/Malcolm
Malcolm runtime files extracted to /home/user/Malcolm
Now that any necessary system configuration changes have been made, the local Malcolm instance will be configured:
Malcolm processes will run as UID 1000 and GID 1000. Is this OK? (Y/n): y
Should Malcolm use and maintain its own OpenSearch instance? (Y/n): y
Forward Logstash logs to a secondary remote OpenSearch instance? (y/N): n
Setting 10g for OpenSearch and 3g for Logstash. Is this OK? (Y/n): y
Setting 3 workers for Logstash pipelines. Is this OK? (Y/n): y
Restart Malcolm upon system or Docker daemon restart? (y/N): y
1: no
2: on-failure
3: always
4: unless-stopped
Select Malcolm restart behavior (unless-stopped): 4
Require encrypted HTTPS connections? (Y/n): y
Will Malcolm be running behind another reverse proxy (Traefik, Caddy, etc.)? (y/N): n
Specify external Docker network name (or leave blank for default networking) ():
Authenticate against Lightweight Directory Access Protocol (LDAP) server? (y/N): n
Store OpenSearch index snapshots locally in /home/user/Malcolm/opensearch-backup? (Y/n): y
Compress OpenSearch index snapshots? (y/N): n
Delete the oldest indices when the database exceeds a certain size? (y/N): n
Automatically analyze all PCAP files with Suricata? (Y/n): y
Download updated Suricata signatures periodically? (Y/n): y
Automatically analyze all PCAP files with Zeek? (Y/n): y
Perform reverse DNS lookup locally for source and destination IP addresses in logs? (y/N): n
Perform hardware vendor OUI lookups for MAC addresses? (Y/n): y
Perform string randomness scoring on some fields? (Y/n): y
Expose OpenSearch port to external hosts? (y/N): n
Expose Logstash port to external hosts? (y/N): n
Expose Filebeat TCP port to external hosts? (y/N): y
1: json
2: raw
Select log format for messages sent to Filebeat TCP listener (json): 1
Source field to parse for messages sent to Filebeat TCP listener (message): message
Target field under which to store decoded JSON fields for messages sent to Filebeat TCP listener (miscbeat): miscbeat
Field to drop from events sent to Filebeat TCP listener (message): message
Tag to apply to messages sent to Filebeat TCP listener (_malcolm_beats): _malcolm_beats
Expose SFTP server (for PCAP upload) to external hosts? (y/N): n
Enable file extraction with Zeek? (y/N): y
1: none
2: known
3: mapped
4: all
5: interesting
Select file extraction behavior (none): 5
1: quarantined
2: all
3: none
Select file preservation behavior (quarantined): 1
Scan extracted files with ClamAV? (y/N): y
Scan extracted files with Yara? (y/N): y
Scan extracted PE files with Capa? (y/N): y
Lookup extracted file hashes with VirusTotal? (y/N): n
Download updated file scanner signatures periodically? (Y/n): y
Should Malcolm capture live network traffic to PCAP files for analysis with Arkime? (y/N): y
Capture packets using netsniff-ng? (Y/n): y
Capture packets using tcpdump? (y/N): n
Should Malcolm analyze live network traffic with Suricata? (y/N): y
Should Malcolm analyze live network traffic with Zeek? (y/N): y
Specify capture interface(s) (comma-separated): eth0
Capture filter (tcpdump-like filter expression; leave blank to capture all traffic) (): not port 5044 and not port 8005 and not port 9200
Disable capture interface hardware offloading and adjust ring buffer sizes? (y/N): n
Malcolm has been installed to /home/user/Malcolm. See README.md for more information.
Scripts for starting and stopping Malcolm and changing authentication-related settings can be found in /home/user/Malcolm/scripts.
At this point you should reboot your computer so that the new system settings can be applied. After rebooting, log back in and return to the directory to which Malcolm was installed (or to which the git working copy was cloned).
Now we need to set up authentication and generate some unique self-signed TLS certificates. You can replace analyst
in this example with whatever username you wish to use to log in to the Malcolm web interface.
user@host:~/Malcolm$ ./scripts/auth_setup
Store administrator username/password for local Malcolm access? (Y/n): y
Administrator username: analyst
analyst password:
analyst password (again):
(Re)generate self-signed certificates for HTTPS access (Y/n): y
(Re)generate self-signed certificates for a remote log forwarder (Y/n): y
Store username/password for primary remote OpenSearch instance? (y/N): n
Store username/password for secondary remote OpenSearch instance? (y/N): n
Store username/password for email alert sender account? (y/N): n
For now, rather than build Malcolm from scratch, we'll pull images from Docker Hub:
user@host:~/Malcolm$ docker-compose pull
Pulling api ... done
Pulling arkime ... done
Pulling dashboards ... done
Pulling dashboards-helper ... done
Pulling file-monitor ... done
Pulling filebeat ... done
Pulling freq ... done
Pulling htadmin ... done
Pulling logstash ... done
Pulling name-map-ui ... done
Pulling nginx-proxy ... done
Pulling opensearch ... done
Pulling pcap-capture ... done
Pulling pcap-monitor ... done
Pulling suricata ... done
Pulling upload ... done
Pulling zeek ... done
user@host:~/Malcolm$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
malcolmnetsec/api 6.3.0 xxxxxxxxxxxx 3 days ago 158MB
malcolmnetsec/arkime 6.3.0 xxxxxxxxxxxx 3 days ago 816MB
malcolmnetsec/dashboards 6.3.0 xxxxxxxxxxxx 3 days ago 1.02GB
malcolmnetsec/dashboards-helper 6.3.0 xxxxxxxxxxxx 3 days ago 184MB
malcolmnetsec/filebeat-oss 6.3.0 xxxxxxxxxxxx 3 days ago 624MB
malcolmnetsec/file-monitor 6.3.0 xxxxxxxxxxxx 3 days ago 588MB
malcolmnetsec/file-upload 6.3.0 xxxxxxxxxxxx 3 days ago 259MB
malcolmnetsec/freq 6.3.0 xxxxxxxxxxxx 3 days ago 132MB
malcolmnetsec/htadmin 6.3.0 xxxxxxxxxxxx 3 days ago 242MB
malcolmnetsec/logstash-oss 6.3.0 xxxxxxxxxxxx 3 days ago 1.35GB
malcolmnetsec/name-map-ui 6.3.0 xxxxxxxxxxxx 3 days ago 143MB
malcolmnetsec/nginx-proxy 6.3.0 xxxxxxxxxxxx 3 days ago 121MB
malcolmnetsec/opensearch 6.3.0 xxxxxxxxxxxx 3 days ago 1.17GB
malcolmnetsec/pcap-capture 6.3.0 xxxxxxxxxxxx 3 days ago 121MB
malcolmnetsec/pcap-monitor 6.3.0 xxxxxxxxxxxx 3 days ago 213MB
malcolmnetsec/suricata 6.3.0 xxxxxxxxxxxx 3 days ago 278MB
malcolmnetsec/zeek 6.3.0 xxxxxxxxxxxx 3 days ago 1GB
Finally, we can start Malcolm. When Malcolm starts it will stream informational and debug messages to the console. If you wish, you can safely close the console or use Ctrl+C
to stop these messages; Malcolm will continue running in the background.
user@host:~/Malcolm$ ./scripts/start
In a few minutes, Malcolm services will be accessible via the following URLs:
------------------------------------------------------------------------------
- Arkime: https://localhost/
- OpenSearch Dashboards: https://localhost/dashboards/
- PCAP upload (web): https://localhost/upload/
- PCAP upload (sftp): sftp://[email protected]:8022/files/
- Host and subnet name mapping editor: https://localhost/name-map-ui/
- Account management: https://localhost:488/
NAME COMMAND SERVICE STATUS PORTS
malcolm-api-1 "/usr/local/bin/dock…" api running (starting) …
malcolm-arkime-1 "/usr/local/bin/dock…" arkime running (starting) …
malcolm-dashboards-1 "/usr/local/bin/dock…" dashboards running (starting) …
malcolm-dashboards-helper-1 "/usr/local/bin/dock…" dashboards-helper running (starting) …
malcolm-file-monitor-1 "/usr/local/bin/dock…" file-monitor running (starting) …
malcolm-filebeat-1 "/usr/local/bin/dock…" filebeat running (starting) …
malcolm-freq-1 "/usr/local/bin/dock…" freq running (starting) …
malcolm-htadmin-1 "/usr/local/bin/dock…" htadmin running (starting) …
malcolm-logstash-1 "/usr/local/bin/dock…" logstash running (starting) …
malcolm-name-map-ui-1 "/usr/local/bin/dock…" name-map-ui running (starting) …
malcolm-nginx-proxy-1 "/usr/local/bin/dock…" nginx-proxy running (starting) …
malcolm-opensearch-1 "/usr/local/bin/dock…" opensearch running (starting) …
malcolm-pcap-capture-1 "/usr/local/bin/dock…" pcap-capture running …
malcolm-pcap-monitor-1 "/usr/local/bin/dock…" pcap-monitor running (starting) …
malcolm-suricata-1 "/usr/local/bin/dock…" suricata running (starting) …
malcolm-suricata-live-1 "/usr/local/bin/dock…" suricata-live running …
malcolm-upload-1 "/usr/local/bin/dock…" upload running (starting) …
malcolm-zeek-1 "/usr/local/bin/dock…" zeek running (starting) …
malcolm-zeek-live-1 "/usr/local/bin/dock…" zeek-live running …
…
It will take several minutes for all of Malcolm's components to start up. Logstash will take the longest, probably 3 to 5 minutes. You'll know Logstash is fully ready when you see Logstash spit out a bunch of starting up messages, ending with this:
…
malcolm-logstash-1 | [2022-07-27T20:27:52,056][INFO ][logstash.agent ] Pipelines running {:count=>6, :running_pipelines=>[:"malcolm-input", :"malcolm-output", :"malcolm-beats", :"malcolm-suricata", :"malcolm-enrichment", :"malcolm-zeek"], :non_running_pipelines=>[]}
…
You can now open a web browser and navigate to one of the Malcolm user interfaces.
At this time there is not an "official" upgrade procedure to get from one version of Malcolm to the next, as it may vary from platform to platform. However, the process is fairly simple can be done by following these steps:
You may wish to get the official updates for the underlying system's software packages before you proceed. Consult the documentation of your operating system for how to do this.
If you are upgrading an Malcolm instance installed from Malcolm installation ISO, follow scenario 2 below. Due to the Malcolm base operating system's hardened configuration, when updating the underlying system, temporarily set the umask value to Debian default (umask 0022
in the root shell in which updates are being performed) instead of the more restrictive Malcolm default. This will allow updates to be applied with the right permissions.
If you checked out a working copy of the Malcolm repository from GitHub with a git clone
command, here are the basic steps to performing an upgrade:
- stop Malcolm
./scripts/stop
- stash changes to
docker-compose.yml
and other filesgit stash save "pre-upgrade Malcolm configuration changes"
- pull changes from GitHub repository
git pull --rebase
- pull new Docker images (this will take a while)
docker-compose pull
- apply saved configuration change stashed earlier
git stash pop
- if you see
Merge conflict
messages, resolve the conflicts with your favorite text editor - you may wish to re-run
install.py --configure
as described in System configuration and tuning in case there are any newdocker-compose.yml
parameters for Malcolm that need to be set up - start Malcolm
./scripts/start
- you may be prompted to configure authentication if there are new authentication-related files that need to be generated
- you probably do not need to re-generate self-signed certificates
If you installed Malcolm from pre-packaged installation files, here are the basic steps to perform an upgrade:
- stop Malcolm
./scripts/stop
- uncompress the new pre-packaged installation files (using
malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz
as an example, the file and/or directory names will be different depending on the release)tar xf malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz
- backup current Malcolm scripts, configuration files and certificates
mkdir -p ./upgrade_backup_$(date +%Y-%m-%d)
cp -r filebeat/ htadmin/ logstash/ nginx/ auth.env cidr-map.txt docker-compose.yml host-map.txt net-map.json ./scripts ./README.md ./upgrade_backup_$(date +%Y-%m-%d)/
- replace scripts and local documentation in your existing installation with the new ones
rm -rf ./scripts ./README.md
cp -r ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/scripts ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/README.md ./
- replace (overwrite)
docker-compose.yml
file with new versioncp ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/docker-compose.yml ./docker-compose.yml
- re-run
./scripts/install.py --configure
as described in System configuration and tuning - using a file comparison tool (e.g.,
diff
,meld
,Beyond Compare
, etc.), comparedocker-compose.yml
and thedocker-compare.yml
file you backed up in step 3, and manually migrate over any customizations you wish to preserve from that file (e.g.,PCAP_FILTER
,MAXMIND_GEOIP_DB_LICENSE_KEY
,MANAGE_PCAP_FILES
; anything else you may have edited by hand indocker-compose.yml
that's not prompted for ininstall.py --configure
) - pull the new docker images (this will take a while)
docker-compose pull
to pull them from Docker Hub ordocker-compose load -i malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz
if you have an offline tarball of the Malcolm docker images
- start Malcolm
./scripts/start
- you may be prompted to configure authentication if there are new authentication-related files that need to be generated
- you probably do not need to re-generate self-signed certificates
If you are technically-minded, you may wish to follow the debug output provided by ./scripts/start
(or ./scripts/logs
if you need to re-open the log stream after you've closed it), although there is a lot there and it may be hard to distinguish whether or not something is okay.
Running docker-compose ps -a
should give you a good idea if all of Malcolm's Docker containers started up and, in some cases, may be able to indicate if the containers are "healthy" or not.
After upgrading following one of the previous outlines, give Malcolm several minutes to get started. Once things are up and running, open one of Malcolm's web interfaces to verify that things are working.
Once the upgraded instance Malcolm has started up, you'll probably want to import the new dashboards and visualizations for OpenSearch Dashboards. You can signal Malcolm to load the new visualizations by opening OpenSearch Dashboards, clicking Management → Index Patterns, then selecting the arkime_sessions3-*
index pattern and clicking the delete 🗑 button near the upper-right of the window. Confirm the Delete index pattern? prompt by clicking Delete. Close the OpenSearch Dashboards browser window. After a few minutes the missing index pattern will be detected and OpenSearch Dashboards will be signalled to load its new dashboards and visualizations.
The Malcolm project uses semantic versioning when choosing version numbers. If you are moving between major releases (e.g., from v4.0.1 to v5.0.0), you're likely to find that there are enough major backwards compatibility-breaking changes that upgrading may not be worth the time and trouble. A fresh install is strongly recommended between major releases.
If you are interested in contributing to the Malcolm project, please read the Malcolm Contributor Guide.
Malcolm is Copyright 2022 Battelle Energy Alliance, LLC, and is developed and released through the cooperation of the Cybersecurity and Infrastructure Security Agency of the U.S. Department of Homeland Security.
See License.txt
for the terms of its release.