Every security team eventually faces the same question: what exactly does this file do? A suspicious attachment lands in a phishing report, an EDR alert fires on an unknown binary, or IR triage surfaces a script with an obfuscated payload. The instinct is to submit it to VirusTotal and move on, but detection rates on targeted samples are often poor, and vendor labels like "Trojan.GenericKD.12345" tell you nothing about capability, infrastructure, or intended victim. Malware analysis fills that gap.

At its core, malware analysis is about answering three questions: what does this file do, how does it do it, and what does it communicate with? The discipline ranges from quick triage performed in minutes to deep reverse engineering that takes days. This guide focuses on the entry point: building a safe analysis environment and developing a repeatable workflow for basic static and dynamic analysis — the skills that directly improve incident response quality without requiring a background in assembly language.

The tooling is largely free. The main investment is time and discipline — especially discipline around safety, because the files you will analyze are designed to compromise systems.

Why Learn Malware Analysis

Malware analysis sits at the intersection of several high-value security skills, and even a foundational capability produces significant returns across the incident response lifecycle.

From a career perspective, analysts who can characterize malware behavior are significantly more valuable during active incidents. The ability to quickly determine whether a suspicious file is a dropper, a loader, a stealer, or a lateral movement tool shapes every subsequent response decision — what systems to prioritize, what network traffic to hunt for, what persistence mechanisms to check. That analysis capability is not automatic; it is built through practice.

From a detection perspective, malware analysis generates artifacts that translate directly into detections. Understanding the file system paths, registry keys, mutex names, network indicators, and behavioral signatures of a specific sample lets you build YARA rules, Sigma rules, and EDR detections tuned to that exact threat rather than relying on generic signatures. Defenders who have analyzed the malware they are hunting write dramatically better detections than those who have not.

From a broader intelligence perspective, analysis connects individual samples to campaigns, threat actors, and toolkits. Shared code patterns, infrastructure overlaps, and behavioral similarities link what looks like an isolated incident to a wider operation. That context drives better prioritization and more effective communication with leadership.

Lab Setup

Before touching a malware sample, your environment must be prepared. Analyzing malware on your primary workstation or in a connected corporate environment is not a calculated risk — it is an uncontrolled experiment with unpredictable consequences. The lab environment is not optional infrastructure; it is the foundation that makes everything else safe.

Isolated Virtual Machines

The standard approach is a dedicated analysis VM running on a hypervisor that you control. VMware Workstation Pro, VMware Fusion, and VirtualBox are all viable. The critical requirement is that the VM is isolated from your production network. This means:

Network Segmentation

When you need to observe network behavior, a two-VM setup is more controlled than opening the analysis VM to the internet. The analysis VM connects to a second VM running a fake network services host (such as a machine running INetSim or Fakenet-NG), which responds to DNS queries, HTTP requests, and other common protocol interactions with plausible but benign responses. The malware believes it has internet connectivity and executes its network routines; you capture the traffic without actually connecting to live command-and-control infrastructure.

A separate physical machine running on a dedicated VLAN — firewalled from everything else on your network — is preferable for high-confidence detonation of sophisticated or potentially VM-aware samples. For most practitioner-level work, a well-configured VM setup is sufficient.

REMnux and FlareVM

Rather than assembling an analysis environment from scratch, two pre-built distributions are the standard starting point in the community:

Start with FlareVM as your Windows analysis VM and REMnux as your network host. Snapshot both in their clean states and you have a functional analysis lab.

Safety First

Safety discipline is not a formality — it is the difference between a controlled analysis and an incident that requires its own incident response. Several practices are non-negotiable:

Static Analysis Workflow

Static analysis examines the file without executing it. It is fast, safe, and provides the initial context that shapes everything else. Even if you eventually run the sample dynamically, static analysis first gives you hypotheses to validate and indicators to watch for during execution.

File Type Identification

Start by determining what the file actually is, independent of its extension. Malware frequently uses misleading extensions: a .pdf that is actually a PE executable, a .docx that contains a malicious macro, or a .jpg that is a renamed ZIP archive. The file command on Linux/REMnux identifies file type from magic bytes rather than extension. The TrID tool provides more detailed format identification across a broader signature database.

For Windows PE files (executables and DLLs), note the architecture (32-bit or 64-bit), subsystem (GUI vs. console), and compilation timestamp. The timestamp may be forged but is worth recording; a compilation date hours before a phishing campaign is a useful correlation point.

Hash and Threat Intelligence Lookup

Compute the MD5, SHA-1, and SHA-256 hashes of the sample. Look up the SHA-256 on VirusTotal (by hash, not file), MalwareBazaar, and your internal threat intelligence platform. If the hash is known, you often get immediate context: malware family, campaign associations, and behavioral reports from previous analyses. If it is unknown, that is also meaningful — a new or customized sample warrants more thorough analysis.

Strings Extraction

The strings command extracts printable character sequences from a binary. Even in packed or obfuscated files, strings often reveals useful artifacts: URLs, domain names, IP addresses, file paths, registry keys, error messages, and hardcoded credentials. On Windows samples, use both ASCII and Unicode string extraction (strings -a -el). The FLOSS tool from Mandiant extends this by automatically deobfuscating common string obfuscation techniques that simple strings extraction misses.

Look specifically for:

PE Header Analysis

For PE files, the header contains rich metadata about the binary's structure and intended behavior. Tools like pestudio, PE-bear, and pefile (Python library) parse PE headers and present this information in readable form.

Key header fields to examine:

Packing Detection

Many malware samples are packed — compressed, encrypted, or obfuscated — to evade static detection and analysis. Packed binaries typically have very few imports (just enough to unpack themselves), high entropy sections, and limited readable strings. Tools like Detect-It-Easy (DIE) and ExeinfoPE identify common packers by signature. If a sample is packed with a known packer like UPX, you may be able to unpack it automatically and then analyze the unpacked binary. Custom packers require dynamic analysis to capture the unpacked payload from memory.

Dynamic Analysis Workflow

Dynamic analysis executes the sample in a controlled environment and observes its behavior. It reveals what the malware actually does at runtime — including behavior that was hidden by packing, obfuscation, or encryption that static analysis could not see through. The cost is that you are running live malware, which is why the lab environment and safety practices from earlier sections are prerequisites.

Baseline Before Execution

Before detonating the sample, take a snapshot and establish a behavioral baseline: running processes, active network connections, loaded services, and relevant registry key values. Tools like Autoruns (for persistence locations) and TCPView (for network connections) give you a clean pre-execution reference. This makes post-execution comparison far more efficient.

Process Monitoring

Process Monitor (ProcMon) from Sysinternals is the single most useful dynamic analysis tool for Windows. It captures every file system operation, registry operation, and process/thread event on the system in real time. With ProcMon running and filtered to your sample's process tree, you see exactly which files it creates, which registry keys it modifies, which child processes it spawns, and which system calls it makes. The filter capability is essential — a single malware execution can generate tens of thousands of events, and filtering to the sample's PID and its children makes the output tractable.

Network Traffic Capture

Run Wireshark on the network interface (or on the REMnux host in a two-VM setup) before executing the sample. Network traffic reveals C2 communication patterns, DNS queries, exfiltration attempts, and download behavior. Even if the malware cannot reach live infrastructure (because of your network isolation), it will still attempt connections that Wireshark captures. DNS queries to C2 domains, HTTP POST requests with encoded data, and IRC or custom protocol traffic are all visible in the packet capture.

If you are using Fakenet-NG or INetSim as a network simulation layer, these tools log every connection attempt and serve plausible responses that keep the malware executing its network routines rather than failing and exiting early.

Registry and File System Changes

After execution, compare the current system state against your pre-execution baseline. Autoruns highlights any new persistence mechanisms added to the registry or startup locations. A manual or scripted comparison of the file system against a pre-execution snapshot reveals dropped files, modified binaries, and created directories. Focus particularly on persistence locations: HKCU\Software\Microsoft\Windows\CurrentVersion\Run, the Startup folder, scheduled tasks, and services.

Essential Tools Reference

A functional analysis environment does not require dozens of specialized tools. The following set covers the majority of initial analysis needs:

Your First Analysis: A Macro-Enabled Document

Document-based malware — particularly Office documents with embedded macros — remains one of the most common initial access vectors. Walking through a macro document analysis illustrates how static and dynamic techniques combine in practice.

Static Examination

Start with file to confirm the format. A modern .docm or .xlsm file is an OOXML ZIP archive; an older .doc with macros is OLE2 compound document format. Use olevba (part of the oletools suite) to extract and examine VBA macro code from Office documents without opening them in Word or Excel:

olevba suspicious.docm

The output displays all extracted VBA modules, flags suspicious patterns (Shell calls, WScript usage, base64 strings, PowerShell invocations), and produces an IOC summary. Examine the macro code for:

If the macro contains base64-encoded content, decode it with CyberChef (the browser-based tool from GCHQ) or from the command line. The decoded content often reveals the next stage: a PowerShell script, a PE executable, or another encoded layer.

Dynamic Examination

With ProcMon filtering active, Wireshark capturing on the network interface, and Fakenet-NG running on the network host, open the document in Microsoft Word within the analysis VM. If the macro is set to auto-execute, it will run immediately. If it requires the user to enable macros (the "Enable Content" prompt), click it — you are the analyst, not a victim; you want to see the behavior.

In ProcMon, watch for:

In Wireshark, watch for DNS queries immediately after the document opens. The domain queried is almost always a C2 domain or a hosting location for the next-stage payload. Note the full query, the response (which Fakenet-NG will have served), and any subsequent HTTP requests. The User-Agent string in HTTP requests sometimes contains version or campaign identifiers hardcoded by the malware author.

After the macro has executed, use Process Hacker to examine the memory of any child processes that were spawned. Injected shellcode or a reflectively loaded DLL will appear as executable memory regions without a backing file on disk — the same pattern that windows.malfind looks for in Volatility.

Next Steps: Deepening Your Analysis Capability

Static and dynamic analysis answer what a sample does behaviorally. Understanding how — the underlying implementation, the specific algorithms, the code structure — requires moving into reverse engineering and code-level analysis. Several natural progressions build on the foundation established here.

Malware analysis is a compounding skill. The first few samples feel slow and uncertain; the thirtieth produces pattern recognition that makes analysis dramatically faster. Every sample analyzed is a vocabulary entry — a technique, an IOC pattern, a behavioral signature — that applies to the next one. The investment pays dividends across every security function: detection engineering, incident response, threat intelligence, and red team understanding.

For analysts building out their forensic capabilities alongside malware analysis, the Memory Forensics Fundamentals guide covers the volatile evidence collection and Volatility analysis workflow that complements dynamic malware analysis — particularly for investigating samples that rely on in-memory execution to evade disk-based detection.

Develop Your Analysis Skills

Malware analysis is a critical capability for any security team. Learn how ForgeWork helps organizations build hands-on security skills through structured training.

Training Programs Explore More Insights