Static configuration extractor engine
Module interface
- class malduck.extractor.Extractor(parent)[source]
Base class for extractor modules
Following parameters need to be defined:
Example extractor code for Citadel:
from malduck import Extractor class Citadel(Extractor): family = "citadel" yara_rules = ("citadel",) overrides = ("zeus",) @Extractor.string("briankerbs") def citadel_found(self, p, addr, match): log.info('[+] `Coded by Brian Krebs` str @ %X' % addr) return True @Extractor.string def cit_login(self, p, addr, match): log.info('[+] Found login_key xor @ %X' % addr) hit = p.uint32v(addr + 4) print(hex(hit)) if p.is_addr(hit): return {'login_key': p.asciiz(hit)} hit = p.uint32v(addr + 5) print(hex(hit)) if p.is_addr(hit): return {'login_key': p.asciiz(hit)}
Decorated methods are always called in order:
@Extractor.extractor methods
@Extractor.string methods
@Extractor.rule methods
@Extractor.final methods
- @string[source]
Decorator for string-based extractor methods. Method is called each time when string with the same identifier as method name has matched
Extractor can be called for many number-suffixed strings e.g. $keyex1 and $keyex2 will call keyex method.
You can optionally provide the actual string identifier as an argument if you don’t want to name your method after the string identifier.
Signature of decorated method:
@Extractor.string def string_identifier(self, p: ProcessMemory, addr: int, match: YaraStringMatch) -> Config: # p: ProcessMemory object that contains matched file/dump representation # addr: Virtual address of matched string # Called for each "$string_identifier" hit ...
If you want to use same method for multiple different named strings, you can provide multiple identifiers as @Extractor.string decorator argument
Extractor methods should return dict object with extracted part of configuration, True indicating a match or False/None when family has not been matched.
For strong methods: truthy values are transformed to dict with {“family”: self.family} key.
New in version 4.0.0: Added @Extractor.string as extended version of @Extractor.extractor
- Parameters:
strings_or_method (str, optional) – If method name doesn’t match the string identifier, pass yara string identifier as decorator argument. Multiple strings are accepted
- @extractor[source]
Simplified variant of @Extractor.string.
Doesn’t accept multiple strings and passes only string offset to the extractor method.
from malduck import Extractor class Citadel(Extractor): family = "citadel" yara_rules = ("citadel",) overrides = ("zeus",) @Extractor.extractor("briankerbs") def citadel_found(self, p, addr): # Called for each $briankerbs hit ... @Extractor.extractor def cit_login(self, p, addr): # Called for each $cit_login1, $cit_login2 hit ...
- @rule[source]
Decorator for rule-based extractor methods, called once for rule match after string-based extraction methods.
Method is called each time when rule with the same identifier as method name has matched.
You can optionally provide the actual rule identifier as an argument if you don’t want to name your method after the rule identifier.
Rule identifier must appear in yara_rules tuple.
Signature of decorated method:
@Extractor.rule def rule_identifier(self, p: ProcessMemory, matches: YaraMatch) -> Config: # p: ProcessMemory object that contains matched file/dump representation # matches: YaraMatch object with offsets of all matched strings related with the rule # Called for matched rule named "rule_identifier". ...
New in version 4.0.0: Added @Extractor.rule decorator
from malduck import Extractor class Evil(Extractor): yara_rules = ("evil", "weird") family = "evil" ... @Extractor.rule def evil(self, p, matches): # This will be called each time evil match. # `matches` is YaraMatch object that contains information about # all string matches related with this rule. ...
- Parameters:
string_or_method (str, optional) – If method name doesn’t match the rule identifier pass yara string identifier as decorator argument
- @final[source]
Decorator for final extractor methods, called once for each single rule match after other extraction methods.
Behaves similarly to the @rule-decorated methods but is called for each rule match regardless of the rule identifier.
Signature of decorated method:
@Extractor.rule def rule_identifier(self, p: ProcessMemory) -> Config: # p: ProcessMemory object that contains matched file/dump representation # Called for each matched rule in self.yara_rules ...
from malduck import Extractor class Evil(Extractor): yara_rules = ("evil", "weird") family = "evil" ... @Extractor.needs_pe @Extractor.final def get_config(self, p): # This will be called each time evil or weird match cfg = {"urls": self.get_cncs_from_rsrc(p)} if "role" not in self.collected_config: cfg["role"] = "loader" return cfg
- @weak[source]
Use this decorator for extractors when successful extraction is not sufficient to mark family as matched.
All “weak configs” will be flushed when “strong config” appears.
Changed in version 4.0.0: Method must be decorated first with @extractor, @rule or @final decorator
from malduck import Extractor class Evil(Extractor): yara_rules = ("evil", "weird") family = "evil" ... @Extractor.weak @Extractor.extractor def dga_seed(self, p, hit): # Even if we're able to get the DGA seed, extractor won't produce config # until is_it_really_evil match as well dga_config = p.readv(hit, 128) seed = self._get_dga_seed(dga_config) if seed is not None: return {"dga_seed": seed} @Extractor.final def is_it_really_evil(self, p): # If p starts with 'evil', we can produce config return p.read(p.imgbase, 4) == b'evil'
- @needs_pe[source]
Use this decorator for extractors that need PE instance. (p is guaranteed to be
malduck.procmem.ProcessMemoryPE
)Changed in version 4.0.0: Method must be decorated first with @extractor, @rule or @final decorator
- @needs_elf[source]
Use this decorator for extractors that need ELF instance. (p is guaranteed to be
malduck.procmem.ProcessMemoryELF
)Changed in version 4.0.0: Method must be decorated first with @extractor, @rule or @final decorator.
- property collected_config
Shows collected config so far (useful in “final” extractors)
- Return type:
dict
- family = None
Extracted malware family, automatically added to “family” key for strong extraction methods
- property globals
Container for global variables associated with analysis
- Return type:
dict
- handle_match(p, match)[source]
Override this if you don’t want to use decorators and customize ripping process (e.g. yara-independent, brute-force techniques)
Called for each rule hit listed in Extractor.yara_rules.
Overriding this method means that all Yara hits must be processed within this method. Ripped configurations must be reported using
push_config()
method.- Parameters:
p (
malduck.procmem.ProcessMemory
) – ProcessMemory objectmatch (
malduck.yara.YaraRuleMatch
) – Found yara matches for currently matched rule
- property log
Logger instance for Extractor methods
- Returns:
logging.Logger
- property matched
Returns True if family has been matched so far
- Return type:
bool
- on_error(exc, method_name)[source]
Handler for all exceptions raised by extractor methods.
- Parameters:
exc (
Exception
) – Exception objectmethod_name (str) – Name of method which raised the exception
- overrides = []
Family match overrides another match e.g. citadel overrides zeus
- push_config(config)[source]
Push partial config (used by
Extractor.handle_match()
)- Parameters:
config (dict) – Partial config element
- push_procmem(procmem: ProcessMemory, **info)[source]
Push extracted procmem object for further analysis
- Parameters:
procmem (
malduck.procmem.ProcessMemory
) – ProcessMemory objectinfo – Additional info about object
- yara_rules = ()
Names of Yara rules for which handle_match is called
- class malduck.extractor.ExtractManager(modules: ExtractorModules)[source]
Multi-dump extraction context. Handles merging configs from different dumps, additional dropped families etc.
- Parameters:
modules (
ExtractorModules
) – Object with loaded extractor modules
- carve_procmem(p: ProcessMemory) List[ProcessMemoryBinary] [source]
Carves binaries from ProcessMemory to try configuration extraction using every possible address mapping.
- property config: List[Dict[str, Any]]
Extracted configuration (list of configs for each extracted family)
- property extractors: List[Type[Extractor]]
Bound extractor modules :rtype: List[Type[
malduck.extractor.Extractor
]]
- match_procmem(p: ProcessMemory) YaraRulesetMatch [source]
Performs Yara matching on ProcessMemory using modules bound with current ExtractManager.
- on_error(exc: Exception, extractor: Extractor) None [source]
Handler for all exceptions raised by
Extractor.handle_yara()
.Deprecated since version 2.1.0: Look at
ExtractManager.on_extractor_error()
instead.- Parameters:
exc (
Exception
) – Exception objectextractor (
malduck.extractor.Extractor
) – Extractor object which raised the exception
- on_extractor_error(exc: Exception, extractor: Extractor, method_name: str) None [source]
Handler for all exceptions raised by extractor methods (including
Extractor.handle_yara()
).Override this method if you want to set your own error handler.
- Parameters:
exc (
Exception
) – Exception objectextractor (
extractor.Extractor
) – Extractor instancemethod_name (str) – Name of method which raised the exception
- push_file(filepath: str, base: int = 0) str | None [source]
Pushes file for extraction. Config extractor entrypoint.
- Parameters:
filepath (str) – Path to extracted file
base (int) – Memory dump base address
- Returns:
Detected family if configuration looks better than already stored one
- push_procmem(p: ProcessMemory, rip_binaries: bool = False) str | None [source]
Pushes ProcessMemory object for extraction
- Parameters:
p (
malduck.procmem.ProcessMemory
) – ProcessMemory objectrip_binaries (bool (default: False)) – Look for binaries (PE, ELF) in provided ProcessMemory and try to perform extraction using specialized variants (ProcessMemoryPE, ProcessMemoryELF)
- Returns:
Detected family if configuration looks better than already stored one
- property rules: Yara
Bound Yara rules :rtype:
malduck.yara.Yara
- class malduck.extractor.ExtractorModules(modules_path: str | None = None)[source]
Configuration object with loaded Extractor modules for ExtractManager
- Parameters:
modules_path (str) – Path with module files (Extractor classes and Yara files, default ‘~/.malduck’)
Internally used classes and routines
- class malduck.extractor.extract_manager.ExtractionContext(parent: ExtractManager)[source]
Single-dump extraction context (single family)
- collected_config: Dict[str, Any]
Collected configuration so far (especially useful for “final” extractors)
- property config: Dict[str, Any]
Returns collected config, but if family is not matched - returns empty dict. Family is not included in config itself, look at
ProcmemExtractManager.family
.
- property family: str | None
Matched family
- on_extractor_error(exc: Exception, extractor: Extractor, method_name: str) None [source]
Handler for all exceptions raised by extractor methods.
- Parameters:
exc (
Exception
) – Exception objectextractor (
extractor.Extractor
) – Extractor instancemethod_name (str) – Name of method which raised the exception
- parent
Bound ExtractManager instance
- push_config(config: Dict[str, Any], extractor: Extractor) None [source]
Pushes new partial config
If strong config provides different family than stored so far and that family overrides stored family - set stored family Example: citadel overrides zeus
- Parameters:
config (dict) – Partial config object
extractor (
malduck.extractor.Extractor
) – Extractor object reference
- push_procmem(p: ProcessMemory, _matches: YaraRulesetMatch | None = None) None [source]
Pushes ProcessMemory object for extraction
- Parameters:
p (
malduck.procmem.ProcessMemory
) – ProcessMemory object_matches (
malduck.yara.YaraRulesetMatch
) – YaraRulesetMatch object (used internally)