Static configuration extractor engine

Module interface

class malduck.extractor.Extractor(parent)[source]

Base class for extractor modules

Following parameters need to be defined:

Example extractor code for Citadel:

from malduck import Extractor

class Citadel(Extractor):
    family = "citadel"
    yara_rules = ("citadel",)
    overrides = ("zeus",)

    @Extractor.string("briankerbs")
    def citadel_found(self, p, addr, match):
        log.info('[+] `Coded by Brian Krebs` str @ %X' % addr)
        return True

    @Extractor.string
    def cit_login(self, p, addr, match):
        log.info('[+] Found login_key xor @ %X' % addr)
        hit = p.uint32v(addr + 4)
        print(hex(hit))
        if p.is_addr(hit):
            return {'login_key': p.asciiz(hit)}

        hit = p.uint32v(addr + 5)
        print(hex(hit))
        if p.is_addr(hit):
            return {'login_key': p.asciiz(hit)}

Decorated methods are always called in order:

  • @Extractor.extractor methods

  • @Extractor.string methods

  • @Extractor.rule methods

  • @Extractor.final methods

@string[source]

Decorator for string-based extractor methods. Method is called each time when string with the same identifier as method name has matched

Extractor can be called for many number-suffixed strings e.g. $keyex1 and $keyex2 will call keyex method.

You can optionally provide the actual string identifier as an argument if you don’t want to name your method after the string identifier.

Signature of decorated method:

@Extractor.string
def string_identifier(self, p: ProcessMemory, addr: int, match: YaraStringMatch) -> Config:
    # p: ProcessMemory object that contains matched file/dump representation
    # addr: Virtual address of matched string
    # Called for each "$string_identifier" hit
    ...

If you want to use same method for multiple different named strings, you can provide multiple identifiers as @Extractor.string decorator argument

Extractor methods should return dict object with extracted part of configuration, True indicating a match or False/None when family has not been matched.

For strong methods: truthy values are transformed to dict with {“family”: self.family} key.

New in version 4.0.0: Added @Extractor.string as extended version of @Extractor.extractor

Parameters:

strings_or_method (str, optional) – If method name doesn’t match the string identifier, pass yara string identifier as decorator argument. Multiple strings are accepted

@extractor[source]

Simplified variant of @Extractor.string.

Doesn’t accept multiple strings and passes only string offset to the extractor method.

from malduck import Extractor

class Citadel(Extractor):
    family = "citadel"
    yara_rules = ("citadel",)
    overrides = ("zeus",)

    @Extractor.extractor("briankerbs")
    def citadel_found(self, p, addr):
        # Called for each $briankerbs hit
        ...

    @Extractor.extractor
    def cit_login(self, p, addr):
        # Called for each $cit_login1, $cit_login2 hit
        ...
@rule[source]

Decorator for rule-based extractor methods, called once for rule match after string-based extraction methods.

Method is called each time when rule with the same identifier as method name has matched.

You can optionally provide the actual rule identifier as an argument if you don’t want to name your method after the rule identifier.

Rule identifier must appear in yara_rules tuple.

Signature of decorated method:

@Extractor.rule
def rule_identifier(self, p: ProcessMemory, matches: YaraMatch) -> Config:
    # p: ProcessMemory object that contains matched file/dump representation
    # matches: YaraMatch object with offsets of all matched strings related with the rule
    # Called for matched rule named "rule_identifier".
    ...

New in version 4.0.0: Added @Extractor.rule decorator

from malduck import Extractor

class Evil(Extractor):
    yara_rules = ("evil", "weird")
    family = "evil"

    ...

    @Extractor.rule
    def evil(self, p, matches):
        # This will be called each time evil match.
        # `matches` is YaraMatch object that contains information about
        # all string matches related with this rule.
        ...
Parameters:

string_or_method (str, optional) – If method name doesn’t match the rule identifier pass yara string identifier as decorator argument

@final[source]

Decorator for final extractor methods, called once for each single rule match after other extraction methods.

Behaves similarly to the @rule-decorated methods but is called for each rule match regardless of the rule identifier.

Signature of decorated method:

@Extractor.rule
def rule_identifier(self, p: ProcessMemory) -> Config:
    # p: ProcessMemory object that contains matched file/dump representation
    # Called for each matched rule in self.yara_rules
    ...
from malduck import Extractor

class Evil(Extractor):
    yara_rules = ("evil", "weird")
    family = "evil"

    ...

    @Extractor.needs_pe
    @Extractor.final
    def get_config(self, p):
        # This will be called each time evil or weird match
        cfg = {"urls": self.get_cncs_from_rsrc(p)}
        if "role" not in self.collected_config:
            cfg["role"] = "loader"
        return cfg
@weak[source]

Use this decorator for extractors when successful extraction is not sufficient to mark family as matched.

All “weak configs” will be flushed when “strong config” appears.

Changed in version 4.0.0: Method must be decorated first with @extractor, @rule or @final decorator

from malduck import Extractor

class Evil(Extractor):
    yara_rules = ("evil", "weird")
    family = "evil"

    ...

    @Extractor.weak
    @Extractor.extractor
    def dga_seed(self, p, hit):
        # Even if we're able to get the DGA seed, extractor won't produce config
        # until is_it_really_evil match as well
        dga_config = p.readv(hit, 128)
        seed = self._get_dga_seed(dga_config)
        if seed is not None:
            return {"dga_seed": seed}

    @Extractor.final
    def is_it_really_evil(self, p):
        # If p starts with 'evil', we can produce config
        return p.read(p.imgbase, 4) == b'evil'
@needs_pe[source]

Use this decorator for extractors that need PE instance. (p is guaranteed to be malduck.procmem.ProcessMemoryPE)

Changed in version 4.0.0: Method must be decorated first with @extractor, @rule or @final decorator

@needs_elf[source]

Use this decorator for extractors that need ELF instance. (p is guaranteed to be malduck.procmem.ProcessMemoryELF)

Changed in version 4.0.0: Method must be decorated first with @extractor, @rule or @final decorator.

property collected_config

Shows collected config so far (useful in “final” extractors)

Return type:

dict

family = None

Extracted malware family, automatically added to “family” key for strong extraction methods

property globals

Container for global variables associated with analysis

Return type:

dict

handle_match(p, match)[source]

Override this if you don’t want to use decorators and customize ripping process (e.g. yara-independent, brute-force techniques)

Called for each rule hit listed in Extractor.yara_rules.

Overriding this method means that all Yara hits must be processed within this method. Ripped configurations must be reported using push_config() method.

Parameters:
  • p (malduck.procmem.ProcessMemory) – ProcessMemory object

  • match (malduck.yara.YaraRuleMatch) – Found yara matches for currently matched rule

property log

Logger instance for Extractor methods

Returns:

logging.Logger

property matched

Returns True if family has been matched so far

Return type:

bool

on_error(exc, method_name)[source]

Handler for all exceptions raised by extractor methods.

Parameters:
  • exc (Exception) – Exception object

  • method_name (str) – Name of method which raised the exception

overrides = []

Family match overrides another match e.g. citadel overrides zeus

push_config(config)[source]

Push partial config (used by Extractor.handle_match())

Parameters:

config (dict) – Partial config element

push_procmem(procmem: ProcessMemory, **info)[source]

Push extracted procmem object for further analysis

Parameters:
  • procmem (malduck.procmem.ProcessMemory) – ProcessMemory object

  • info – Additional info about object

yara_rules = ()

Names of Yara rules for which handle_match is called

class malduck.extractor.ExtractManager(modules: ExtractorModules)[source]

Multi-dump extraction context. Handles merging configs from different dumps, additional dropped families etc.

Parameters:

modules (ExtractorModules) – Object with loaded extractor modules

carve_procmem(p: ProcessMemory) List[ProcessMemoryBinary][source]

Carves binaries from ProcessMemory to try configuration extraction using every possible address mapping.

property config: List[Dict[str, Any]]

Extracted configuration (list of configs for each extracted family)

property extractors: List[Type[Extractor]]

Bound extractor modules :rtype: List[Type[malduck.extractor.Extractor]]

match_procmem(p: ProcessMemory) YaraRulesetMatch[source]

Performs Yara matching on ProcessMemory using modules bound with current ExtractManager.

on_error(exc: Exception, extractor: Extractor) None[source]

Handler for all exceptions raised by Extractor.handle_yara().

Deprecated since version 2.1.0: Look at ExtractManager.on_extractor_error() instead.

Parameters:
on_extractor_error(exc: Exception, extractor: Extractor, method_name: str) None[source]

Handler for all exceptions raised by extractor methods (including Extractor.handle_yara()).

Override this method if you want to set your own error handler.

Parameters:
  • exc (Exception) – Exception object

  • extractor (extractor.Extractor) – Extractor instance

  • method_name (str) – Name of method which raised the exception

push_file(filepath: str, base: int = 0) str | None[source]

Pushes file for extraction. Config extractor entrypoint.

Parameters:
  • filepath (str) – Path to extracted file

  • base (int) – Memory dump base address

Returns:

Detected family if configuration looks better than already stored one

push_procmem(p: ProcessMemory, rip_binaries: bool = False) str | None[source]

Pushes ProcessMemory object for extraction

Parameters:
  • p (malduck.procmem.ProcessMemory) – ProcessMemory object

  • rip_binaries (bool (default: False)) – Look for binaries (PE, ELF) in provided ProcessMemory and try to perform extraction using specialized variants (ProcessMemoryPE, ProcessMemoryELF)

Returns:

Detected family if configuration looks better than already stored one

property rules: Yara

Bound Yara rules :rtype: malduck.yara.Yara

class malduck.extractor.ExtractorModules(modules_path: str | None = None)[source]

Configuration object with loaded Extractor modules for ExtractManager

Parameters:

modules_path (str) – Path with module files (Extractor classes and Yara files, default ‘~/.malduck’)

compare_family_overrides(first: str, second: str) int[source]

Checks which family supersedes which. Relations can be transitive, so ExtractorModules builds all possible paths and checks the order. If there is no such relationship between families, function returns None.

on_error(exc: Exception, module_name: str) None[source]

Handler for all exceptions raised during module load

Override this method if you want to set your own error handler.

Parameters:
  • exc (Exception) – Exception object

  • module_name (str) – Name of module which raised the exception

Internally used classes and routines

class malduck.extractor.extract_manager.ExtractionContext(parent: ExtractManager)[source]

Single-dump extraction context (single family)

collected_config: Dict[str, Any]

Collected configuration so far (especially useful for “final” extractors)

property config: Dict[str, Any]

Returns collected config, but if family is not matched - returns empty dict. Family is not included in config itself, look at ProcmemExtractManager.family.

property family: str | None

Matched family

on_extractor_error(exc: Exception, extractor: Extractor, method_name: str) None[source]

Handler for all exceptions raised by extractor methods.

Parameters:
  • exc (Exception) – Exception object

  • extractor (extractor.Extractor) – Extractor instance

  • method_name (str) – Name of method which raised the exception

parent

Bound ExtractManager instance

push_config(config: Dict[str, Any], extractor: Extractor) None[source]

Pushes new partial config

If strong config provides different family than stored so far and that family overrides stored family - set stored family Example: citadel overrides zeus

Parameters:
push_procmem(p: ProcessMemory, _matches: YaraRulesetMatch | None = None) None[source]

Pushes ProcessMemory object for extraction

Parameters:
  • p (malduck.procmem.ProcessMemory) – ProcessMemory object

  • _matches (malduck.yara.YaraRulesetMatch) – YaraRulesetMatch object (used internally)