Memory model objects (procmem)

ProcessMemory (procmem)

malduck.procmem

alias of malduck.procmem.procmem.ProcessMemory

class malduck.procmem.procmem.ProcessMemory(buf, base=0, regions=None, **_)[source]

Basic virtual memory representation

Short name: procmem

Parameters
  • buf (bytes, mmap, memoryview or bytearray object) – Object with memory contents

  • base (int, optional (default: 0)) – Virtual address of the region of interest (or beginning of buf when no regions provided)

  • regions (List[Region]) – Regions mapping. If set to None (default), buf is mapped into single-region with VA specified in base argument

Let’s assume that notepad.exe_400000.bin contains raw memory dump starting at 0x400000 base address. We can easily load that file to ProcessMemory object, using from_file() method:

from malduck import procmem

with procmem.from_file("notepad.exe_400000.bin", base=0x400000) as p:
    mem = p.readv(...)
    ...

If your data are loaded yet into buffer, you can directly use procmem constructor:

from malduck import procmem

with open("notepad.exe_400000.bin", "rb") as f:
    payload = f.read()

p = procmem(payload, base=0x400000)

Then you can work with PE image contained in dump by creating ProcessMemoryPE object, using its from_memory() constructor method

from malduck import procmem

with open("notepad.exe_400000.bin", "rb") as f:
    payload = f.read()

p = procmem(payload, base=0x400000)
ppe = procmempe.from_memory(p)
ppe.pe.resource("NPENCODINGDIALOG")

If you want to load PE file directly and work with it in a similar way as with memory-mapped files, just use image parameter. It also works with ProcessMemoryPE.from_memory() for embedded binaries. Your file will be loaded and relocated in similar way as it’s done by Windows loader.

from malduck import procmempe

with procmempe.from_file("notepad.exe", image=True) as p:
    p.pe.resource("NPENCODINGDIALOG")
addr_region(addr)[source]

Returns Region object mapping specified virtual address

Parameters

addr – Virtual address

Return type

Region

asciiz(addr)[source]

Read a null-terminated ASCII string at address.

close(copy=False)[source]

Closes opened files referenced by ProcessMemory object

If copy is False (default): invalidates the object.

Parameters

copy (bool) – Copy data into string before closing the mmap object (default: False)

disasmv(addr, size=None, x64=False, count=None)[source]

Disassembles code under specified address

Changed in version 4.0.0: Returns iterator instead of list of instructions

Parameters
  • addr (int) – Virtual address

  • size (int (optional)) – Size of disassembled buffer

  • count (int (optional)) – Number of instructions to disassemble

  • x64 (bool (optional)) – Assembly is 64bit

Returns

List[Instruction]

extract(modules=None, extract_manager=None)[source]

Tries to extract config from ProcessMemory object

Parameters
Returns

Static configuration(s) (malduck.extractor.ExtractManager.config) or None if not extracted

Return type

List[dict] or None

findbytesp(query, offset=None, length=None)[source]

Search for byte sequences (e.g., 4? AA BB ?? DD). Uses yarap() internally

If offset is None, looks for match from the beginning of memory

New in version 1.4.0: Query is passed to yarap as single hexadecimal string rule. Use Yara-compatible strings only

Parameters
  • query (str or bytes) – Sequence of wildcarded hexadecimal bytes, separated by spaces

  • offset (int (optional)) – Buffer offset where searching will be started

  • length (int (optional)) – Length of searched area

Returns

Iterator returning next offsets

Return type

Iterator[int]

findbytesv(query, addr=None, length=None)[source]

Search for byte sequences (e.g., 4? AA BB ?? DD). Uses yarav() internally

If addr is None, looks for match from the beginning of memory

New in version 1.4.0: Query is passed to yarav as single hexadecimal string rule. Use Yara-compatible strings only

Parameters
  • query (str or bytes) – Sequence of wildcarded hexadecimal bytes, separated by spaces

  • addr (int (optional)) – Virtual address where searching will be started

  • length (int (optional)) – Length of searched area

Returns

Iterator returning found virtual addresses

Return type

Iterator[int]

Usage example:

from malduck import hex

findings = []

for va in mem.findbytesv("4? AA BB ?? DD"):
    if hex(mem.readv(va, 5)) == "4aaabbccdd":
        findings.append(va)
findmz(addr)[source]

Tries to locate MZ header based on address inside PE image

Parameters

addr (int) – Virtual address inside image

Returns

Virtual address of found MZ header or None

findp(query, offset=None, length=None)[source]

Find raw bytes in memory (non-region-wise).

If offset is None, looks for substring from the beginning of memory

Parameters
  • query (bytes) – Substring to find

  • offset (int (optional)) – Offset in buffer where searching starts

  • length (int (optional)) – Length of searched area

Returns

Generates offsets where bytes were found

Return type

Iterator[int]

findv(query, addr=None, length=None)[source]

Find raw bytes in memory (region-wise)

If addr is None, looks for substring from the beginning of memory

Parameters
  • query (bytes) – Substring to find

  • addr (int (optional)) – Virtual address of region where searching starts

  • length (int (optional)) – Length of searched area

Returns

Generates offsets where regex was matched

Return type

Iterator[int]

classmethod from_file(filename, **kwargs)[source]

Opens file and loads its contents into ProcessMemory object

Parameters

filename – File name to load

Return type

ProcessMemory

It’s highly recommended to use context manager when operating on files:

from malduck import procmem

with procmem.from_file("binary.dmp") as p:
    mem = p.readv(...)
    ...
classmethod from_memory(memory, base=None, **kwargs)[source]

Makes new instance based on another ProcessMemory object.

Useful for specialized derived classes like CuckooProcessMemory

Parameters
  • memory (ProcessMemory) – ProcessMemory object to be copied

  • base (int (optional, default is provided by specialized class)) – Virtual address of region of interest (imgbase)

Return type

ProcessMemory

int16p(offset, fixed=False)[source]

Read signed 16-bit value at offset.

int16v(addr, fixed=False)[source]

Read signed 16-bit value at address.

int32p(offset, fixed=False)[source]

Read signed 32-bit value at offset.

int32v(addr, fixed=False)[source]

Read signed 32-bit value at address.

int64p(offset, fixed=False)[source]

Read signed 64-bit value at offset.

int64v(addr, fixed=False)[source]

Read signed 64-bit value at address.

int8p(offset, fixed=False)[source]

Read signed 8-bit value at offset.

int8v(addr, fixed=False)[source]

Read signed 8-bit value at address.

is_addr(addr)[source]

Checks whether provided parameter is correct virtual address :param addr: Virtual address candidate :return: True if it is mapped by ProcessMemory object

iter_regions(addr=None, offset=None, length=None, contiguous=False, trim=False)[source]

Iterates over Region objects starting at provided virtual address or offset

This method is used internally to enumerate regions using provided strategy.

Warning

If starting point is not provided, iteration will start from the first mapped region. This could be counter-intuitive when length is set. It literally means “get <length> of mapped bytes”. If you want to look for regions from address 0, you need to explicitly provide this address as an argument.

New in version 3.0.0.

Parameters
  • addr (int (default: None)) – Virtual address of starting point

  • offset (int (default: None)) – Offset of starting point, which will be translated to virtual address

  • length (int (default: None, unlimited)) – Length of queried range in VM mapping context

  • contiguous (bool (default: False)) – If True, break after first gap. Starting point must be inside mapped region.

  • trim (bool (default: False)) – Trim Region objects to range boundaries (addr, addr+length)

Return type

Iterator[Region]

property length

Returns length of raw memory contents :rtype: int

p2v(off, length=None)[source]

Buffer (physical) offset to virtual address translation

Changed in version 3.0.0: Added optional mapping length check

Parameters
  • off – Buffer offset

  • length – Expected minimal length of mapping (optional)

Returns

Virtual address or None if offset is not mapped

patchp(offset, buf)[source]

Patch bytes under specified offset

Warning

Family of *p methods doesn’t care about contiguity of regions.

Use p2v() and patchv() if you want to operate on contiguous regions only

Parameters
  • offset (int) – Buffer offset

  • buf (bytes) – Buffer with patch to apply

Usage example:

from malduck import procmempe, aplib

with procmempe("mal1.exe.dmp") as ppe:
    # Decompress payload
    payload = aPLib().decompress(
        ppe.readv(ppe.imgbase + 0x8400, ppe.imgend)
    )
    embed_pe = procmem(payload, base=0)
    # Fix headers
    embed_pe.patchp(0, b"MZ")
    embed_pe.patchp(embed_pe.uint32p(0x3C), b"PE")
    # Load patched image into procmempe
    embed_pe = procmempe.from_memory(embed_pe, image=True)
    assert embed_pe.asciiz(0x1000a410) == b"StrToIntExA"
patchv(addr, buf)[source]

Patch bytes under specified virtual address

Patched address range must be within single region, ValueError is raised otherwise.

Parameters
  • addr (int) – Virtual address

  • buf (bytes) – Buffer with patch to apply

readp(offset, length=None)[source]

Read a chunk of memory from the specified buffer offset.

Warning

Family of *p methods doesn’t care about contiguity of regions.

Use p2v() and readv() if you want to operate on contiguous regions only

Parameters
  • offset – Buffer offset

  • length – Length of chunk (optional)

Returns

Chunk from specified location

Return type

bytes

readv(addr, length=None)[source]

Read a chunk of memory from the specified virtual address

Parameters
  • addr (int) – Virtual address

  • length (int) – Length of chunk (optional)

Returns

Chunk from specified location

Return type

bytes

readv_regions(addr=None, length=None, contiguous=True)[source]

Generate chunks of memory from next contiguous regions, starting from the specified virtual address, until specified length of read data is reached.

Used internally.

Changed in version 3.0.0: Contents of contiguous regions are merged into single string

Parameters
  • addr – Virtual address

  • length – Size of memory to read (optional)

  • contiguous – If True, readv_regions breaks after first gap

Return type

Iterator[Tuple[int, bytes]]

readv_until(addr, s)[source]

Read a chunk of memory until the stop marker

Parameters
  • addr (int) – Virtual address

  • s (bytes) – Stop marker

Return type

bytes

regexp(query, offset=None, length=None)[source]

Performs regex on the memory contents (non-region-wise)

If offset is None, looks for match from the beginning of memory

Parameters
  • query (bytes) – Regular expression to find

  • offset (int (optional)) – Offset in buffer where searching starts

  • length (int (optional)) – Length of searched area

Returns

Generates offsets where regex was matched

Return type

Iterator[int]

regexv(query, addr=None, length=None)[source]

Performs regex on the memory contents (region-wise)

If addr is None, looks for match from the beginning of memory

Parameters
  • query (bytes) – Regular expression to find

  • addr (int (optional)) – Virtual address of region where searching starts

  • length (int (optional)) – Length of searched area

Returns

Generates offsets where regex was matched

Return type

Iterator[int]

Warning

Method doesn’t match bytes overlapping the border between regions

uint16p(offset, fixed=False)[source]

Read unsigned 16-bit value at offset.

uint16v(addr, fixed=False)[source]

Read unsigned 16-bit value at address.

uint32p(offset, fixed=False)[source]

Read unsigned 32-bit value at offset.

uint32v(addr, fixed=False)[source]

Read unsigned 32-bit value at address.

uint64p(offset, fixed=False)[source]

Read unsigned 64-bit value at offset.

uint64v(addr, fixed=False)[source]

Read unsigned 64-bit value at address.

uint8p(offset, fixed=False)[source]

Read unsigned 8-bit value at offset.

uint8v(addr, fixed=False)[source]

Read unsigned 8-bit value at address.

utf16z(addr)[source]

Read a null-terminated UTF-16 ASCII string at address.

Parameters

addr – Virtual address of string

Return type

bytes

v2p(addr, length=None)[source]

Virtual address to buffer (physical) offset translation

Changed in version 3.0.0: Added optional mapping length check

Parameters
  • addr – Virtual address

  • length – Expected minimal length of mapping (optional)

Returns

Buffer offset or None if virtual address is not mapped

yarap(ruleset, offset=None, length=None, extended=False)[source]

Perform yara matching (non-region-wise)

If offset is None, looks for match from the beginning of memory

Changed in version 4.0.0: Added extended option which allows to get extended information about matched strings and rules. Default is False for backwards compatibility.

Parameters
  • ruleset (malduck.yara.Yara) – Yara object with loaded yara rules

  • offset (int (optional)) – Offset in buffer where searching starts

  • length (int (optional)) – Length of searched area

  • extended (bool (optional, default False)) – Returns extended information about matched strings and rules

Return type

malduck.yara.YaraMatches

yarav(ruleset, addr=None, length=None, extended=False)[source]

Perform yara matching (region-wise)

If addr is None, looks for match from the beginning of memory

Changed in version 4.0.0: Added extended option which allows to get extended information about matched strings and rules. Default is False for backwards compatibility.

Parameters
  • ruleset (malduck.yara.Yara) – Yara object with loaded yara rules

  • addr (int (optional)) – Virtual address of region where searching starts

  • length (int (optional)) – Length of searched area

  • extended (bool (optional, default False)) – Returns extended information about matched strings and rules

Return type

malduck.yara.YaraRulesetOffsets or malduck.yara.YaraRulesetMatches if extended is set to True

class malduck.procmem.procmem.Region(addr: int, size: int, state: int, type_: int, protect: int, offset: int)[source]

Represents single mapped region in ProcessMemory

contains_addr(addr: int) → bool[source]

Checks whether region contains provided virtual address

contains_offset(offset: int) → bool[source]

Checks whether region contains provided physical offset

property end

Virtual address of region end (first unmapped byte)

property end_offset

Offset of region end (first unmapped byte)

intersects_range(addr: int, length: int) → bool[source]

Checks whether region mapping intersects with provided range

property last

Virtual address of last region byte

property last_offset

Offset of last region byte

p2v(off: int) → int[source]

Physical offset to translation. Assumes that offset is valid within Region. :param off: Physical offset :return: Virtual address

to_json() → Dict[str, Union[int, str, None]][source]

Returns JSON-like dict representation

trim_range(addr: int, length: Optional[int] = None) → Optional[malduck.procmem.region.Region][source]

Returns region intersection with provided range :param addr: Virtual address of starting point :param length: Length of range (optional) :rtype: Region

v2p(addr: int) → int[source]

Virtual address to physical offset translation. Assumes that address is valid within Region. :param addr: Virtual address :return: Physical offset

ProcessMemoryPE (procmempe)

malduck.procmempe

alias of malduck.procmem.procmempe.ProcessMemoryPE

class malduck.procmem.procmempe.ProcessMemoryPE(buf: Union[bytes, bytearray, mmap.mmap], base: int = 0, regions: Optional[List[malduck.procmem.region.Region]] = None, image: bool = False, detect_image: bool = False)[source]

Representation of memory-mapped PE file

Short name: procmempe

PE files can be read directly using inherited ProcessMemory.from_file() with image argument set (look at from_memory() method).

property imgend

Address where PE image ends

is_image_loaded_as_memdump() → bool[source]

Checks whether memory region contains image incorrectly loaded as memory-mapped PE dump (image=False).

embed_pe = procmempe.from_memory(mem)
if not embed_pe.is_image_loaded_as_memdump():
    # Memory contains plain PE file - need to load it first
    embed_pe = procmempe.from_memory(mem, image=True)
is_valid() → bool[source]

Checks whether imgbase is pointing at valid binary header

property pe

Related PE object

store() → bytes[source]

Store ProcessMemoryPE contents as PE file data.

Return type

bytes

ProcessMemoryELF (procmemelf)

malduck.procmemelf

alias of malduck.procmem.procmemelf.ProcessMemoryELF

class malduck.procmem.procmemelf.ProcessMemoryELF(buf: Union[bytes, bytearray, mmap.mmap], base: int = 0, regions: Optional[List[malduck.procmem.region.Region]] = None, image: bool = False, detect_image: bool = False)[source]

Representation of memory-mapped ELF file

Short name: procmemelf

ELF files can be read directly using inherited ProcessMemory.from_file() with image argument set (look at from_memory() method).

property elf

Related ELFFile object

property imgend

Address where ELF image ends

is_image_loaded_as_memdump()[source]

Uses some heuristics to deduce whether contents can be loaded with image=True. Used by detect_image

is_valid() → bool[source]

Checks whether imgbase is pointing at valid binary header

CuckooProcessMemory (cuckoomem)

malduck.cuckoomem

alias of malduck.procmem.cuckoomem.CuckooProcessMemory

class malduck.procmem.cuckoomem.CuckooProcessMemory(buf: Union[bytes, bytearray, mmap.mmap], base: Optional[int] = None, **_)[source]

Wrapper object to operate on process memory dumps in Cuckoo 2.x format.

IDAProcessMemory (idamem)

malduck.idamem

alias of malduck.procmem.idamem.IDAProcessMemory

class malduck.procmem.idamem.IDAProcessMemory[source]

ProcessMemory representation operating in IDAPython context [BETA]