Memory model objects (procmem)

ProcessMemory (procmem)

malduck.procmem

alias of ProcessMemory

class malduck.procmem.procmem.ProcessMemory(buf, base=0, regions=None, **_)[source]

Basic virtual memory representation

Short name: procmem

Parameters:
  • buf (bytes, mmap, memoryview, bytearray or MemoryBuffer object) – Object with memory contents

  • base (int, optional (default: 0)) – Virtual address of the region of interest (or beginning of buf when no regions provided)

  • regions (List[Region]) – Regions mapping. If set to None (default), buf is mapped into single-region with VA specified in base argument

Let’s assume that notepad.exe_400000.bin contains raw memory dump starting at 0x400000 base address. We can easily load that file to ProcessMemory object, using from_file() method:

from malduck import procmem

with procmem.from_file("notepad.exe_400000.bin", base=0x400000) as p:
    mem = p.readv(...)
    ...

If your data are loaded yet into buffer, you can directly use procmem constructor:

from malduck import procmem

with open("notepad.exe_400000.bin", "rb") as f:
    payload = f.read()

p = procmem(payload, base=0x400000)

Then you can work with PE image contained in dump by creating ProcessMemoryPE object, using its from_memory() constructor method

from malduck import procmem, procmempe

with open("notepad.exe_400000.bin", "rb") as f:
    payload = f.read()

p = procmem(payload, base=0x400000)
ppe = procmempe.from_memory(p)
ppe.pe.resource("NPENCODINGDIALOG")

If you want to load PE file directly and work with it in a similar way as with memory-mapped files, just use image parameter. It also works with ProcessMemoryPE.from_memory() for embedded binaries. Your file will be loaded and relocated in similar way as it’s done by Windows loader.

from malduck import procmempe

with procmempe.from_file("notepad.exe", image=True) as p:
    p.pe.resource("NPENCODINGDIALOG")
addr_region(addr)[source]

Returns Region object mapping specified virtual address

Parameters:

addr – Virtual address

Return type:

Region

asciiz(addr)[source]

Read a null-terminated ASCII string at address.

close(copy=False)[source]

Closes opened files referenced by ProcessMemory object owned by this object.

If copy is False (default): invalidates the object.

Parameters:

copy (bool) – Copy data into string before closing the mmap object (default: False)

disasmv(addr, size=None, x64=False, count=None)[source]

Disassembles code under specified address

Changed in version 4.0.0: Returns iterator instead of list of instructions

Parameters:
  • addr (int) – Virtual address

  • size (int (optional)) – Size of disassembled buffer

  • count (int (optional)) – Number of instructions to disassemble

  • x64 (bool (optional)) – Assembly is 64bit

Returns:

List[Instruction]

extract(modules=None, extract_manager=None)[source]

Tries to extract config from ProcessMemory object

Parameters:
Returns:

Static configuration(s) (malduck.extractor.ExtractManager.config) or None if not extracted

Return type:

List[dict] or None

findbytesp(query, offset=None, length=None)[source]

Search for byte sequences (e.g., 4? AA BB ?? DD). Uses yarap() internally

If offset is None, looks for match from the beginning of memory

New in version 1.4.0: Query is passed to yarap as single hexadecimal string rule. Use Yara-compatible strings only

Parameters:
  • query (str or bytes) – Sequence of wildcarded hexadecimal bytes, separated by spaces

  • offset (int (optional)) – Buffer offset where searching will be started

  • length (int (optional)) – Length of searched area

Returns:

Iterator returning next offsets

Return type:

Iterator[int]

findbytesv(query, addr=None, length=None)[source]

Search for byte sequences (e.g., 4? AA BB ?? DD). Uses yarav() internally

If addr is None, looks for match from the beginning of memory

New in version 1.4.0: Query is passed to yarav as single hexadecimal string rule. Use Yara-compatible strings only

Parameters:
  • query (str or bytes) – Sequence of wildcarded hexadecimal bytes, separated by spaces

  • addr (int (optional)) – Virtual address where searching will be started

  • length (int (optional)) – Length of searched area

Returns:

Iterator returning found virtual addresses

Return type:

Iterator[int]

Usage example:

from malduck import hex

findings = []

for va in mem.findbytesv("4? AA BB ?? DD"):
    if hex(mem.readv(va, 5)) == "4aaabbccdd":
        findings.append(va)
findmz(addr)[source]

Tries to locate MZ header based on address inside PE image

Parameters:

addr (int) – Virtual address inside image

Returns:

Virtual address of found MZ header or None

findp(query, offset=None, length=None)[source]

Find raw bytes in memory (non-region-wise).

If offset is None, looks for substring from the beginning of memory

Parameters:
  • query (bytes) – Substring to find

  • offset (int (optional)) – Offset in buffer where searching starts

  • length (int (optional)) – Length of searched area

Returns:

Generates offsets where bytes were found

Return type:

Iterator[int]

findv(query, addr=None, length=None)[source]

Find raw bytes in memory (region-wise)

If addr is None, looks for substring from the beginning of memory

Parameters:
  • query (bytes) – Substring to find

  • addr (int (optional)) – Virtual address of region where searching starts

  • length (int (optional)) – Length of searched area

Returns:

Generates offsets where regex was matched

Return type:

Iterator[int]

classmethod from_file(filename, **kwargs)[source]

Opens file and loads its contents into ProcessMemory object

Parameters:

filename – File name to load

Return type:

ProcessMemory

It’s highly recommended to use context manager when operating on files:

from malduck import procmem

with procmem.from_file("binary.dmp") as p:
    mem = p.readv(...)
    ...
classmethod from_memory(memory, base=None, **kwargs)[source]

Makes new instance based on another ProcessMemory object.

Useful for specialized derived classes like CuckooProcessMemory

Parameters:
  • memory (ProcessMemory) – ProcessMemory object to be copied

  • base (int (optional, default is provided by specialized class)) – Virtual address of region of interest (imgbase)

Return type:

ProcessMemory

int16p(offset, fixed=False)[source]

Read signed 16-bit value at offset.

int16v(addr, fixed=False)[source]

Read signed 16-bit value at address.

int32p(offset, fixed=False)[source]

Read signed 32-bit value at offset.

int32v(addr, fixed=False)[source]

Read signed 32-bit value at address.

int64p(offset, fixed=False)[source]

Read signed 64-bit value at offset.

int64v(addr, fixed=False)[source]

Read signed 64-bit value at address.

int8p(offset, fixed=False)[source]

Read signed 8-bit value at offset.

int8v(addr, fixed=False)[source]

Read signed 8-bit value at address.

is_addr(addr)[source]

Checks whether provided parameter is correct virtual address :param addr: Virtual address candidate :return: True if it is mapped by ProcessMemory object

iter_regions(addr=None, offset=None, length=None, contiguous=False, trim=False)[source]

Iterates over Region objects starting at provided virtual address or offset

This method is used internally to enumerate regions using provided strategy.

Warning

If starting point is not provided, iteration will start from the first mapped region. This could be counter-intuitive when length is set. It literally means β€œget <length> of mapped bytes”. If you want to look for regions from address 0, you need to explicitly provide this address as an argument.

New in version 3.0.0.

Parameters:
  • addr (int (default: None)) – Virtual address of starting point

  • offset (int (default: None)) – Offset of starting point, which will be translated to virtual address

  • length (int (default: None, unlimited)) – Length of queried range in VM mapping context

  • contiguous (bool (default: False)) – If True, break after first gap. Starting point must be inside mapped region.

  • trim (bool (default: False)) – Trim Region objects to range boundaries (addr, addr+length)

Return type:

Iterator[Region]

property length

Returns length of raw memory contents :rtype: int

p2v(off, length=None)[source]

Buffer (physical) offset to virtual address translation

Changed in version 3.0.0: Added optional mapping length check

Parameters:
  • off – Buffer offset

  • length – Expected minimal length of mapping (optional)

Returns:

Virtual address or None if offset is not mapped

patchp(offset, buf)[source]

Patch bytes under specified offset

Warning

Family of *p methods doesn’t care about contiguity of regions.

Use p2v() and patchv() if you want to operate on contiguous regions only

Parameters:
  • offset (int) – Buffer offset

  • buf (bytes) – Buffer with patch to apply

Usage example:

from malduck import procmempe, aplib

with procmempe("mal1.exe.dmp") as ppe:
    # Decompress payload
    payload = aPLib().decompress(
        ppe.readv(ppe.imgbase + 0x8400, ppe.imgend)
    )
    embed_pe = procmem(payload, base=0)
    # Fix headers
    embed_pe.patchp(0, b"MZ")
    embed_pe.patchp(embed_pe.uint32p(0x3C), b"PE")
    # Load patched image into procmempe
    embed_pe = procmempe.from_memory(embed_pe, image=True)
    assert embed_pe.asciiz(0x1000a410) == b"StrToIntExA"
patchv(addr, buf)[source]

Patch bytes under specified virtual address

Patched address range must be within single region, ValueError is raised otherwise.

Parameters:
  • addr (int) – Virtual address

  • buf (bytes) – Buffer with patch to apply

readp(offset, length=None)[source]

Read a chunk of memory from the specified buffer offset.

Warning

Family of *p methods doesn’t care about contiguity of regions.

Use p2v() and readv() if you want to operate on contiguous regions only

Parameters:
  • offset – Buffer offset

  • length – Length of chunk (optional)

Returns:

Chunk from specified location

Return type:

bytes

readv(addr, length=None)[source]

Read a chunk of memory from the specified virtual address

Parameters:
  • addr (int) – Virtual address

  • length (int) – Length of chunk (optional)

Returns:

Chunk from specified location

Return type:

bytes

readv_regions(addr=None, length=None, contiguous=True)[source]

Generate chunks of memory from next contiguous regions, starting from the specified virtual address, until specified length of read data is reached.

Used internally.

Changed in version 3.0.0: Contents of contiguous regions are merged into single string

Parameters:
  • addr – Virtual address

  • length – Size of memory to read (optional)

  • contiguous – If True, readv_regions breaks after first gap

Return type:

Iterator[Tuple[int, bytes]]

readv_until(addr, s)[source]

Read a chunk of memory until the stop marker

Parameters:
  • addr (int) – Virtual address

  • s (bytes) – Stop marker

Return type:

bytes

regexp(query, offset=None, length=None)[source]

Performs regex on the memory contents (non-region-wise)

If offset is None, looks for match from the beginning of memory

Parameters:
  • query (bytes) – Regular expression to find

  • offset (int (optional)) – Offset in buffer where searching starts

  • length (int (optional)) – Length of searched area

Returns:

Generates offsets where regex was matched

Return type:

Iterator[int]

regexv(query, addr=None, length=None)[source]

Performs regex on the memory contents (region-wise)

If addr is None, looks for match from the beginning of memory

Parameters:
  • query (bytes) – Regular expression to find

  • addr (int (optional)) – Virtual address of region where searching starts

  • length (int (optional)) – Length of searched area

Returns:

Generates offsets where regex was matched

Return type:

Iterator[int]

Warning

Method doesn’t match bytes overlapping the border between regions

uint16p(offset, fixed=False)[source]

Read unsigned 16-bit value at offset.

uint16v(addr, fixed=False)[source]

Read unsigned 16-bit value at address.

uint32p(offset, fixed=False)[source]

Read unsigned 32-bit value at offset.

uint32v(addr, fixed=False)[source]

Read unsigned 32-bit value at address.

uint64p(offset, fixed=False)[source]

Read unsigned 64-bit value at offset.

uint64v(addr, fixed=False)[source]

Read unsigned 64-bit value at address.

uint8p(offset, fixed=False)[source]

Read unsigned 8-bit value at offset.

uint8v(addr, fixed=False)[source]

Read unsigned 8-bit value at address.

utf16z(addr)[source]

Read a null-terminated UTF-16 ASCII string at address.

Parameters:

addr – Virtual address of string

Return type:

bytes

v2p(addr, length=None)[source]

Virtual address to buffer (physical) offset translation

Changed in version 3.0.0: Added optional mapping length check

Parameters:
  • addr – Virtual address

  • length – Expected minimal length of mapping (optional)

Returns:

Buffer offset or None if virtual address is not mapped

yarap(ruleset, offset=None, length=None, extended=False)[source]

Perform yara matching (non-region-wise)

If offset is None, looks for match from the beginning of memory

Changed in version 4.0.0: Added extended option which allows to get extended information about matched strings and rules. Default is False for backwards compatibility.

Parameters:
  • ruleset (malduck.yara.Yara) – Yara object with loaded yara rules

  • offset (int (optional)) – Offset in buffer where searching starts

  • length (int (optional)) – Length of searched area

  • extended (bool (optional, default False)) – Returns extended information about matched strings and rules

Return type:

malduck.yara.YaraMatches

yarav(ruleset, addr=None, length=None, extended=False)[source]

Perform yara matching (region-wise)

If addr is None, looks for match from the beginning of memory

Changed in version 4.0.0: Added extended option which allows to get extended information about matched strings and rules. Default is False for backwards compatibility.

Parameters:
  • ruleset (malduck.yara.Yara) – Yara object with loaded yara rules

  • addr (int (optional)) – Virtual address of region where searching starts

  • length (int (optional)) – Length of searched area

  • extended (bool (optional, default False)) – Returns extended information about matched strings and rules

Return type:

malduck.yara.YaraRulesetOffsets or malduck.yara.YaraRulesetMatches if extended is set to True

class malduck.procmem.procmem.Region(addr: int, size: int, state: int, type_: int, protect: int, offset: int)[source]

Represents single mapped region in ProcessMemory

contains_addr(addr: int) bool[source]

Checks whether region contains provided virtual address

contains_offset(offset: int) bool[source]

Checks whether region contains provided physical offset

property end: int

Virtual address of region end (first unmapped byte)

property end_offset: int

Offset of region end (first unmapped byte)

intersects_range(addr: int, length: int) bool[source]

Checks whether region mapping intersects with provided range

property last: int

Virtual address of last region byte

property last_offset: int

Offset of last region byte

p2v(off: int) int[source]

Physical offset to translation. Assumes that offset is valid within Region. :param off: Physical offset :return: Virtual address

to_json() Dict[str, int | str | None][source]

Returns JSON-like dict representation

trim_range(addr: int, length: int | None = None) Region | None[source]

Returns region intersection with provided range :param addr: Virtual address of starting point :param length: Length of range (optional) :rtype: Region

v2p(addr: int) int[source]

Virtual address to physical offset translation. Assumes that address is valid within Region. :param addr: Virtual address :return: Physical offset

ProcessMemoryPE (procmempe)

malduck.procmempe

alias of ProcessMemoryPE

class malduck.procmem.procmempe.ProcessMemoryPE(buf: bytes | bytearray | mmap | MemoryBuffer, base: int = 0, regions: List[Region] | None = None, image: bool = False, detect_image: bool = False)[source]

Representation of memory-mapped PE file

Short name: procmempe

Parameters:
  • buf (bytes, mmap, memoryview, bytearray or MemoryBuffer() object) – A memory object containing the PE to be loaded

  • base (int, optional (default: 0)) – Virtual address of the region of interest (or beginning of buf when no regions provided)

  • image (bool, optional (default: False)) – The memory object is a dump of memory-mapped PE

  • detect_image (bool, optional (default: False)) – Try to automatically detect if the input buffer is memory-mapped PE using some heuristics

File memory_dump contains a 64bit memory-aligned PE dumped from address 0x140000000, in order to load it into procmempe and access the pe field all we have to do is initialize a new object with the file data:

from malduck import procmempe

with open("memory_dump", "rb") as f:
    data = f.read()

pe_dump = procmempe(buf=data, base=0x140000000, image=True)
print(pe_dump.pe.is64bit)

PE files can also be read directly using inherited ProcessMemory.from_file() with image argument set (look at from_memory() method).

pe_dump = procmempe.from_file("140000000_1d5bdc3dbe71a7bd", image=True)
print(pe_dump.pe.sections)
property imgend: int

Address where PE image ends

is_image_loaded_as_memdump() bool[source]

Checks whether memory region contains image incorrectly loaded as memory-mapped PE dump (image=False).

embed_pe = procmempe.from_memory(mem)
if not embed_pe.is_image_loaded_as_memdump():
    # Memory contains plain PE file - need to load it first
    embed_pe = procmempe.from_memory(mem, image=True)
is_valid() bool[source]

Checks whether imgbase is pointing at valid binary header

property pe: PE

Related PE object

store() bytes[source]

Store ProcessMemoryPE contents as PE file data.

Return type:

bytes

ProcessMemoryELF (procmemelf)

malduck.procmemelf

alias of ProcessMemoryELF

class malduck.procmem.procmemelf.ProcessMemoryELF(buf: bytes | bytearray | mmap | MemoryBuffer, base: int = 0, regions: List[Region] | None = None, image: bool = False, detect_image: bool = False)[source]

Representation of memory-mapped ELF file

Short name: procmemelf

ELF files can be read directly using inherited ProcessMemory.from_file() with image argument set (look at from_memory() method).

property elf: ELFFile

Related ELFFile object

property imgend: int

Address where ELF image ends

is_image_loaded_as_memdump()[source]

Uses some heuristics to deduce whether contents can be loaded with image=True. Used by detect_image

is_valid() bool[source]

Checks whether imgbase is pointing at valid binary header

CuckooProcessMemory (cuckoomem)

malduck.cuckoomem

alias of CuckooProcessMemory

class malduck.procmem.cuckoomem.CuckooProcessMemory(buf: bytes | bytearray | mmap | MemoryBuffer, base: int | None = None, **_)[source]

Wrapper object to operate on process memory dumps in Cuckoo 2.x format.

IDAProcessMemory (idamem)

malduck.idamem

alias of IDAProcessMemory

class malduck.procmem.idamem.IDAProcessMemory[source]

ProcessMemory representation operating in IDAPython context

Short name: idamem

Initialize by creating the object within IDAPython context and then use like a normal procmem object:

from malduck import idamem, xor

ida = idamem()
decrypted_data = xor(b"KEYZ", ida.readv(0x0040D320, 128))
some_wide_string = ida.utf16z(0x402010).decode("utf-8")