Skip to content

The EVTX file format: a complete byte-level reference

A field-by-field reference for the Windows .evtx format — file header, ELFCHNK chunk header, event record, the full BinXML token and value-type tables, and a worked decode from raw bytes to rendered XML.

By Florian AmettePublished 9 {n} min read

Most writing about the EVTX format — including our own working tour of it — explains the format in prose. This page is the other thing: a reference you can keep open in a second tab while you read a hex dump. Every structure below is given with offsets, sizes, and the values you should actually see on disk, followed by a worked decode of a single record from raw bytes to the XML an analyst reads.

The on-disk format is not covered by [MS-EVEN6], which documents only the remoting protocol. The definitive public sources are Andreas Schuster's 2007 reverse-engineering paper and Joachim Metz's libevtx specification; the tables here follow libevtx. Everything is little-endian unless noted.

The shape of the file

┌──────────────────────────────┐  offset 0
│  File header (4096 bytes)     │
├──────────────────────────────┤  offset 4096
│  Chunk 0 (65536 bytes)        │
│   ├─ Chunk header (512 B)     │
│   ├─ String table (offsets)   │
│   ├─ Template table (offsets) │
│   └─ Event records →          │
├──────────────────────────────┤  offset 4096 + 65536
│  Chunk 1 (65536 bytes)        │
├──────────────────────────────┤
│  …                            │
└──────────────────────────────┘

A file is a 4 KB header followed by N fixed 64 KB chunks. Chunks are self-contained: each carries the string and template tables its own records reference. That independence is the single most important property of the format — it is why a chunk carved from unallocated space is parseable on its own, and why a record carved without its chunk is not.

File header (ElfFile)

4096 bytes. Magic 45 6C 66 46 69 6C 65 00 ("ElfFile\0"). Only the first 128 bytes are used; the rest is zero padding.

OffsetSizeFieldNotes
08SignatureElfFile\0
88First chunk numberchunk index, not a byte offset
168Last chunk number
248Next record identifiernext RecordID to be written — a truncation tell
324Header sizealways 0x80 (128)
362Minor version1
382Major version3 → format 3.1
402Header block sizealways 0x1000 (4096)
422Number of chunks
1204File flags0x1 = dirty, 0x2 = full
1244ChecksumCRC-32 of bytes 0–119

Two flags carry forensic weight. Dirty means the file was not cleanly closed — set on the active channel of a live-collected host or a crashed machine. Full means the file has wrapped at least once. Note also that strict parsers enforce the header CRC and will reject files that looser parsers happily recover records from; know which behaviour yours has.

Chunk header (ElfChnk)

Each chunk is exactly 65536 bytes. The header is 512 bytes; magic 45 6C 66 43 68 6E 6B 00 ("ElfChnk\0").

OffsetSizeFieldNotes
08SignatureElfChnk\0
88First event record number
168Last event record number
248First event record identifier
328Last event record identifier
404Header size0x80 (128)
444Last record data offsetrelative to chunk start
484Free space offsetstart of unused tail
524Event records CRC-32over the records area
1204(unknown / flags)
1244Header CRC-32over bytes 0–119 and 128–511
128256String-offset table64 × 4-byte offsets
384128Template-pointer table32 × 4-byte offsets

The two tables are why you cannot render a record in isolation. Element names, attribute names, and provider strings are interned once per chunk and referenced by offset; XML skeletons are stored once per chunk as templates. A record is a template ID plus a list of values. Resolve the tables or every Provider reads Unknown.

The records-area CRC-32 is the tamper tripwire: edit a record without recomputing it and the mismatch is detectable. (Most attackers don't bother — see tampered logs and what survives.)

Event record

Records run back-to-back from offset 512 until the free-space offset.

OffsetSizeFieldNotes
04Signature2A 2A 00 00 (**\0\0)
44Sizetotal length, including the trailing copy
88Event record identifiermonotonic RecordID
168Written timeWindows FILETIME, 100 ns ticks since 1601-01-01 UTC
24BinXMLthe event payload
end4Size (copy)repeat of the size field, so readers can walk backwards

That 2A 2A 00 00 signature is what makes record-level carving feasible. The FILETIME here is the same encoding you'll meet in the registry, $MFT, and Prefetch — worth learning to read in your head.

BinXML token table

The payload is BinXML: a token stream of XML opcodes. A token's high bit 0x40 is a "more data follows" flag (e.g. an element that has attributes), which is why the same logical token appears at two values.

TokenValue(s)Meaning
EndOfStream0x00end of the token stream
OpenStartElement0x01 / 0x41<name … (0x41 = has attributes)
CloseStartElement0x02>
CloseEmptyElement0x03/>
EndElement0x04</name>
Value0x05 / 0x45literal value, followed by a value type + bytes
Attribute0x06 / 0x46name="…"
CDATASection0x07 / 0x47<![CDATA[…]]>
CharRef0x08 / 0x48&#x…;
EntityRef0x09 / 0x49&name;
PITarget0x0aprocessing-instruction target
PIData0x0bprocessing-instruction data
TemplateInstance0x0cuse a template + substitution array
NormalSubstitution0x0dinsert value [id]
OptionalSubstitution0x0einsert value [id], or omit the element if null
FragmentHeader0x0fstream prologue, always 0F 01 01 00

The distinction between normal (0x0d) and optional (0x0e) substitution is a classic correctness bug: optional means omit the parent element when the value is null. Treat them the same and you emit empty <Data/> elements that the real log doesn't contain.

BinXML value types

Every literal value and every substitution slot is typed. The high bit 0x80 marks an array of the base type.

ValueTypeEncoding
0x00NullTypeempty
0x01StringTypeUTF-16LE, no BOM, no terminator
0x02AnsiStringTypecode-page string
0x030x0aInt8…UInt648/16/32/64-bit signed & unsigned
0x0b / 0x0cReal32 / Real64float / double
0x0dBoolType32-bit 0/1
0x0eBinaryTyperaw bytes
0x0fGuidType16-byte GUID
0x10SizeTType32- or 64-bit
0x11FileTimeType64-bit FILETIME
0x12SysTimeType128-bit SYSTEMTIME
0x13SidTypeNT security identifier
0x14 / 0x15HexInt32 / HexInt64rendered as 0x…
0x21BinXmlTypea nested BinXML fragment — parsers must recurse
0x810x95array variantse.g. 0x81 = array of UTF-16LE strings

Type 0x21 is where naive parsers fall over: a substitution value can itself be a BinXML stream (this is how UserData/EventData nest), so the decoder has to call itself.

Templates and the substitution array

The event-log writer almost never emits raw element tokens for an event. It emits a template instance (0x0c): a reference to a template definition stored once per chunk, plus a typed substitution array of the per-record values.

Template definition (referenced by ID/offset within the chunk):

OffsetSizeField
01version
14template id
54next-template offset (0 if none)
916template GUID
254data size
29BinXML skeleton (fragment header + elements + 0x00)

Substitution array (carried by the instance):

OffsetSizeField
04value count n
44·ndescriptors: size (2) + type (1) + reserved (1)
the values, back-to-back, in descriptor order

To render one record:

  1. Read its token stream; on 0x0c, resolve the template by ID in the chunk's template table.
  2. Walk the template skeleton. Each 0x0d/0x0e carries a substitution id (index into the array) and a type.
  3. For each placeholder, read array entry [id], type-check it against the declared type, and inline it — recursing if the type is 0x21 (BinXML) or iterating if the high 0x80 bit marks an array.

Worked decode (schematic)

A <System> element with a substituted computer name decodes roughly like this. Bytes are illustrative — the grammar is exact:

0F 01 01 00              FragmentHeader (major 1, minor 1, flags 0)
0C ..(template ref).. │  TemplateInstance → resolve template T in chunk
                         template T skeleton, walked token by token:
  41 .. <Computer>       OpenStartElement "Computer" (0x41: has content)
    02                   CloseStartElement  >
    0E 00 00 01          OptionalSubstitution id=0, type=0x01 (String)
    04                   EndElement  </Computer>
  00                     EndOfStream

substitution array:
  count = 1
  [0] size=0x1A type=0x01  → "WIN-DC01" (UTF-16LE)

Result: <Computer>WIN-DC01</Computer>. Multiply that by ~20 placeholders and you have a 4624 event; the template (<System>, <EventData>, every element name) is stored once for the whole chunk and every 4624 record in it is just a fresh substitution array. That deduplication is what keeps EVTX small and what makes a half-finished parser wrong. The token-by-token mechanics — names, hashes, nested fragments — are their own subject: see how BinXML actually works.

Validation and recovery checklist

  • Header CRC-32 (file +124, chunk +124) and records CRC-32 (chunk +52) — mismatches mean corruption or tampering, not necessarily an unreadable file.
  • NextRecordIdentifier vs the last record's ID — a gap means records are missing (truncation, or selective deletion).
  • A dirty trailing chunk may end mid-token; robust parsers recover what precedes the break and report the chunk, rather than failing the whole file.
  • Each complete chunk stands alone: wrap a carved chunk in a synthetic 4 KB header and it parses.

Sources & further reading

Related posts

A token-by-token walkthrough of BinXML — the binary XML encoding inside .evtx records. Names, hashes, templates, the substitution array, nested fragments, and the edge cases that break parsers.
A working tour of the EVTX binary format: file header, ELFCHNK chunks, BinXML templates, substitution arrays, and why parsing this thing is harder than it looks.
How attackers clear, truncate and timestomp Windows event logs — and the byte-level tells that survive: 1102/104 clearing events, record-ID gaps, chunk CRC mismatches, dirty chunks, and records carvable from slack and unallocated space.