LZSS Compression
LZSS Compression
The KN5000 firmware uses LZSS compression (SLIDE4K format) for storing preset and parameter data. This page documents all compressed regions and the decompression routines.
Compressed Data Regions
Compressed Preset/Parameter Data (SLIDE4K Format)
| Property | Value |
|---|---|
| ROM Component | table_data |
| Label | Compressed_Preset_Data_LZSS |
| Start Address | 0x08E0000 (ROM offset) / 0x3E0000 (CPU address) |
| End Address | 0x08E6D40 |
| Header Size | 14 bytes |
| Compressed Data Size | 27,953 bytes |
| Total Size | 27,967 bytes |
| Decompressed Size | ~32,910 bytes |
| Binary File | table_data/includes/preset_data_compressed.bin |
Header Structure:
Offset Size Content
------ ---- -------
0x00 7 "SLIDE4K" (format signature)
0x07 1 0x00 (null terminator)
0x08 1 0x00
0x09 1 0x95
0x0A 1 0x00
0x0B 2 "}Z" (0x7D, 0x5A)
0x0D 1 0xEE
Important Clarification:
This compressed region does NOT contain the Sub CPU ROM. Decompression yields ~33KB of parameter-like data, not the 192KB executable code found in
kn5000_subprogram_v142.rom.
Decompressed Data Characteristics:
- Size: 32,910 bytes (0x808E) (vs 196,608 bytes for Sub CPU ROM)
- Content: Parameter/preset data structure, not executable code
- Most bytes are in MIDI range (0-127)
- Contains repeating structural patterns (e.g.,
0x80 XXflags,00 03record markers) - Does NOT match the Sub CPU ROM byte patterns
✅ RESOLVED: Address 0x3E0000 is Custom Data Flash (Firmware Update Staging)
Resolution: The
0x3E0000address mystery has been solved by analyzing the firmware update routines.
Firmware Update System and 0x3E0000
Discovery: 0x3E0000 is a Firmware Update Destination
The address 0x3E0000 is Custom Data Flash (not an alternate ROM mapping). During firmware updates, compressed Sub CPU payload data is written here:
File Type 007 Handler (HANDLE_UPDATE_FILE_TYPE_ID_007h at 0xEF47FA):
; "Technics KN5000 Program DATA FILE PCK"
HANDLE_UPDATE_FILE_TYPE_ID_007h:
LD WA, 1 ; Select Custom Data Flash
LD XBC, 003e0000h ; Write to 0x3E0000
CALL Flash_EraseSectorWithBankSelect ; Flash write routine
LD WA, 1
LD XBC, 003f0000h ; Write to 0x3F0000
CALL Flash_EraseSectorWithBankSelect ; Flash write routine
The flash write routine (Flash_EraseSectorWithBankSelect) uses CUSTOM_DATA_FLASH__BASE_ADDR (0x300000) as the base when WA=1.
Update File Types
| ID | File Type | Destination | Format |
|---|---|---|---|
| 007 | Program DATA FILE PCK | Custom Data Flash 0x3E0000 + 0x3F0000 | LZSS compressed |
| 008 | Table DATA FILE PCK | Table Data ROM | LZSS compressed |
| 001 | Program DATA FILE 1/2 | Table Data ROM 0x800000 | Uncompressed |
| 003 | Table DATA FILE 1/2 | Table Data ROM 0x800000 | Uncompressed |
Boot Sequence Behavior
After a firmware update:
- Compressed payload exists at Custom Data Flash 0x3E0000
SubCPU_Send_Payloaddecompresses from 0x3E0000 → success- Updated Sub CPU firmware is loaded
Factory state (no update):
- Custom Data Flash at 0x3E0000 contains user data or is empty
SubCPU_Send_Payloadtries to decompress → fails (returns 0xFFFF)- Falls back to
TABLE_DATA_ROM__BASE_ADDR(0x800000)
Memory Map Clarification
| Address | Memory Region | Purpose |
|---|---|---|
| 0x3E0000 | Custom Data Flash (offset 0xE0000) | Firmware update staging area |
| 0x8E0000 | Table Data ROM (offset 0xE0000) | Factory LZSS preset data |
| 0x830000-0x87FFFF | Table Data ROM (offset 0x30000-0x7FFFF) | Factory Sub CPU payload |
The addresses 0x3E0000 and 0x8E0000 are NOT the same physical data - they are different chips:
0x3E0000= Custom Data Flash IC19 (user-writable)0x8E0000= Table Data ROM IC7/IC8 (factory programmed)
Remaining Questions (Partially Disputed)
While the 0x3E0000 address is now understood, some aspects of the preset data transfer remain unclear:
Decompressed Data Structure:
| Section | Offset | Size | Runtime Destination |
|---|---|---|---|
| Main CPU Header | 0x0000-0x00AF | 176 bytes | Main CPU only (word at 0x100 → RAM 0x0404) |
| Sub CPU Audio Params | 0x00B0-0x808D | 32,734 bytes | Sub CPU address 0xF000+ (uncertain) |
Outstanding questions:
- The bulk transfers send 64KB blocks (much larger than the ~33KB of meaningful data)
- The fallback to
TABLE_DATA_ROM__BASE_ADDR(0x800000) produces mostly 0xF7 padding bytes - The exact purpose of the preset parameters at Sub CPU 0xF000+ is not fully understood
Main CPU Header (0x00-0xAF):
- Mostly zero bytes with sparse configuration values
- Key non-zero positions: 0x18-0x1B, 0x2E, 0x45, 0x57-0x5A, 0x61, 0x94-0x9E, 0xA4-0xA5
- Contains record marker
00 03 01 00at offset 0x94 - Word at offset 0x100 is copied to Main CPU RAM 0x0404 before bulk transfer
Sub CPU Audio Parameters (0x100+) - per Claude’s interpretation:
- Destination: Sub CPU address 0xF000 (overwrites ROM defaults) - DISPUTED
- Variable-length records, often starting with
00 03marker - 24 occurrences of
00 03record markers - Flag byte
0x80indicates “value set” (actual value in next byte) - Pattern
18 XXappears to indicate parameter type codes - Common sequence:
64 03 00 7F 20 00 70 80 - Most values are MIDI-range (0-127), suggesting voice/audio parameters
Sub CPU 0xF000 Area Usage (IF Claude’s interpretation is correct):
The Sub CPU ROM (kn5000_subprogram_v142) contains default values at 0xF000+. These are audio engine configuration tables:
| Address | Purpose |
|---|---|
| 0xF000-0xF01F | System configuration, counters |
| 0xF010-0xF100 | Voice parameters (ADSR envelopes) |
| 0xF100-0xF420 | Pitch tables, envelope lookups |
| 0xF420-0xF434 | Runtime buffer initialization |
| 0xF434-0xF460 | Serial buffer structures |
| 0xF48C+ | Voice polyphony/index tables |
If the interpretation is correct, the preset data would overwrite these defaults during boot, configuring factory presets for the audio engine.
Open Questions - Preset Data Destination (Added due to AI/Human disagreement):
The following questions remain open regarding where the LZSS preset data actually ends up:
- What does 0x3E0000 actually map to? The memory bank configuration during boot needs investigation.
- Are there other ~33KB transfers? Search for data transfers matching the preset data size.
- What happens to the “extra” bytes? The 64KB transfer vs ~33KB data discrepancy is suspicious.
- Why does fallback produce 0xF7 bytes? This suggests the fallback path may never be intended to work.
Open Questions - Sub CPU Payload Transfer:
The Sub CPU executable payload (~192KB) must be transferred to the Sub CPU via the inter-CPU communication latches during boot. The source location remains under investigation:
-
Memory Map Reconfiguration: During the two-stage boot process, the Table Data ROM is initially mapped at
0xE00000-0xFFFFFF, then remapped to0x800000-0x9FFFFF. The payload source address may differ between Stage 1 and Stage 2 boot contexts. - Potential Payload Locations:
0x830000-0x87FFFF(Table Data ROM, post-remap) - Referenced bySubCPU_Send_Payload0xE30000-0xE7FFFF(same physical ROM, Stage 1 boot context)- Other regions that become visible after memory controller configuration
- The
SubCPU_Send_Payloadroutine transfers data from multiple address ranges to different Sub CPU RAM regions. The exact mapping and whether the full 192KB executable is transferred vs. loaded from a separate Sub CPU ROM chip requires further analysis.
See Boot Sequence and Inter-CPU Protocol for related documentation.
SLIDE4K Format Specification
SLIDE4K is a variant of LZSS (Lempel-Ziv-Storer-Szymanski) compression with the following parameters:
| Parameter | Value |
|---|---|
| Sliding Window Size | 4,096 bytes (4KB) |
| Window Offset Bits | 12 bits (0x000 - 0xFFF) |
| Match Length Bits | 4 bits (encoded length + 2) |
| Minimum Match Length | 2 bytes |
| Maximum Match Length | 17 bytes (15 + 2) |
| Window Pre-fill | First 4,078 bytes (0xFEE) filled with 0x00 |
Encoding Format
The compressed data consists of flag bytes followed by literal bytes or back-references:
- Flag Byte: Each bit (LSB first) indicates the type of the next 8 elements:
- Bit = 1: Literal byte follows
- Bit = 0: Back-reference follows
-
Literal Byte: Single byte copied directly to output
- Back-Reference: Two bytes encoding position and length:
Byte 1: Low 8 bits of window offset Byte 2: [High 4 bits of offset][4-bit length] offset = (byte2 & 0xF0) << 4 | byte1 length = (byte2 & 0x0F) + 2
Decompression Algorithm
def decompress_slide4k(data):
window = bytearray(4096)
window[0:0xFEE] = bytes(0xFEE) # Pre-fill with zeros
window_pos = 0xFEE
output = bytearray()
i = 0
while i < len(data):
flags = data[i]
i += 1
for bit in range(8):
if i >= len(data):
break
if flags & (1 << bit):
# Literal byte
byte = data[i]
i += 1
output.append(byte)
window[window_pos] = byte
window_pos = (window_pos + 1) & 0xFFF
else:
# Back-reference
if i + 1 >= len(data):
break
low = data[i]
high = data[i + 1]
i += 2
offset = ((high & 0xF0) << 4) | low
length = (high & 0x0F) + 2
for _ in range(length):
byte = window[offset]
output.append(byte)
window[window_pos] = byte
window_pos = (window_pos + 1) & 0xFFF
offset = (offset + 1) & 0xFFF
return bytes(output)
Decompression Routines
All LZSS decompression routines are located in the table_data ROM. They are used during boot and firmware update operations.
LZSS_Decompress (Main Decompressor)
| Property | Value |
|---|---|
| Address | 0xFFCA50 (CPU) / 0x09FCA50 (ROM) |
| Label | LZSS_Decompress |
| Purpose | Main SLIDE4K decompression routine |
Description: This is the primary decompression routine that:
- Allocates a 4KB sliding window buffer via
malloc - Pre-fills the window with zeros (positions 0x0000 to 0x0FED)
- Reads the compressed size from the header
- Processes flag bytes and handles literal/back-reference encoding
- Displays progress during decompression
- Frees the window buffer when complete
Stack Frame Layout:
XSP+0x00: (local variable)
XSP+0x04: Flag byte (with sentinel in high byte)
XSP+0x06: Copy counter for back-reference
XSP+0x08: Match length
XSP+0x0A: Window write position
XSP+0x0C: Window base address (copy of +0x10)
XSP+0x10: Window buffer pointer (from malloc)
LZSS_ReadByte
| Property | Value |
|---|---|
| Address | 0xFFC8C2 (CPU) / 0x09FC8C2 (ROM) |
| Label | LZSS_ReadByte |
| Purpose | Read next byte from compressed input stream |
| Returns | HL = byte read, or 0xFFFF if EOF |
Description: Handles sector buffering for reading compressed data. Reads 0x2400 bytes per sector from the table_data ROM and manages sector advancement for large compressed data.
LZSS_OutputByte
| Property | Value |
|---|---|
| Address | 0xFFC935 (CPU) / 0x09FC935 (ROM) |
| Label | LZSS_OutputByte |
| Purpose | Write decompressed byte to output buffer |
| Input | A = byte to output |
Description: Buffers 4 bytes and writes them as a 32-bit word to the destination for efficient memory access. Uses buffer at RAM address 0x0C0A.
LZSS_OutputByte_Alt
| Property | Value |
|---|---|
| Address | 0xFFC974 (CPU) / 0x09FC974 (ROM) |
| Label | LZSS_OutputByte_Alt |
| Purpose | Alternate output handler for different buffer |
Description: Variant of LZSS_OutputByte that uses buffer at RAM address 0x0C0E instead of 0x0C0A. Used in flash update operations.
LZSS_ParseHeader
| Property | Value |
|---|---|
| Address | 0xFFC9B3 (CPU) / 0x09FC9B3 (ROM) |
| Label | LZSS_ParseHeader |
| Purpose | Parse and validate LZSS header for flash updates |
Description: Sets up source address (0x3E0000 = Table Data ROM), validates the header against expected signature at 0xA150, and pre-reads 4 sectors for initial buffer fill.
RAM Variables
The LZSS routines use the following RAM locations:
| Address | Size | Purpose |
|---|---|---|
0x0C20 |
4 | Expected decompressed output size |
0x0C24 |
4 | Current output position |
0x0C28 |
4 | Source ROM address pointer |
0x0C2C |
4 | Sector read buffer pointer |
0x0C30 |
2 | Display X coordinate (progress) |
0x0C32 |
2 | Display Y coordinate (progress) |
0x0C34 |
2 | Sector offset for progress display |
0x0C36 |
1 | Output byte counter (0-3 for 32-bit writes) |
0x0C0A |
4 | Temporary output buffer (LZSS_OutputByte) |
0x0C0E |
4 | Temporary output buffer (LZSS_OutputByte_Alt) |
SLIDE String Markers
The firmware contains “SLIDE” string markers used to detect and validate SLIDE4K format headers:
| Location | Address | Label |
|---|---|---|
| Main CPU | 0xEA2350 |
SLIDE_STRING |
| Main CPU | 0xEA2376 |
SLIDE_STRING_2 |
| Table Data | 0x9FA150 |
(within boot data) |
These strings are used by header validation routines to confirm that compressed data uses the SLIDE4K format before attempting decompression.
Usage Context
Boot Sequence
During the boot sequence, the Main CPU’s SubCPU_Send_Payload routine:
- Transfers uncompressed payload data from 0x830000-0x87FFFF to Sub CPU RAM
- Optionally attempts to decompress data from 0x3E0000 (the SLIDE4K region)
- If decompression fails (returns 0xFFFF), uses alternate uncompressed source
- The Sub CPU ROM itself is a separate chip - not transferred from table_data
See Boot Sequence for details.
Firmware Updates
The LZSS decompressor is also used during firmware updates to handle compressed update files. The flash update handlers (at 0x9FD800-0x9FEA9C) can process compressed ROM images using both SLIDE4K and SLIDE8K formats.
Compression Tools
scripts/decompress_lzss.py
Decompresses SLIDE4K data from the table_data ROM.
# Extract and decompress from table_data ROM
python scripts/decompress_lzss.py --extract-from-rom original_ROMs/kn5000_table_data.ic11 output.bin
# Decompress standalone compressed file
python scripts/decompress_lzss.py compressed.bin output.bin
scripts/compress_lzss.py
Compresses data using SLIDE4K algorithm. Supports byte-identical output via reference file.
# Standard compression
python scripts/compress_lzss.py input.bin output.bin
# Byte-identical compression (uses original compression decisions)
python scripts/compress_lzss.py input.bin output.bin --reference original_compressed.bin
Note: The --reference option decodes compression decisions from the original file and replays them, producing byte-identical output when the input data matches. This is necessary because the original Technics compressor uses specific offset/length choices that don’t follow a simple greedy or lazy matching algorithm.
Makefile Targets
# Full rebuild: assemble preset_data.asm → compress → include in ROM
make rebuild-preset-data
# Recompress only (without reassembly, uses existing uncompressed binary)
make recompress-lzss
# Clean intermediate files
make clean_preset_data
The build uses the reference file at original_ROMs/preset_data_compressed.original.bin to ensure byte-matching output.
Source Files
table_data/preset_data.asm
Assembly source file describing the preset data structure. Currently:
- Header section (0x00-0xAF): Fully defined with explicit
dbstatements - Parameter section (0xB0+): Binary include with documentation
table_data/preset_data.asm <- Assembly source
↓ (assemble)
table_data/includes/preset_data.bin <- Uncompressed binary
↓ (LZSS compress with --reference)
table_data/includes/preset_data_compressed.bin <- Compressed, included in ROM
As reverse engineering progresses, more of the parameter section can be converted from binary include to documented data definitions.
Related Files
| File | Description |
|---|---|
table_data/preset_data.asm |
Assembly source for preset data |
table_data/includes/preset_data_uncompressed.bin |
Original decompressed data (reference) |
table_data/includes/preset_data_compressed.bin |
Compressed output for ROM inclusion |
original_ROMs/preset_data_compressed.original.bin |
Reference for byte-matching compression |
Future Work
- Reverse engineer parameter record format in detail
- Determine the exact purpose of each parameter type (
18 XXcodes) - Convert more of preset_data.asm from binary include to documented data
- Investigate if any other data in the ROMs uses SLIDE4K compression