Sprites and OAM¶
OAM format, palette indices, chaining, metasprites, budget strategies, the shadow-OAM double-buffer, entity-pool architectures observed in commercial games, and a known-bugs section with confirmed fixes.
Note: this page uses ASCII only (avoids encoding issues on Windows/PowerShell).
1. OAM Hardware Format¶
1.1 Sprite VRAM (0x8800)¶
64 sprites, 4 bytes each. Byte access.
Sprite n:
0x8800 + n*4 + 0 : tile index (bits 7-0 of 9-bit index)
0x8800 + n*4 + 1 : flags
bit7 : H flip
bit6 : V flip
bit4-3 : priority 00=hidden 01=behind 10=middle 11=front
bit2 : H chain (extend sprite 8px to the right)
bit1 : V chain (extend sprite 8px downward)
bit0 : tile index bit 8 (for tiles 256-511)
0x8800 + n*4 + 2 : X position (pixels)
0x8800 + n*4 + 3 : Y position (pixels)
Common mistake: the struct layout is
tile / flags / x / y— NOTy / x / tile / attr. Always write fields in the correct order when building OAM entries manually.
1.2 Palette indices (0x8C00)¶
Palette indices are stored separately from OAM flags (unlike some other platforms).
When flushing the shadow OAM, flush both regions: OAM at 0x8800 and palette indices at 0x8C00.
1.3 Flag constants¶
/* Priority */
SPR_HIDE = (0 << 3) /* hidden */
SPR_BEHIND = (1 << 3) /* behind both scroll planes */
SPR_MIDDLE = (2 << 3) /* between the two scroll planes */
SPR_FRONT = (3 << 3) /* in front of everything */
/* Flip */
SPR_HFLIP = 0x80
SPR_VFLIP = 0x40
SPR_HVFLIP = 0xC0
/* Chain (extend sprite size) */
SPR_HCHAIN = 0x04 /* extend 8px right (use 2 consecutive tile slots) */
SPR_VCHAIN = 0x02 /* extend 8px down (use 2 consecutive tile slots) */
2. Chaining & Metasprites¶
2.1 H-chain and V-chain¶
Chain bits extend a sprite beyond 8x8 pixels by consuming consecutive hardware sprite slots:
Single 8x8: slot n (tile T)
H-chain 16x8: slot n (tile T, SPR_HCHAIN) + slot n+1 (tile T+1)
V-chain 8x16: slot n (tile T, SPR_VCHAIN) + slot n+1 (tile T+2)
H+V chain: slot n (tile T, SPR_HCHAIN|SPR_VCHAIN) + slots n+1, n+2, n+3
The hardware fills the additional pixels automatically from consecutive tile indices. The extra slot's X/Y and flags are ignored by the hardware (only the first slot controls position/flip).
2.2 Multi-tile sprites (metasprites)¶
For sprites larger than 16x16 or needing more than 3 visible colors (per 8x8 tile), use multiple independent hardware sprite slots with manually computed offsets.
/* 16x16 character = 4 x 8x8 sprites (2 layers of 3 colors each = 6 total colors) */
const NgpcMetasprite player_idle = {
4, /* part count */
16, 16, /* total width, height (for flip calculation) */
{
{ 0, 0, TILE_BASE+0, 0, SPR_FRONT }, /* top-left */
{ 8, 0, TILE_BASE+1, 0, SPR_FRONT }, /* top-right */
{ 0, 8, TILE_BASE+2, 0, SPR_FRONT }, /* bottom-left */
{ 8, 8, TILE_BASE+3, 0, SPR_FRONT }, /* bottom-right */
}
};
/* Draw (slot 0..3 used) */
ngpc_mspr_draw(0, player_x, player_y, &player_idle, SPR_FRONT);
/* Draw facing left (automatic quad swap + per-part flip toggle) */
ngpc_mspr_draw(0, player_x, player_y, &player_idle, SPR_FRONT | SPR_HFLIP);
2.3 Flip rules for metasprites¶
When applying H-flip to a multi-tile metasprite, you must:
1. Swap left/right quads: slot pair (0,1) <-> (2,3) for H-flip
2. Toggle SPR_HFLIP on each individual part
Without the quad swap, the sprite appears cut in half or mirrored incorrectly.
The ngpc_mspr_draw() function handles this automatically.
6-color sprite technique: render the same character twice using two layers (two sets of sprite slots), each with a 3-color palette. The layers combine to produce up to 6 visible colors. The sprite exporter supports this via a two-layer split option.
3. Budget & Strategy¶
3.1 Hardware limit: 64 sprites¶
The NGPC OAM holds exactly 64 sprite entries. This is the hard limit.
All 64 are drawn every frame regardless; hiding uses SPR_HIDE (priority = 0).
3.2 Fixed slot strategy¶
Assign a permanent slot or slot range to each game object type. Simple and predictable, no allocation overhead.
#define SLOT_PLAYER 0 /* slots 0-3 (2x2 metasprite) */
#define SLOT_ENEMIES 4 /* slots 4-19 (up to 8 enemies, 2 slots each) */
#define SLOT_BULLETS 20 /* slots 20-35 (up to 16 bullets, 1 slot each) */
#define SLOT_HUD 36 /* slots 36-63 (HUD, score, icons) */
3.3 Pool strategy¶
For variable-count objects, maintain a bitmask and allocate slots dynamically:
static u16 s_spr_mask = 0; /* bit N = slot N is taken */
s8 spr_alloc(u8 count) {
u8 i;
for (i = 0; i <= 64 - count; i++) {
/* check 'count' consecutive free bits */
u16 mask = ((1 << count) - 1) << i;
if (!(s_spr_mask & mask)) {
s_spr_mask |= mask;
return (s8)i;
}
}
return -1; /* no space */
}
3.4 Sprite multiplexing (sprmux) — ABANDONED¶
Not recommended — do not use. Hardware tests confirmed that the HBlank ISR budget (~30 cycles) is too short to flush OAM on the fly for any realistic sprite count.
Validated alternative: DMA-based mass OAM update — see DMA.md.
The original approach recycled hardware OAM slots each HBlank via Timer0 interrupt (Y-sort + reassign as sprites leave scanline visibility). The concept works in principle but fails in practice on NGPC hardware: the Timer0 HBlank window cannot accommodate the required memory operations before the next scanline begins, causing visual corruption at any non-trivial sprite count.
For >64 logical sprites, the validated solution is a single DMA copy from shadow OAM
to 0x8800 in VBlank — zero HBlank pressure, production-safe, hardware-confirmed.
4. Performance Checklist¶
- Shadow OAM: build the full OAM state in RAM during main loop, flush atomically in VBlank. Never write individual sprite registers mid-frame.
- Move-only writes: if only X/Y changed, update only
0x8800+n*4+2and+3. Skipping the tile/flags bytes saves cycles when animating large sprite counts. - Tail-clear: after flush, clear the remaining slots (slots
used..63) once per frame.ngpc_soam_flush_partial()does this automatically. - Palette index flush: upload the full 64-byte
shadow_col[]array to0x8C00once per frame using LDIRW (32 words). Do not update palette indices mid-frame. - Tile upload: only re-upload tile data when the asset changes. Tile RAM is persistent; avoid unnecessary re-uploads in the main loop.
- Tile base conflicts: ensure sprite tile base does not overlap with tilemap tile base or system font (slots 32-127). Tile VRAM is shared.
- Character Over: if
HW_STATUS & 0x80fires, too many sprites overlap on one scanline. Reduce overlap or use priority to hide lower-priority sprites.
5. Shadow OAM Double-Buffer¶
Optional module: ngpc_soam.
5.1 Principle¶
Build the full OAM state in a RAM shadow during game logic,
then push atomically to hardware (0x8800 + 0x8C00) during VBlank via LDIRW.
No screen tearing: hardware is updated in one burst.
- Shadow buffers:
shadow_oam[256 bytes]+shadow_col[64 bytes]= 320 bytes RAM - Confirmed by reverse engineering of a commercial action title:
shadow OAM in RAM, flush to
0x8800via LDIRW + tail-clear. Palette indices to0x8C00as a separate 64-byte copy. - Implementation note: some commercial routines pack the palette ID into unused flag bits
in the shadow, then extract it during flush (
pal = (flags >> 1) & 0x0F). Keepingshadow_col[]separate is simpler.
5.2 API¶
#include "ngpc_soam/ngpc_soam.h"
void ngpc_soam_begin(void);
/* Start of frame: reset internal slot counter */
void ngpc_soam_put(u8 slot, u8 x, u8 y, u16 tile, u8 flags, u8 pal);
/* Write one sprite to shadow RAM (does NOT write hardware) */
void ngpc_soam_hide(u8 slot);
/* Hide one slot in shadow (does not advance the slot counter) */
void ngpc_soam_flush(void);
/* Push shadow -> hardware: LDIRW all 64 slots (128 u16 OAM + 32 u16 pal idx)
Call ONLY from VBlank ISR. */
void ngpc_soam_flush_partial(void);
/* Performance variant: LDIRW only slots 0..used-1, then hardware-clear the rest.
Saves time when few sprites are active. */
void ngpc_soam_hide_all(void);
/* Immediately clear all 64 hardware sprite slots (priority = SPR_HIDE) */
u8 ngpc_soam_used(void);
/* Returns the high-water-mark (number of slots used this frame) */
5.3 VBlank integration¶
/* VBlank ISR: */
extern void ngpc_soam_flush(void);
static void __interrupt isr_vblank(void) {
HW_WATCHDOG = WATCHDOG_CLEAR;
ngpc_soam_flush(); /* push shadow OAM to hardware */
ngpc_vramq_flush(); /* flush queued VRAM writes */
g_vb_counter++;
}
/* In game loop — build shadow during logic: */
void game_render(void) {
ngpc_soam_begin();
for (i = 0; i < enemy_count; i++) {
ngpc_soam_put(i, enemy[i].x, enemy[i].y,
TILE_BASE + enemy[i].tile, SPR_FRONT, enemy[i].pal);
}
/* flush happens automatically in next VBlank ISR */
}
Drop-in replacement for ngpc_sprite_set():
/* Before (direct HW write, risk of tearing): */
ngpc_sprite_set(0, px, py, TILE_BASE, 0, SPR_FRONT);
/* After (buffered, tear-free): */
ngpc_soam_begin();
ngpc_soam_put(0, px, py, TILE_BASE, SPR_FRONT, 0);
/* ngpc_soam_flush() called automatically in VBlank ISR */
5.4 ASM flush implementation¶
For performance, flush() and flush_partial() are implemented in TLCS-900H assembly.
The C implementation uses byte-by-byte loops (256 + 64 iterations), which is too slow
for VBlank. The ASM port replaces these with two LDIRW instructions:
flush() : LDIRW 128 words (OAM) + LDIRW 32 words (pal idx)
flush_partial() : LDIRW N words (used slots) + hardware-clear tail
Module split:
ngpc_soam/
ngpc_soam_c.c compiled C : begin / put / hide / hide_all / used + variables
ngpc_soam_flush.asm compiled ASM: flush / flush_partial via LDIRW
Variables s_oam[], s_col[], s_used, s_used_prev are declared non-static
in the C source so the linker makes them visible to the .asm file via extern.
TLCS-900H assembly pitfalls encountered during the ASM port:
- INC r is invalid — mandatory form is INC 1, r (MAXIMUM strict mode)
- LD (HL), n does not exist — store via intermediate register
- LD (XHL), c — c is ambiguous with the Carry condition code — use e + (XHL+d)
- (XHL+d) required for indirect addressing (not (HL) directly)
- Upper bits of XHL must be zero: LD XHL, 0 before use
- LDIRW with BC=0 = 65536 iterations — always guard at runtime if BC is variable
- Warning 501 on ld hl, extern_symbol: harmless, linker resolves correctly
5.5 Hardware validation¶
Validated on real hardware. Four test cases passed:
- ngpc_soam_put() + D-pad: 2x2-tile player sprite moves correctly
- Tail-clear: cycling enemies 8->4->0->8, extra slots hidden automatically
- ngpc_soam_hide_all(): everything hidden immediately
- ngpc_soam_used(): live slot counter displayed correctly
6. Entity Pool & Spawn Table¶
Architecture proven by reverse engineering a commercial platformer binary.
6.1 Pool layout¶
/* Main loop:
XIX = pool RAM base
XIZ = spawn table pointer (ROM, current level) */
for each entity at XIX {
u16 type = *(XIX+0x00); /* 0xFFFF = end of pool (sentinel) */
if (type == 0xFFFF) break;
if (*(XIZ+0x00) == 0) skip; /* entity inactive */
/* +0x04, +0x06 : X/Y position offsets (world base) */
/* +0x08 : u32 FAR ptr to spawn data in ROM */
/* +0x0A : flags: bit6=flipX, bit7=flipY, high bits=tile attr */
/* +0x0B : palette byte (copied to shadow_col[]) */
}
6.2 Spawn table ROM¶
Each entry in the current level ROM table (4 bytes):
+0x00 u16 type (0xFFFF = end of table)
+0x02 s8 spawn_x (relative to map)
+0x03 s8 spawn_y (relative to map)
The current level ID is stored in a fixed ROM byte (0xFF = last level / end of game).
6.3 Shadow OAM builder¶
/* Setup at start of frame: */
camera_x = *(s16*)0x801E; /* SCR1_X HW -> stored to RAM */
camera_y = -(*(u8*)0x5085 + 1); /* SCR1_Y shadow inverted -> RAM */
sprite_count = 0;
XIY = 0x4B6E; /* shadow OAM destination */
XIZ = 0x4C6E; /* shadow palette indices destination */
for each active entity {
screen_x = spawn_x + camera_x; /* world -> screen projection */
screen_y = spawn_y + camera_y;
if (screen_x > 0xA6) continue; /* clip right (166px) */
if (screen_y > 0x9E) continue; /* clip bottom (158px) */
if (sprite_count >= 64) break; /* hardware limit */
tile = 0x1BF + sprite_count; /* sequential tile slot in VRAM */
attr_hi = flags | (0x1BF >> 8); /* high bits + flags */
(XIY+) = tile_lo | (attr_hi << 8); /* OAM word0 */
(XIY+) = x | (y << 8); /* OAM word1 */
(XIZ+) = palette_byte; /* palette index */
sprite_count++;
}
/* Clear tail [sprite_count..63] in shadow, then flush in VBlank: */
/* LDIRW [0x8800] <- [0x4B6E], BC=0x80 (OAM, 256 bytes = 128 words) */
/* LDIRW [0x8C00] <- [0x4CF0], BC=0x20 (pal idx, 64 bytes = 32 words) */
Tile allocation pattern: this engine allocates a fresh sequential tile slot per entity
each frame (tile = base + sprite_count). Different from the fixed-slot approach
(which assigns a permanent slot per entity), but simpler to implement.
6.4 CPU Budget Throttle¶
This engine caps entity updates at 30 per frame to prevent VBlank overruns during entity-heavy scenes.
Frame reset (called once per frame before pool loop):
ld (0x4cef), 0x1E ; reset budget counter to 30
; also resets: pool list head, OAM slot counter, sprite_count, etc.
Per-entity check (called at start of each entity's update function):
cp (0x4cef), 0x0 ; budget exhausted?
jr Z, skip_update ; yes — skip this entity (no movement, no draw)
dec (0x4cef) ; no — consume one update slot
; entity update code follows
Result: when more than 30 entities are active, entities beyond the cap hold their last position until budget is available next frame. At 60fps this is generally invisible.
C equivalent:
/* main_loop_frame_start(): */
entity_budget = 30;
/* update_entity() entry: */
if (entity_budget == 0) return;
entity_budget--;
/* ... update logic ... */
Set the cap to match the game's expected entity count under normal load; adjust up or down based on profiling.
7. Entity Struct & State Machine¶
Architecture proven by reverse engineering a commercial run-and-gun action binary.
7.1 Pool layout¶
Base address : 0x4000
Pool size : 0xB88 words = 11,792 bytes
Entry size : ~0x90 bytes (144 bytes per entity)
Active list head : address 0x5A00
Sentinel tail : address 0x5950
entity_count : address 0x65BD (decremented in entity_free)
Linked list pointers per entity:
- +0x04 / +0x06 : next/prev in active list
- +0x20 / +0x22 : next/prev in free list
7.2 Entity struct (key offsets)¶
+0x00 ptr32 State function pointer (state machine, called every frame)
+0x04 u16 Next ptr (active list)
+0x06 u16 Prev ptr (active list)
+0x10 u16 Sprite flags (bit6 = "live")
+0x11 byte Direction (bit0 = flip X)
+0x13 byte Control flags (bit6 = OAM active, bit7 = culled)
+0x16 s16 Screen X position (camera-relative)
+0x1E u16 Timer (decremented per frame)
+0x20 ptr32 Next free (free list)
+0x26 u16 World X position (absolute)
+0x2A u16 World Y position
+0x30 s16 Velocity X (fixpoint)
+0x36 s16 Velocity Y
+0x3C byte State flags (bits 2/3/6)
+0x3E u16 Death/flash timer
+0x44 s16 Next state address (dispatcher)
+0x46 u16 Animation counter
+0x48 byte HP / life points
+0x54 ptr32 Parent entity pointer (e.g., boss segment -> boss)
+0x58 s16 Screen X (camera-projected)
+0x5A s16 Screen Y (camera-projected)
+0x60 byte Sprite width in tiles
+0x62 byte Sprite height in tiles
+0x66 u16 Angle / heading (8:8 fixpoint)
+0x6A byte Current mode (0xFF = just spawned)
+0x74 u16 Frame delay / animation timer
+0x76 u16 Tile base index
+0x7A ptr16 Metasprite pointer (near, low ROM)
+0x7E u16 Metasprite flags
7.3 Function-pointer state machine¶
/* Change state: store new function pointer at +0x00 */
entity->state = &state_walk;
/* Dispatcher, called every frame: */
(*(void(*)())entity->state)(); /* call current state handler */
/* Global scheduler: */
next_state_fn = ptr_next_function; /* next global state */
action_flags |= 8; /* flag "new action" */
wait_frames = N; /* frames to wait */
bios_return_hook = 0x82C0; /* BIOS return hook */
Benefit: no dispatch table, no switch/case. Cost: 4 bytes per entity for the state pointer. This is the universal NGPC game-object pattern observed across multiple commercial titles.
8. Object Struct & Script Engine¶
This architecture pushes the "state machine via function pointer" pattern to its maximum,
as observed in a commercial NGPC title. Register XIZ serves as this (equivalent to the
C++ this pointer). Nearly every game object follows this schema.
8.1 XIZ object struct (~104 bytes)¶
Offset | Size | Role
--------+-------+--------------------------------------------------
+0x00 | ptr32 | State function pointer (state machine)
+0x09 | byte | Active / enable flag
+0x1C | byte | Mode flags (bits 0..3)
+0x1D | byte | Object-local frame counter
+0x20 | ptr32 | Current data pointer (XIY ptr)
+0x22 | byte | Animation frame counter
+0x24 | ptr32 | Secondary animation data pointer
+0x28 | byte | Current animation frame ID
+0x2A | byte | Sprite flags (H/V flip, etc.)
+0x2B | byte | Render mode (0x30 = 2-layer sprite)
+0x2C | ptr32 | Current position in script sequence
+0x2F | byte | Palette index override
+0x30 | byte | Current frame data
+0x31 | byte | Repeat count
+0x32 | byte | Speed scale (1=100%)
+0x33 | byte | Direction flags (bit7 = mirror X)
+0x34 | byte | X position (camera-relative)
+0x35 | byte | Y position (camera-relative)
+0x38 | ptr32 | Main animation data pointer
+0x3A | s8 | Velocity X
+0x3B | s8 | Velocity Y
+0x3C | s8 | Acceleration X
+0x3D | s8 | Acceleration Y
+0x3E | byte | Hitbox / camera width
+0x3F | byte | Hitbox / camera height
+0x40 | byte | Timer reset value
+0x42 | byte | World X position (clamped, for camera)
+0x44 | byte | World Y position (clamped)
+0x48 | byte | Screen X position (post-clamp)
+0x4A | byte | Screen Y position (post-clamp)
+0x4C | byte | Current animation frame index
+0x4D | byte | Current animation ID
+0x4E | byte | General-purpose countdown timer
+0x4F | byte | End-of-sequence flag
+0x50 | byte | Current state ID
+0x52 | byte | Next state ID
+0x64 | ptr32 | State machine pointer (read-only backup)
8.2 Function-pointer state machine (XIZ+0x00)¶
Identical to the entity-pool pattern, but used for ALL objects (more systematic):
; Change state: write new handler into XIZ+0x0
ld (XIZ+0x0), XIX ; XIX = address of new state function
; Dispatcher (called every frame):
call (XIZ+0x0) ; indirect call through state pointer
/* C equivalent: */
typedef void (*StateFunc)(void);
((StateFunc)(obj->state_ptr))(); /* execute current state */
obj->state_ptr = (u32)&new_state_func; /* transition to new state */
The pointer IS the state. No dispatch table, no switch/case. Cost: 4 bytes per object.
8.3 Script engine bytecode¶
This title implements a mini-interpreter for animation and behavior sequences.
Bytecode is stored in ROM, pointed to by XIZ+0x2C.
Format of one script element (1 word + 1 dword):
Opcodes (value in B):
B > 0 : frame duration (show this frame for B ticks)
B == 0 : end of sequence
B == -1 : loop back to start
B == -2 : jump offset -2
B == -3 : special event
B == -5 : callback
Main interpreter routine:
1. Read BC from (XWA) — XWA = current sequence pointer
2. If B > 0: load XIY = (XWA+2) (sprite data), increment frame counter
3. If B <= 0: dispatch via a 12-entry jump table
Compact format: 6 bytes per frame (1 word header + 1 dword data pointer). Used universally for animations, behaviors, and cutscenes.
8.4 Sprite format conversion¶
A converter routine maps the internal sprite format to NGPC OAM hardware:
Source (1 word, internal):
bits 11..8 : palette index (4 bits)
bits 7..0 : tile index
Destination (NGPC OAM format):
QD = tile index low byte
QE = palette index (shifted right 4)
9. Graphics Pipeline: PNG to Screen¶
9.1 Method A: helper functions¶
#include "ngpc_gfx.h"
#include "../GraphX/my_tileset.h"
#define TILE_BASE 128u /* avoid overwriting sysfont (slots 32-127) */
static void scene_init(void) {
u16 i;
ngpc_gfx_clear(GFX_SCR1);
ngpc_gfx_set_bg_color(RGB(0, 0, 0));
/* Load tiles (NGP_FAR handled internally by the helper) */
ngpc_gfx_load_tiles_at(my_tileset_tiles,
my_tileset_tiles_count,
TILE_BASE);
/* Load palettes */
for (i = 0; i < (u16)my_tileset_palette_count; ++i) {
u16 off = (u16)i * 4u;
ngpc_gfx_set_palette(GFX_SCR1, (u8)i,
my_tileset_palettes[off + 0],
my_tileset_palettes[off + 1],
my_tileset_palettes[off + 2],
my_tileset_palettes[off + 3]);
}
/* Write tilemap */
for (i = 0; i < my_tileset_map_len; ++i) {
u8 x = (u8)(i % my_tileset_map_w);
u8 y = (u8)(i / my_tileset_map_w);
u16 tile = (u16)(TILE_BASE + my_tileset_map_tiles[i]);
u8 pal = (u8)(my_tileset_map_pals[i] & 0x0Fu);
ngpc_gfx_put_tile(GFX_SCR1, x, y, tile, pal);
}
}
9.2 Method B: direct VRAM blit macro¶
Use when debugging near/far pointer issues or when Method A produces corrupted output:
#include "ngpc_tilemap_blit.h"
#include "../GraphX/my_tileset.h"
#define TILE_BASE 128u
static void scene_init(void) {
ngpc_gfx_clear(GFX_SCR1);
ngpc_gfx_set_bg_color(RGB(0, 0, 0));
NGP_TILEMAP_BLIT_SCR1(my_tileset, TILE_BASE);
}
This macro writes tiles directly to Character RAM (0xA000), the tilemap directly
to HW_SCR1_MAP (0x9000), and loads palettes via ngpc_gfx_set_palette().
No pointer argument passing — avoids near/far issues completely.
Diagnostic rule: if Method B renders correctly but Method A does not, the issue is a near/far pointer problem in the helper call. If both fail, the asset itself is corrupted or video init is wrong.
9.3 Tilemap constraints¶
| Constraint | Value |
|---|---|
| Scroll plane map size | 32x32 tiles |
| Visible screen area | 20x19 tiles (160x152 px) |
| Free tile slots | 128-511 (0-31 reserved, 32-127 = BIOS sysfont) |
| Palettes per plane | 16 palettes x 4 colors, format 0x0BGR |
| Palette 0, color 0 | Always transparent on scroll planes |
| Total tile VRAM | 512 tiles (Character RAM = 8 KB) |
| Max colors per tile | 3 visible + 1 transparent |
| Max palettes (budget) | 16 per plane |
tiles_count from the tilemap export tool = number of u16 words (= nb_tiles * 8),
not the number of tiles.
9.4 Debug checklist¶
- Palettes: loaded on the correct plane? SCR1 and SCR2 have separate palette RAM.
- Tile base: not overwriting sysfont? Use
tile_base >= 128. - NGP_FAR: all pointers to ROM (
0x200000+) declared withNGP_FAR? - Method B test: does
NGP_TILEMAP_BLIT_SCR1render correctly? - Yes -> asset is fine, problem is near/far in helper
- No -> check asset generation or video init
- Raw data check: verify generated
tiles[],map_tiles[],palettes[]byte-by-byte before blaming the C pipeline
10. Known Bugs & Solutions¶
[ngpc_soam] Blank screen at startup / watchdog reset loop (hardware)
- Symptom: ROM boots, white screen, resets after ~100ms, repeats.
- Root cause: for (u8 i = 0; i < SPR_MAX * 4; i++) with SPR_MAX=64
gives limit 256. u8 overflows 255->0 -> infinite loop in isr_vblank() ->
watchdog never fed -> hardware reset.
- Fix: use u16 b for the OAM loop counter (256 iterations minimum).
- Rule: never use u8 for a loop counter if iteration count can reach 256.
Same family as u16 y * SCR_MAP_W overflow.
/* WRONG: infinite loop */
u8 i;
for (i = 0; i < SPR_MAX * 4; i++) hw_oam[i] = s_oam[i];
/* CORRECT: u16 counter */
u16 b;
for (b = 0u; b < (u16)SPR_MAX * 4u; b++) hw_oam[b] = s_oam[b];
[ngpc_soam] Sysfont text invisible on dark background
- Symptom: sprites visible and working, zero text on screen.
- Root cause: palette 0 of GFX_SCR1 never initialized -> color 1 (sysfont foreground)
= 0x000 (black) -> black text on dark background.
- Fix: always call ngpc_gfx_set_palette(GFX_SCR1, 0u, ...) in scene init,
after ngpc_gfx_fill(). Mandatory for any project using text rendering.
/* After ngpc_gfx_fill(GFX_SCR1, ' ', 0u): */
ngpc_gfx_set_palette(GFX_SCR1, 0u,
RGB(0, 0, 0), /* color 0: transparent */
RGB(15, 15, 15), /* color 1: white (sysfont foreground) */
RGB(8, 8, 10), /* color 2: light gray */
RGB(3, 5, 15)); /* color 3: blue accent */
Sprites appear as tilemap tiles / tile base conflict - Symptom: sprite slots show terrain tiles or garbage from another asset. - Root cause: sprite tile base overlaps with tilemap or background tile base. When backgrounds are re-uploaded (state transition, level load), they overwrite sprite tiles. - Fix: plan tile VRAM layout explicitly. Assign separate, non-overlapping ranges: tile 0-31 = reserved, 32-127 = sysfont, 128-N = backgrounds, N+1..511 = sprites. Regenerate all assets if the layout changes.
Sprite appears "cut in half" when H-flipped
- Symptom: multi-tile sprite appears mirrored on one side only, or two halves don't align.
- Root cause: H-flip toggled on each part but left/right quad positions not swapped.
- Fix: for H-flip on a multi-tile sprite, swap the left and right columns of parts
AND toggle SPR_HFLIP on each individual part. Same logic applies for V-flip
(swap top/bottom rows). The ngpc_mspr_draw() function handles this correctly.
Stale sprites from previous state visible on screen
- Symptom: sprites from title screen, menu, or previous game state remain visible
after a state transition.
- Root cause: sprite slots not explicitly cleared on state exit.
- Fix: call ngpc_soam_hide_all() or ngpc_sprite_hide_all() in every _init()
before setting up the new state's sprites.
"Bullet time" / frame lag with many sprites and projectiles
- Symptom: game slows down noticeably when several sprites and bullets are active.
- Root cause: full OAM write (tile + flags + X + Y) for every slot every frame,
even when only position changed.
- Fix: use move-only writes when only X/Y changed. Use shadow OAM with
flush_partial() (only active slots). Avoid unnecessary tile re-uploads.
Corrupted graphics / "garbage" tiles
- Symptom: tile or sprite data looks like random noise or wrong asset data.
- Root cause: near/far pointer mismatch. ROM assets at 0x200000+ accessed without
NGP_FAR -> the address is truncated to 16 bits -> reads from wrong location.
- Fix: always declare ROM asset pointers with NGP_FAR.
Use Method B (direct VRAM blit macro) to isolate the issue.
11. OAM Watermark & Dynamic Tile Upload¶
This section documents an OAM watermark and dynamic-tile-upload technique observed in a commercial pseudo-3D simulation title. All addresses, byte values, and patterns are confirmed from reverse engineering.
11.1 Context¶
The title's sprite engine manages up to 64 OAM slots for a variable number of objects each frame (objects appear and disappear as the view moves forward). It uses a dynamic watermark instead of fixed slot ranges, which is a cleaner allocation model for scenes with variable sprite counts.
11.2 OAM Pool Init¶
oam_pool_init:
ld XWA, 0x8800 ; HW OAM base
ld (0x44A0), XWA ; watermark = start of OAM
ld WA, 0x1186 ; starting tile index (4486)
ld BC, 0xF8F8 ; X=0xF8, Y=0xF8 (off-screen, hidden)
ld XIX, 0x8800
ld E, 0x40 ; 64 sprites
loc_:
ldw (XIX+), WA ; write tile + flags
ldw (XIX+), BC ; write X=0xF8, Y=0xF8
inc 1, XWA ; next tile
djnz E, loc_
ret
All 64 slots initialized to consecutive tile indices starting at 0x1186, hidden at
(0xF8, 0xF8). The watermark (0x44A0) tracks where the used portion ends.
11.3 OAM Watermark — Dynamic Cursor¶
Instead of assigning fixed slot ranges per object type, this engine uses a single advancing cursor:
- RAM
0x44A0: current OAM write pointer (byte address inside0x8800..0x88FF). Incremented by 4 each time a sprite is submitted. Reset to0x8800each frame. - RAM
0x44AA: number of sprites used this frame (slot count, not byte count). Used to size the LDIRW flush in VBlank. - No per-type slot budgets. All 64 slots are available to any object type.
C equivalent:
static u8 oam_cursor = 0; /* slot index, reset to 0 each frame */
u8 oam_submit(u8 x, u8 y, u16 tile, u8 flags) {
volatile u8 *p;
if (oam_cursor >= 64) return 0xFF; /* overflow guard (threshold 0x41) */
p = (volatile u8 *)(0x8800u + (u16)oam_cursor * 4u);
p[0] = (u8)tile;
p[1] = flags | (u8)((tile >> 8) & 1u);
p[2] = x;
p[3] = y;
return oam_cursor++;
}
11.4 Tail-Clear — XY Only¶
After flushing the active sprites for this frame, slots that were used last frame but are no longer needed are cleared by writing only the XY bytes (offset +2 in the 4-byte slot). The tile_id and flags are left intact.
; Tail-clear: from old watermark down to new watermark
ld WA, 0xF8F8 ; X=0xF8, Y=0xF8
ld XIZ, (0x44A0) ; old watermark (frame N-1)
ld (0x44A0), XIX ; update watermark to frame N
loop:
cp XIX, XIZ
jr PL, done
ldw (XIX+2), WA ; write ONLY bytes +2,+3 (X and Y)
inc 4, XIX ; advance by full slot size
jr loop
Why XY only: writing 2 bytes instead of 4 halves the memory traffic for the
tail-clear pass. The tile_id retained in the slot also lets the hardware recycle
the slot without re-initializing it. Confirmed: ldw (XIX+2), WA targets byte
offset +2 within the 4-byte OAM entry (the X byte), which is consistent with the
documented layout tile / flags / x / y.
11.5 Pre-Baked Scaling — Frame List Format¶
No hardware scaling on NGPC. The depth-scaling illusion is achieved by selecting from a set of pre-drawn sprite sizes depending on the object's computed zoom level.
Each frame_list entry is 5 bytes:
+0 u16 tile_index ; index into ROM tile bank (0 = end-of-list sentinel)
+2 u8 x_off ; X offset from entity screen origin
+3 u8 y_off ; Y offset
+4 u8 flags ; SPR_HFLIP, SPR_VFLIP, priority bits
Consumer loop:
loc_:
ld IY, (XIY+) ; IY = tile_index (u16), XIY advances +2
and XIY, XIY
jrl Z, done ; tile_index=0 → end of list
inc 1, E ; E = slot counter (max 0x40 = 64)
ld BC, (XIY+) ; C = x_off, B = y_off
ld A, (XIY+) ; A = flags
add C, W ; C += screen base X (W register)
add B, W ; B += screen base Y
ld (XIX+), D ; write attr (D = palette+priority, preloaded)
ldw (XIZ+), BC ; write X,Y to shadow palette region
jr loc_
The D register carries the palette+priority byte for all parts of one object
(preloaded before calling the renderer). Its bits encode priority (bits 4-3) and
palette (bits 2-0) in the same format as the OAM flags byte.
11.6 Dynamic Tile Upload Accumulator¶
This engine does not keep static tile data in VRAM. Instead, it uploads only the
tiles that will be visible this frame, into a fixed VRAM window. Each entity's
renderer appends its tile data to a RAM accumulator buffer during game logic,
then a single ldirw flushes the whole batch before VBlank:
tile_upload_flush:
ld BC, (0x5630) ; BC = number of tiles queued this frame
sll 3, XBC ; * 8 words per tile (8 u16 = 16 bytes = 1 tile 8x8 2bpp)
lda XIX, (0xB010) ; XIX = VRAM tile write ptr (stored in RAM)
lda XIY, (0x4D70) ; XIY = RAM accumulator buffer base
ldw (0x5630), 0x0000 ; reset tile counter to 0
ldirw (XDE+),(XHL+) ; flush: copy accumulator → VRAM in one pass
ret
Key addresses:
- 0x4D70 — RAM tile upload buffer (accumulator)
- 0x5630 — tile count queued this frame (reset to 0 after flush)
- 0xB010 — VRAM tile write pointer (stored in RAM, advances each frame)
When to use this pattern: This is appropriate when total unique tiles across all visible objects exceeds the available VRAM tile window, and objects change each frame. For static scenes or games with a fixed sprite set, a one-time VRAM upload at level load is simpler and faster.
Quick Reference¶
| Item | Address / Value | Notes |
|---|---|---|
| OAM base | 0x8800 |
64 sprites x 4 bytes |
| Palette indices | 0x8C00 |
64 bytes, 1 per sprite |
| OAM entry layout | tile / flags / x / y | byte order within 4-byte entry |
| Priority hidden | SPR_HIDE = 0x00 |
bits 4-3 = 00 |
| Priority behind | SPR_BEHIND = 0x08 |
bits 4-3 = 01 |
| Priority middle | SPR_MIDDLE = 0x10 |
bits 4-3 = 10 |
| Priority front | SPR_FRONT = 0x18 |
bits 4-3 = 11 |
| H-flip | SPR_HFLIP = 0x80 |
bit 7 |
| V-flip | SPR_VFLIP = 0x40 |
bit 6 |
| H-chain | SPR_HCHAIN = 0x04 |
bit 2 |
| V-chain | SPR_VCHAIN = 0x02 |
bit 1 |
| Tile index bit 8 | flag bit 0 | for tiles 256-511 |
| Shadow OAM RAM | 320 bytes | s_oam[256] + s_col[64] |
| OAM flush | LDIRW 128 words | in VBlank ISR only |
| Pal idx flush | LDIRW 32 words | with OAM flush |
| Free tile slots | 128-511 | 0-31 reserved, 32-127 sysfont |
| Max sprites HW | 64 | hard limit |
| Max visible colors / tile | 3 + transparent | 2bpp, index 0 = transparent |
| Character Over | HW_STATUS & 0x80 |
too many sprites on one scanline |
See Also¶
- Hardware-Registers.md — OAM and palette index register addresses
- Game-Loop.md — VBlank ISR, OAM flush timing
- DMA.md — MicroDMA for sprite updates
- Tilemaps-and-Scrolling.md — Tile VRAM layout, tilemap pipeline
- Colors-and-Palettes.md — Palette RAM, shadow palette, palette effects
- Asset-Pipeline.md — PNG export, sprite bundling, tile tools
- Assembly.md — TLCS-900H assembly, LDIRW patterns