Skip to content

Game Loop

VBlank sync, ISR design, watchdog, frame budget, state machine, and three reference pipeline architectures for NGPC games.

Note: All files use ASCII only (avoids encoding issues on Windows/PowerShell).


1. Frame Sync

The NGPC runs at ~60 Hz driven by the VBlank interrupt. All game logic must be frame-locked to this signal. Two patterns exist in production games.

1.1 halt-based sync

The CPU executes halt which pauses execution until the next interrupt fires. The VBlank ISR sets a flag and returns; the main loop resumes from after the halt.

/* In the template (ngpc_vsync): */
void ngpc_vsync(void) {
    __asm("halt");  /* sleep until VBlank ISR runs */
    /* check shutdown after waking */
    if (*(volatile u8 *)0x6F85) ngpc_shutdown();
}

1.2 Flag-based sync

The VBlank ISR writes 1 to a RAM flag (0x4000) at the end of its work. The main loop polls this flag instead of using halt.

/* Flag-based style (allows non-VBL IRQs to fire during wait) */
void wait_vbl(void) {
    volatile u8 *sync = (volatile u8 *)0x4000;
    while (*sync == 0) {}   /* passive wait */
    *sync = 0;              /* reset for next frame */
}

Advantage over halt: non-VBlank IRQs (Timer0, DMA completion) continue to fire during the wait. Advantage over CPU polling: no wasted cycles.

1.3 Sync counters vs frame counter

Keep two distinct VBlank-counter roles:

  • A monotonic frame counter that is never reset — used for absolute timestamps and cooldowns (e.g. (u8)(now - event_time) >= cooldown).
  • Separate resettable sync counters for frame-pacing waits.

Do not reuse one counter for both purposes — resetting the timestamp base corrupts any cooldown that referenced it.

N-frame sync via double-sync: to resume cleanly at the start of a fresh frame, wait N VBlanks, reset the sync counter, then wait one more VBlank edge. The trailing single-edge wait guarantees execution resumes at the very start of a frame rather than partway through the one in which the Nth edge fired.

void wait_n_frames_aligned(u8 n) {
    u8 start = g_vb_counter;
    while ((u8)(g_vb_counter - start) < n) { HW_WATCHDOG = WATCHDOG_CLEAR; }
    /* re-align: wait one more fresh VBlank edge */
    u8 edge = g_vb_counter;
    while (g_vb_counter == edge) { HW_WATCHDOG = WATCHDOG_CLEAR; }
}

2. Minimal VBlank ISR

2.1 Mandatory operations

The VBlank ISR must never be disabled. Minimal mandatory content:

  1. Feed the watchdog: HW_WATCHDOG = 0x4E
  2. Check HW_USR_SHUTDOWN and request power-off if set

Template note: shutdown is handled in main loop context (ngpc_vsync()) to avoid calling BIOS from inside an ISR on certain hardware configurations.

2.2 Safe pattern

static volatile u8 g_vb_counter = 0;

void __interrupt isr_vblank(void) {
    /* 1. Feed watchdog immediately */
    *(volatile u8 *)0x006F = 0x4E;

    /* 2. Flash OAM shadow -> hardware (256 bytes) */
    /* (LDIRW: 128 u16 words, see Sprites and OAM) */

    /* 3. Push scroll register shadows */
    *(volatile u8 *)0x8032 = scr1_x_shadow;
    *(volatile u8 *)0x8033 = scr1_y_shadow;

    /* 4. Audio tick */
    audio_tick();

    /* 5. Frame counter */
    g_vb_counter++;
}

/* Minimal version if no OAM/scroll update needed: */
void __interrupt isr_vblank_minimal(void) {
    *(volatile u8 *)0x006F = 0x4E;
    g_vb_counter++;
}

2.3 Startup sequence

void main(void) {
    ngpc_init();            /* installs VBL ISR, inits hardware */
    ngpc_load_sysfont();    /* load ASCII glyphs to tile slots 32-127 */

    /* ... load assets, init game state ... */

    while (1) {
        ngpc_vsync();           /* halt until VBlank */
        ngpc_input_update();    /* snapshot joypad once per frame */
        game_update();
        render();
    }
}

Interrupt setup:

void init_interrupts(void) {
    __asm("di");
    *(u32 *)0x6FCC = (u32)isr_vblank;   /* VBL vector */
    *(u32 *)0x6FD4 = (u32)isr_hblank;  /* Timer0/HBlank (optional) */
    __asm("ei");
}

A common pattern: bulk-copy all 18 ISR pointers in one LDIRW from a ROM table to 0x6FB8..0x6FFC — simpler than 18 individual assignments.


3. Watchdog & Blocking Loops

The NGPC hardware resets the CPU if the watchdog register is not written within ~100ms. The VBlank ISR (60 Hz) normally feeds it. In long blocking operations, feed it manually.

#define HW_WATCHDOG    (*(volatile u8 *)0x006F)
#define WATCHDOG_CLEAR 0x4E

/* Before any long operation */
HW_WATCHDOG = WATCHDOG_CLEAR;

Example — pool init loop (2952 iterations):

for (int i = 0; i < 0xB88; i++) {
    pool[i] = 0;
    *(volatile u8 *)0x006B = 0x4E;  /* watchdog kick every iteration */
}

Some commercial games use address 0x006B (byte 1 of the function at 0x006A). The official SDK uses 0x006F. Both addresses work on hardware. Use 0x006F in new code (SDK standard).

Sleep/wait-for-key loops also need watchdog:

void wait_frames(u8 n) {
    u8 start = g_vb_counter;
    while ((u8)(g_vb_counter - start) < n) {
        HW_WATCHDOG = WATCHDOG_CLEAR;
    }
}


4. Frame Budgets

4.1 CPU budget per frame

Item Value
CPU frequency 6.144 MHz
Cycles per frame (60 Hz) ~102,400 cycles
VBlank window duration ~3.94 ms = ~24,200 cycles
HBlank duration ~5 us = ~30 cycles

Typical distribution:

VBlank ISR (watchdog + OAM flush + scroll + audio)  : ~2,000 cycles
Game logic (update + AI + physics)                  : ~50,000 cycles
VRAM updates (outside VBlank, via VRAMQ)            : ~20,000 cycles
Margin                                              : ~30,000 cycles

IRQ context-switch is not free. Each IRQ entry costs ≈ 80 cycles (prologue + branches + I/O + RETI). A per-scanline raster split (152 IRQ/frame) therefore burns ≈ 12,000 cycles/frame (~12 % of budget) — enough to drop a real cart to < 1 fps even though some emulators hide the cost. Prefer one split per frame for a HUD freeze; see Effects and Raster §6.6. Note also that arming a Timer0 split must happen right after ngpc_vsync() (scanline ~0), before the variable-length game update, or the fire scanline jitters.

4.2 RAM budget

Zone Size
Total work RAM 12 KB (0x004000-0x005FFF)
Stack ~1 KB
Template globals ~200 bytes
Audio driver state ~500 bytes
Sprite/metasprite state ~300 bytes
Available for game ~9-10 KB

4.3 VBlank window budget

The VBlank window is ~24,200 cycles. Operations safe inside VBlank: - OAM flush (shadow -> 0x8800): 256 bytes = 128 LDIRW words = ~1,000 cycles - Scroll register push: 4 byte writes = ~20 cycles - Palette index flush (64 bytes): ~500 cycles - VRAMQ flush: variable, depends on pending commands

NEVER inside VBlank: - DMA operations (hardware watchdog power-off — confirmed by commercial game analysis) - Long computation loops without watchdog feeds

If raster DMA is active (channels streaming scroll registers per-scanline), stop all DMA channels before any VBlank work, then re-arm after. See DMA for the full stop/re-arm pattern.


5. State Machine Pattern

5.1 Basic enum state machine

typedef enum { STATE_TITLE, STATE_GAME, STATE_GAMEOVER } GameState;
static GameState s_state = STATE_TITLE;

void main(void) {
    GameState prev = STATE_GAMEOVER;  /* force init on first frame */

    ngpc_init();
    ngpc_load_sysfont();

    while (1) {
        ngpc_vsync();
        ngpc_input_update();

        /* Call _init() once on state entry */
        if (s_state != prev) {
            prev = s_state;
            switch (s_state) {
            case STATE_TITLE:    title_init();    break;
            case STATE_GAME:     game_init();     break;
            case STATE_GAMEOVER: gameover_init(); break;
            }
        }

        /* Call _update() every frame */
        switch (s_state) {
        case STATE_TITLE:    title_update();    break;
        case STATE_GAME:     game_update();     break;
        case STATE_GAMEOVER: gameover_update(); break;
        }
    }
}

5.2 State transitions and cleanup

Each _init() should: - Clear sprites: ngpc_sprite_hide_all() - Clear tilemaps: ngpc_gfx_clear(GFX_SCR1) and/or ngpc_gfx_clear(GFX_SCR2) - Reset scroll: ngpc_gfx_scroll(GFX_SCR1, 0, 0) - Load new assets, set new palettes

5.3 Object pool pattern

Fixed-size pools are recommended for bullets, particles, and enemies. No malloc, no fragmentation, deterministic timing on a 12 KB machine.

#define MAX_BULLETS 16
static Bullet s_bullets[MAX_BULLETS];
static u16    s_active_mask;   /* bit N = slot N is used */

/* Spawn: find first free slot */
s8 bullet_spawn(s16 x, s16 y) {
    u8 i;
    for (i = 0; i < MAX_BULLETS; i++) {
        if (!(s_active_mask & (1 << i))) {
            s_active_mask |= (1 << i);
            s_bullets[i].x = x;
            s_bullets[i].y = y;
            return (s8)i;
        }
    }
    return -1;  /* pool full */
}

/* Update: iterate only active slots */
void bullets_update(void) {
    u8 i;
    for (i = 0; i < MAX_BULLETS; i++) {
        if (!(s_active_mask & (1 << i))) continue;
        s_bullets[i].x += s_bullets[i].vx;
        if (s_bullets[i].x > 200) {
            s_active_mask &= ~(1 << i);  /* free slot */
        }
    }
}

6. Pipeline A — Map Streaming + Main Loop

A scrolling action-game architecture proven by reverse engineering a commercial side-scroller. Use this when the game uses large tilemap streaming.

6.1 VBlank ISR (short)

VBlank ISR (~2,000 cycles — stays short):
  1. HW_WATCHDOG = 0x4E                      feed watchdog
  2. LDIRW shadow_oam -> 0x8800              flush OAM (256 bytes = 128 words)
  3. LDIRW shadow_pal_idx -> 0x8C00          flush palette indices (64 bytes)
  4. write scroll regs from shadow values    0x8032 / 0x8033 / 0x8034 / 0x8035

6.2 Main loop

Main loop (game logic + streaming — uses remaining ~100,000 cycles):
  1. update input         snapshot joypad
  2. update camera_x/y   player physics / scroll logic
  3. clamp camera         between (min_x/y, max_x/y) map bounds
  4. MAP STREAMING X      if cam_x changed: load edge columns
  5. MAP STREAMING Y      if cam_y changed: load edge rows
  6. update scroll shadow cam_x & 0xFF, cam_y & 0xFF
  7. game logic           enemies, collision, SFX, score, etc.
  8. build shadow OAM     world -> screen projection for all entities
  9. halt                 wait for next VBlank

6.3 Critical rules

  • Map streaming is synchronous in main loop — this architecture never streams during VBlank.
  • Scroll registers are pushed from shadow values written in step 6.
  • OAM is built after streaming (step 8) because entities need the updated camera.
  • DMA is not used. LDIRW handles OAM and tilemap transfers.

6.4 Representative Exact Call Order

A representative main loop with a watchdog panic guard after it (infinite jp T = safe hang on overrun):

main_loop:
          call vblank_wait    ; VBlank wait — polls frame counter > 0x1E
          call input_read     ; input — SWI call to BIOS joypad read
          call pool_reset     ; entity pool reset + budget counter = 0x1E (30)
          call window_reset   ; window register reset (160x152)
          call oam_build      ; shadow OAM builder — world->screen projection + clip
          call game_logic     ; (game logic — enemies, camera, collision)
          call scroll_update  ; scroll camera update — sub-pixel integration + clamp (SCR1+SCR2)
          call vblank_flush   ; VBlank flush — optional palette scatter, then LDIRW OAM+palette
          call game_state     ; (game state / scoring)
          call scroll_write   ; scroll register write — SCR1_X/Y -> 0x8032/0x8033, SCR2 -> 0x8034/0x8035
          call window_flush   ; window shadow flush — LDIRW shadow -> 0x8002-0x8005 (4 bytes)
          jp   main_loop      ; loop

Key notes from this order: - VBlank wait first (vblank_wait): halts until the frame counter increments. - Input immediately after VBlank: fresh joypad state before any logic. - Pool reset before OAM build (pool_reset): clears slot counter and sets entity budget. - OAM builder before game logic (oam_build before game_logic): shadow is built from last frame's positions, then game logic mutates positions for next frame. Sprite positions are 1-frame behind logic updates by design (double-buffer pattern). - Scroll camera update (scroll_update): sub-pixel integration + clamp for SCR1 and SCR2. Uses a mutex flag to prevent re-entry. - VBlank flush (vblank_flush): writes shadow OAM -> 0x8800 (128 words) + conditionally palette indices -> 0x8c00. Also writes PO.V (0x8021) = neg camera Y. - Scroll regs written late (scroll_write): after all position math is done. Writes SCR1 (0x8032/0x8033) and SCR2 (0x8034/0x8035) from the scroll context values. - Window flush last (window_flush): 9-byte routine — LDIRW shadow -> 0x8002, BC=2. Writes window right/bottom bounds and control flags from RAM shadow.


7. Pipeline B — All Logic in VBlank

A different architecture: all game logic runs inside the VBlank ISR. The main loop is just an infinite halt loop.

7.1 VBlank ISR (all logic here)

VBL ISR — 8 steps, all logic here:
  1. BIOS WAIT_VBLANK call                    sync to K2GE signal
  2. joypad read + edge detection             just_pressed / just_released
  3. music timer update                       note_timer--, advance sequence if 0
  4. joypad snapshot                          prev_pad = cur_pad
  5. entity update loop                       ALL game logic: AI, physics, spawning
  6. OAM build                                shadow -> screen projection
  7. post-update / effects                    screen shake, palette flash, etc.
  8. OAM flush -> 0x8800                      LDIRW shadow OAM to hardware

Main loop:
  infinite loop: halt (wait for VBL)

Pipeline B vs Pipeline A: - Pipeline A: game logic in main loop, VBL = only flush OAM + scroll registers - Pipeline B: game logic IN VBL ISR, main loop = infinite halt

Both approaches are valid. Template 2026 follows the Pipeline A model (main loop logic), which is safer for VBlank budget management.

7.2 Watchdog in long loops

Pipeline B implementations feed the watchdog every iteration of their pool init loop (2952 iterations) to avoid a reset during startup. See §3 for the full pattern.

7.3 Joypad Edge Detection

The VBL ISR reads the joypad and computes edge-triggered signals using three RAM bytes:

cur_pad     new raw joypad byte (updated each VBL)
prev_pad    copy of previous frame's raw byte
edge_mask   buttons pressed this frame only (rising edge)

ASM:

; A = new joypad read (from BIOS or hardware register)
; W = previous joypad value loaded before the update
ld  (prev_pad),W        ; prev_pad = old cur_pad
ld  (cur_pad),A         ; cur_pad  = new read
and W,A                 ; W = prev_pad & cur_pad  (bits held in both frames)
xor W,0xFF              ; W = ~(prev & cur)
and A,W                 ; rising = new & ~(prev & cur)
                        ;       = new & (new ^ prev) — bits that just went 1
ld  (edge_mask),A       ; store rising-edge mask

C equivalent:

u8 prev = cur_pad;
cur_pad = read_joypad();                     /* hardware or BIOS read */
u8 held  = prev & cur_pad;
rising   = cur_pad & (u8)~held;             /* equivalent: cur & (cur ^ prev) */

rising has a 1 only for buttons that transitioned 0->1 this frame. Use cur_pad for held/continuous checks and rising for single-press actions.

7.4 ROM Dispatch Table Pattern

An animation state machine can call entity-type-specific functions via a ROM function pointer table. Entry size = 8 bytes (1 far function pointer).

; BC = state index (0, 1, 2, ...)
; BC *= 8 to get byte offset into table
sll    0x3, BC            ; BC <<= 3  (multiply by 8)
lda    XIY, dispatch_table ; XIY = base of ROM dispatch table
ld     XBC, (XIY+BC)      ; load far function pointer
call   T, XBC             ; call it unconditionally

C equivalent:

typedef void (*EntityHandler)(void);
extern const EntityHandler dispatch_table[];   /* in ROM */
dispatch_table[state_index]();

Key properties of this pattern: - Stride = 8 bytes per entry (size of a far pointer on TLCS-900H with cc900 large model) - No bounds check — caller is responsible for keeping state_index in range - Table lives in ROM (const) — no RAM overhead per entry - Compatible with animation states, screen states, enemy-type handlers, etc.

7.5 Conditional Return Guards

TLCS-900H supports conditional ret in 2 bytes. This pattern is used extensively as an early-exit guard at function entry:

; Sprite budget guard:
ld   E,(sprite_count)   ; E = sprite count
cp   E,0x40             ; compare with limit (64)
ret  GE                 ; return if E >= 64  (2 bytes, ~2 cycles)

; Direction guard:
cp   (XIX+0x11),0x0
ret  NZ                 ; return if direction != 0

; Range guard:
cp   WA,(XIX+0x14)
ret  Z                  ; return if already at max position

All common condition codes work: ret GE, ret GT, ret LE, ret LT, ret Z, ret NZ.

Rule: always use ret <cond> as a guard rather than a branch over the entire function body — it is 2 bytes vs 3-6 bytes for a jump, and reads as a precondition check at the call site.


8. Pipeline C — Flag Sync + DMA Stop/Re-arm

A third architectural variant (virtual-pet / menu engine style): the VBlank handler runs as a standard ISR but uses a RAM sync flag (0x4000) that the main loop waits on. Game logic stays in the main loop (like Pipeline A) while guaranteeing VBlank atomicity.

8.1 VBlank ISR (12 steps)

VBL ISR (0x0A62) — 12 steps:
  1.  ld (0x006F),0x4E               watchdog kick (BEFORE raster wait)
  2.  loop: cp (0x8009),0x98         wait for RAS_V >= 0x98 (VBlank start)
  3.  push bank3 registers
  4.  if DMA active (0x6F85 != 0):
        ld (0x007C..0x007F),0        stop DMA0V-DMA3V (shadow enables)
        ldc DMAC_0..DMAC_3, 0       clear all DMAC registers
        call re_arm_dma             re-arm DMA for next frame
  5.  swi 0xFFFF04                   BIOS WAIT_VBLANK (K2GE sync signal)
  6.  callback 1 (ptr at 0x44B0)     if flag set
  7.  callback 2 (ptr at vbl_callback) DMA dispatcher
  8.  callback 3 (ptr at 0x5056)
  9.  call main_game_logic            entity update, physics, etc.
  10. inc (0x456D)                    global frame counter
  11. ld (0x4000),1                   SET VBL sync flag (main loop waits for this)
  12. pop registers + reti

8.2 Main loop (flag-based sync)

/* Main loop waits for ISR to signal completion */
while (1) {
    /* Wait for VBL ISR to set sync flag */
    volatile u8 *sync = (volatile u8 *)0x4000;
    while (*sync == 0) {}
    *sync = 0;

    /* Post-VBL frame logic: input, per-frame updates */
    input_update();
    game_frame_update();
}

Advantage over halt: non-VBlank IRQs continue to fire during the wait, allowing Timer0/DMA events to be processed without interruption.

8.3 Stop-all-DMA sequence (CRITICAL)

This architecture stops all DMA channels before doing any VBlank work:

ld (0x007C),0    ; stop DMA0V (shadow enable -> DMA channel 0)
ld (0x007D),0    ; stop DMA1V
ld (0x007E),0    ; stop DMA2V
ld (0x007F),0    ; stop DMA3V
ldc DMAC_0,WA   ; clear DMAC0 (WA=0)
ldc DMAC_1,WA   ; clear DMAC1
ldc DMAC_2,WA   ; clear DMAC2
ldc DMAC_3,WA   ; clear DMAC3
call re_arm_dma ; re-arm for next frame

This confirms the NGPC hardware rule: active DMA during VBlank = watchdog power-off. The correct pattern is: stop -> VBL work -> re-arm (as implemented in Template 2026).

8.4 VBL Callback Pointer

Full ASM detail: see DMA §4.6.

This engine installs an indirect VBL callback at a fixed RAM address. The VBL ISR does not call any game function directly — it calls whatever function pointer is currently stored at that address. This allows the game to hot-swap the VBL action without modifying the ISR.

VBL ISR dispatch (inside ISR):

; VBL ISR calls the current callback indirectly
ld  xhl, (vbl_callback) ; load current callback pointer (far address)
call T, xhl             ; call it
reti

Install DMA re-arm callback:

; Store address of DMA re-arm routine as the VBL callback
ld  xhl, dma_rearm_fn
ld  (vbl_callback), xhl

Disable VBL callback (replace with stub return):

; Replace callback with a stub that just returns immediately
ld  xhl, vbl_stub_ret   ; address of: push/pop nothing + ret
ld  (vbl_callback), xhl

C equivalent:

typedef void (*VblCallback)(void);

/* RAM slot for current VBL callback */
extern VblCallback *vbl_fn_ptr;   /* fixed RAM slot for the current VBL callback */

/* Install */
*vbl_fn_ptr = dma_rearm_fn;

/* Disable */
*vbl_fn_ptr = vbl_nop;   /* void vbl_nop(void) {} */

Why this pattern: - The ISR code never changes — only the pointer changes. - DMA can be enabled mid-game by installing the re-arm callback, and disabled during level transitions or cutscenes by swapping to the stub, without touching ISR vectors. - Safe when the swap happens outside the VBL window (during game logic, not inside ISR).

This is a simpler alternative to modifying interrupt vectors. Template 2026 achieves the same effect via the stop/re-arm wrapper in isr_vblank().


9. Pipeline Comparison

                    Pipeline A   Pipeline B   Pipeline C    Template 2026
Game logic        : main loop   VBL ISR      main loop     main loop
Frame sync        : halt        halt         flag 0x4000   halt (ngpc_vsync)
Stop DMA in VBL   : no (no DMA) no (no DMA)  YES (all ch)  YES (stop/re-arm)
Re-arm DMA        : -           -            YES            YES
Watchdog kick     : VBL ISR     VBL ISR      VBL ISR        VBL ISR
OAM flush         : LDIRW VBL   LDIRW in ISR LDIRW callback LDIRW VBL
DMA used          : no          no           YES (raster)   optional

Template 2026 follows the "main loop logic + halt" model, closest to Pipeline A / C, with stop/re-arm DMA conforming to the Pipeline C pattern.


10. Core Module API

10.1 ngpc_sys — System Init

void ngpc_init(void);           /* Call first. Installs VBL ISR, inits viewport. */
u8   ngpc_is_color(void);       /* 1 = NGPC Color, 0 = monochrome NGP */
void ngpc_shutdown(void);       /* Power off (BIOS call) */
void ngpc_load_sysfont(void);   /* Load BIOS font into tile RAM (slots 32-127) */
void ngpc_memcpy(dst, src, n);  /* Byte copy */
void ngpc_memset(dst, val, n);  /* Byte fill */

extern volatile u8 g_vb_counter; /* Frame counter, incremented at 60 Hz */

ngpc_init() installs the VBL ISR automatically. The ISR clears the watchdog, checks shutdown requests, increments g_vb_counter, and flushes the VRAMQ.

10.2 ngpc_vramq — Queued VRAM Updates

Queue tilemap/palette writes during gameplay, then VBlank flushes them safely.

void ngpc_vramq_init(void);
u8   ngpc_vramq_copy(dst, src, len_words);   /* Queue u16 block copy */
u8   ngpc_vramq_fill(dst, value, len_words); /* Queue u16 fill */
void ngpc_vramq_flush(void);                 /* Flush all pending commands */
void ngpc_vramq_clear(void);                 /* Drop pending commands */
u8   ngpc_vramq_pending(void);               /* Number of pending commands */
u8   ngpc_vramq_dropped(void);               /* Commands rejected (queue full) */

Notes: - len_words is in u16 units (not bytes). - Destination must be inside VRAM (0x8000-0xBFFF). - Queue size is fixed at VRAMQ_MAX_CMDS (default 16 commands). - ngpc_sys calls ngpc_vramq_flush() automatically each VBlank.

10.3 ngpc_timing — Timing

void ngpc_vsync(void);          /* Wait for next VBlank (~60 Hz) */
u8   ngpc_in_vblank(void);      /* Check if currently in VBlank */
void ngpc_sleep(u8 frames);     /* Pause N frames (feeds watchdog) */
void ngpc_cpu_speed(u8 div);    /* 0=6MHz, 1=3MHz, 2=1.5MHz, 3=768KHz, 4=384KHz */

10.4 ngpc_input — Joypad

void ngpc_input_update(void);         /* Call once per frame (after vsync) */

extern u8 ngpc_pad_held;              /* Buttons currently down */
extern u8 ngpc_pad_pressed;           /* Buttons just pressed this frame */
extern u8 ngpc_pad_released;          /* Buttons just released this frame */
extern u8 ngpc_pad_repeat;            /* Auto-repeat (for menu navigation) */

void ngpc_input_set_repeat(u8 delay, u8 rate); /* delay/rate in frames */

Button masks: PAD_UP, PAD_DOWN, PAD_LEFT, PAD_RIGHT, PAD_A, PAD_B, PAD_OPTION, PAD_POWER

/* React to a new button press only (not held) */
if (ngpc_pad_pressed & PAD_A) { /* fire! */ }

/* Menu navigation with auto-repeat (15f initial delay, then every 4f) */
ngpc_input_set_repeat(15, 4);
if ((ngpc_pad_pressed | ngpc_pad_repeat) & PAD_DOWN) { /* next item */ }

Edge detection formula (confirmed from commercial game reverse engineering):

pressed  = (prev ^ cur) & cur;   /* rising edges */
released = (prev ^ cur) & prev;  /* falling edges */

10.5 ngpc_math — Math

s8   ngpc_sin(u8 angle);         /* Angle 0-255, returns -127..127 */
s8   ngpc_cos(u8 angle);         /* Same range */
void ngpc_rng_seed(void);        /* Seed RNG from frame counter */
u16  ngpc_random(u16 max);       /* LCG, returns 0..max (max <= 32767) */
s32  ngpc_mul32(s32 a, s32 b);   /* 32-bit signed multiply */

void ngpc_qrandom_init(void);    /* Shuffle table (call after rng_seed) */
u8   ngpc_qrandom(void);         /* Ultra-fast: table read + index increment */

Angles: 0=0 deg, 64=90 deg, 128=180 deg, 192=270 deg, 256 wraps to 0.

ngpc_random limit: extracts bits 16-30 of LCG — result is always in 0..32767 regardless of max. For large random numbers: ngpc_random(255) | (ngpc_random(127) << 8).

For proper game logic (AI, procedural gen) use ngpc_random(). For non-critical randomness (particles, screen shake) use ngpc_qrandom() (zero cost).

10.6 ngpc_debug — CPU Profiler

void ngpc_debug_begin(void);                              /* Mark logic start */
void ngpc_debug_end(void);                                /* Mark logic end */
void ngpc_debug_draw_bar(plane, pal_ok, pal_warn, pal_over); /* Visual bar */
void ngpc_debug_print_pct(plane, pal, x, y);              /* Print "XX%" */
void ngpc_debug_print_fps(plane, pal, x, y);              /* Print "XXFPS" */
u8   ngpc_debug_get_lines(void);                          /* Raw scanlines used */
u8   ngpc_debug_get_pct(void);                            /* CPU usage 0-100+ */

Measures game logic time via the hardware raster position register HW_RAS_V. Bar turns: green (< 80%), yellow (80-100%), red (> 100% = frame overflow).

/* Typical game loop with profiler */
ngpc_vsync();
ngpc_input_update();
ngpc_debug_begin();
    game_update();
ngpc_debug_end();
ngpc_debug_draw_bar(GFX_SCR2, PAL_GREEN, PAL_YELLOW, PAL_RED);

Disable for release: #define NGPC_DEBUG 0 (all calls become no-ops).

10.7 ngpc_log — Ring Buffer Debug Log

void ngpc_log_init(void);
void ngpc_log_clear(void);
void ngpc_log_hex(const char *label, u16 value);
void ngpc_log_str(const char *label, const char *str);
void ngpc_log_dump(plane, pal, x, y);
u8   ngpc_log_count(void);

/* Convenience macros */
NGPC_LOG_HEX("PAD", ngpc_pad_held);
NGPC_LOG_STR("ST",  "RUN ");

Stores short entries in a fixed-size ring buffer (~288 bytes RAM). Useful on hardware where no serial/stdout is available.

10.8 ngpc_assert — Runtime Assert

NGPC_ASSERT(pointer != 0);

In debug builds: assertion failure shows an on-screen fault page and blinks the background in a loop. In release profile: compiled out entirely.

Toggle: #define NGP_PROFILE_RELEASE 1 before including headers.


11. Safe Rules

VRAM writes     : during VBlank window OR via ngpc_vramq. NOT during active render.
                  Active render VRAM access -> Character Over (graphics glitch).
OAM flush       : LDIRW shadow -> 0x8800 in VBlank ISR. Never write OAM mid-frame.
                  Flushing OAM in the main loop AFTER the VBlank sync (rather than
                  inside the ISR) keeps frame timing deterministic; either is valid.
Video registers : shadow ALL video registers in RAM, flush once per VBlank. Never
                  write a video register during game logic. Skip the BG_CTL write on
                  mono hardware; skip scroll writes when a DMA-driven camera owns them.
Scroll regs     : update shadow in main loop; push in VBL ISR. 8-bit, wraps at 256px.
DMA in VBlank   : FORBIDDEN. Active DMA + VBlank = watchdog power-off (hardware).
                  Pattern: stop all DMA -> VBL work -> re-arm DMA.
ISR must be     : short. All heavy work in main loop (Pipeline A / Template model).
                  Long ISR + audio tick = audio glitches and frame budget overflow.
Input snapshot  : call ngpc_input_update() ONCE per frame, after vsync.
                  Multiple calls per frame = inconsistent button state.
Watchdog        : feed at VBL ISR start. Also in any blocking loop > ~5ms.
Timer0 owner    : one user only. raster_chain, raster, and sprmux all compete.
                  ngpc_dma (MicroDMA) + sprmux can coexist: DMA on Timer0,
                  sprmux on Timer1 (ngpc_sprmux_flush_timer1()).
u16 counter     : NEVER use u8 for loop counter if iterations >= 256.
                  (OAM flush = 256 bytes -> u8 counter wraps to 0 = infinite loop.)
State cleanup   : on state transition, hide all sprites + clear tilemap.
                  Leftover sprites from previous state = garbage on screen.

Quick Reference

Item Details
VBL ISR vector 0x6FCC (32-bit ptr)
Watchdog register 0x006F, write 0x4E
Watchdog alt address 0x006B (some commercial games, not SDK)
Frame counter g_vb_counter (volatile u8, 60 Hz)
Frame budget ~102,400 cycles total
VBlank window ~24,200 cycles
HBlank window ~30 cycles
VBL sync flag pattern Pipeline C: poll 0x4000, reset after
OAM flush LDIRW shadow_oam -> 0x8800, 128 words
Scroll push Write 0x8032/0x8033/0x8034/0x8035 in ISR
DMA rule NEVER during VBlank
VRAMQ max cmds 16 (configurable)

See Also