Fixed-Point Math¶
Fixed-point arithmetic, precomputed lookup tables, and tile decompression for NGPC homebrew.
1. Fixed-Point Conventions¶
1.1 Formats Used¶
Fixed-point is the standard technique for sub-pixel positions and smooth velocities on NGPC
(no FPU, no float on TLCS-900H).
Common formats in NGPC homebrew:
| Format | Precision | Typical use |
|---|---|---|
| 8.8 (u16, 256ths) | 1/256 pixel per unit | Fine-grained position |
| 1.7 (u8, 128ths) | 1/128 pixel per unit | Compact position/velocity |
| 9.7 (u16, 128ths) | 1/128 pixel per unit | Most common: fits screen+off-screen |
Typical velocity range: 64..384 units/frame = 0.5..3 pixels/frame (at 128ths resolution).
1.2 Common Operations¶
/* Convert fixed-point to screen pixel */
s16 screen_x = fixed_x >> 7; /* 128ths -> pixels */
s16 screen_x = fixed_x >> 8; /* 256ths -> pixels */
/* Move entity */
fixed_x += velocity_x; /* velocity in same units */
/* Collision: compare in fixed-point units, not pixels */
s16 dx = (s16)(a_x - b_x); /* both in fixed units */
if (dx < 0) dx = -dx;
if (dx < HITBOX_W << 7) { /* hit */ }
/* Integer multiply (constant divisor = shift) */
s16 half = x >> 1; /* x / 2 */
s16 sixth = x / 6; /* division — cc900 calls C9H_divlu (software) */
1.3 Overflow and Promotion¶
u16 * u16with the same type → hardware MUL (fast).u16 * u8mixed → softwareC9H_mullu(slow-ish).- If the result can exceed
u16, cast tou32before multiplying:
/* Safe: intermediate result in u32 */
u32 r = (u32)a * b;
s16 px = (s16)(r >> 7);
/* WRONG: overflow if a * b > 65535 */
s16 px = (s16)((a * b) >> 7);
Never use
float— there is no FPU. Every floating-point operation would call a software emulation library that does not exist on NGPC.
2. Math API¶
2.1 Trigonometry¶
s8 ngpc_sin(u8 angle); /* sine lookup, returns -127..127 */
s8 ngpc_cos(u8 angle); /* cosine lookup, same range */
Angle encoding: 8-bit circle — 0-255 maps to 0°-360° (wraps).
| Value | Degrees |
|---|---|
| 0 | 0° |
| 64 | 90° |
| 128 | 180° |
| 192 | 270° |
| 256 | wraps to 0 |
Usage:
/* Move in direction `angle` at speed 1 (in 1/127 pixel units) */
entity_x += ngpc_cos(angle);
entity_y += ngpc_sin(angle);
2.2 Random Number Generation¶
void ngpc_rng_seed(void); /* seed from VBlank counter */
u16 ngpc_random(u16 max); /* returns 0..max (LCG, good quality) */
void ngpc_qrandom_init(void); /* shuffle pre-built table (call after rng_seed) */
u8 ngpc_qrandom(void); /* ultra-fast table read, returns 0..255 */
ngpc_random() is a u32 LCG. It extracts bits 16-30 of the state → result
should modulo down to 0..max, but on cc900 the result % ((u32)max + 1)
step is broken (u32 modulo runtime helper miscompiles). In practice the
returned value stays in 0..32767 regardless of max.
Consequence — hardware confirmed:
- (u8)ngpc_random(6) → value ≥ 5 in ~98% of calls (expected 2/7)
- (u8)ngpc_random(1) → random 0..255 instead of 0..1
- Any gameplay roll using ngpc_random(max) with small max will ignore max
Rule: never rely on ngpc_random(max) for a bounded gameplay value.
Use ngpc_qrandom() + u8 modulo instead — cc900 handles u8 arithmetic cleanly:
/* Bounded roll, correct on cc900 */
u8 roll = ngpc_qrandom() % 7u; /* 0..6 */
u8 bonus = ngpc_qrandom() % attack_pow; /* 0..attack_pow-1 */
ngpc_qrandom() is zero-cost (table index increment + read). The pre-shuffle
is done with ngpc_random internally so the permutation is biased, but the
output is still a permutation over 0..255, so qrandom() % N distributes
correctly.
For wide ranges where a single qrandom() call isn't enough, combine two:
2.3 32-bit Multiply¶
Useful when both operands may be large and the result needs full 32-bit range.
Note: on cc900,
s16 * s16callsC9H_mulls(software).u16 * u16(same width) uses the hardware MUL opcode. For best performance, keep operands the same width.
3. Lookup Table Math¶
u8 ngpc_lut_atan2(s8 dx, s8 dy); /* angle in 0-255 format */
u8 ngpc_lut_sqrt16(u16 n); /* integer sqrt, returns 0-255 */
u16 ngpc_lut_dist(s8 dx, s8 dy); /* approx distance, ~4% error */
u16 ngpc_lut_div(u16 n, u8 divisor); /* fast division via reciprocal multiply */
All use fixed-point or integer arithmetic. Zero FPU. Minimal CPU cost.
Usage example — aim a bullet at a target:
s8 dx = (s8)(target_x - bullet_x);
s8 dy = (s8)(target_y - bullet_y);
u8 angle = ngpc_lut_atan2(dx, dy); /* 0-255 angle toward target */
bullet_vx = ngpc_cos(angle);
bullet_vy = ngpc_sin(angle);
Use
ngpc_lut_dist()for range checks instead ofsqrt(). For AABB overlap, avoid distance entirely — compare rect edges directly (see Collision).
4. BCD Conversion¶
Binary-coded decimal is the cheap way to render scores, timers, and counters as
digits without per-frame division (which calls the slow C9H_divlu helper).
4.1 Binary to Packed BCD (no division)¶
Convert a 32-bit binary value to 8-digit packed BCD using a power-of-10 table and
repeated subtract-and-count per digit — no division required. For each decimal
place, subtract the corresponding power of 10 while it fits, counting the
subtractions to get that digit, then shift the digit into the result with
sla 4,XWA (shift the accumulator left one nibble). Worst case is ~54
subtractions for a full 8-digit value.
for each power-of-10 P (10^7 down to 10^0):
digit = 0
while value >= P: value -= P; digit++ ; count subtractions
result = (result << 4) | digit ; sla 4,XWA then OR in the nibble
4.2 Byte-Level BCD Helpers¶
For single packed-BCD bytes (two digits):
/* packed BCD byte -> binary 0..99 */
u8 bcd_to_bin(u8 bcd) { return (u8)((bcd >> 4) * 10 + (bcd & 0x0F)); }
/* binary 0..99 -> packed BCD byte */
u8 bin_to_packed(u8 v) { return (u8)(((v % 10) << 4) | (v / 10)); }
5. Tile Decompression¶
5.1 Runtime Decompression API¶
/* RLE — simple, fast, ~2:1 ratio */
u16 ngpc_rle_decompress(void *dst, const void NGP_FAR *src, u16 src_len);
void ngpc_rle_to_tiles(const void NGP_FAR *src, u16 src_len, u16 tile_offset);
/* LZ77/LZSS — better ratio, ~3:1 to 4:1 */
u16 ngpc_lz_decompress(void *dst, const void NGP_FAR *src, u16 src_len);
void ngpc_lz_to_tiles(const void NGP_FAR *src, u16 src_len, u16 tile_offset);
The _to_tiles functions use a 2 KB internal buffer (~128 tiles maximum per call).
For larger tilesets, call in chunks with increasing tile_offset.
Usage example:
/* Load compressed tileset at tile slot 96 */
extern const u8 NGP_FAR level1_tiles_lz[]; /* compressed, in ROM */
extern const u16 level1_tiles_lz_len;
ngpc_lz_to_tiles(level1_tiles_lz, level1_tiles_lz_len, 96);
5.2 Offline Compression Tool¶
The ngpc_compress tool compresses raw tile binary data:
# LZ77 (default) — emits <name>_lz[] + <name>_lz_len
ngpc_compress tiles.bin -o tiles_lz.c -n level1_tiles --header
# RLE — emits <name>_rle[] + <name>_rle_len
ngpc_compress tiles.bin -o tiles_rle.c -m rle -n level1_tiles --header
# Auto-pick smallest
ngpc_compress tiles.bin -o tiles_best.c -m both -n level1_tiles --header
Generated symbols:
| Symbol | Meaning |
|---|---|
<name>_lz[] / <name>_rle[] |
Compressed data array |
<name>_lz_len |
Compressed size in bytes — pass to runtime functions |
<name>_raw_len |
Decompressed size (informational) |
5.3 Compression Constraints and Notes¶
- Input must be a raw binary file (
.binor any byte stream — not a PNG). - The tool verifies roundtrip integrity by default (compress → decompress → compare).
- Naming convention:
_lz(not_lz77),_rle. - Use the tilemap tool's
--tiles-binflag to generate the raw binary input:
ngpc_tilemap level.png -o level.c -n level --header --tiles-bin level_tiles.bin
# then compress:
ngpc_compress level_tiles.bin -o level_tiles_lz.c -n level_tiles --header
_to_tilesfunctions use a 2 KB internal buffer — max ~128 tiles per call. For tilesets > 128 tiles, decompress in two or more calls with offset stepping.
Quick Reference¶
| Item | Value / Pattern |
|---|---|
| Fixed-point 128ths | >> 7 to get pixels |
| Fixed-point 256ths | >> 8 to get pixels |
| Typical velocity | 64..384 units/frame (128ths) = 0.5..3 px/frame |
| Safe u16×u16 | Same-width cast: (u16)a * (u16)b → HW MUL |
| Safe large multiply | (u32)a * b before shifting down |
| Angle range | 0-255 → 0-360° (64=90°, 128=180°, 192=270°) |
ngpc_sin/cos range |
Returns -127..127 |
ngpc_random range |
Broken on cc900 — ignores max, returns 0..32767. Use ngpc_qrandom() % N for bounded rolls |
ngpc_qrandom |
Zero-cost: table read, 0-255. Safe to % N for any N |
ngpc_lut_atan2 |
Returns 0-255 angle toward target |
ngpc_lut_dist |
~4% error, no sqrt |
ngpc_lut_sqrt16 |
Integer sqrt of u16 |
| LZ77 ratio | ~3:1 to 4:1 |
| RLE ratio | ~2:1 |
_to_tiles buffer |
2 KB = ~128 tiles per call max |
<name>_lz_len |
Pass to ngpc_lz_to_tiles() — compressed size |
| No float | Zero FPU on TLCS-900H — never use float / double |
| Division by 2^n | Use right-shift (>> n) — much faster than / |
See Also¶
- Collision — AABB overlap (no sqrt needed), dist² comparison
- Asset Pipeline — Full tilemap export commands,
--tiles-binflag - Build Toolchain — cc900 MUL/DIV codegen, C9H_mulls/divlu, overflow rules
- Game Loop — Frame budget for math-heavy code