Usage Guide

This section explains how to use GATICC for compiling ONNX models, running inference on the FPGA, and performing CPU-based simulation.

GATICC works in three simple stages:

  1. Preprocess your input → produce a NumPy tensor

  2. Run inference (simulation or FPGA)

  3. Postprocess the output → get predictions, boxes, labels, etc.

GATICC does not perform preprocessing or postprocessing automatically.
Users must implement these parts depending on their model (classification or detection).

Basic Workflow

A typical workflow looks like:

ONNX Model  →  Preprocess Input (User code)
             →  GATICC Compile (int8 only for FPGA)
             →  Flash Bitstream
             →  GATICC Load & Run
             →  Postprocess Output (User code)

Preprocessing (User Responsibility)

GATICC expects input as a:

{input_name: numpy_tensor}

Where:

  • input_name comes from the ONNX model (use gati.get_model_inputs())

  • numpy_tensor must match the model’s expected shape and dtype

  • GATICC expects input in 5D tensor format: (N, 1, C, H, W)
    This extra dimension is required by the internal architecture of GATI’s dataflow.

Example: Classification Preprocess

import numpy as np
import cv2

def preprocess(img_path):
    img = cv2.imread(img_path)
    img = cv2.resize(img, (224, 224))
    img = img.astype(np.float32) / 255.0
    img = np.transpose(img, (2, 0, 1))       # CHW
    img = np.expand_dims(img, axis=0)        # NCHW
    return img

Example: Object Detection Preprocess

def preprocess(img):
    img = cv2.resize(img, (300, 300))
    img = img[..., ::-1] / 255.0
    img = (img - 0.5) / 0.5
    img = np.transpose(img, (2, 0, 1))
    img = img.reshape(1, 1, 3, 300, 300)
    return img.astype(np.float32)

GATICC does not impose a fixed preprocessing style —only the final tensor shape must follow (N, 1, C, H, W).
Users may normalize, resize, or format inputs however their ONNX model requires.

Compiling INT8 ONNX Model for FPGA

FPGA inference requires INT8 quantized ONNX models.

import gati

gati.set_arch(ramsize=512, sa_arch="9,4,4", vasize=32, accbuf_size=4096, fcbuf_size=32768)
gati.compile("model_int8.onnx", "model.gml","plus more args if needed")

The compiler produces a .gml file used by the FPGA runtime.

Flash Bitstream (FPGA Only)

gati.flash("gati_944.hex")

Load Model and Run Inference

name = gati.get_model_inputs("model_int8.onnx")[0]
gati.load("model_int8.onnx", "model.gml")

inp = preprocess("image.jpg")
out = gati.run({name: inp})

The output is always a list of tuples:

[("layer_name", numpy_array), ("layer_name", numpy_array), ...]

Users must apply postprocessing to convert raw tensors into:

  • class IDs (classification)

  • bounding boxes (object detection)

  • scores / probabilities

  • segmentation masks

  • etc.

Simulation Mode (FP32/INT8)

Simulation runs on CPU only, no FPGA required.

Simulation supports:

  • FP32 ONNX models

  • INT8 ONNX models

    out = gati.sim(“model_fp32.onnx”, {name: inp})

Use this for debugging, comparison, and output verification.

Example: Classification Usage

import numpy as np
import gati

def post(arr):
    logits = np.stack([x[1] for x in arr])
    return np.argmax(logits, axis=-1)

name = gati.get_model_inputs("mnist.onnx")[0]
gati.set_arch(ramsize=512, sa_arch="9,4,4", vasize=32, accbuf_size=4096, fcbuf_size=32768)
gati.compile("mnist_int8.onnx", "mnist.gml")
gati.flash("gati_944.hex")
gati.load("mnist_int8.onnx", "mnist.gml")

inp = np.load("sample.npy")
pred = post(gati.run({name: inp}))

print("Prediction:", pred)

Example: Object Detection Usage (Simplified)

img = preprocess("dog.jpg")
name = gati.get_model_inputs("ssd.onnx")[0]

outputs = gati.run({name: img})

# User-written decode + NMS here
boxes, labels, scores = decode(outputs)

(Full examples are included in the examples/ folder.)

Important Notes

  • GATICC does not do preprocessing or postprocessing.
    Users must handle image normalization, resizing, decoding, NMS, etc.

  • FPGA supports only INT8 operators currently:

    • QLinearConv (normal / pointwise / depthwise)

    • Relu, Clip

    • QGemm

    • Flatten

    • QLinearAdd/Sub/Mul

    • MaxPool, AveragePool

    • QLinearSigmoid

    • QLinearConcat

    • QLinearLeakyRelu

    • Tanh

    • Split

  • Simulation supports FP32 and INT8.

  • Input/Output format depends entirely on the model architecture.

Example Files

The repository includes ready-to-run examples:

  • examples/classification_run.py

  • examples/classification_sim.py

  • examples/detection_sim.py

  • examples/detection_run.py

Users can copy these templates and plug in their own preprocessing and postprocessing code.

[!NOTE]

Make sure to update the required paths in example files

More Usage & Full Command Reference

This guide covers the common workflow for compiling and running models.
However, GATICC provides many more runtime options, flags, and utilities
that users may need.

To view the full function reference, run:

gaticc --help

This will show detailed documentation for:

  • all available compiler flags

  • runtime options

  • model inspection commands

  • summary/info mode

  • architecture selection

  • dispatch options

  • debug/verbose options

  • explanations for each function