Usage Guide =========== This section explains how to use **GATICC** for compiling ONNX models, running inference on the FPGA, and performing CPU-based simulation. GATICC works in three simple stages: 1. **Preprocess your input** → produce a NumPy tensor 2. **Run inference** (simulation or FPGA) 3. **Postprocess the output** → get predictions, boxes, labels, etc. GATICC does *not* perform preprocessing or postprocessing automatically. Users must implement these parts depending on their model (classification or detection). Basic Workflow -------------- A typical workflow looks like: ONNX Model → Preprocess Input (User code) → GATICC Compile (int8 only for FPGA) → Flash Bitstream → GATICC Load & Run → Postprocess Output (User code) Preprocessing (User Responsibility) -------------------------------------- GATICC expects input as a: {input_name: numpy_tensor} Where: - ``input_name`` comes from the ONNX model (use ``gati.get_model_inputs()``) - ``numpy_tensor`` must match the model’s expected shape and dtype - **GATICC expects input in *5D tensor format*: ``(N, 1, C, H, W)``** This extra dimension is required by the internal architecture of GATI’s dataflow. **Example: Classification Preprocess** import numpy as np import cv2 def preprocess(img_path): img = cv2.imread(img_path) img = cv2.resize(img, (224, 224)) img = img.astype(np.float32) / 255.0 img = np.transpose(img, (2, 0, 1)) # CHW img = np.expand_dims(img, axis=0) # NCHW return img **Example: Object Detection Preprocess** def preprocess(img): img = cv2.resize(img, (300, 300)) img = img[..., ::-1] / 255.0 img = (img - 0.5) / 0.5 img = np.transpose(img, (2, 0, 1)) img = img.reshape(1, 1, 3, 300, 300) return img.astype(np.float32) GATICC does **not** impose a fixed preprocessing style —only the final tensor shape must follow ``(N, 1, C, H, W)``. Users may normalize, resize, or format inputs however their ONNX model requires. Compiling INT8 ONNX Model for FPGA ------------------------------------- FPGA inference requires **INT8 quantized ONNX models**. import gati gati.set_arch(ramsize=512, sa_arch="9,4,4", vasize=32, accbuf_size=4096, fcbuf_size=32768) gati.compile("model_int8.onnx", "model.gml","plus more args if needed") The compiler produces a ``.gml`` file used by the FPGA runtime. Flash Bitstream (FPGA Only) ------------------------------ gati.flash("gati_944.hex") Load Model and Run Inference ------------------------------- name = gati.get_model_inputs("model_int8.onnx")[0] gati.load("model_int8.onnx", "model.gml") inp = preprocess("image.jpg") out = gati.run({name: inp}) The output is always a list of tuples: [("layer_name", numpy_array), ("layer_name", numpy_array), ...] Users must apply postprocessing to convert raw tensors into: - class IDs (classification) - bounding boxes (object detection) - scores / probabilities - segmentation masks - etc. Simulation Mode (FP32/INT8) ------------------------------ Simulation runs on **CPU only**, no FPGA required. Simulation supports: - FP32 ONNX models - INT8 ONNX models out = gati.sim("model_fp32.onnx", {name: inp}) Use this for debugging, comparison, and output verification. Example: Classification Usage -------------------------------- import numpy as np import gati def post(arr): logits = np.stack([x[1] for x in arr]) return np.argmax(logits, axis=-1) name = gati.get_model_inputs("mnist.onnx")[0] gati.set_arch(ramsize=512, sa_arch="9,4,4", vasize=32, accbuf_size=4096, fcbuf_size=32768) gati.compile("mnist_int8.onnx", "mnist.gml") gati.flash("gati_944.hex") gati.load("mnist_int8.onnx", "mnist.gml") inp = np.load("sample.npy") pred = post(gati.run({name: inp})) print("Prediction:", pred) Example: Object Detection Usage (Simplified) ----------------------------------------------- img = preprocess("dog.jpg") name = gati.get_model_inputs("ssd.onnx")[0] outputs = gati.run({name: img}) # User-written decode + NMS here boxes, labels, scores = decode(outputs) (Full examples are included in the ``examples/`` folder.) Important Notes --------------- - GATICC does **not** do preprocessing or postprocessing. Users must handle image normalization, resizing, decoding, NMS, etc. - FPGA supports **only INT8** operators currently: - QLinearConv (normal / pointwise / depthwise) - Relu, Clip - QGemm - Flatten - QLinearAdd/Sub/Mul - MaxPool, AveragePool - QLinearSigmoid - QLinearConcat - QLinearLeakyRelu - Tanh - Split - Simulation supports **FP32 and INT8**. - Input/Output format depends entirely on the model architecture. Example Files ------------- The repository includes ready-to-run examples: - ``examples/classification_run.py`` - ``examples/classification_sim.py`` - ``examples/detection_sim.py`` - ``examples/detection_run.py`` Users can copy these templates and plug in their own preprocessing and postprocessing code. >[!NOTE] >``` > Make sure to update the required paths in example files More Usage & Full Command Reference --------------------------------------- This guide covers the common workflow for compiling and running models. However, **GATICC provides many more runtime options, flags, and utilities** that users may need. To view the full function reference, run: gaticc --help This will show detailed documentation for: - all available compiler flags - runtime options - model inspection commands - summary/info mode - architecture selection - dispatch options - debug/verbose options - explanations for each function