The Mandate
Your battlefield is not writing Python scripts with unlimited cloud computing power. Your arena is edge hardware with severely constrained computing resources (cameras, IoT devices, automotive systems). Your mission is to take massive multimodal models and vision algorithms and, through extreme quantization, pruning, and low-level operator rewriting, squeeze them into chips with only a few hundred TOPS or even just a few tens of TOPS — and make them run lightning fast.
Core Tech Stack
Low-Level Languages:
Proficient in C/C++, with a solid grasp of low-level memory management and concurrent programming.
Inference Frameworks:
Skilled in using edge inference acceleration frameworks such as TensorRT, NCNN, ONNX Runtime, and OpenVINO.
Model Compression:
Hands-on experience with model quantization (INT8/INT4), network pruning, and knowledge distillation.
Hardware Acceleration:
Familiar with CUDA programming and low-level instruction set optimizations like ARM NEON.
Bonus (Strongly Preferred)
Experience with engineering deployments in autonomous driving, robotics (SLAM/ROS), or Vehicle-to-Everything (V2X).
Experience with end-side AI deployment in leading tech companies (DJI, SenseTime, Huawei, NIO, etc.).
Company Website: https://kamivision.com/en-us
Information Technology>Systems / Technical Support
Information Technology>Technical / Functional Consulting
Information Technology>Others
HK$ -
Full Time