Detail

The Mandate

Your battlefield is not writing Python scripts with unlimited cloud computing power. Your arena is edge hardware with severely constrained computing resources (cameras, IoT devices, automotive systems). Your mission is to take massive multimodal models and vision algorithms and, through extreme quantization, pruning, and low-level operator rewriting, squeeze them into chips with only a few hundred TOPS or even just a few tens of TOPS — and make them run lightning fast.

Core Tech Stack

Low-Level Languages:

Proficient in C/C++, with a solid grasp of low-level memory management and concurrent programming.

Inference Frameworks:

Skilled in using edge inference acceleration frameworks such as TensorRT, NCNN, ONNX Runtime, and OpenVINO.

Model Compression:

Hands-on experience with model quantization (INT8/INT4), network pruning, and knowledge distillation.

Hardware Acceleration:

Familiar with CUDA programming and low-level instruction set optimizations like ARM NEON.

Bonus (Strongly Preferred)

Experience with engineering deployments in autonomous driving, robotics (SLAM/ROS), or Vehicle-to-Everything (V2X).

Experience with end-side AI deployment in leading tech companies (DJI, SenseTime, Huawei, NIO, etc.).

Company Website: https://kamivision.com/en-us

Tags for this job:

Industry

Information Technology
Job Function

Information Technology>Systems / Technical Support

Information Technology>Technical / Functional Consulting

Information Technology>Others
Location
Salary

HK$ -
Employment Type

Full Time
Benefits

JobMarket for iPhone

JobMarket Publishing L...

Tags for this job: