Our team designs the entire lifecycle of AI models to ensure LLM and Computer Vision models operate reliably across diverse NPU, GPU, and AI accelerator environments. We manage the full pipeline: Model Conversion → Graph Optimization → Vendor Compilation → Device Runtime Configuration → Deployment.
Our core mission is to bridge the gap where hardware manufacturers fall short. By independently solving challenges such as Front/Middle-End optimization, Graph Surgery, and operator modification, we aim to make any AI model run on any device.
In this role, you will be the key driver in making cutting-edge AI models run accurately and efficiently on physical devices. You will design and implement End-to-End model realization processes, converting dynamic PyTorch models into ONNX-based static graphs and optimizing them for specific hardware.
Beyond simple conversion, you will be responsible for answering, "How and why does this model perform on this specific device?" This involves everything from architecture analysis and graph-level optimization to vendor compiler adaptation and performance tuning. You will work with a talented team to solve complex problems at the intersection of hardware and software, turning the vision of "Any AI on Any Device" into reality.
1. Model Conversion & Optimization Pipeline
2. Device-based Compilation & Deployment
3. Model Quality & Performance Management
4. Problem Definition & Project Strategy
5. Documentation & Technical Leadership
(Additional assignments may be included during the process.)
We take pride in solving the "last mile" of AI deployment. Our team handles everything from the graph level to the runtime level to ensure stability and performance. If you enjoy digging into operator-level modifications and making the impossible possible on constrained hardware, you will find our mission incredibly rewarding.
Our team designs the entire lifecycle of AI models to ensure LLM and Computer Vision models operate reliably across diverse NPU, GPU, and AI accelerator environments. We manage the full pipeline: Model Conversion → Graph Optimization → Vendor Compilation → Device Runtime Configuration → Deployment.
Our core mission is to bridge the gap where hardware manufacturers fall short. By independently solving challenges such as Front/Middle-End optimization, Graph Surgery, and operator modification, we aim to make any AI model run on any device.
In this role, you will be the key driver in making cutting-edge AI models run accurately and efficiently on physical devices. You will design and implement End-to-End model realization processes, converting dynamic PyTorch models into ONNX-based static graphs and optimizing them for specific hardware.
Beyond simple conversion, you will be responsible for answering, "How and why does this model perform on this specific device?" This involves everything from architecture analysis and graph-level optimization to vendor compiler adaptation and performance tuning. You will work with a talented team to solve complex problems at the intersection of hardware and software, turning the vision of "Any AI on Any Device" into reality.
1. Model Conversion & Optimization Pipeline
2. Device-based Compilation & Deployment
3. Model Quality & Performance Management
4. Problem Definition & Project Strategy
5. Documentation & Technical Leadership
(Additional assignments may be included during the process.)
We take pride in solving the "last mile" of AI deployment. Our team handles everything from the graph level to the runtime level to ensure stability and performance. If you enjoy digging into operator-level modifications and making the impossible possible on constrained hardware, you will find our mission incredibly rewarding.