Our team is responsible for designing and implementing the entire model lifecycle so that LLM and Computer Vision models can run reliably across diverse NPU, GPU, and AI accelerator environments. This includes model conversion, graph optimization, vendor compiler integration, device runtime configuration, and deployment. We independently address areas not covered by hardware vendors, such as front and middle end optimization, graph surgery, and operator customization, with the core mission of enabling any model to run on any device environment.
As an Edge AI Engineer, you play a critical role in ensuring that state-of-the-art AI models run accurately and efficiently on real-world devices. You will design and implement end-to-end model deployment pipelines by converting and optimizing PyTorch-based dynamic models into ONNX-centric static graphs, and compiling and deploying them across diverse hardware environments.
Beyond simple model conversion, this role takes full ownership of the entire process—from model architecture analysis and graph-level optimization to vendor-specific compiler integration and device-level performance tuning—answering the fundamental questions of why, how, and how well a given model runs on a specific device.
1) Model Conversion & Optimization Pipeline
2) Device-Aware Compilation & Deployment
3) Model Quality & Performance Management
Senior engineers are expected to quickly grasp the full workflow from model conversion to device deployment, provide technical direction, and lead projects end to end.
1. Problem Definition and Project Strategy
2. Documentation, Proposals, and Technical Leadership
Document Screening → Screening Interview → 1st Interview(Including assingment presentation) → 2nd Interview
(Additional assignments can be included during the process.)
Our team does not describe Any AI on Any Device as a concept. We prove it in practice. We run diverse AI models across a wide range of device environments and validate them end to end. By converting and optimizing state-of-the-art Computer Vision, Transformer, and Diffusion models for different architectures including CPU, GPU, and NPU, we demonstrate feasibility through results, not theory. Our responsibility goes beyond research, covering the full process from model transformation and deployment to performance validation.
We are neither a model-only team nor a device-only team. From PyTorch-based dynamic graphs to ONNX static graphs, compiler IR, and device runtimes, we tackle problems across the entire stack encompassing models, graphs, compilers, and devices. We do not treat questions like why an operator fails on a specific device or why the same model performs differently across environments as someone else’s problem. We dive in and solve them ourselves.
Working alongside strong teammates, we take on complex challenges at the intersection of hardware and software and turn them into real technologies and products. We are looking for people who want to join us on that journey.
🔎 Helpful materials
Our team is responsible for designing and implementing the entire model lifecycle so that LLM and Computer Vision models can run reliably across diverse NPU, GPU, and AI accelerator environments. This includes model conversion, graph optimization, vendor compiler integration, device runtime configuration, and deployment. We independently address areas not covered by hardware vendors, such as front and middle end optimization, graph surgery, and operator customization, with the core mission of enabling any model to run on any device environment.
As an Edge AI Engineer, you play a critical role in ensuring that state-of-the-art AI models run accurately and efficiently on real-world devices. You will design and implement end-to-end model deployment pipelines by converting and optimizing PyTorch-based dynamic models into ONNX-centric static graphs, and compiling and deploying them across diverse hardware environments.
Beyond simple model conversion, this role takes full ownership of the entire process—from model architecture analysis and graph-level optimization to vendor-specific compiler integration and device-level performance tuning—answering the fundamental questions of why, how, and how well a given model runs on a specific device.
1) Model Conversion & Optimization Pipeline
2) Device-Aware Compilation & Deployment
3) Model Quality & Performance Management
Senior engineers are expected to quickly grasp the full workflow from model conversion to device deployment, provide technical direction, and lead projects end to end.
1. Problem Definition and Project Strategy
2. Documentation, Proposals, and Technical Leadership
Document Screening → Screening Interview → 1st Interview(Including assingment presentation) → 2nd Interview
(Additional assignments can be included during the process.)
Our team does not describe Any AI on Any Device as a concept. We prove it in practice. We run diverse AI models across a wide range of device environments and validate them end to end. By converting and optimizing state-of-the-art Computer Vision, Transformer, and Diffusion models for different architectures including CPU, GPU, and NPU, we demonstrate feasibility through results, not theory. Our responsibility goes beyond research, covering the full process from model transformation and deployment to performance validation.
We are neither a model-only team nor a device-only team. From PyTorch-based dynamic graphs to ONNX static graphs, compiler IR, and device runtimes, we tackle problems across the entire stack encompassing models, graphs, compilers, and devices. We do not treat questions like why an operator fails on a specific device or why the same model performs differently across environments as someone else’s problem. We dive in and solve them ourselves.
Working alongside strong teammates, we take on complex challenges at the intersection of hardware and software and turn them into real technologies and products. We are looking for people who want to join us on that journey.
🔎 Helpful materials