[NetsPresso] Edge AI Engineer
Job group
R&D
Experience Level
Experience irrelevant
Job Types
Full-time
Locations
NotaNota Inc. (16F, Parnas Tower), 521, Teheran-ro, Gangnam-gu, Seoul, Republic of Korea, 파르나스타워 16층 Nota

👋About the ​Team

Our ​team ​is responsible ​for designing and implementing ​the ​entire model ​lifecycle so that ​LLM and ​Computer ​Vision models ​can ​run ​reliably across diverse ​NPU, ​GPU, and AI ​accelerator ​environments. ​This includes model ​conversion, graph ​optimization, ​vendor compiler ​integration, device ​runtime ​configuration, and deployment. ​We independently ​address areas not covered by hardware vendors, such as front and middle end optimization, graph surgery, and operator customization, with the core mission of enabling any model to run on any device environment.



📌 What You’ll Do at This Position

As an Edge AI Engineer, you play a critical role in ensuring that state-of-the-art AI models run accurately and efficiently on real-world devices. You will design and implement end-to-end model deployment pipelines by converting and optimizing PyTorch-based dynamic models into ONNX-centric static graphs, and compiling and deploying them across diverse hardware environments.

Beyond simple model conversion, this role takes full ownership of the entire process—from model architecture analysis and graph-level optimization to vendor-specific compiler integration and device-level performance tuning—answering the fundamental questions of why, how, and how well a given model runs on a specific device.




✅ Key Responsibilities

1) Model Conversion & Optimization Pipeline

  • Analyze Torch → ONNX conversion flows and transform dynamic graphs into static graphs
  • Analyze operator compatibility and modify or replace unsupported operations
  • Design quantization-friendly architectures and optimize memory usage and latency
  • Define and implement front-/middle-end graph rewriting strategies

2) Device-Aware Compilation & Deployment

  • Compile models using various vendor compilers (e.g., TensorRT, QNN, SNPE, eIQ)
  • Configure and debug runtime environments on devices such as Jetson, DeepX, Telechips, and Renesas
  • Handle device-specific constraints (dtype, shape, memory) and perform performance profiling

3) Model Quality & Performance Management

  • Deep understanding of modern architectures such as Transformers and Diffusion models
  • Compare model quality before and after conversion and analyze degradation causes
  • Define and guarantee targets for latency, memory usage, and accuracy



✅ Qualifications

  • Hands-on experience with ONNX-based model conversion, optimization, and debugging
  • Experience designing or operating inference pipelines across diverse hardware environments such as NPU, GPU, and ASIC
  • Strong understanding of model graph structures with the ability to perform front and middle end level graph surgery
  • Solid understanding of Transformer and Diffusion model architectures
  • Experience building automated pipelines for model conversion, build, and deployment
  • Ability to clearly communicate complex technical issues in context and collaborate effectively with teammates



✅ Additional Qualifications (Senior Level)

Senior engineers are expected to quickly grasp the full workflow from model conversion to device deployment, provide technical direction, and lead projects end to end.

1. Problem Definition and Project Strategy

  • Ability to define problems based on customer requirements and clearly set goals and success criteria
  • Experience establishing feasible technical strategies within given timelines and constraints, while identifying and managing key risks
  • Capability to structure complex and ambiguous problems and communicate them clearly to teams and stakeholders

2. Documentation, Proposals, and Technical Leadership

  • Experience writing technical proposals for government funded projects or enterprise customers
  • Ability to document technical decisions and outcomes through clear technical documentation and reports
  • Capability to provide technical guidance and mentorship to junior and mid-level engineers



✅ Pluses

  • Experience with multiple vendor compilers
  • Experience with MLOps or LLM serving stacks
  • Technical proposal writing and project leadership experience
  • Experience working with multiple enterprise customers
  • Publications at top-tier AI conferences



✅ Hiring Process

Document Screening → Screening Interview → 1st Interview(Including assingment presentation) → 2nd Interview

(Additional assignments can be included during the process.)




🤓 A Message from the Team

Our team does not describe Any AI on Any Device as a concept. We prove it in practice. We run diverse AI models across a wide range of device environments and validate them end to end. By converting and optimizing state-of-the-art Computer Vision, Transformer, and Diffusion models for different architectures including CPU, GPU, and NPU, we demonstrate feasibility through results, not theory. Our responsibility goes beyond research, covering the full process from model transformation and deployment to performance validation.

We are neither a model-only team nor a device-only team. From PyTorch-based dynamic graphs to ONNX static graphs, compiler IR, and device runtimes, we tackle problems across the entire stack encompassing models, graphs, compilers, and devices. We do not treat questions like why an operator fails on a specific device or why the same model performs differently across environments as someone else’s problem. We dive in and solve them ourselves.

Working alongside strong teammates, we take on complex challenges at the intersection of hardware and software and turn them into real technologies and products. We are looking for people who want to join us on that journey.



Please Check Before Applying! 👀

  • This job posting is open continuously, and it may close early upon completion of the hiring process.
  • Resumes that include sensitive personal information, such as salary details, may be excluded from the review process.
  • Providing false information in the submitted materials may result in the cancellation of the application.
  • Please be aware that references will be checked before finalizing the hiring decision.
  • Compensation will be discussed separately upon successful completion of the final interview.
  • There will be a probationary period after joining, and there will be no discrimination in the treatment during this period.
  • To support the employment of persons with disabilities, you may optionally submit a copy of your disability registration certificate under “Additional Documents,” if administrative verification is required. Submission is optional and does not affect the evaluation process.
  • Veterans and individuals with disabilities will receive preferential treatment in accordance with relevant regulations.



🔎 Helpful materials

Share
[NetsPresso] Edge AI Engineer

👋About the ​Team

Our ​team ​is responsible ​for designing and implementing ​the ​entire model ​lifecycle so that ​LLM and ​Computer ​Vision models ​can ​run ​reliably across diverse ​NPU, ​GPU, and AI ​accelerator ​environments. ​This includes model ​conversion, graph ​optimization, ​vendor compiler ​integration, device ​runtime ​configuration, and deployment. ​We independently ​address areas not covered by hardware vendors, such as front and middle end optimization, graph surgery, and operator customization, with the core mission of enabling any model to run on any device environment.



📌 What You’ll Do at This Position

As an Edge AI Engineer, you play a critical role in ensuring that state-of-the-art AI models run accurately and efficiently on real-world devices. You will design and implement end-to-end model deployment pipelines by converting and optimizing PyTorch-based dynamic models into ONNX-centric static graphs, and compiling and deploying them across diverse hardware environments.

Beyond simple model conversion, this role takes full ownership of the entire process—from model architecture analysis and graph-level optimization to vendor-specific compiler integration and device-level performance tuning—answering the fundamental questions of why, how, and how well a given model runs on a specific device.




✅ Key Responsibilities

1) Model Conversion & Optimization Pipeline

  • Analyze Torch → ONNX conversion flows and transform dynamic graphs into static graphs
  • Analyze operator compatibility and modify or replace unsupported operations
  • Design quantization-friendly architectures and optimize memory usage and latency
  • Define and implement front-/middle-end graph rewriting strategies

2) Device-Aware Compilation & Deployment

  • Compile models using various vendor compilers (e.g., TensorRT, QNN, SNPE, eIQ)
  • Configure and debug runtime environments on devices such as Jetson, DeepX, Telechips, and Renesas
  • Handle device-specific constraints (dtype, shape, memory) and perform performance profiling

3) Model Quality & Performance Management

  • Deep understanding of modern architectures such as Transformers and Diffusion models
  • Compare model quality before and after conversion and analyze degradation causes
  • Define and guarantee targets for latency, memory usage, and accuracy



✅ Qualifications

  • Hands-on experience with ONNX-based model conversion, optimization, and debugging
  • Experience designing or operating inference pipelines across diverse hardware environments such as NPU, GPU, and ASIC
  • Strong understanding of model graph structures with the ability to perform front and middle end level graph surgery
  • Solid understanding of Transformer and Diffusion model architectures
  • Experience building automated pipelines for model conversion, build, and deployment
  • Ability to clearly communicate complex technical issues in context and collaborate effectively with teammates



✅ Additional Qualifications (Senior Level)

Senior engineers are expected to quickly grasp the full workflow from model conversion to device deployment, provide technical direction, and lead projects end to end.

1. Problem Definition and Project Strategy

  • Ability to define problems based on customer requirements and clearly set goals and success criteria
  • Experience establishing feasible technical strategies within given timelines and constraints, while identifying and managing key risks
  • Capability to structure complex and ambiguous problems and communicate them clearly to teams and stakeholders

2. Documentation, Proposals, and Technical Leadership

  • Experience writing technical proposals for government funded projects or enterprise customers
  • Ability to document technical decisions and outcomes through clear technical documentation and reports
  • Capability to provide technical guidance and mentorship to junior and mid-level engineers



✅ Pluses

  • Experience with multiple vendor compilers
  • Experience with MLOps or LLM serving stacks
  • Technical proposal writing and project leadership experience
  • Experience working with multiple enterprise customers
  • Publications at top-tier AI conferences



✅ Hiring Process

Document Screening → Screening Interview → 1st Interview(Including assingment presentation) → 2nd Interview

(Additional assignments can be included during the process.)




🤓 A Message from the Team

Our team does not describe Any AI on Any Device as a concept. We prove it in practice. We run diverse AI models across a wide range of device environments and validate them end to end. By converting and optimizing state-of-the-art Computer Vision, Transformer, and Diffusion models for different architectures including CPU, GPU, and NPU, we demonstrate feasibility through results, not theory. Our responsibility goes beyond research, covering the full process from model transformation and deployment to performance validation.

We are neither a model-only team nor a device-only team. From PyTorch-based dynamic graphs to ONNX static graphs, compiler IR, and device runtimes, we tackle problems across the entire stack encompassing models, graphs, compilers, and devices. We do not treat questions like why an operator fails on a specific device or why the same model performs differently across environments as someone else’s problem. We dive in and solve them ourselves.

Working alongside strong teammates, we take on complex challenges at the intersection of hardware and software and turn them into real technologies and products. We are looking for people who want to join us on that journey.



Please Check Before Applying! 👀

  • This job posting is open continuously, and it may close early upon completion of the hiring process.
  • Resumes that include sensitive personal information, such as salary details, may be excluded from the review process.
  • Providing false information in the submitted materials may result in the cancellation of the application.
  • Please be aware that references will be checked before finalizing the hiring decision.
  • Compensation will be discussed separately upon successful completion of the final interview.
  • There will be a probationary period after joining, and there will be no discrimination in the treatment during this period.
  • To support the employment of persons with disabilities, you may optionally submit a copy of your disability registration certificate under “Additional Documents,” if administrative verification is required. Submission is optional and does not affect the evaluation process.
  • Veterans and individuals with disabilities will receive preferential treatment in accordance with relevant regulations.



🔎 Helpful materials