[NetsPresso] Senior Edge AI Engineer
Job group
R&D
Experience Level
Experienced 5 years or more
Job Types
Full-time
Locations
NotaNota Inc. (16F, Parnas Tower), 521, Teheran-ro, Gangnam-gu, Seoul, Republic of Korea, 파르나스타워 16층 Nota

👋 About ​the ​Team

Our ​team designs ​the entire lifecycle of ​AI ​models to ​ensure LLM and ​Computer Vision ​models ​operate reliably ​across ​diverse ​NPU, GPU, and ​AI ​accelerator environments. We ​manage ​the ​full pipeline: Model ​Conversion → ​Graph ​Optimization → ​Vendor Compilation ​→ ​Device Runtime Configuration ​→ Deployment.

Our ​core mission is to bridge the gap where hardware manufacturers fall short. By independently solving challenges such as Front/Middle-End optimization, Graph Surgery, and operator modification, we aim to make any AI model run on any device.



📌 What You’ll Do at This Position

In this role, you will be the key driver in making cutting-edge AI models run accurately and efficiently on physical devices. You will design and implement End-to-End model realization processes, converting dynamic PyTorch models into ONNX-based static graphs and optimizing them for specific hardware.

Beyond simple conversion, you will be responsible for answering, "How and why does this model perform on this specific device?" This involves everything from architecture analysis and graph-level optimization to vendor compiler adaptation and performance tuning. You will work with a talented team to solve complex problems at the intersection of hardware and software, turning the vision of "Any AI on Any Device" into reality.




✅ Key Responsibilities

1. Model Conversion & Optimization Pipeline

  • Analyze Torch → ONNX conversion structures and transform dynamic graphs into static ones.
  • Perform operator compatibility analysis and fix/replace unsupported ops.
  • Design quantization-friendly architectures and optimize for memory/latency.
  • Establish and implement Front/Middle-End Graph Rewriting strategies.

2. Device-based Compilation & Deployment

  • Compile models using various vendor compilers (TensorRT, QNN, SNPE, eIQ, etc.).
  • Configure and debug runtime environments for devices such as Jetson, DeepX, Telechips, and Renesas.
  • Address device-specific constraints (Dtype, shape, memory) and conduct performance profiling.

3. Model Quality & Performance Management

  • Maintain a deep understanding of the latest architectures (Transformers, Diffusion, etc.).
  • Conduct pre/post-conversion quality comparisons and root-cause analysis for quality degradation.
  • Establish and guarantee targets for latency, memory, and accuracy.

4. Problem Definition & Project Strategy

  • Define problems, set goals, and design success metrics based on client requirements.
  • Develop feasible strategies within deadlines and preemptively identify/mitigate risks.
  • Structure complex, ambiguous problems into clear technical roadmaps.

5. Documentation & Technical Leadership

  • Draft technical sections for government and corporate project proposals.
  • Author technical white papers, reports, and records of technical decision-making.
  • Provide mentorship for junior and mid-level engineers.



✅ Requirements

  • Proven experience in ONNX-based model conversion, optimization, and debugging.
  • Hands-on experience with inference pipelines across various hardware (NPU, GPU, ASIC).
  • Strong understanding of Model Graphs and proficiency in Front/Middle-End Graph Surgery.
  • Deep architectural understanding of Transformers and Diffusion models.
  • Experience building automated deployment pipelines.
  • Clear communication skills with the ability to lead technical directions and understand end-to-end flows.



✅ Pluses

  • Experience with multiple vendor compilers.
  • Familiarity with MLOps or LLM Serving stacks.
  • Experience in technical proposal writing or project leadership.
  • Experience managing multiple client accounts/requirements.
  • Published research in top-tier AI conferences (e.g., CVPR, NeurIPS, ICML).



✅ Hiring Process

  • Document Screening → 1st Interview → 2nd Interview → 3rd Interview

(Additional assignments may be included during the process.)




🤓 A Message from the Team

We take pride in solving the "last mile" of AI deployment. Our team handles everything from the graph level to the runtime level to ensure stability and performance. If you enjoy digging into operator-level modifications and making the impossible possible on constrained hardware, you will find our mission incredibly rewarding.



Please Check Before Applying! 👀

  • This job posting is open continuously, and it may close early upon completion of the hiring process.
  • Resumes that include sensitive personal information, such as salary details, may be excluded from the review process.
  • Providing false information in the submitted materials may result in the cancellation of the application.
  • Please be aware that references will be checked before finalizing the hiring decision.
  • Compensation will be discussed separately upon successful completion of the final interview.
  • There will be a probationary period after joining, and there will be no discrimination in the treatment during this period.
  • To support the employment of persons with disabilities, you may optionally submit a copy of your disability registration certificate under “Additional Documents,” if administrative verification is required. Submission is optional and does not affect the evaluation process.
  • Veterans and individuals with disabilities will receive preferential treatment in accordance with relevant regulations.



🔎 Helpful materials

Share
[NetsPresso] Senior Edge AI Engineer

👋 About ​the ​Team

Our ​team designs ​the entire lifecycle of ​AI ​models to ​ensure LLM and ​Computer Vision ​models ​operate reliably ​across ​diverse ​NPU, GPU, and ​AI ​accelerator environments. We ​manage ​the ​full pipeline: Model ​Conversion → ​Graph ​Optimization → ​Vendor Compilation ​→ ​Device Runtime Configuration ​→ Deployment.

Our ​core mission is to bridge the gap where hardware manufacturers fall short. By independently solving challenges such as Front/Middle-End optimization, Graph Surgery, and operator modification, we aim to make any AI model run on any device.



📌 What You’ll Do at This Position

In this role, you will be the key driver in making cutting-edge AI models run accurately and efficiently on physical devices. You will design and implement End-to-End model realization processes, converting dynamic PyTorch models into ONNX-based static graphs and optimizing them for specific hardware.

Beyond simple conversion, you will be responsible for answering, "How and why does this model perform on this specific device?" This involves everything from architecture analysis and graph-level optimization to vendor compiler adaptation and performance tuning. You will work with a talented team to solve complex problems at the intersection of hardware and software, turning the vision of "Any AI on Any Device" into reality.




✅ Key Responsibilities

1. Model Conversion & Optimization Pipeline

  • Analyze Torch → ONNX conversion structures and transform dynamic graphs into static ones.
  • Perform operator compatibility analysis and fix/replace unsupported ops.
  • Design quantization-friendly architectures and optimize for memory/latency.
  • Establish and implement Front/Middle-End Graph Rewriting strategies.

2. Device-based Compilation & Deployment

  • Compile models using various vendor compilers (TensorRT, QNN, SNPE, eIQ, etc.).
  • Configure and debug runtime environments for devices such as Jetson, DeepX, Telechips, and Renesas.
  • Address device-specific constraints (Dtype, shape, memory) and conduct performance profiling.

3. Model Quality & Performance Management

  • Maintain a deep understanding of the latest architectures (Transformers, Diffusion, etc.).
  • Conduct pre/post-conversion quality comparisons and root-cause analysis for quality degradation.
  • Establish and guarantee targets for latency, memory, and accuracy.

4. Problem Definition & Project Strategy

  • Define problems, set goals, and design success metrics based on client requirements.
  • Develop feasible strategies within deadlines and preemptively identify/mitigate risks.
  • Structure complex, ambiguous problems into clear technical roadmaps.

5. Documentation & Technical Leadership

  • Draft technical sections for government and corporate project proposals.
  • Author technical white papers, reports, and records of technical decision-making.
  • Provide mentorship for junior and mid-level engineers.



✅ Requirements

  • Proven experience in ONNX-based model conversion, optimization, and debugging.
  • Hands-on experience with inference pipelines across various hardware (NPU, GPU, ASIC).
  • Strong understanding of Model Graphs and proficiency in Front/Middle-End Graph Surgery.
  • Deep architectural understanding of Transformers and Diffusion models.
  • Experience building automated deployment pipelines.
  • Clear communication skills with the ability to lead technical directions and understand end-to-end flows.



✅ Pluses

  • Experience with multiple vendor compilers.
  • Familiarity with MLOps or LLM Serving stacks.
  • Experience in technical proposal writing or project leadership.
  • Experience managing multiple client accounts/requirements.
  • Published research in top-tier AI conferences (e.g., CVPR, NeurIPS, ICML).



✅ Hiring Process

  • Document Screening → 1st Interview → 2nd Interview → 3rd Interview

(Additional assignments may be included during the process.)




🤓 A Message from the Team

We take pride in solving the "last mile" of AI deployment. Our team handles everything from the graph level to the runtime level to ensure stability and performance. If you enjoy digging into operator-level modifications and making the impossible possible on constrained hardware, you will find our mission incredibly rewarding.



Please Check Before Applying! 👀

  • This job posting is open continuously, and it may close early upon completion of the hiring process.
  • Resumes that include sensitive personal information, such as salary details, may be excluded from the review process.
  • Providing false information in the submitted materials may result in the cancellation of the application.
  • Please be aware that references will be checked before finalizing the hiring decision.
  • Compensation will be discussed separately upon successful completion of the final interview.
  • There will be a probationary period after joining, and there will be no discrimination in the treatment during this period.
  • To support the employment of persons with disabilities, you may optionally submit a copy of your disability registration certificate under “Additional Documents,” if administrative verification is required. Submission is optional and does not affect the evaluation process.
  • Veterans and individuals with disabilities will receive preferential treatment in accordance with relevant regulations.



🔎 Helpful materials