[NetsPresso] Senior Edge AI Engineer

Job group

R&D

Experience Level

Experienced 5 years or more

Job Types

Full-time

Locations

NotaNota Inc. (16F, Parnas Tower), 521, Teheran-ro, Gangnam-gu, Seoul, Republic of Korea, 파르나스타워 16층 Nota

👋 About the Team

Our team designs the entire lifecycle of AI models to ensure LLM and Computer Vision models operate reliably across diverse NPU, GPU, and AI accelerator environments. We manage the full pipeline: Model Conversion → Graph Optimization → Vendor Compilation → Device Runtime Configuration → Deployment.

Our core mission is to bridge the gap where hardware manufacturers fall short. By independently solving challenges such as Front/Middle-End optimization, Graph Surgery, and operator modification, we aim to make any AI model run on any device.

📌 What You’ll Do at This Position

In this role, you will be the key driver in making cutting-edge AI models run accurately and efficiently on physical devices. You will design and implement End-to-End model realization processes, converting dynamic PyTorch models into ONNX-based static graphs and optimizing them for specific hardware.

Beyond simple conversion, you will be responsible for answering, "How and why does this model perform on this specific device?" This involves everything from architecture analysis and graph-level optimization to vendor compiler adaptation and performance tuning. You will work with a talented team to solve complex problems at the intersection of hardware and software, turning the vision of "Any AI on Any Device" into reality.

✅ Key Responsibilities

1. Model Conversion & Optimization Pipeline

Analyze Torch → ONNX conversion structures and transform dynamic graphs into static ones.
Perform operator compatibility analysis and fix/replace unsupported ops.
Design quantization-friendly architectures and optimize for memory/latency.
Establish and implement Front/Middle-End Graph Rewriting strategies.

2. Device-based Compilation & Deployment

Compile models using various vendor compilers (TensorRT, QNN, SNPE, eIQ, etc.).
Configure and debug runtime environments for devices such as Jetson, DeepX, Telechips, and Renesas.
Address device-specific constraints (Dtype, shape, memory) and conduct performance profiling.

3. Model Quality & Performance Management

Maintain a deep understanding of the latest architectures (Transformers, Diffusion, etc.).
Conduct pre/post-conversion quality comparisons and root-cause analysis for quality degradation.
Establish and guarantee targets for latency, memory, and accuracy.

4. Problem Definition & Project Strategy

Define problems, set goals, and design success metrics based on client requirements.
Develop feasible strategies within deadlines and preemptively identify/mitigate risks.
Structure complex, ambiguous problems into clear technical roadmaps.

5. Documentation & Technical Leadership

Draft technical sections for government and corporate project proposals.
Author technical white papers, reports, and records of technical decision-making.
Provide mentorship for junior and mid-level engineers.

✅ Requirements

Proven experience in ONNX-based model conversion, optimization, and debugging.
Hands-on experience with inference pipelines across various hardware (NPU, GPU, ASIC).
Strong understanding of Model Graphs and proficiency in Front/Middle-End Graph Surgery.
Deep architectural understanding of Transformers and Diffusion models.
Experience building automated deployment pipelines.
Clear communication skills with the ability to lead technical directions and understand end-to-end flows.

✅ Pluses

Experience with multiple vendor compilers.
Familiarity with MLOps or LLM Serving stacks.
Experience in technical proposal writing or project leadership.
Experience managing multiple client accounts/requirements.
Published research in top-tier AI conferences (e.g., CVPR, NeurIPS, ICML).

✅ Hiring Process

Document Screening → 1st Interview → 2nd Interview → 3rd Interview

(Additional assignments may be included during the process.)

🤓 A Message from the Team

We take pride in solving the "last mile" of AI deployment. Our team handles everything from the graph level to the runtime level to ensure stability and performance. If you enjoy digging into operator-level modifications and making the impossible possible on constrained hardware, you will find our mission incredibly rewarding.

Please Check Before Applying! 👀

This job posting is open continuously, and it may close early upon completion of the hiring process.
Resumes that include sensitive personal information, such as salary details, may be excluded from the review process.
Providing false information in the submitted materials may result in the cancellation of the application.
Please be aware that references will be checked before finalizing the hiring decision.
Compensation will be discussed separately upon successful completion of the final interview.
There will be a probationary period after joining, and there will be no discrimination in the treatment during this period.
To support the employment of persons with disabilities, you may optionally submit a copy of your disability registration certificate under “Additional Documents,” if administrative verification is required. Submission is optional and does not affect the evaluation process.
Veterans and individuals with disabilities will receive preferential treatment in accordance with relevant regulations.

🔎 Helpful materials

[NetsPresso] Senior Edge AI Engineer

👋 About the Team

📌 What You’ll Do at This Position

✅ Key Responsibilities

1. Model Conversion & Optimization Pipeline

Analyze Torch → ONNX conversion structures and transform dynamic graphs into static ones.
Perform operator compatibility analysis and fix/replace unsupported ops.
Design quantization-friendly architectures and optimize for memory/latency.
Establish and implement Front/Middle-End Graph Rewriting strategies.

2. Device-based Compilation & Deployment

Compile models using various vendor compilers (TensorRT, QNN, SNPE, eIQ, etc.).
Configure and debug runtime environments for devices such as Jetson, DeepX, Telechips, and Renesas.
Address device-specific constraints (Dtype, shape, memory) and conduct performance profiling.

3. Model Quality & Performance Management

Maintain a deep understanding of the latest architectures (Transformers, Diffusion, etc.).
Conduct pre/post-conversion quality comparisons and root-cause analysis for quality degradation.
Establish and guarantee targets for latency, memory, and accuracy.

4. Problem Definition & Project Strategy

Define problems, set goals, and design success metrics based on client requirements.
Develop feasible strategies within deadlines and preemptively identify/mitigate risks.
Structure complex, ambiguous problems into clear technical roadmaps.

5. Documentation & Technical Leadership

Draft technical sections for government and corporate project proposals.
Author technical white papers, reports, and records of technical decision-making.
Provide mentorship for junior and mid-level engineers.

✅ Requirements

Proven experience in ONNX-based model conversion, optimization, and debugging.
Hands-on experience with inference pipelines across various hardware (NPU, GPU, ASIC).
Strong understanding of Model Graphs and proficiency in Front/Middle-End Graph Surgery.
Deep architectural understanding of Transformers and Diffusion models.
Experience building automated deployment pipelines.
Clear communication skills with the ability to lead technical directions and understand end-to-end flows.

✅ Pluses

Experience with multiple vendor compilers.
Familiarity with MLOps or LLM Serving stacks.
Experience in technical proposal writing or project leadership.
Experience managing multiple client accounts/requirements.
Published research in top-tier AI conferences (e.g., CVPR, NeurIPS, ICML).

✅ Hiring Process

Document Screening → 1st Interview → 2nd Interview → 3rd Interview

(Additional assignments may be included during the process.)

🤓 A Message from the Team

Please Check Before Applying! 👀

This job posting is open continuously, and it may close early upon completion of the hiring process.
Resumes that include sensitive personal information, such as salary details, may be excluded from the review process.
Providing false information in the submitted materials may result in the cancellation of the application.
Please be aware that references will be checked before finalizing the hiring decision.
Compensation will be discussed separately upon successful completion of the final interview.
There will be a probationary period after joining, and there will be no discrimination in the treatment during this period.
To support the employment of persons with disabilities, you may optionally submit a copy of your disability registration certificate under “Additional Documents,” if administrative verification is required. Submission is optional and does not affect the evaluation process.
Veterans and individuals with disabilities will receive preferential treatment in accordance with relevant regulations.

👋 About ​the ​Team

📌 What You’ll Do at This Position

✅ Key Responsibilities

✅ Requirements

✅ Pluses

✅ Hiring Process

🤓 A Message from the Team

Please Check Before Applying! 👀

🔎 Helpful materials

👋 About ​the ​Team

📌 What You’ll Do at This Position

✅ Key Responsibilities

✅ Requirements

✅ Pluses

✅ Hiring Process

🤓 A Message from the Team

Please Check Before Applying! 👀

🔎 Helpful materials

👋 About the Team

👋 About the Team