[NetsPresso] Quantization Research Engineer
Job group
R&D
Experience Level
Experienced 5 years or more
Job Types
Full-time
Locations
NotaNota Inc. (16F, Parnas Tower), 521, Teheran-ro, Gangnam-gu, Seoul, Republic of Korea, 파르나스타워 16층 Nota

👋 About ​the ​Team

The ​NetsPresso Platform ​Team designs and implements ​core ​platforms and ​software that transform ​Nota AI’s ​model ​lightweighting and ​optimization ​research ​into real-world products.

Our ​organization ​consists of Model ​Representation, ​Quantization, ​Graph Optimization, Model ​Engineering, and ​SW ​Engineering units. ​Among them, ​the ​Quantization Part researches ​NetsPresso’s core ​optimization technology—quantization—and integrates proprietary techniques into our products to accelerate deep learning inference across diverse hardware (HW) environments.

We research algorithms to minimize performance degradation caused by quantization, support optimization tailored to various HW and backend constraints, and convert models into forms that enable hardware acceleration.



📌 What You’ll Do at This Position

As a key contributor, you will research and productize quantization technologies that sit at the heart of NetsPresso. You will design Nota’s unique quantization methods by studying state-of-the-art (SOTA) algorithms and optimizing them for specific model architectures and HW characteristics. You will gain hands-on experience with cutting-edge models and optimization techniques for On-device AI.




✅ Key Responsibilities

  • Quantization Research & Productization
  • Research and develop next-generation Post-Training Quantization (PTQ), Quantization-Aware Training (QAT), and compression algorithms.
  • Establish quantization strategies and design frameworks considering model architecture and target HW characteristics.
  • Study optimization methodologies for Generative AI (LLM, VLM, Diffusion, etc.), Computer Vision (Classification, Detection, Segmentation, etc.), and various AI models.
  • Secure technical leadership through paper publications at top-tier conferences and patent filings.
  • On-device AI Model Optimization
  • Design and enhance quantization pipelines for deployment to various On-device HW.
  • Lead and execute AI model optimization projects.
  • Analyze causes of accuracy drops due to quantization and establish systematic solutions.



✅ Requirements

  • Degree in Computer Science, Electronic Engineering, or a related field.
  • Experience Level: Bachelor’s with 5+ years, Master’s with 3+ years, or Ph.D. (including expected graduates).
  • In-depth research/development experience in Quantization, Model Compression, or Deep Learning Optimization.
  • Deep understanding of deep learning model optimization based on PyTorch, ExecuTorch, and ONNX.
  • Experience leading and managing small-to-mid-sized projects.
  • Strong technical writing skills and ability to communicate in English.
  • No disqualifying factors for overseas travel.



✅ Pluses

  • Experience publishing papers in top-tier conferences (NeurIPS, ICML, CVPR, ICLR, etc.) related to Quantization, Model Compression, or Kernel Optimization.
  • Experience in productizing research results or applying them to commercial services.
  • Advanced proficiency in optimization libraries such as ExecuTorch, ONNX, TensorRT, and AIMET.
  • Experience in porting low-bit models to embedded devices and performance tuning.
  • Experience contributing to or maintaining open-source projects.
  • Ph.D. holders.



✅ Hiring Process

Document Screening → Assignment → 1st Interview → 2nd Interview

(Additional assignments may be included during the process.)




🤓 A Message from the Team

We value high interest in new technologies and the execution power to turn ideas into reality. This position goes beyond pure research; you will develop proprietary quantization technologies directly linked to NetsPresso services. As each module is organically connected, we prioritize active communication and a proactive attitude. If you enjoy diving deep into complex technical problems and growing through collaboration, you will thrive in this team.



Please Check Before Applying! 👀

  • This job posting is open continuously, and it may close early upon completion of the hiring process.
  • Resumes that include sensitive personal information, such as salary details, may be excluded from the review process.
  • Providing false information in the submitted materials may result in the cancellation of the application.
  • Please be aware that references will be checked before finalizing the hiring decision.
  • Compensation will be discussed separately upon successful completion of the final interview.
  • There will be a probationary period after joining, and there will be no discrimination in the treatment during this period.
  • To support the employment of persons with disabilities, you may optionally submit a copy of your disability registration certificate under “Additional Documents,” if administrative verification is required. Submission is optional and does not affect the evaluation process.
  • Veterans and individuals with disabilities will receive preferential treatment in accordance with relevant regulations.



🔎 Helpful materials

Share
[NetsPresso] Quantization Research Engineer

👋 About ​the ​Team

The ​NetsPresso Platform ​Team designs and implements ​core ​platforms and ​software that transform ​Nota AI’s ​model ​lightweighting and ​optimization ​research ​into real-world products.

Our ​organization ​consists of Model ​Representation, ​Quantization, ​Graph Optimization, Model ​Engineering, and ​SW ​Engineering units. ​Among them, ​the ​Quantization Part researches ​NetsPresso’s core ​optimization technology—quantization—and integrates proprietary techniques into our products to accelerate deep learning inference across diverse hardware (HW) environments.

We research algorithms to minimize performance degradation caused by quantization, support optimization tailored to various HW and backend constraints, and convert models into forms that enable hardware acceleration.



📌 What You’ll Do at This Position

As a key contributor, you will research and productize quantization technologies that sit at the heart of NetsPresso. You will design Nota’s unique quantization methods by studying state-of-the-art (SOTA) algorithms and optimizing them for specific model architectures and HW characteristics. You will gain hands-on experience with cutting-edge models and optimization techniques for On-device AI.




✅ Key Responsibilities

  • Quantization Research & Productization
  • Research and develop next-generation Post-Training Quantization (PTQ), Quantization-Aware Training (QAT), and compression algorithms.
  • Establish quantization strategies and design frameworks considering model architecture and target HW characteristics.
  • Study optimization methodologies for Generative AI (LLM, VLM, Diffusion, etc.), Computer Vision (Classification, Detection, Segmentation, etc.), and various AI models.
  • Secure technical leadership through paper publications at top-tier conferences and patent filings.
  • On-device AI Model Optimization
  • Design and enhance quantization pipelines for deployment to various On-device HW.
  • Lead and execute AI model optimization projects.
  • Analyze causes of accuracy drops due to quantization and establish systematic solutions.



✅ Requirements

  • Degree in Computer Science, Electronic Engineering, or a related field.
  • Experience Level: Bachelor’s with 5+ years, Master’s with 3+ years, or Ph.D. (including expected graduates).
  • In-depth research/development experience in Quantization, Model Compression, or Deep Learning Optimization.
  • Deep understanding of deep learning model optimization based on PyTorch, ExecuTorch, and ONNX.
  • Experience leading and managing small-to-mid-sized projects.
  • Strong technical writing skills and ability to communicate in English.
  • No disqualifying factors for overseas travel.



✅ Pluses

  • Experience publishing papers in top-tier conferences (NeurIPS, ICML, CVPR, ICLR, etc.) related to Quantization, Model Compression, or Kernel Optimization.
  • Experience in productizing research results or applying them to commercial services.
  • Advanced proficiency in optimization libraries such as ExecuTorch, ONNX, TensorRT, and AIMET.
  • Experience in porting low-bit models to embedded devices and performance tuning.
  • Experience contributing to or maintaining open-source projects.
  • Ph.D. holders.



✅ Hiring Process

Document Screening → Assignment → 1st Interview → 2nd Interview

(Additional assignments may be included during the process.)




🤓 A Message from the Team

We value high interest in new technologies and the execution power to turn ideas into reality. This position goes beyond pure research; you will develop proprietary quantization technologies directly linked to NetsPresso services. As each module is organically connected, we prioritize active communication and a proactive attitude. If you enjoy diving deep into complex technical problems and growing through collaboration, you will thrive in this team.



Please Check Before Applying! 👀

  • This job posting is open continuously, and it may close early upon completion of the hiring process.
  • Resumes that include sensitive personal information, such as salary details, may be excluded from the review process.
  • Providing false information in the submitted materials may result in the cancellation of the application.
  • Please be aware that references will be checked before finalizing the hiring decision.
  • Compensation will be discussed separately upon successful completion of the final interview.
  • There will be a probationary period after joining, and there will be no discrimination in the treatment during this period.
  • To support the employment of persons with disabilities, you may optionally submit a copy of your disability registration certificate under “Additional Documents,” if administrative verification is required. Submission is optional and does not affect the evaluation process.
  • Veterans and individuals with disabilities will receive preferential treatment in accordance with relevant regulations.



🔎 Helpful materials