Hardware-efficient AI Architecture
I’m a first year Ph.D. Student in Korea Advanced Institute of Science and Technology (KAIST)
Advised by Professor Hoi-Jun Yoo in Semiconductor System Lab (SSL).
Research Focus

- ML Level: Quantization and sparsity exploration for efficient processing while preserving model accuracy
- System Level: GPU kernel optimization for sparse AI workloads on current hardware platforms
- Architecture Level: Innovative system-on-chip (SoC) architectures that improve hardware efficiency.
- Circuit Level: End-to-end chip implementation and post-silicon measurement for performance validation
- Especially focusing on hardware-efficient transformer models, post-transformer models, and accelerator architectures in LLM and Video AI domains
Publications
💡 Each project I work on carries a unique design philosophy. 💡
🎨 Giving it a name helps me preserve that philosophy and communicate it more intuitively. 🎨
Conference Papers
🐍 A 2.67 mJ/frame Video Mamba Accelerator with Importance-aware Redundancy Elimination and SSM Computing Reformulation
Authors: Youngjin Moon, Sangwoo Ha, Soyeon Kim, Junha Ryu, Hoi-Jun Yoo, and Donghyeon Han
IEEE International Symposium on Circuits and Systems (ISCAS), 2025
Major Circuit
“When Mamba Awakened with SLYTHERIN!!”
SMoLPU: A 122.1 μJ/token Sparse MoE-based Speculative Decoding LLM Processing Unit with Adaptive-Offload NPU-CIM Core
Authors: Sangwoo Ha, Jingu Lee, Youngjin Moon, Sunjoo Whang, Wooyoung Jo, Gwangtae Park, Soyeon Um, Junha Ryu, Yurim Jo, and Hoi-Jun Yoo
IEEE International Conference on Solid-State Circuits (ISSCC), 2026 (To appear)
Top-Tier Circuit
NuVPU: A 4.8~9.6 mJ/frame Progressive NTT-based Unified Video Processor for Stable Video Streaming and Processing with Neural Video Codec
Authors: Soyeon Kim, Hankyul Kwon, Jingu Lee, Youngjin Moon, Hongseok Lee, Junha Ryu, Zhamaliddin Kalzhan, Sangyeob Kim, Wooyoung Jo, and Hoi-Jun Yoo
IEEE Symposium on VLSI Circuits (S.VLSI), 2025
Top-Tier Circuit
Journal Papers
🐍 An Energy-Efficient State-Space Model Accelerator for Real-Time Video Understanding via Redundancy-Conscious Design
Authors: Youngjin Moon, Sangwoo Ha, Junha Ryu, Soyeon Kim, Donghyeon Han, and Hoi-Jun Yoo
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2026 (Under Review)
Top-Tier Circuit
“SLYTHERIN’s Tiki-Taka of Redundancy in Mamba!!”
A 227.1 TOPS/W High Energy-efficiency PNN-based 3D Object Recognition Processor with Spiking Neural Network for Edge Device
Authors: Sangmyoung Lee, Seryeong Kim, Jongjun Park, Youngjin Moon, Minsung Kim, Junha Ryu, Hoi-Jun Yoo, Donghyeon Han
IEEE Transactions on Circuits and Systems II: Express Briefs (TCAS-II), 2025
Major Circuit
Education
Sungkyunkwan University (SKKU)
Suwon, South Korea
B.S in M.E & E.E
2017/Mar. – 2024/Feb.
Korea Advanced Institute of Science and Technology (KAIST)
Daejeon, South Korea
M.S in E.E
Advisor: Hoi-Jun Yoo
2024/Mar. – 2026/Feb.
Korea Advanced Institute of Science and Technology (KAIST)
Daejeon, South Korea
Ph.D in E.E
Advisor: Hoi-Jun Yoo
2026/Mar. – Present
Work Experience
SAMSUNG ELECTRONICS
Position: Hardware Design Engineer Internship
Group: Design Solutions, System LSI, Modem Team, SoC Design Group
2022/Sep. – 2022/Dec.
