Seunghyun Park (Integrated Ph.D. Student)
Repository Commit HistoryIntroductionBrief IntroductionA holistic AI accelerator designer from low-level to high-level Full Bio SketchMr. Park received his B.S. degree in Electronics Engineering at Kyungpook National University, Daegu, Republic of Korea in 2023. He is currently an integrated Ph.D. student in School of Electronic and Electrical Engineering at Kyungpook National University, Daegu, Republic of Korea. His research interests include artificial intelligence (AI) accelerator design. He conducts research about low-power, small-area, high-speed accelerator architecture and design/verification methodology. Currently, he is developing a structure that allows AI operations to perform with high efficiency but low power on edge-devices like MCU, utilizing techniques such as tiling and improved off-chip communication. His previous research primarily focused on studies related to DSP (Digital Signal Processing). In particular, he concentrated on research about acoustic signals, covering areas like active noise cancellation and 3D audio. Notably, he addressed the shortcomings of existing noise-canceling algorithms by utilizing artificial intelligence models for real-time noise processing and designed accelerators for binaural reproduction, resulting in the publication of several papers. Research TopicCNN Acceleration for LiDAR Signal Processing SystemsIn field of artificial intelligence, Image is commonly expressed by matrix. Hardware process input matrix through filter(kernel) then output the matrix processed image. And images consist of three colors, R, G, B. So images are 3-dimentional tensor where convolutional neural networks process. For efficient image processing convolutional neural network acceleration processor are needed. To design convolution processor this research uses Verilog RTL simulation. Accelerating means processor could run immediately when data is come. So, in this research puts effort on control logic which calculates memory address. Starting with assembly language code for matrix multiplier, calculating memory address can be automated. With methods like automation memory address calculation and loop unrolling, we can achieve the goal which is high-performance processor implementation. Also, minimize the Les (Logic Elements) low-power, small-size design can be implemented. One of the minimalization method is MAC (Multiply And Accumulate). MAC is powerful because in one instruction multiply and add calculation is immediately performed by combinational logic circuit. MAC processor calculates better than non-MAC processor with about 60% reduced time. CNN Accelerator for Noise CancellerConvolutional neural networks (CNNs) are prevalent in image processing systems. However, there are not sufficient studies on acoustic systems. The primary research focuses on the acoustic system, a low-power hardware implementation of noise cancellation. However, conventional adaptive noise cancellation suffers slow convergence. Furthermore, existing CNNs have a bottleneck in memory and power. In the present work, we propose efficient acoustic noise cancellation architecture to accelerate processing speed and reduce power consumption. Our proposed architecture has an efficient data transfer technique using even-odd buffer and low-power CNNs noise cancellation algorithms. With our proposed architecture, the simulation result shows that the overall processing time was reduced by 20.3% and the power consumption was reduced by 6.1%compared to the single buffer Tile-Connected AI Computation OptimizationDesigning systems that transcend specific applications like ANC and can be applied to more general accelerator architectures is currently one of the most important topics in the chip design field. Due to resource constraints in artificial intelligence operations or high-performance computing, performance often becomes bound by I/O bandwidth or computing elements. Therefore, there is a proposal for structures that can be implemented at a low cost, not just by widening bandwidth or increasing the number of processing elements. Research is underway to create a more efficient data control path with a structure that is low-power but does not compromise on performance or accuracy using the Radix-4 Booth algorithm in a Bit-separable manner, or through a tightly coupled software/hardware structure for off-chip communication PublicationsJournal Publications (KCI 1, SCI 1)
Conference Publications (Intl. 5)
Participation in International Conference
Last Updated, 2024.3.23 |