A Study on Learning Method for Korean Speech Data Using Limited Computing Resource | sam

HOME
학술논문
- 학술논문
복합학
- 인문학
- 사회과학
- 자연과학
- 공학
- 의약학
- 농수해양
- 예술체육
- 복합학
- 경제경영
- 법학
- 어문학
과학기술학

학술논문

A Study on Learning Method for Korean Speech Data Using Limited Computing Resource

이용수 25

영문명
발행기관: 한국인공지능학회
저자명: JeHyung TAK Kyuhyun CHOI Hyunsik NA Minsung KIM
간행물 정보: 『인공지능연구』Vol.13 No. 2, 17~21쪽, 전체 5쪽
주제분류: 복합학 > 과학기술학
파일형태: PDF
발행일자: 2025.06.30

무료

구매일시로부터 72시간 이내에 다운로드 가능합니다.
이 학술논문 정보는 (주)교보문고와 각 발행기관 사이에 저작물 이용 계약이 체결된 것으로, 교보문고를 통해 제공되고 있습니다.

1:1 문의

국문 초록

In light of the increasing concerns over carbon emissions and power supply issues in the field of artificial intelligence, this study aims to conduct fine-tuning of a large language model (LLM) on Korean spoken language data using small-scale computing resources, and to evaluate the performance of the resulting supervised model. This research proposes an efficient method to limit computing resource usage and conducts the training based on such limited infrastructure.Subsequently, Korean spoken language data was collected. The dataset was designed to enable the model to understand a wide range of questions and provide appropriate answers. It consists of general knowledge sentence generation data, book summary information, academic paper summary data, and document summarization data. Due to the phonological changes, frequent subject omission, and honorifics that are unique to the Korean language, it is difficult to achieve satisfactory performance using existing English-based LLM training methods alone.This study distinguishes itself from prior works by selectively leveraging a dataset that reflects the linguistic characteristics of Korean, thereby proposing a language-specialized fine-tuning data strategy. For methodology, we conducted LLM fine-tuning using LoRA (Low-Rank Adaptation of Large Language Models) via Unsloth, based on the open-source Llama-3.1-8B-Instruct AI model. As a result, the model fine-tuning in this study achieved an average score of 43.33 on the Open Ko-LLM Leaderboard. Notably, it scored 61.17 on Ko-Winogrande, which assesses logical reasoning, and 58.3 on Ko-GSM8k, which evaluates mathematical problem-solving skills—demonstrating competitive performance compared to other open-source models. These results suggest a practical alternative to large-scale resource-based models in terms of both resource efficiency and linguistic suitability

국문 초록

영문 초록

목차

키워드

해당간행물 수록 논문

참고문헌

관련논문

복합학 > 과학기술학분야 BEST

복합학 > 과학기술학분야 NEW

최근 이용한 논문

APA

MLA