RAG 기반 방사선치료학 교육용 챗봇의 지식소스 유형 및 임베딩 모델 성능 비교 | sam

HOME
학술논문
의약학
- 인문학
- 사회과학
- 자연과학
- 공학
- 의약학
- 농수해양
- 예술체육
- 복합학
- 경제경영
- 법학
- 어문학
방사선과학

학술논문

RAG 기반 방사선치료학 교육용 챗봇의 지식소스 유형 및 임베딩 모델 성능 비교

이용수 0

영문명: Performance Comparison of Knowledge Sources Types and Embedding Models in a RAG-based Educational Chatbots for Radiation Therapy
발행기관: 대한방사선과학회(구 대한방사선기술학회)
저자명: 정재홍(Jae-Hong Jung) 이경배(Kyung-Bae Lee) 김대건(Daegun Kim) 이영진(Youngjin Lee)
간행물 정보: 『방사선기술과학』제48권 제4호, 405~416쪽, 전체 12쪽
주제분류: 의약학 > 방사선과학
파일형태: PDF
발행일자: 2025.08.31

이용권 구매하기

이용가능 이용불가

sam무제한 이용권 으로 학술논문 이용이 가능합니다.
이 학술논문 정보는 (주)교보문고와 각 발행기관 사이에 저작물 이용 계약이 체결된 것으로, 교보문고를 통해 제공되고 있습니다. 1:1 문의

국문 초록

본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs. 도메인 특화)에 따른 성능 차이를 비교 평가하였다. 지식소스에는 교재(Textbook), 보고서(Report), 요약(Summary)이고, 언어 생성 모델에는 오픈AI사의 GPT-3.5-turbo 모델과 임베딩 모델에는 총 7가지 모델(OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, PubMedBERT)를 사용했다. 성능 평가는 총 일곱 가지 지표{BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} 지표를 사용했고, 총 10회 반복 측정하여 평균(Mean)과 표준편차(Standard deviation, SD)를 산출했다. 통계분석에는 독립 표본 t-검정과 이원분산분석(Two-way ANOVA)이적용되었다. 요약 지식소스 유형의 의미적 유사도는 평균 0.898 (89.8%)였고, 교재 및 보고서 유형보다 통계적으로 성능이 우수하였다(p < 0.001). 전반적으로 요약 지식소스 유형에서 일반 임베딩 모델이 도메인 특화 모델보다 우수했고, 교재 및 보고서 유형에서는 일부 지표에서만 유의한 차이를 확인하였다. 일반 임베딩 모델이 도메인 특화 모델보다 통계적으로 유의한 차이를 보이며성능이 높았다. 본 연구는 방사선치료학 교육을 위한 RAG 시스템 개발 시 지식소스 구성과 임베딩 모델이 시스템 성능에 미치는영향을 평가하였다. 향후 의료 교육용 인공지능 챗봇 설계 및 개발 시 최적화를 위한 유용한 기초자료가 될 것으로 기대한다.

영문 초록

The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were textbooks, reports, and summaries. The language generation model was OpenAI’s GPT-3.5-turbo model, and the embedding models were seven different models (OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, and PubMedBERT). Seven metrics {BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} were to evaluate the performance. The mean and standard deviation (SD) were calculated from 10 repeat measurement. Independent sample t-test and two-way ANOVA were used for statistical analysis. The average semantic similarity of the summary knowledge source type was 0.898 (89.8%), statistically superior to the textbook and report types (p < 0.001). Overall, the generic embedding model was better than the domain-specific model for the summary knowledge source type, while only some metrics were significantly different for the textbook and report types. The general embedding model was high performed the domain-specific model. This study evaluated the impact of knowledge source configuration and embedding model on system performance when developing a RAG system for radiation therapy education. It is expected to provide a useful basis for optimization in the design and development of artificial intelligence (AI) chatbots for medical education in the future.

키워드

검색 증강 생성 대형언어모델 챗봇 교육 방사선치료 RAG LLM Chatbot Education Radiation therapy

해당간행물 수록 논문

참고문헌

Social Network Analysis and Mining
Future Internet
arXiv
The Journal of Korean Association of Computer Education
Advances in neutral information processing system 33 (NeurIPS 2020)
Processings of the VLDB Endowment
arXiv
IEEE Intelligent Systems
The Korean Journal of BigData
JMIR Cancer
Healthcare
Journal of Radiological Science and Technology
Information
Journal of Radiological Science and Technology
International Atomic Energy Agency
J Appl Clin Med Phys
Med Phys
Med Phys
Med Phys
Med Phys
Med Phys
Med Phys
Med Phys
Med Phys
Med Phys
Proceedings of the 40th annual Meeting of the Association for Computational Linguistics
Proceedings of Workshop on Text Summarization Branches Out, Post-Conference Workshop ACL 2004, Barcelona, Spain
International Journal of Speech Technology
arXiv