Publications
Our peer-reviewed research papers and preprints
2025
🌟 The Impact of Negated Text on Hallucination with Large Language Models
Empirical Methods in Natural Language Processing (EMNLP), 2025
🌟 KoLEG: On-the-Fly Korean Legal Knowledge Editing with Continuous Retrieval
Empirical Methods in Natural Language Processing (EMNLP) Findings, 2025
🌟 MultiDocFusion: Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents
Empirical Methods in Natural Language Processing (EMNLP), 2025
🌟 Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models
Empirical Methods in Natural Language Processing (EMNLP), 2025
LimaCost: Data Valuation for Instruction Tuning of Large Language Models
Empirical Methods in Natural Language Processing (EMNLP) Findings, 2025
🌟 K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models
International Conference on Learning Representations (ICLR), 2025
Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models
North American Chapter of the ACL (NAACL) Findings, 2025
CoME: An Unlearning-based Approach to Conflict-free Model Editing
North American Chapter of the ACL (NAACL), 2025
An analysis on language transfer of pre-trained language model with cross-lingual post-training
Expert Systems with Applications, 2025
2024
🌟 Post-negation Text Induce New Hallucinations in Large Language Models
Annual Conference on Human and Cognitive Language Technology (HCLT), 2024
🌟 KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2024
Hyper-BTS Dataset: Scalability and Enhanced Analysis of Back TranScription (BTS) for ASR Post-Processing
European Chapter of the ACL (EACL) Findings, 2024
Generative Interpretation: Toward Human-Like Evaluation for Educational Question-Answer Pair Generation
European Chapter of the ACL (EACL) Findings, 2024
Intelligent Predictive Maintenance RAG framework for Power Plants: Enhancing QA with StyleDFS and Domain Specific Instruction Tuning
Empirical Methods in Natural Language Processing (EMNLP) Industry Track, 2024
Length-aware Byte Pair Encoding for Mitigating Over-segmentation in Korean Machine Translation
Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2024
Leveraging Pre-existing Resources for Data-Efficient Counter-Narrative Generation in Korean
Language Resources and Evaluation Conference (LREC-COLING), 2024
Detecting Critical Errors Considering Cross-Cultural Factors in English-Korean Translation
Language Resources and Evaluation Conference (LREC-COLING), 2024
Exploiting Hanja-Based Resources in Processing Korean Historic Documents Written by Common Literati
IEEE Access, 2024
2023
🌟 CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients
Empirical Methods in Natural Language Processing (EMNLP), 2023
KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing
Empirical Methods in Natural Language Processing (EMNLP), 2023
CReTIHC: Designing Causal Reasoning Tasks about Temporal Interventions and Hallucinated Confoundings
Empirical Methods in Natural Language Processing (EMNLP) Findings, 2023
Doubts on the reliability of parallel corpus filtering
Expert Systems with Applications, 2023
Informative Evidence-guided Prompt-based Fine-tuning for English-Korean Critical Error Detection
International Joint Conference on Natural Language Processing and Conference of the Asia-Pacific Chapter of the ACL (IJCNLP-AACL), 2023
Uncovering the Risks and Drawbacks Associated with the Use of Synthetic Data for Grammatical Error Correction
IEEE Access, 2023
PEEP-Talk: A Situational Dialogue-based Chatbot for English Education
Annual Meeting of the Association for Computational Linguistics (ACL) Demo, 2023
2022
🌟 PU-GEN: Enhancing generative commonsense reasoning for language models with human-centered knowledge
Knowledge-Based Systems, 2022
🌟 Plain Template Insertion: Korean-Prompt-Based Engineering for Few-Shot Learners
IEEE Access, 2022
🌟 A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation
North American Chapter of the ACL (NAACL) Findings, 2022
🌟 Dense-to-Question and Sparse-to-Answer: Hybrid Retriever System for Industrial Frequently Asked Questions
Mathematics, 2022
QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation
International Conference on Computational Linguistics (COLING), 2022
PicTalky: Augmentative and Alternative Communication for Language Developmental Disabilities
Asia-Pacific Chapter of the ACL (AACL) Demo, 2022
Priming Ancient Korean Neural Machine Translation
Language Resources and Evaluation Conference (LREC), 2022
Empirical Analysis of Noising Scheme based Synthetic Data Generation for Automatic Post-editing
Language Resources and Evaluation Conference (LREC), 2022
Word-Level Quality Estimation for Korean-English Neural Machine Translation
IEEE Access, 2022
An Automatic Post Editing with Efficient and Simple Data Generation Method
IEEE Access, 2022
Utilization Strategy of User Engagements in Korean Fake News Detection
IEEE Access, 2022
2021
🌟 KommonGen: A Dataset for Korean Generative Commonsense Reasoning Evaluation
Annual Conference on Human and Cognitive Language Technology (HCLT), 2021
BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text
Workshop on Asian Translation (WAT), 2021
Grounded Vocabulary for Image Retrieval Using a Modified Multi-Generator Generative Adversarial Network
IEEE Access, 2021
An Empirical Study on Automatic Post Editing for Neural Machine Translation
IEEE Access, 2021
Automatic Knowledge Augmentation for Generative Commonsense Reasoning
NeurIPS Data-Centric AI Workshop, 2021
Citing Our Work
If you use our research in your work, please cite the relevant papers. Click the "Paper" or "PDF" link to access the publication.
* denotes corresponding author