ELION Lab
Language Intelligence & Representation
Our research group explores the frontiers of natural language processing and AI systems, guided by the vision of Elucidating Language Intelligence & RepresentatiON (ELION), to understand how language models represent knowledge, reason, and generate meaning toward interpretable, controllable, and trustworthy language intelligence for real-world interaction.
Research Areas
Language is the most fundamental modality of AI, and Language Models are the ideal AI systems for connecting knowledge across disciplines. Toward the vision of All Languages, One Mind, we focus on the following research topics.
Language Models & World Models
We develop and analyze language models to study how scale, structure, and representations support reasoning, generalization, controllability, and world modeling.
Thinking & Reasoning
We study the internal representations and processes that enable language models to perform multi-step thinking and reasoning, aiming to interpret how conclusions are formed and fail.
Hallucination Detection & Mitigation
We investigate hallucination as a representational and epistemic failure in language models, developing methods to detect, analyze, and mitigate ungrounded or misleading generations.
Model Editing
We explore model editing as a means of modifying internal knowledge representations, enabling correction, updating, and control of language model behavior without retraining.
Multilingual & Multimodality
We study how language models represent and align meaning across languages and modalities, with the goal of building unified models that generalize beyond linguistic and modal boundaries.
AI Agents
We design and analyze language-model-based AI agents, focusing on how internal representations support planning, decision-making, and interaction in dynamic environments.
NOW HIRING!
Ongoing Projects
1. Continual Representation Learning in Language Models
We study how language models continuously update and refine internal representations over time, enabling lifelong learning without catastrophic forgetting, extending to Life2Vec-style research that predicts events and risks across human and model lifecycles using AnyType data.
2. Reducing Hallucination in Multi-Agent Systems
We investigate how hallucinations emerge in multi-agent interactions and develop methods to analyze, verify, and mitigate ungrounded generation, extending beyond single-agent settings to agentic AIβbased multidimensional hallucination validation and control.
3. Anti-Scalability in Language Model Reasoning
We explore the limits of scaling for reasoning, examining when and why larger language models fail to reason better and identifying alternative mechanisms beyond scale for robust reasoning.
4. Multimodal Language Modeling and World Understanding
We enhance dependency parsing and semantic chunking in visionβlanguage models to improve multimodal document retrieval and understanding, studying how models integrate vision and other modalities to reason about physical and social world contexts beyond text.
5. World-Interactive Data Augmentation
As online data becomes saturated, we explore world-interactive data augmentation, generating high-level reasoning and multidimensional language data through interaction with real-world environments.
We are now looking for talented M.S/Ph.D students, and research interns.
APPLYLatest News
Stay updated with our recent achievements and announcements.
- Mar 2026 π Established the ELION Lab at Konkuk University
- Aug 2025 π₯ 5 papers are accepted at EMNLP 2025.
- May 2025 π₯ 1 paper is accepted at ACL 2025.
- Feb 2025 π₯ 1 paper is accepted at ICLR 2025.
- Feb 2025 π₯ 2 papers are accepted at NAACL 2025.
Featured Publications
Selected recent papers from our research group.
π The Impact of Negated Text on Hallucination with Large Language Models
Empirical Methods in Natural Language Processing (EMNLP), 2025
π KoLEG: On-the-Fly Korean Legal Knowledge Editing with Continuous Retrieval
Empirical Methods in Natural Language Processing (EMNLP) Findings, 2025
π MultiDocFusion: Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents
Empirical Methods in Natural Language Processing (EMNLP), 2025
π Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models
Empirical Methods in Natural Language Processing (EMNLP), 2025
π K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models
International Conference on Learning Representations (ICLR), 2025
π Post-negation Text Induce New Hallucinations in Large Language Models
Annual Conference on Human and Cognitive Language Technology (HCLT), 2024
π KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2024
π CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients
Empirical Methods in Natural Language Processing (EMNLP), 2023
π PU-GEN: Enhancing generative commonsense reasoning for language models with human-centered knowledge
Knowledge-Based Systems, 2022
π Plain Template Insertion: Korean-Prompt-Based Engineering for Few-Shot Learners
IEEE Access, 2022
π A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation
North American Chapter of the ACL (NAACL) Findings, 2022
π Dense-to-Question and Sparse-to-Answer: Hybrid Retriever System for Industrial Frequently Asked Questions
Mathematics, 2022
π KommonGen: A Dataset for Korean Generative Commonsense Reasoning Evaluation
Annual Conference on Human and Cognitive Language Technology (HCLT), 2021