ELION Lab

Language Intelligence & Representation

Our research group explores the frontiers of natural language processing and AI systems, guided by the vision of Elucidating Language Intelligence & RepresentatiON (ELION), to understand how language models represent knowledge, reason, and generate meaning toward interpretable, controllable, and trustworthy language intelligence for real-world interaction.

Research Areas

Language is the most fundamental modality of AI, and Language Models are the ideal AI systems for connecting knowledge across disciplines. Toward the vision of All Languages, One Mind, we focus on the following research topics.

🧠

Language Models & World Models

We develop and analyze language models to study how scale, structure, and representations support reasoning, generalization, controllability, and world modeling.

πŸ’­

Thinking & Reasoning

We study the internal representations and processes that enable language models to perform multi-step thinking and reasoning, aiming to interpret how conclusions are formed and fail.

πŸ”

Hallucination Detection & Mitigation

We investigate hallucination as a representational and epistemic failure in language models, developing methods to detect, analyze, and mitigate ungrounded or misleading generations.

✏️

Model Editing

We explore model editing as a means of modifying internal knowledge representations, enabling correction, updating, and control of language model behavior without retraining.

🌐

Multilingual & Multimodality

We study how language models represent and align meaning across languages and modalities, with the goal of building unified models that generalize beyond linguistic and modal boundaries.

πŸ€–

AI Agents

We design and analyze language-model-based AI agents, focusing on how internal representations support planning, decision-making, and interaction in dynamic environments.

NOW HIRING!

Ongoing Projects

1. Continual Representation Learning in Language Models

We study how language models continuously update and refine internal representations over time, enabling lifelong learning without catastrophic forgetting, extending to Life2Vec-style research that predicts events and risks across human and model lifecycles using AnyType data.

2. Reducing Hallucination in Multi-Agent Systems

We investigate how hallucinations emerge in multi-agent interactions and develop methods to analyze, verify, and mitigate ungrounded generation, extending beyond single-agent settings to agentic AI–based multidimensional hallucination validation and control.

3. Anti-Scalability in Language Model Reasoning

We explore the limits of scaling for reasoning, examining when and why larger language models fail to reason better and identifying alternative mechanisms beyond scale for robust reasoning.

4. Multimodal Language Modeling and World Understanding

We enhance dependency parsing and semantic chunking in vision–language models to improve multimodal document retrieval and understanding, studying how models integrate vision and other modalities to reason about physical and social world contexts beyond text.

5. World-Interactive Data Augmentation

As online data becomes saturated, we explore world-interactive data augmentation, generating high-level reasoning and multidimensional language data through interaction with real-world environments.

We are now looking for talented M.S/Ph.D students, and research interns.

APPLY

Latest News

Stay updated with our recent achievements and announcements.

  • Mar 2026 πŸŽ‰ Established the ELION Lab at Konkuk University
  • Aug 2025 πŸ”₯ 5 papers are accepted at EMNLP 2025.
  • May 2025 πŸ”₯ 1 paper is accepted at ACL 2025.
  • Feb 2025 πŸ”₯ 1 paper is accepted at ICLR 2025.
  • Feb 2025 πŸ”₯ 2 papers are accepted at NAACL 2025.

Featured Publications

Selected recent papers from our research group.

EMNLP 2025

🌟 The Impact of Negated Text on Hallucination with Large Language Models

Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim*

Empirical Methods in Natural Language Processing (EMNLP), 2025

EMNLP 2025 Findings

🌟 KoLEG: On-the-Fly Korean Legal Knowledge Editing with Continuous Retrieval

Jaehyung Seo, Dahyun Jung, Jaewook Lee, Yongchan Chun, Dongjun Kim, Hwijung Ryu, Donghoon Shin, Heuiseok Lim*

Empirical Methods in Natural Language Processing (EMNLP) Findings, 2025

EMNLP 2025

🌟 MultiDocFusion: Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents

Joong Min Shin, Chanjun Park, Jeongbae Park, Jaehyung Seo*, Heuiseok Lim*

Empirical Methods in Natural Language Processing (EMNLP), 2025

EMNLP 2025

🌟 Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models

Hyeonseok Moon, Seongtae Hong, Jaehyung Seo*, Heuiseok Lim*

Empirical Methods in Natural Language Processing (EMNLP), 2025

ICLR 2025

🌟 K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models

Jaehyung Seo, Heuiseok Lim*

International Conference on Learning Representations (ICLR), 2025

HCLT 2024 πŸ† Best Paper

🌟 Post-negation Text Induce New Hallucinations in Large Language Models

Jaehyung Seo, Aram So, Heuiseok Lim*

Annual Conference on Human and Cognitive Language Technology (HCLT), 2024

ACL 2024 Findings

🌟 KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models

Jaehyung Seo, Jaewook Lee, Chanjun Park, SeongTae Hong, Seungjun Lee, Heuiseok Lim*

Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2024

EMNLP 2023

🌟 CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients

Jaehyung Seo, Hyeonseok Moon, Jaewook Lee, Sugyeong Eo, Chanjun Park, Heuiseok Lim*

Empirical Methods in Natural Language Processing (EMNLP), 2023

Knowledge-Based Systems

🌟 PU-GEN: Enhancing generative commonsense reasoning for language models with human-centered knowledge

Jaehyung Seo, Dongsuk Oh, Sugyeong Eo, Chanjun Park, Kisu Yang, Hyeonseok Moon, Kinam Park, Heuiseok Lim*

Knowledge-Based Systems, 2022

IEEE Access

🌟 Plain Template Insertion: Korean-Prompt-Based Engineering for Few-Shot Learners

Jaehyung Seo, Hyeonseok Moon, Chanhee Lee, Sugyeong Eo, Chanjun Park, Jihoon Kim, Changwoo Chun, Heuiseok Lim*

IEEE Access, 2022

NAACL 2022 Findings

🌟 A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation

Jaehyung Seo*, Seounghoon Lee*, Chanjun Park, Yoonna Jang, Hyeonseok Moon, Sugyeong Eo, Seonmin Koo, Heuiseok Lim*

North American Chapter of the ACL (NAACL) Findings, 2022

Mathematics

🌟 Dense-to-Question and Sparse-to-Answer: Hybrid Retriever System for Industrial Frequently Asked Questions

Jaehyung Seo, Taemin Lee, Hyeonseok Moon, Chanjun Park, Sugyeong Eo, Imatitikua D Aiyanyo, Kinam Park, Aram So, Sungmin Ahn, Jeongbae Park*

Mathematics, 2022

HCLT 2021 πŸ† Outstanding Paper

🌟 KommonGen: A Dataset for Korean Generative Commonsense Reasoning Evaluation

Jaehyung Seo, Chanjun Park, Hyeonseok Moon, Sugyeong Eo, Myunghoon Kang, Seounghoon Lee, Heuiseok Lim*

Annual Conference on Human and Cognitive Language Technology (HCLT), 2021