ELION Lab

Language Intelligence & Representation

Our research group explores the frontiers of natural language processing and AI systems, guided by the vision of Elucidating Language Intelligence & RepresentatiON (ELION), to understand how language models represent knowledge, reason, and generate meaning toward interpretable and controllable language intelligence for real-world interaction.

연ꡬ λΆ„μ•Ό (Research Areas)

Toward the vision of All Languages, One Mind, we focus on the following research topics.

🧠

Language Models & World Models

We develop and analyze language models to study how scale, structure, and representations support reasoning, generalization, controllability, and world modeling.

πŸ’­

Thinking & Reasoning

We study the internal representations and processes that enable language models to perform multi-step thinking and reasoning, aiming to interpret how conclusions are formed and fail.

πŸ”

Hallucination Detection & Mitigation

We investigate hallucination as a representational and epistemic failure in language models, developing methods to detect, analyze, and mitigate ungrounded or misleading generations.

✏️

Model Editing

We explore model editing as a means of modifying internal knowledge representations, enabling correction, updating, and control of language model behavior without retraining.

🌐

Multilingual & Multimodality

We study how language models represent and align meaning across languages and modalities, with the goal of building unified models that generalize beyond linguistic and modal boundaries.

πŸ€–

AI Agents

We design and analyze language-model-based AI agents, focusing on how internal representations support planning, decision-making, and interaction in dynamic environments.

λͺ¨μ§‘ 쀑! (NOW HIRING!)

πŸš€ Ongoing Projects

ELION Lab은 μ–Έμ–΄ λͺ¨λΈμ˜ ν•œκ³„λ₯Ό λ„˜μ–΄, 세상을 μ΄ν•΄ν•˜κ³  ν–‰λ™ν•˜λŠ” μ°¨μ„ΈλŒ€ AIλ₯Ό κ΅¬μΆ•ν•˜κΈ° μœ„ν•΄ λ‹€μŒμ˜ 핡심 κ³Όμ œλ“€μ„ μˆ˜ν–‰ν•˜κ³  μžˆμ–΄μš”.

1. Continual Representation Learning 🧠

μ–Έμ–΄ λͺ¨λΈμ΄ μƒˆλ‘œμš΄ 정보λ₯Ό ν•™μŠ΅ν•  λ•Œ κΈ°μ‘΄ 지식을 μžŠμ–΄λ²„λ¦¬λŠ” '치λͺ…적 망각(Catastrophic Forgetting)' 없이 지식을 μ§€μ†μ μœΌλ‘œ μ—…λ°μ΄νŠΈν•˜κ³  μ •κ΅ν™”ν•˜λŠ” 방법을 μ—°κ΅¬ν•˜κ³  μžˆμ–΄μš”. 이λ₯Ό Life2Vec μŠ€νƒ€μΌμ˜ μ—°κ΅¬λ‘œ ν™•μž₯ν•˜μ—¬, AnyType 데이터λ₯Ό ν™œμš©ν•΄ 인간과 λͺ¨λΈμ˜ μ „ 생애 주기에 걸친 사건과 리슀크λ₯Ό μ˜ˆμΈ‘ν•˜λŠ” μ§€μ†ν•™μŠ΅(Lifelong Learning) μ‹œμŠ€ν…œ ꡬ좕을 λͺ©ν‘œλ‘œ ν•΄μš”.

* Status: λ‹€μˆ˜μ˜ 논문이 κΈ€λ‘œλ²Œ 탑티어 ν•™νšŒ 심사 쀑 (Under Review)

2. Reducing Hallucination & RAG πŸ›‘οΈ

LLM이 잘λͺ»λœ 정보λ₯Ό μƒμ„±ν•˜λŠ” ν™˜κ°(Hallucination) ν˜„μƒμ„ νƒμ§€ν•˜κ³  μ™„ν™”ν•˜λŠ” μ΅œμ²¨λ‹¨ 방법둠을 νƒκ΅¬ν•˜κ³  μžˆμ–΄μš”. λ‚΄λΆ€μ μœΌλ‘œλŠ” Calibrationκ³Ό Uncertainty Estimation을 톡해 λͺ¨λΈμ˜ λ‹΅λ³€ 확신도λ₯Ό μΈ‘μ •ν•˜κ³ , μ™ΈλΆ€μ μœΌλ‘œλŠ” μ •λ°€ν•œ 검색 증강(RAG) κ³Όμ •μ—μ„œμ˜ Grounding κΈ°μˆ μ„ κ°œλ°œν•˜μ—¬ 생성 결과의 μ‹ λ’°μ„±κ³Ό 일관성을 κ·ΉλŒ€ν™”ν•˜κ³  μžˆμ–΄μš”.

* Project: IITP 과제 μˆ˜ν–‰ 쀑 (2024-2026, μƒμ„±ν˜• AI μ„±κ³Όλ¬Όμ˜ μ‹ λ’°μ„± 및 일관성 연ꡬ)

3. Multi-Agent Systems & Orchestration πŸ€–

단일 λͺ¨λΈμ˜ ν•œκ³„λ₯Ό λ„˜μ–΄, μ—¬λŸ¬ AI μ—μ΄μ „νŠΈκ°€ ν˜‘λ ₯ν•˜λŠ” μ‹€μš©μ μΈ λ©€ν‹°μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œ κ°œλ°œμ„ μ§€ν–₯ν•˜κ³  μžˆμ–΄μš”. 효율적인 μ˜€μΌ€μŠ€νŠΈλ ˆμ΄μ…˜(Orchestration) κΈ°μˆ μ„ 톡해 Planningκ³Ό μ˜¨ν†¨λ‘œμ§€ 탐색 과정을 ν†΅ν•©ν•˜λ©°, κ²½λŸ‰ν™”λœ λͺ¨λΈμ—μ„œλ„ λ³΅μž‘ν•œ νƒœμŠ€ν¬λ₯Ό μˆ˜ν–‰ν•  수 μžˆλŠ” κ³ μ„±λŠ₯ μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œμ„ κ΅¬ν˜„ν•˜λŠ” 것이 λͺ©ν‘œμ—μš”.

* Status: μ—μ΄μ „νŠΈ κ΄€λ ¨ μ‹ κ·œ 연ꡬ 과제 지원 및 μΆ”μ§„ 쀑
* Collaboration: κ³ λ €λŒ€ν•™κ΅ Agentic AI Teamκ³Ό κΈ΄λ°€ν•œ ν˜‘λ ₯ 연ꡬ μ§„ν–‰

4. Multimodal LM & World Understanding πŸ‘οΈ

ν…μŠ€νŠΈλ₯Ό λ„˜μ–΄ μ‹œκ° λ“± λ‹€μ–‘ν•œ λͺ¨λ‹¬λ¦¬ν‹°λ₯Ό ν†΅ν•©ν•˜μ—¬ 물리적/μ‚¬νšŒμ  λ§₯락을 μ΄ν•΄ν•˜λŠ” 연ꡬλ₯Ό μˆ˜ν–‰ν•˜κ³  μžˆμ–΄μš”. Vision-Language λͺ¨λΈμ˜ μ˜μ‘΄μ„± ꡬ문 뢄석(Dependency Parsing)κ³Ό μ‹œλ©˜ν‹± μ²­ν‚Ή(Semantic Chunking) μ„±λŠ₯을 λ†’μ—¬ λ©€ν‹°λͺ¨λ‹¬ λ¬Έμ„œ 검색 및 이해도λ₯Ό ν˜μ‹ ν•˜λ©°, AIκ°€ ν˜„μ‹€ μ„Έκ³„μ˜ 상식을 μΆ”λ‘ ν•  수 μžˆλŠ” μ§€λŠ₯을 갖좔도둝 ν•΄μš”.

* Collaboration: κ³ λ €λŒ€ν•™κ΅ Document AI & Multimodal νŒ€κ³Ό 곡동 연ꡬ μˆ˜ν–‰ 쀑

5. World-Interactive Data Augmentation 🌐

μ›Ήμƒμ˜ 데이터가 포화 μƒνƒœμ— 이λ₯Έ '데이터 고갈' μ‹œλŒ€μ— λŒ€μ‘ν•˜κΈ° μœ„ν•΄, ν˜„μ‹€ 세계 ν™˜κ²½κ³Όμ˜ μƒν˜Έμž‘μš©μ„ ν†΅ν•œ 데이터 증강 기법을 νƒκ΅¬ν•΄μš”. ν™˜κ²½κ³Όμ˜ ν”Όλ“œλ°±μ„ 톡해 고차원적인 μΆ”λ‘ (High-level reasoning)κ³Ό 닀차원적 μ–Έμ–΄ 데이터λ₯Ό 슀슀둜 μƒμ„±ν•΄λ‚΄λŠ” ν˜μ‹ μ μΈ 데이터 μ—”μ§„ κΈ°μˆ μ„ μ—°κ΅¬ν•΄μš”.

* Collaboration: 싱가포λ₯΄ A*STAR Research 및 Microsoft Research Asia (MSRA)와 κΈ€λ‘œλ²Œ ν˜‘μ—… μ˜ˆμ •

ELION Lab is dedicated to pushing the boundaries of language models to build next-generation AI that interacts with the real world.

1. Continual Representation Learning 🧠

We investigate methods for language models to continuously update and refine their internal knowledge without experiencing "catastrophic forgetting." Expanding this into Life2Vec-style research, we aim to build lifelong learning systems that utilize AnyType data to predict events and risks across both human and model lifecycles.

* Status: Multiple papers currently under review at top-tier global conferences.

2. Reducing Hallucination & RAG πŸ›‘οΈ

We explore cutting-edge methodologies to detect and mitigate the hallucination phenomenon in LLMs. Internally, we focus on measuring model confidence through calibration and uncertainty estimation. Externally, we strive to maximize the reliability and consistency of generated outputs by developing advanced grounding techniques within precise Retrieval-Augmented Generation (RAG) pipelines.

* Project: Supported by IITP (2024–2026, Research on the Reliability and Coherence of Generative AI Outcomes).

3. Multi-Agent Systems & Orchestration πŸ€–

Moving beyond the limitations of single models, we aim to develop practical and precise multi-agent systems. Through efficient orchestration, we integrate planning and ontology exploration to achieve high performance even with lightweight models, enabling them to execute complex, multi-step tasks.

* Status: New research proposals in progress; actively expanding the agentic AI agenda.
* Collaboration: Joint research with the Korea University Agentic AI Team.

4. Multimodal LM & World Understanding πŸ‘οΈ

We conduct research on integrating diverse modalities, such as vision, to understand physical and social contexts beyond text. By enhancing dependency parsing and semantic chunking in Vision-Language Models (VLMs), we innovate multimodal document retrieval and empower AI to reason with real-world common sense.

* Collaboration: Ongoing joint research with the Korea University Document AI & Multimodal Team.

5. World-Interactive Data Augmentation 🌐

To address the era of "data exhaustion" as online data becomes saturated, we explore world-interactive data augmentation. We research innovative data engines that autonomously generate high-level reasoning and multi-dimensional language data through direct feedback and interaction with real-world environments.

* Collaboration: Upcoming international collaboration with Singapore A*STAR Research and Microsoft Research Asia (MSRA).

ELION Labμ—μ„œ 인곡지λŠ₯ 연ꡬ에 열정을 μ§€λ‹Œ 인턴, 석사, 박사과정 학생을 λͺ¨μ§‘ν•©λ‹ˆλ‹€
(We are now looking for talented M.S/Ph.D students, and research interns.)

APPLY

μ΅œμ‹  λ‰΄μŠ€ (Latest News)

Stay updated with our recent achievements and announcements.

  • Mar 2026 πŸ”₯ 2 papers are accepted at CVPR 2026.
  • Mar 2026 πŸŽ‰ Established the ELION Lab at Konkuk University
  • Aug 2025 πŸ”₯ 5 papers are accepted at EMNLP 2025.
  • May 2025 πŸ”₯ 1 paper is accepted at ACL 2025.
  • Feb 2025 πŸ”₯ 1 paper is accepted at ICLR 2025.
  • Feb 2025 πŸ”₯ 2 papers are accepted at NAACL 2025.

μ£Όμš” λ…Όλ¬Έ (Featured Publications)

Selected recent papers from our research group.

CVPR 2026

🌟 Evidential Transformation Network: Turning Pretrained Models into Evidential Models for Uncertainty Estimation

Yongchan Chun, Chanhee Park, Jeongho Yoon, Jaehyung Seo*, Heuiseok Lim*

Conference on Computer Vision and Pattern Recognition (CVPR), 2026

CVPR 2026

🌟 M3DocDep: Multi-modal, Multi-page, Multi-document Dependency Chunking with Large Vision-Language Models

Joongmin Shin, Jeongbae Park, Jaehyung Seo*, Heuiseok Lim*

Conference on Computer Vision and Pattern Recognition (CVPR), 2026

EMNLP 2025

🌟 The Impact of Negated Text on Hallucination with Large Language Models

Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim*

Empirical Methods in Natural Language Processing (EMNLP), 2025

EMNLP 2025 Findings

🌟 KoLEG: On-the-Fly Korean Legal Knowledge Editing with Continuous Retrieval

Jaehyung Seo, Dahyun Jung, Jaewook Lee, Yongchan Chun, Dongjun Kim, Hwijung Ryu, Donghoon Shin, Heuiseok Lim*

Empirical Methods in Natural Language Processing (EMNLP) Findings, 2025

EMNLP 2025

🌟 MultiDocFusion: Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents

Joong Min Shin, Chanjun Park, Jeongbae Park, Jaehyung Seo*, Heuiseok Lim*

Empirical Methods in Natural Language Processing (EMNLP), 2025

EMNLP 2025

🌟 Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models

Hyeonseok Moon, Seongtae Hong, Jaehyung Seo*, Heuiseok Lim*

Empirical Methods in Natural Language Processing (EMNLP), 2025

ICLR 2025

🌟 K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models

Jaehyung Seo, Heuiseok Lim*

International Conference on Learning Representations (ICLR), 2025

HCLT 2024 πŸ† Best Paper

🌟 Post-negation Text Induce New Hallucinations in Large Language Models

Jaehyung Seo, Aram So, Heuiseok Lim*

Annual Conference on Human and Cognitive Language Technology (HCLT), 2024

ACL 2024 Findings

🌟 KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models

Jaehyung Seo, Jaewook Lee, Chanjun Park, SeongTae Hong, Seungjun Lee, Heuiseok Lim*

Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2024

EMNLP 2023

🌟 CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients

Jaehyung Seo, Hyeonseok Moon, Jaewook Lee, Sugyeong Eo, Chanjun Park, Heuiseok Lim*

Empirical Methods in Natural Language Processing (EMNLP), 2023

Knowledge-Based Systems

🌟 PU-GEN: Enhancing generative commonsense reasoning for language models with human-centered knowledge

Jaehyung Seo, Dongsuk Oh, Sugyeong Eo, Chanjun Park, Kisu Yang, Hyeonseok Moon, Kinam Park, Heuiseok Lim*

Knowledge-Based Systems, 2022

IEEE Access

🌟 Plain Template Insertion: Korean-Prompt-Based Engineering for Few-Shot Learners

Jaehyung Seo, Hyeonseok Moon, Chanhee Lee, Sugyeong Eo, Chanjun Park, Jihoon Kim, Changwoo Chun, Heuiseok Lim*

IEEE Access, 2022

NAACL 2022 Findings

🌟 A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation

Jaehyung Seo*, Seounghoon Lee*, Chanjun Park, Yoonna Jang, Hyeonseok Moon, Sugyeong Eo, Seonmin Koo, Heuiseok Lim*

North American Chapter of the ACL (NAACL) Findings, 2022

Mathematics

🌟 Dense-to-Question and Sparse-to-Answer: Hybrid Retriever System for Industrial Frequently Asked Questions

Jaehyung Seo, Taemin Lee, Hyeonseok Moon, Chanjun Park, Sugyeong Eo, Imatitikua D Aiyanyo, Kinam Park, Aram So, Sungmin Ahn, Jeongbae Park*

Mathematics, 2022

HCLT 2021 πŸ† Outstanding Paper

🌟 KommonGen: A Dataset for Korean Generative Commonsense Reasoning Evaluation

Jaehyung Seo, Chanjun Park, Hyeonseok Moon, Sugyeong Eo, Myunghoon Kang, Seounghoon Lee, Heuiseok Lim*

Annual Conference on Human and Cognitive Language Technology (HCLT), 2021