Skip to main navigation Skip to search Skip to main content

A Novel Translation-Driven Approach to Enhance LLM Performance on Low-Resource Languages

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Large Language Models (LLMs) excel in highresource languages but struggle with low-resource languages due to limited training data and insufficient representation during pre-training. This disparity creates significant barriers for deploying advanced NLP technologies across diverse linguistic communities. This paper presents TALL (Trainable Architecture for Enhancing LLM Performance in Low-Resource Languages), a novel framework that strategically integrates an LLM with two bilingual translation models to bridge the performance gap between high and low-resource languages. TALL transforms lowresource inputs into high-resource representations through a multi-stage pipeline, leveraging the LLM's robust capabilities while preserving essential linguistic features through carefully designed dimension alignment layers and custom transformer components. The architecture addresses the challenge of integrating models with different hidden dimensions and representation spaces, enabling seamless knowledge transfer across languages. Our comprehensive experiments on Hebrew demonstrate significant improvements over several competitive baselines, including direct LLM use, naive translation approaches, finetuning strategies, and soft prompting techniques. Notably, TALL achieves up to 5. 5 9% accuracy compared to 2. 9 3% for the next best approach, representing a substantial performance gain. The architecture employs a parameter-efficient strategy, freezing large pre-trained components while training only lightweight adapter modules, effectively balancing computational efficiency with performance gains. This approach makes TALL particularly suitable for resource-constrained environments while maintaining strong cross-lingual transfer capabilities. Code is available in https://github.com/MosheOfer1/TALL

Original languageEnglish
Title of host publicationProceedings - 2025 IEEE 37th International Conference on Tools with Artificial Intelligence, ICTAI 2025
PublisherIEEE Computer Society
Pages347-354
Number of pages8
ISBN (Electronic)9798331549190
DOIs
StatePublished - 2025
Event37th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2025 - Athens, Greece
Duration: 3 Nov 20255 Nov 2025

Publication series

NameProceedings - International Conference on Tools with Artificial Intelligence, ICTAI
ISSN (Print)1082-3409

Conference

Conference37th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2025
Country/TerritoryGreece
CityAthens
Period3/11/255/11/25

Keywords

  • cross-lingual transfer
  • Hebrew NLP
  • large language models
  • low-resource languages
  • parameter-efficient adaptation

Fingerprint

Dive into the research topics of 'A Novel Translation-Driven Approach to Enhance LLM Performance on Low-Resource Languages'. Together they form a unique fingerprint.

Cite this