Browsing by Author "Manafi, Shadi, author"

Now showing 1 - 1 of 1

Open Access
Smart transfers: challenges and opportunities in boosting low-resource language models with high-resource language power
(Colorado State University. Libraries, 2024) Manafi, Shadi, author; Krishnaswamy, Nikhil, advisor; Ortega, Francisco R., committee member; Blanchard, Nathaniel, committee member; Chong, Edwin K. P., committee member
Large language models (LLMs) are predominantly built for high-resource languages (HRLs), leaving low-resource languages (LRLs) underrepresented. To bridge this gap, knowledge transfer from HRLs to LRLs is crucial, but it must be sensitive to low-resource language (LRL)-specific traits and not biased toward an high-resource language (HRL) with larger training data. This dissertation addresses the opportunities and challenges of cross-lingual transfer in two main streams. The first stream explores cross-lingual zero-shot learning in Multilingual Language Models (MLLMs) like mBERT and XLM-R for tasks such as Named Entity Recognition (NER) and section-title prediction. The research introduces adversarial test sets by replacing named entities and modifying common words to evaluate transfer accuracy. Results show that word overlap between languages is essential for both tasks, highlighting the need to account for language-specific features and biases. The second stream develops sentence Transformers, which generate sentence embeddings by mean-pooling contextualized word embeddings. However, these embeddings often struggle to capture sentence similarities effectively. To address this, we fine-tuned an English sentence Transformer by leveraging a word-to-word translation approach and a triplet loss function. Despite using a pre-trained English BERT model and only word-by-word translations without accounting for sentence structure, the results were competitive. This suggests that mean-pooling may weaken attention mechanisms, causing the model to rely more on word embeddings than sentence structure, potentially limiting comprehension of sentence meaning. Together, these streams reveal the complexities of cross-lingual transfer, guiding more effective and equitable use of HRLs to support LRLs in NLP applications.