Our reading list for today includes some ACL Best Papers as listed below:
- A Theory of Response Sampling in LLMs: Part Descriptive and Part Prescriptive
- Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs
- Language Models Resist Alignment
- Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
A Theory of Response Sampling in LLMs: Part Descriptive and Part Prescriptive
Paper Link: https://arxiv.org/abs/2402.11005
This paper introduces a theory proposing that Large Language Model response sampling is governed by both a descriptive component reflecting statistical norms and a prescriptive component representing an implicit ideal notion. Empirical studies across various concepts demonstrate LLM outputs consistently deviate towards an internal ideal, a tendency that is robust to debiasing and exacerbated in larger, instruction-tuned models, impacting critical applications such as medical decision-making.
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs
Paper Link: https://arxiv.org/pdf/2502.01926
This Stanford University-led research proposes and evaluates a new framework, “Fairness through Difference Awareness,” to assess Large Language Models’ ability to appropriately differentiate between demographic groups when contextually relevant. The study empirically demonstrates that leading LLMs often lack this capability and that traditional bias mitigation methods can inadvertently hinder models from recognizing legitimate group differences.
Language Models Resist Alignment
Paper Link: https://arxiv.org/abs/2406.06144
This paper from Peking University and BAAI investigates the fragility of Large Language Model alignment, introducing the concept of ‘elasticity’ where models resist modifications and tend to revert to their pre-trained state. The study provides theoretical evidence from data compression principles and empirical validation demonstrating that this resistance and rebound effect intensifies with increasing model size and pre-training data volume.
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper Link: https://arxiv.org/abs/2502.11089
DeepSeek-AI researchers developed Native Sparse Attention (NSA), a hardware-aligned and natively trainable sparse attention mechanism that enables efficient and performant long-context modeling for large language models. NSA achieves up to 11.6x decoding speedup and 9.0x training speedup at 64k context length while outperforming full attention on various benchmarks, including long-context reasoning tasks.