ICML 2026

FedHera

Towards Drift-Resilient Federated Fine-tuning with Heterogeneous Resources

Ke Xiao, Qiyuan Wang, Christos Anagnostopoulos, Zhuoran Tan, Wenhao Li

School of Computing Science, University of Glasgow

Paper Code Manim scenes

Technical TL;DR

Communication rank and trainable rank should not be the same knob.

FedHera decouples the downloaded inference rank from the locally trainable rank, so clients can receive a richer global LoRA basis while only optimizing the prefix that fits their memory and compute budget.

A spectrum-preserving water-filling allocator spends bandwidth on high-energy singular directions, while prefix-gated training uses the frozen tail as a forward-pass anchor to reduce truncation-induced drift.

FedHera mechanics

Three moving parts behind the dual-rank framework.

1. Decouple reception from optimization

Clients download rank r_tot under bandwidth limits, but only train rank r_train under memory and time limits.

2. Allocate by spectral energy

The server competes singular directions across layers and spends each rank column where energy per cost is highest.

3. Train the prefix, anchor with the tail

Gradient masks update only the active prefix; Adaptive Tail Warm-up gates the frozen tail as the global basis becomes reliable.

Results

FedHera improves accuracy, generation quality, and drift stability under skewed resources.

+3.93 accuracy points on HellaSwag over FedHomoLoRA

+0.195 ROUGE-L gain on E2E NLG

2.27x information gain ratio over coupled ranks

98.7% VRAM utilization in the efficiency study

Drift-prevention diagnostics

Lower drift indicates that local client updates stay closer to the high-rank reference direction and remain more aligned across heterogeneous clients.

Absolute drift 31.1% lower than FlexLoRA

Relative drift 31.2% lower than FlexLoRA

Cross-client drift, N=100 29.3% lower than FlexLoRA

Cross-client drift, N=500 16.4% lower than FlexLoRA

Generative tasks

Task	ROUGE-L gain	Loss
Alpaca	+0.101	1.140
E2E NLG	+0.195	0.492

Drift on E2E NLG

Method	Abs.	Rel.
FedHera	2.060	6.641
FedHL	2.072	6.684
FlexLoRA	2.991	9.647

Code map

The repository exposes the FedHera path directly.

Training entry main.py Server SVD, rank push, ATW metadata fed_utils/model_aggregation.py Spectrum-aware water filling fed_utils/rank_allocator.py Prefix masks and frozen-tail scaling fed_utils/adaptive_peft.py Client data partitioning utils/preprocess_fedhera_data.py

BibTeX

Cite FedHera

@inproceedings{xiao2026fedhera,
  title={FedHera: Towards Drift-Resilient Federated Fine-tuning with Heterogeneous Resources},
  author={Xiao, Ke and Wang, Qiyuan and Anagnostopoulos, Christos and Tan, Zhuoran and Li, Wenhao},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2026}
}