pub fn xpby_f32(a: &mut [f32], b: &[f32], beta: f32)
Compute a[i] = b[i] + beta * a[i] (used in CG for p update)