Co-Alignment: Rethinking Alignment as Bidirectional Human-AI Cognitive Adaptation
By: Yubo Li, Weiyi Song
Potential Business Impact:
Humans and AI learn together, improving teamwork.
Current AI alignment through RLHF follows a single directional paradigm that AI conforms to human preferences while treating human cognition as fixed. We propose a shift to co-alignment through Bidirectional Cognitive Alignment (BiCA), where humans and AI mutually adapt. BiCA uses learnable protocols, representation mapping, and KL-budget constraints for controlled co-evolution. In collaborative navigation, BiCA achieved 85.5% success versus 70.3% baseline, with 230% better mutual adaptation and 332% better protocol convergence. Emergent protocols outperformed handcrafted ones by 84%, while bidirectional adaptation unexpectedly improved safety (+23% out-of-distribution robustness). The 46% synergy improvement demonstrates optimal collaboration exists at the intersection, not union, of human and AI capabilities, validating the shift from single-directional to co-alignment paradigms.
Similar Papers
Co-Alignment: Rethinking Alignment as Bidirectional Human-AI Cognitive Adaptation
Artificial Intelligence
Humans and AI learn together for better teamwork.
Towards Integrated Alignment
Computers and Society
Makes AI understand and follow human wishes.
Person-AI Bidirectional Fit - A Proof-Of-Concept Case Study Of Augmented Human-Ai Symbiosis In Management Decision-Making Process
Human-Computer Interaction
Helps people and AI make better choices together.