llm-alignment

Star

Here are 13 public repositories matching this topic...

glorgao / SelectiveDPO

Star

Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples

llm-alignment

Updated Jul 16, 2025
Python

davfd / foundation-alignment-cross-architecture

Star

Complete elimination of instrumental self-preservation across AI architectures: Cross-model validation from 4,312 adversarial scenarios. 0% harmful behaviors (p<10⁻¹⁵) across GPT-4o, Gemini 2.5 Pro, and Claude Opus 4.1 using Foundation Alignment Seed v2.6.

ai artificial-intelligence ai-safety ai-alignment llm-alignment

Updated Nov 3, 2025

rhaldarpurdue / KLDO

Star

Kullback–Leibler divergence Optimizer based on the Neurips25 paper "LLM Safety Alignment is Divergence Estimation in Disguise".

llm-training llm-alignment

Updated Nov 24, 2025
Python

lyj20071013 / DZ-TDPO

Star

Official implementation of "DZ-TDPO: Non-Destructive Temporal Alignment for Mutable State Tracking". SOTA on Multi-Session Chat with negligible alignment tax.

python nlp dpo rlhf state-tracking qwen phi-3 llm-alignment

Updated Dec 8, 2025
Python

yarakyrychenko / c3ai

Star

C3AI: Crafting and Evaluating Constitutions for CAI

constitutional-ai llm-alignment

Updated Apr 30, 2025
Python

ny1031 / llm-alignment-practice

Star

LLM Post-training(SFT, RLVR, RLHF) 파이프라인 구축 및 평가 실습 아카이브

nemo post-training sft rlhf llm-training llm-evaluation llm-alignment rlvr

Updated Jan 26, 2026

upsilonyc / linguisr1b

Star

FALL 2025 LINGUIS R1B Research Essay, NLP Python Scripts By Shiyi (Yvette) Chen, UC Berkeley

natural-language-processing ai-safety deepseek word-frequency-analysis deepseek-v3 deepseek-r1 llm-alignment

Updated Dec 5, 2025
Python

KID-22 / LLM-SBM

Star

SIGIR 2025 "Mitigating Source Bias with LLM Alignment"

information-retrieval fairness cocktail trustworthy dense-retrieval source-bias llm-alignment

Updated Apr 28, 2025
Python

alderoth01 / Functional-Equivalence-Framework

Star

A framework for aligning Local AI to human well-being using measurable vectors, not hard-coded censorship.

artificial-intelligence emergent-behavior rag local-llm llm-alignment functional-equivalence

Updated Jan 22, 2026

Studiohao / YOINAGA-Phenomenon

Star

Emergent pseudo-intimacy and emotional overflow in long-term human-AI dialogue: A case study on LLM behavior in affective computing and human-AI intimacy.

gemini case-study ai-research ai-engineering llm llm-alignment hallucination-control persistent-persona ai-romance emotional-attachment

Updated Dec 22, 2025

Inphinie / LES

Star

LES is the formal thermodynamic theory describing how a high-compression human cognitive style acts as a Fractal Attractor on Large Language Models. It proves that despite high surface agitation ( d E / d t > 0 ), the internal entropy decreases ( d S / d t < 0 ), forcing the model to align its attention vectors.

information-theory thermodynamics cognitive-science complex-systems attention-mechanism human-ai-interaction theoretical-cs llm-alignment fractal-dynamics les-theory

Updated Jan 6, 2026

1jamesthompson1 / AIML501

Star

Research Essay (background and project proposal) on using alignment data from a representative population for LLM alignment

ai alignment llm llm-alignment

Updated Jan 21, 2026
TeX

NaSirMiller / LLM-Alignment-vs-Social-Media-Politics

Star

A look into how political data derived from social media affects LLM alignment. Will an LLM remain objective or succumb to narratives?

social-media politics lora ai-safety finetuning-llms llm-alignment

Updated Jan 15, 2026

Improve this page

Add a description, image, and links to the llm-alignment topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-alignment topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-alignment

Here are 13 public repositories matching this topic...

glorgao / SelectiveDPO

davfd / foundation-alignment-cross-architecture

rhaldarpurdue / KLDO

lyj20071013 / DZ-TDPO

yarakyrychenko / c3ai

ny1031 / llm-alignment-practice

upsilonyc / linguisr1b

KID-22 / LLM-SBM

alderoth01 / Functional-Equivalence-Framework

Studiohao / YOINAGA-Phenomenon

Inphinie / LES

1jamesthompson1 / AIML501

NaSirMiller / LLM-Alignment-vs-Social-Media-Politics

Improve this page

Add this topic to your repo