Recursive Folding Machine (RFM): Latent Reasoning Without RLHF

A latent thinking architecture designed to reuse the "frozen" knowledge of established models while training a real-time Latent Space Controller.

Rather than relying on human preference data (RLHF), the RFM trains its policy engine through the process of reasoning itself, using a Policy Curriculum to learn how to sample and evolve its own latent space for deep thinking.

🚀 Core Architecture: Latent Thinking

The RFM manages a persistent Reasoning State (Z) that acts as a an internal canvas.

Weight Reuse: Reuses the dense world-model contained in frozen weights.
Latent Control: A learnable Thinking Adapter and Objective Router manage how thoughts evolve in the latent space.
Policy Curriculum: The model is trained on-the-fly to choose cognitive strategies (Explore, Converge, etc.) that lead to coherent, deep answers.
Zero RLHF: All "intelligence" is derived from signal-processing feedback and signal-scoring of its own self-generated reasoning trajectories.

🧬 The "Thinking Cap" Mechanism

Reality Anchoring: Input text embeddings remain constant during recursion, anchoring the model to the prompt.
Latent Evolution (Z): The Reasoning State (Z) is initialized from valid token embeddings and evolved via a recursive Thinking Adapter.
Sampling Strategy: The Objective Router decides the cognitive "goal" for each rollout, sampling the latent space based on:
- Explore ($Se$): Seeking novel interpretations.
- Converge ($Si$): Stabilizing around a likely conclusion.
- Diversify ($Ti$): Expanding the internal reasoning landscape.
- Smooth ($Ni$): Optimizing for semantic flow and narrative consistency.

Key Features

Sequence-Level GRPO: A delayed-gratification training loop where $K$ full drafts (rollouts) are evaluated as complete logical units.
Forced Reasoning (Thinking Caps): EOS-masking ensures the model can't quit until it has achieved a minimum "Depth of Thought."
Complexity Rewards: Explicitly incentivizes vocabulary richness and narrative elaboration over cliches.
Hybrid Precision Stability: Critical controller parameters (Router, Adapter) are maintained in FP32 to ensure the reasoning engine remains numerically stable during deep recursion.

🛠️ Usage

1. Training the Thinking Controller

Initialize the recursive wrapper and perform latent reasoning:

from rfm import RecursiveModel, inference

# Initialize with a frozen backbone
rfm = RecursiveModel("meta-llama/Llama-3.2-1B-Instruct")

# Perform real-time latent reasoning
# max_len=100 allows for paragraph-level depth
inference(rfm, episodes=100, max_len=100, K=4, min_think=30)

📦 Components

RecursiveModel: The wrapper managing the frozen weights, the Thinking Adapter, and the Z state.
ObjectiveRouter: The conductor trained on the policy curriculum to sample the latent reasoning space.
score_trajectories: A sequence-level engine that rewards vocabulary richesse, halting logic, and cognitive coherence.
inference: The real-time reasoning engine implementing rollout-based latent optimization.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
m_rfm_logo.png		m_rfm_logo.png
requirements.txt		requirements.txt
rfm.py		rfm.py
test_generation.py		test_generation.py
test_holographic.py		test_holographic.py
test_kernel_composition.py		test_kernel_composition.py
test_shard_lifecycle.py		test_shard_lifecycle.py
test_spectral_dna.py		test_spectral_dna.py
test_subspace.py		test_subspace.py
test_tokenizer_bundling.py		test_tokenizer_bundling.py
verify_arfm.py		verify_arfm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recursive Folding Machine (RFM): Latent Reasoning Without RLHF

🚀 Core Architecture: Latent Thinking

🧬 The "Thinking Cap" Mechanism

Key Features

🛠️ Usage

1. Training the Thinking Controller

📦 Components

About

Uh oh!

Releases

Packages

Languages

License

iblameandrew/Recursive-Folding-Machine

Folders and files

Latest commit

History

Repository files navigation

Recursive Folding Machine (RFM): Latent Reasoning Without RLHF

🚀 Core Architecture: Latent Thinking

🧬 The "Thinking Cap" Mechanism

Key Features

🛠️ Usage

1. Training the Thinking Controller

📦 Components

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages