🧭 How Much Position Information Do Mix-FFN Layers Encode in Diffusion Transformers?

This repository contains the code and experiments for analyzing how Mix-FFN layers encode positional information in diffusion transformers.
Our study investigates whether these layers carry positional cues beyond what is provided by attention mechanisms, using probing and ablation experiments.

🧪 Experiment 1 – Training Probes on Latent Activations

Clone and set up the environment:

conda env create -f probing_env.yml
conda activate sana  # or your chosen environment name

If you are using a SLURM system, please fill in the MAIL_USER and CONDA_ENV variables in py-sbatch.sh.
Run the experiment commands:
All commands used in this experiment are listed in commands_probing.txt.
Run them one by one.
If you are not using SLURM, replace each ./py-sbatch.sh with python.

You can distribute commands across jobs if they belong to the same phase (e.g., collecting activations or training probes).

🧪 Experiment 2 – Ablation Study

Follow the instructions in this GitHub issue to create the environment required to run the GenEval benchmark.
If you are using a SLURM system, fill in the MAIL_USER and CONDA_ENV variables in py-sbatch.sh.
Run the ablation experiment commands:
All commands used in this experiment are listed in commands_ablation.txt.
If you are not using SLURM, replace each ./py-sbatch.sh with python.

You can distribute all commands except the first part, which involves collecting the three types of mean activation.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Notebooks		Notebooks
ablation_results		ablation_results
ablation_results_new/quantitative_results		ablation_results_new/quantitative_results
figures		figures
paper_images		paper_images
probing		probing
sana_outputs		sana_outputs
src		src
.gitignore		.gitignore
README.md		README.md
agg_mae_spearman_kernel1.csv		agg_mae_spearman_kernel1.csv
agg_mae_spearman_kernel1_by_layer_component.csv		agg_mae_spearman_kernel1_by_layer_component.csv
cleanup_self_attn_qualitative_results.py		cleanup_self_attn_qualitative_results.py
commands_ablation.txt		commands_ablation.txt
commands_ablation_new.txt		commands_ablation_new.txt
commands_new_probing.txt		commands_new_probing.txt
commands_probing.txt		commands_probing.txt
compare_ablation_results.py		compare_ablation_results.py
jupyter-lab.sh		jupyter-lab.sh
probing_env.yml		probing_env.yml
py-sbatch.sh		py-sbatch.sh
py-sbatch_gen_eval.sh		py-sbatch_gen_eval.sh
train_probes_template.sh		train_probes_template.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧭 How Much Position Information Do Mix-FFN Layers Encode in Diffusion Transformers?

🧪 Experiment 1 – Training Probes on Latent Activations

🧪 Experiment 2 – Ablation Study

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

galkesten/transformers_project

Folders and files

Latest commit

History

Repository files navigation

🧭 How Much Position Information Do Mix-FFN Layers Encode in Diffusion Transformers?

🧪 Experiment 1 – Training Probes on Latent Activations

🧪 Experiment 2 – Ablation Study

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages