This repository contains the code and experiments for analyzing how Mix-FFN layers encode positional information in diffusion transformers.
Our study investigates whether these layers carry positional cues beyond what is provided by attention mechanisms, using probing and ablation experiments.
-
Clone and set up the environment:
conda env create -f probing_env.yml conda activate sana # or your chosen environment name -
If you are using a SLURM system, please fill in the
MAIL_USERandCONDA_ENVvariables inpy-sbatch.sh. -
Run the experiment commands:
All commands used in this experiment are listed incommands_probing.txt.
Run them one by one.
If you are not using SLURM, replace each./py-sbatch.shwithpython.You can distribute commands across jobs if they belong to the same phase (e.g., collecting activations or training probes).
-
Follow the instructions in this GitHub issue to create the environment required to run the GenEval benchmark.
-
If you are using a SLURM system, fill in the
MAIL_USERandCONDA_ENVvariables inpy-sbatch.sh. -
Run the ablation experiment commands:
All commands used in this experiment are listed incommands_ablation.txt.
If you are not using SLURM, replace each./py-sbatch.shwithpython.You can distribute all commands except the first part, which involves collecting the three types of mean activation.