This repository contains the source code for the Sasformer model from Multi-task Scattering-Model Classification and Parameter Regression of Nanostructures from Small-Angle Scattering Data.
The SAS-55M-20k dataset can be downloaded from figshare.
After you have downloaded the SAS-55M-20k dataset, clone the repository and navigate to the sasformer parent directory.
git clone git@github.com:by256/sasformer.git
cd <path>/<to>/sasformerUsing conda, create and activate the virtual environment and install the package.
conda env create -f environment.yaml
conda activate sasformer
python -m pip install -e .To train SASformer using the hyperparameters specified in the paper, update data_dir in scripts/train_sasformer.sh with the path of the SAS-55M-20k dataset and run the following script:
bash scripts/train_sasformer.shNote: we used a batch size of 1843 for training, but it is likely that you will have to adjust the batch_size parameter in this script to suit your hardware. Additionally, if you do not possess a graphics processing unit, change the accelerator parameter to cpu.
To see an example of inference on I(q) generated using sasmodels, see the sasformer/example.ipynb notebook.
You can open this file by installing Jupyter Notebook, running it in the sasformer/sasformer directory and clicking on the example.ipynb file in the jupyter notebook file browser:
python -m pip install notebook
cd sasformer
jupyter notebook
# click on `example.py`.To generate the test-set results presented in the paper, update data_dir in scripts/results.sh with the path of the SAS-55M-20k dataset and run the following script:
bash scripts/results.shYou may have to adjust the batch_size and accelerator script parameters as mentioned in the Training section.
If you use the methods outlined in this repository, please cite the following work:
@article{Yildirim2024,
author={Yildirim, Batuhan
and Doutch, James
and Cole, Jacqueline},
title={Multi-Task Scattering-Model Classification and Parameter Regression of Nanostructures from Small-Angle Scattering Data},
journal={Digital Discovery},
year={2024},
publisher={RSC},
doi={10.1039/D3DD00225J},
url={https://doi.org/10.1039/D3DD00225J}
}
This project was financially supported by the Science and Technology Facilities Council (STFC) and the Royal Academy of Engineering (RCSRF1819\7\10).
This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357.
This work benefited from the use of the SasView application, originally developed under NSF award DMR-0520547. SasView contains code developed with funding from the European Union’s Horizon 2020 research and innovation programme under the SINE2020 project, grant agreement No 654000.
