Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,7 @@ These are currently used to find a minimum energy conformation of a molecule.
| `OpenFF Torsion Multiplicity Optimization Training Coverage Supplement v1.0` | [2024-06-20-OpenFF-Torsion-Multiplicity-Optimization-Training-Coverage-Supplement-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-06-20-OpenFF-Torsion-Multiplicity-Optimization-Training-Coverage-Supplement-v1.0) | Additional optimization training data for Sage 2.2.0 proper torsions and new parameters from the torsion multiplicity work | C, Cl, S, O, H, P, N, Br | |
| `OpenFF Torsion Multiplicity Optimization Benchmarking Coverage Supplement v1.0` | [2024-06-24-OpenFF-Torsion-Multiplicity-Optimization-Benchmarking-Coverage-Supplement-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-06-24-OpenFF-Torsion-Multiplicity-Optimization-Benchmarking-Coverage-Supplement-v1.0) | Additional optimization benchmarking data for Sage 2.2.0 proper torsions and new parameters from the torsion multiplicity work | Cl, H, I, S, O, N, Br, C, P | |
|`OpenFF Iodine Fragment Opt v1.0` | [2024-09-10-OpenFF-Iodine-Fragment-Opt-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-09-10-OpenFF-Iodine-Fragment-Opt-v1.0) | B3LYP-D3BJ/DZVP optimized conformers for a variety of I-containing fragment molecules | C, O, I, S, F, Br, Cl, N, H ||
| `OpenFF Sulfur Optimization Training Coverage Supplement v1.0` | [2024-09-11-OpenFF-Sulfur-Optimization-Training-Coverage-Supplement-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-09-11-OpenFF-Sulfur-Optimization-Training-Coverage-Supplement-v1.0) | Additional optimization training data for Sage sulfur and phosphorus parameters | C, S, F, O, H, Cl, Br, P, N | |


# TorsionDrive Datasets
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# OpenFF Sulfur Optimization Training Coverage Supplement v1.0

## Description

An optimization data set created to improve the training coverage of sulfonic
and phosphonic acids, sulfone, sulfonate, sulfinyl, sulfoximine, sulfonamides,
thioether, and 1,3-thiazole groups. The molecules in this data set were manually
selected from a subset of the smallest matching structures in the ChEMBL 34
database.

## General Information

* Date: 2024-09-11
* Class: OpenFF Optimization Dataset
* Purpose: Improve coverage in Sage
* Name: OpenFF Sulfur Optimization Training Coverage Supplement v1.0
* Number of unique molecules: 129
* Number of filtered molecules: 0
* Number of conformers: 899
* Number of conformers per molecule (min, mean, max): 1, 6.97, 10
* Mean molecular weight: 218.80
* Max molecular weight: 493.37
* Charges: [-2.0, -1.0, 0.0]
* Dataset submitter: Brent Westbrook
* Dataset generator: Brent Westbrook

## QCSubmit Generation Pipeline

* `generate-dataset.py`: This script shows how the dataset was prepared from the
input file `train.smi`.
* The list of labels and SMILES pairs in `train.smi` were collected by searching
the ChEMBL database for all of the molecules matching the SMIRKS patterns
corresponding to the labels in `sulfur.dat`. The code used for all of these
steps can be found
[here](https://github.com/ntBre/curato/tree/64261e2261e5b3109223c7fbe8ef5d866937fd13).

## QCSubmit Manifest

* `generate-dataset.py`: Script describing dataset generation and submission
* `input-environment.yaml`: Environment file used to create the Python environment for the script
* `full-environment.yaml`: Fully-resolved environment used to execute the script
* `opt.toml`: Experimental [qcaide](https://github.com/ntBre/qcaide) input file
for defining variables used throughout the QCA submission process
* `dataset.json.bz2`: Compressed dataset ready for submission
* `dataset.pdf`: Visualization of dataset molecules
* `output.smi`: SMILES strings for dataset molecules

## Metadata

* Elements: {C, S, F, O, H, Cl, Br, P, N}
* Spec: default
* basis: DZVP
* implicit_solvent: None
* keywords: {}
* maxiter: 200
* method: B3LYP-D3BJ
* program: psi4
* SCF properties:
* dipole
* quadrupole
* wiberg_lowdin_indices
* mayer_indices
Git LFS file not shown
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,304 @@
name: qcarchive-user-submit
channels:
- openeye
- conda-forge
dependencies:
- _libgcc_mutex=0.1=conda_forge
- _openmp_mutex=4.5=2_gnu
- ambertools=23.3=py311h9fea076_6
- annotated-types=0.6.0=pyhd8ed1ab_0
- anyio=4.2.0=pyhd8ed1ab_0
- apsw=3.46.0.0=py311h3ea06b8_0
- argcomplete=3.2.2=pyhd8ed1ab_0
- argon2-cffi=23.1.0=pyhd8ed1ab_0
- argon2-cffi-bindings=21.2.0=py311h459d7ec_4
- arpack=3.8.0=nompi_h0baa96a_101
- arrow=1.3.0=pyhd8ed1ab_0
- asttokens=2.4.1=pyhd8ed1ab_0
- astunparse=1.6.3=pyhd8ed1ab_0
- async-lru=2.0.4=pyhd8ed1ab_0
- attrs=23.2.0=pyh71513ae_0
- babel=2.14.0=pyhd8ed1ab_0
- basis_set_exchange=0.9.1=pyhd8ed1ab_0
- beautifulsoup4=4.12.3=pyha770c72_0
- bleach=6.1.0=pyhd8ed1ab_0
- blosc=1.21.5=h0f2a231_0
- brotli=1.1.0=hd590300_1
- brotli-bin=1.1.0=hd590300_1
- brotli-python=1.1.0=py311hb755f60_1
- bson=0.5.9=py_0
- bzip2=1.0.8=hd590300_5
- c-ares=1.26.0=hd590300_0
- c-blosc2=2.13.1=hb4ffafa_0
- ca-certificates=2024.6.2=hbcca054_0
- cached-property=1.5.2=hd8ed1ab_1
- cached_property=1.5.2=pyha770c72_1
- cachetools=5.3.2=pyhd8ed1ab_0
- cairo=1.18.0=h3faef2a_0
- certifi=2024.6.2=pyhd8ed1ab_0
- cffi=1.16.0=py311hb3a22ac_0
- chardet=5.2.0=py311h38be061_1
- charset-normalizer=3.3.2=pyhd8ed1ab_0
- colorama=0.4.6=pyhd8ed1ab_0
- comm=0.2.1=pyhd8ed1ab_0
- contourpy=1.2.0=py311h9547e67_0
- cudatoolkit=11.8.0=h4ba93d1_12
- cycler=0.12.1=pyhd8ed1ab_0
- debugpy=1.8.0=py311hb755f60_1
- decorator=5.1.1=pyhd8ed1ab_0
- defusedxml=0.7.1=pyhd8ed1ab_0
- entrypoints=0.4=pyhd8ed1ab_0
- exceptiongroup=1.2.0=pyhd8ed1ab_2
- executing=2.0.1=pyhd8ed1ab_0
- expat=2.5.0=hcb278e6_1
- fftw=3.3.10=nompi_hc118613_108
- font-ttf-dejavu-sans-mono=2.37=hab24e00_0
- font-ttf-inconsolata=3.000=h77eed37_0
- font-ttf-source-code-pro=2.038=h77eed37_0
- font-ttf-ubuntu=0.83=h77eed37_1
- fontconfig=2.14.2=h14ed4e7_0
- fonts-conda-ecosystem=1=0
- fonts-conda-forge=1=0
- fonttools=4.47.2=py311h459d7ec_0
- fqdn=1.5.1=pyhd8ed1ab_0
- freetype=2.12.1=h267a509_2
- freetype-py=2.3.0=pyhd8ed1ab_0
- gettext=0.21.1=h27087fc_0
- greenlet=3.0.3=py311hb755f60_0
- hdf4=4.2.15=h2a13503_7
- hdf5=1.14.3=nompi_h4f84152_100
- icu=73.2=h59595ed_0
- idna=3.6=pyhd8ed1ab_0
- importlib-metadata=7.0.1=pyha770c72_0
- importlib_metadata=7.0.1=hd8ed1ab_0
- importlib_resources=6.1.1=pyhd8ed1ab_0
- iniconfig=2.0.0=pyhd8ed1ab_0
- ipykernel=6.29.0=pyhd33586a_0
- ipython=8.20.0=pyh707e725_0
- ipywidgets=8.1.1=pyhd8ed1ab_0
- isoduration=20.11.0=pyhd8ed1ab_0
- jedi=0.19.1=pyhd8ed1ab_0
- jinja2=3.1.3=pyhd8ed1ab_0
- joblib=1.3.2=pyhd8ed1ab_0
- json5=0.9.14=pyhd8ed1ab_0
- jsonpointer=2.4=py311h38be061_3
- jsonschema=4.21.1=pyhd8ed1ab_0
- jsonschema-specifications=2023.12.1=pyhd8ed1ab_0
- jsonschema-with-format-nongpl=4.21.1=pyhd8ed1ab_0
- jupyter-lsp=2.2.2=pyhd8ed1ab_0
- jupyter_client=8.6.0=pyhd8ed1ab_0
- jupyter_core=5.7.1=py311h38be061_0
- jupyter_events=0.9.0=pyhd8ed1ab_0
- jupyter_server=2.12.5=pyhd8ed1ab_0
- jupyter_server_terminals=0.5.2=pyhd8ed1ab_0
- jupyterlab=4.0.12=pyhd8ed1ab_0
- jupyterlab_pygments=0.3.0=pyhd8ed1ab_0
- jupyterlab_server=2.25.2=pyhd8ed1ab_0
- jupyterlab_widgets=3.0.9=pyhd8ed1ab_0
- keyutils=1.6.1=h166bdaf_0
- kiwisolver=1.4.5=py311h9547e67_1
- krb5=1.21.2=h659d440_0
- lcms2=2.16=hb7c19ff_0
- ld_impl_linux-64=2.40=h41732ed_0
- lerc=4.0.0=h27087fc_0
- libaec=1.1.2=h59595ed_1
- libblas=3.9.0=21_linux64_openblas
- libboost=1.82.0=h6fcfa73_6
- libboost-python=1.82.0=py311h92ebd52_6
- libbrotlicommon=1.1.0=hd590300_1
- libbrotlidec=1.1.0=hd590300_1
- libbrotlienc=1.1.0=hd590300_1
- libcblas=3.9.0=21_linux64_openblas
- libcurl=8.5.0=hca28451_0
- libdeflate=1.19=hd590300_0
- libedit=3.1.20191231=he28a2e2_2
- libev=4.33=hd590300_2
- libexpat=2.5.0=hcb278e6_1
- libffi=3.4.2=h7f98852_5
- libgcc-ng=13.2.0=h807b86a_4
- libgfortran-ng=13.2.0=h69a702a_4
- libgfortran5=13.2.0=ha4646dd_4
- libglib=2.78.3=h783c2da_0
- libgomp=13.2.0=h807b86a_4
- libiconv=1.17=hd590300_2
- libjpeg-turbo=3.0.0=hd590300_1
- liblapack=3.9.0=21_linux64_openblas
- libnetcdf=4.9.2=nompi_h9612171_113
- libnghttp2=1.58.0=h47da74e_1
- libnsl=2.0.1=hd590300_0
- libopenblas=0.3.26=pthreads_h413a1c8_0
- libpng=1.6.39=h753d276_0
- libsodium=1.0.18=h36c2ea0_1
- libsqlite=3.46.0=hde9e2c9_0
- libssh2=1.11.0=h0841786_0
- libstdcxx-ng=13.2.0=h7e041cc_4
- libtiff=4.6.0=ha9c0a0a_2
- libuuid=2.38.1=h0b41bf4_0
- libwebp-base=1.3.2=hd590300_0
- libxcb=1.15=h0b41bf4_0
- libxcrypt=4.4.36=hd590300_1
- libxml2=2.12.4=h232c23b_1
- libzip=1.10.1=h2629f0a_3
- libzlib=1.2.13=hd590300_5
- lz4-c=1.9.4=hcb278e6_0
- lzo=2.10=h516909a_1000
- markupsafe=2.1.4=py311h459d7ec_0
- matplotlib-base=3.8.2=py311h54ef318_0
- matplotlib-inline=0.1.6=pyhd8ed1ab_0
- mda-xdrlib=0.2.0=pyhd8ed1ab_0
- mdtraj=1.9.9=py311h90fe790_1
- mistune=3.0.2=pyhd8ed1ab_0
- msgpack-python=1.0.7=py311h9547e67_0
- munkres=1.1.4=pyh9f0ad1d_0
- nbclient=0.8.0=pyhd8ed1ab_0
- nbconvert-core=7.14.2=pyhd8ed1ab_0
- nbformat=5.9.2=pyhd8ed1ab_0
- ncurses=6.5=h59595ed_0
- nest-asyncio=1.6.0=pyhd8ed1ab_0
- netcdf-fortran=4.6.1=nompi_hacb5139_103
- networkx=3.2.1=pyhd8ed1ab_0
- nomkl=1.0=h5ca1d4c_0
- notebook=7.0.7=pyhd8ed1ab_0
- notebook-shim=0.2.3=pyhd8ed1ab_0
- numexpr=2.8.8=py311h039bad6_100
- numpy=1.26.3=py311h64a7726_0
- ocl-icd=2.3.1=h7f98852_0
- ocl-icd-system=1.0.0=1
- openeye-toolkits=2023.1.1=py311_0
- openff-amber-ff-ports=0.0.4=pyhca7485f_0
- openff-forcefields=2024.01.0=pyhca7485f_0
- openff-interchange=0.3.18=pyhd8ed1ab_0
- openff-interchange-base=0.3.18=pyhd8ed1ab_0
- openff-models=0.1.1=pyhca7485f_0
- openff-qcsubmit=0.50.2=pyhd8ed1ab_0
- openff-toolkit=0.15.1=pyhd8ed1ab_0
- openff-toolkit-base=0.15.1=pyhd8ed1ab_0
- openff-units=0.2.1=pyh1a96a4e_0
- openff-utilities=0.1.12=pyhd8ed1ab_0
- openjpeg=2.5.0=h488ebb8_3
- openmm=8.1.1=py311h9766050_0
- openssl=3.3.1=h4ab18f5_0
- overrides=7.7.0=pyhd8ed1ab_0
- packaging=23.2=pyhd8ed1ab_0
- packmol=20.010=h86c2bf4_0
- pandas=2.2.0=py311h320fe9a_0
- pandocfilters=1.5.0=pyhd8ed1ab_0
- panedr=0.8.0=pyhd8ed1ab_0
- parmed=4.2.2=py311hb755f60_1
- parso=0.8.3=pyhd8ed1ab_0
- pcre2=10.42=hcad00b1_0
- perl=5.32.1=7_hd590300_perl5
- pexpect=4.9.0=pyhd8ed1ab_0
- pickleshare=0.7.5=py_1003
- pillow=10.2.0=py311ha6c5da5_0
- pint=0.21=pyhd8ed1ab_0
- pip=23.3.2=pyhd8ed1ab_0
- pixman=0.43.2=h59595ed_0
- pkgutil-resolve-name=1.3.10=pyhd8ed1ab_1
- platformdirs=4.2.0=pyhd8ed1ab_0
- pluggy=1.4.0=pyhd8ed1ab_0
- prometheus_client=0.19.0=pyhd8ed1ab_0
- prompt-toolkit=3.0.42=pyha770c72_0
- psutil=5.9.8=py311h459d7ec_0
- pthread-stubs=0.4=h36c2ea0_1001
- ptyprocess=0.7.0=pyhd3deb0d_0
- pure_eval=0.2.2=pyhd8ed1ab_0
- py-cpuinfo=9.0.0=pyhd8ed1ab_0
- pycairo=1.25.1=py311h8feb60e_0
- pycalverter=1.6.1=py_0
- pycparser=2.21=pyhd8ed1ab_0
- pydantic=2.6.0=pyhd8ed1ab_0
- pydantic-core=2.16.1=py311h46250e7_0
- pyedr=0.8.0=pyhd8ed1ab_0
- pygments=2.17.2=pyhd8ed1ab_0
- pyjwt=2.8.0=pyhd8ed1ab_0
- pyparsing=3.1.1=pyhd8ed1ab_0
- pysocks=1.7.1=pyha2e5f31_6
- pytables=3.9.2=py311h10c7f7f_1
- pytest=8.0.0=pyhd8ed1ab_0
- python=3.11.7=hab00c5b_1_cpython
- python-constraint=1.4.0=py_0
- python-dateutil=2.8.2=pyhd8ed1ab_0
- python-fastjsonschema=2.19.1=pyhd8ed1ab_0
- python-json-logger=2.0.7=pyhd8ed1ab_0
- python-tzdata=2023.4=pyhd8ed1ab_0
- python_abi=3.11=4_cp311
- pytz=2023.4=pyhd8ed1ab_0
- pyyaml=6.0.1=py311h459d7ec_1
- pyzmq=25.1.2=py311h34ded2d_0
- qcelemental=0.27.1=pyhd8ed1ab_0
- qcportal=0.55=pyhd8ed1ab_0
- rdkit=2023.09.4=py311h4c2f14b_0
- readline=8.2=h8228510_1
- referencing=0.33.0=pyhd8ed1ab_0
- regex=2023.12.25=py311h459d7ec_0
- reportlab=4.0.9=py311h459d7ec_0
- requests=2.31.0=pyhd8ed1ab_0
- rfc3339-validator=0.1.4=pyhd8ed1ab_0
- rfc3986-validator=0.1.1=pyh9f0ad1d_0
- rlpycairo=0.2.0=pyhd8ed1ab_0
- rpds-py=0.17.1=py311h46250e7_0
- scipy=1.12.0=py311h64a7726_2
- send2trash=1.8.2=pyh41d4057_0
- setuptools=69.0.3=pyhd8ed1ab_0
- six=1.16.0=pyh6c4a22f_0
- smirnoff99frosst=1.1.0=pyh44b312d_0
- snappy=1.1.10=h9fff704_0
- sniffio=1.3.0=pyhd8ed1ab_0
- soupsieve=2.5=pyhd8ed1ab_1
- sqlalchemy=2.0.25=py311h459d7ec_0
- sqlite=3.46.0=h6d4b2fc_0
- stack_data=0.6.2=pyhd8ed1ab_0
- tabulate=0.9.0=pyhd8ed1ab_1
- terminado=0.18.0=pyh0d859eb_0
- tinycss2=1.2.1=pyhd8ed1ab_0
- tk=8.6.13=noxft_h4845f30_101
- tomli=2.0.1=pyhd8ed1ab_0
- tornado=6.3.3=py311h459d7ec_1
- tqdm=4.66.1=pyhd8ed1ab_0
- traitlets=5.14.1=pyhd8ed1ab_0
- types-python-dateutil=2.8.19.20240106=pyhd8ed1ab_0
- typing-extensions=4.9.0=hd8ed1ab_0
- typing_extensions=4.9.0=pyha770c72_0
- typing_utils=0.1.0=pyhd8ed1ab_0
- tzdata=2023d=h0c530f3_0
- unidecode=1.3.8=pyhd8ed1ab_0
- uri-template=1.3.0=pyhd8ed1ab_0
- urllib3=2.2.0=pyhd8ed1ab_0
- wcwidth=0.2.13=pyhd8ed1ab_0
- webcolors=1.13=pyhd8ed1ab_0
- webencodings=0.5.1=pyhd8ed1ab_2
- websocket-client=1.7.0=pyhd8ed1ab_0
- wheel=0.42.0=pyhd8ed1ab_0
- widgetsnbextension=4.0.9=pyhd8ed1ab_0
- xmltodict=0.13.0=pyhd8ed1ab_0
- xorg-kbproto=1.0.7=h7f98852_1002
- xorg-libice=1.1.1=hd590300_0
- xorg-libsm=1.2.4=h7391055_0
- xorg-libx11=1.8.7=h8ee46fc_0
- xorg-libxau=1.0.11=hd590300_0
- xorg-libxdmcp=1.1.3=h7f98852_0
- xorg-libxext=1.3.4=h0b41bf4_2
- xorg-libxrender=0.9.11=hd590300_0
- xorg-libxt=1.3.0=hd590300_1
- xorg-renderproto=0.11.1=h7f98852_1002
- xorg-xextproto=7.3.0=h0b41bf4_1003
- xorg-xproto=7.0.31=h7f98852_1007
- xz=5.2.6=h166bdaf_0
- yaml=0.2.5=h7f98852_2
- zeromq=4.3.5=h59595ed_0
- zipp=3.17.0=pyhd8ed1ab_0
- zlib=1.2.13=hd590300_5
- zlib-ng=2.0.7=h0b41bf4_0
- zstandard=0.22.0=py311haa97af0_0
- zstd=1.5.5=hfc55251_0
- pip:
- amberutils==21.0
- edgembar==0.2
- mmpbsa-py==16.0
- packmol-memgen==2023.2.24
- pdb4amber==22.0
- pymsmt==22.0
- pytraj==2.0.6
- sander==22.0
prefix: /home/brent/mambaforge/envs/qcarchive-user-submit
Loading