Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,7 @@ These are currently used to find a minimum energy conformation of a molecule.
| `OpenFF Torsion Benchmark Supplement Optimization Dataset v1.0` | [2024-04-18-OpenFF-Torsion-Benchmark-Supplement-Optimization-Dataset-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-04-18-OpenFF-Torsion-Benchmark-Supplement-Optimization-Dataset-v1.0) | Additional optimizations for benchmarking Sage 2.2.0 proper torsions and new parameters from the torsion multiplicity work | H, C, N, O, F, P, S, Cl, Br | |
| `OpenFF Torsion Multiplicity Optimization Training Coverage Supplement v1.0` | [2024-06-20-OpenFF-Torsion-Multiplicity-Optimization-Training-Coverage-Supplement-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-06-20-OpenFF-Torsion-Multiplicity-Optimization-Training-Coverage-Supplement-v1.0) | Additional optimization training data for Sage 2.2.0 proper torsions and new parameters from the torsion multiplicity work | C, Cl, S, O, H, P, N, Br | |
| `OpenFF Torsion Multiplicity Optimization Benchmarking Coverage Supplement v1.0` | [2024-06-24-OpenFF-Torsion-Multiplicity-Optimization-Benchmarking-Coverage-Supplement-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-06-24-OpenFF-Torsion-Multiplicity-Optimization-Benchmarking-Coverage-Supplement-v1.0) | Additional optimization benchmarking data for Sage 2.2.0 proper torsions and new parameters from the torsion multiplicity work | Cl, H, I, S, O, N, Br, C, P | |

| `OpenFF Sulfur Optimization Training Coverage Supplement v1.0` | [2024-09-11-OpenFF-Sulfur-Optimization-Training-Coverage-Supplement-v1.0](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2024-09-11-OpenFF-Sulfur-Optimization-Training-Coverage-Supplement-v1.0) | Additional optimization training data for Sage sulfur and phosphorus parameters | C, S, F, O, H, Cl, Br, P, N | |

# TorsionDrive Datasets
These are currently used perform a complete rotation of one or more selected bonds, where optimizations are performed over a discrete set of angles.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# OpenFF Sulfur Optimization Training Coverage Supplement v1.0

## Description

An optimization data set created to improve the training coverage of sulfonic
and phosphonic acids, sulfone, sulfonate, sulfinyl, sulfoximine, sulfonamides,
thioether, and 1,3-thiazole groups. The molecules in this data set were selected
from the ChEMBL 34 database.

## General Information

* Date: 2024-09-11
* Class: OpenFF Optimization Dataset
* Purpose: Improve coverage in Sage
* Name: OpenFF Sulfur Optimization Training Coverage Supplement v1.0
* Number of unique molecules: 129
* Number of filtered molecules: 0
* Number of conformers: 899
* Number of conformers per molecule (min, mean, max): 1, 6.97, 10
* Mean molecular weight: 218.80
* Max molecular weight: 493.37
* Charges: [-2.0, -1.0, 0.0]
* Dataset submitter: Brent Westbrook
* Dataset generator: Brent Westbrook

## QCSubmit Generation Pipeline

* `generate-dataset.py`: This script shows how the dataset was prepared from the
input file `train.smi`.
* The list of labels and SMILES pairs in `train.smi` were collected by searching
the ChEMBL database for all of the molecules matching the SMIRKS patterns
corresponding to the labels in `sulfur.dat`. The code used for all of these
steps can be found
[here](https://github.com/ntBre/curato/tree/64261e2261e5b3109223c7fbe8ef5d866937fd13).

## QCSubmit Manifest

* `generate-dataset.py`: Script describing dataset generation and submission
* `input-environment.yaml`: Environment file used to create the Python environment for the script
* `full-environment.yaml`: Fully-resolved environment used to execute the script
* `opt.toml`: Experimental [qcaide](https://github.com/ntBre/qcaide) input file
for defining variables used throughout the QCA submission process
* `dataset.json.bz2`: Compressed dataset ready for submission
* `dataset.pdf`: Visualization of dataset molecules
* `output.smi`: SMILES strings for dataset molecules

## Metadata

* Elements: {C, S, F, O, H, Cl, Br, P, N}
* Spec: default
* basis: DZVP
* implicit_solvent: None
* keywords: {}
* maxiter: 200
* method: B3LYP-D3BJ
* program: psi4
* SCF properties:
* dipole
* quadrupole
* wiberg_lowdin_indices
* mayer_indices
Git LFS file not shown
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,304 @@
name: qcarchive-user-submit
channels:
- openeye
- conda-forge
dependencies:
- _libgcc_mutex=0.1=conda_forge
- _openmp_mutex=4.5=2_gnu
- ambertools=23.3=py311h9fea076_6
- annotated-types=0.6.0=pyhd8ed1ab_0
- anyio=4.2.0=pyhd8ed1ab_0
- apsw=3.46.0.0=py311h3ea06b8_0
- argcomplete=3.2.2=pyhd8ed1ab_0
- argon2-cffi=23.1.0=pyhd8ed1ab_0
- argon2-cffi-bindings=21.2.0=py311h459d7ec_4
- arpack=3.8.0=nompi_h0baa96a_101
- arrow=1.3.0=pyhd8ed1ab_0
- asttokens=2.4.1=pyhd8ed1ab_0
- astunparse=1.6.3=pyhd8ed1ab_0
- async-lru=2.0.4=pyhd8ed1ab_0
- attrs=23.2.0=pyh71513ae_0
- babel=2.14.0=pyhd8ed1ab_0
- basis_set_exchange=0.9.1=pyhd8ed1ab_0
- beautifulsoup4=4.12.3=pyha770c72_0
- bleach=6.1.0=pyhd8ed1ab_0
- blosc=1.21.5=h0f2a231_0
- brotli=1.1.0=hd590300_1
- brotli-bin=1.1.0=hd590300_1
- brotli-python=1.1.0=py311hb755f60_1
- bson=0.5.9=py_0
- bzip2=1.0.8=hd590300_5
- c-ares=1.26.0=hd590300_0
- c-blosc2=2.13.1=hb4ffafa_0
- ca-certificates=2024.6.2=hbcca054_0
- cached-property=1.5.2=hd8ed1ab_1
- cached_property=1.5.2=pyha770c72_1
- cachetools=5.3.2=pyhd8ed1ab_0
- cairo=1.18.0=h3faef2a_0
- certifi=2024.6.2=pyhd8ed1ab_0
- cffi=1.16.0=py311hb3a22ac_0
- chardet=5.2.0=py311h38be061_1
- charset-normalizer=3.3.2=pyhd8ed1ab_0
- colorama=0.4.6=pyhd8ed1ab_0
- comm=0.2.1=pyhd8ed1ab_0
- contourpy=1.2.0=py311h9547e67_0
- cudatoolkit=11.8.0=h4ba93d1_12
- cycler=0.12.1=pyhd8ed1ab_0
- debugpy=1.8.0=py311hb755f60_1
- decorator=5.1.1=pyhd8ed1ab_0
- defusedxml=0.7.1=pyhd8ed1ab_0
- entrypoints=0.4=pyhd8ed1ab_0
- exceptiongroup=1.2.0=pyhd8ed1ab_2
- executing=2.0.1=pyhd8ed1ab_0
- expat=2.5.0=hcb278e6_1
- fftw=3.3.10=nompi_hc118613_108
- font-ttf-dejavu-sans-mono=2.37=hab24e00_0
- font-ttf-inconsolata=3.000=h77eed37_0
- font-ttf-source-code-pro=2.038=h77eed37_0
- font-ttf-ubuntu=0.83=h77eed37_1
- fontconfig=2.14.2=h14ed4e7_0
- fonts-conda-ecosystem=1=0
- fonts-conda-forge=1=0
- fonttools=4.47.2=py311h459d7ec_0
- fqdn=1.5.1=pyhd8ed1ab_0
- freetype=2.12.1=h267a509_2
- freetype-py=2.3.0=pyhd8ed1ab_0
- gettext=0.21.1=h27087fc_0
- greenlet=3.0.3=py311hb755f60_0
- hdf4=4.2.15=h2a13503_7
- hdf5=1.14.3=nompi_h4f84152_100
- icu=73.2=h59595ed_0
- idna=3.6=pyhd8ed1ab_0
- importlib-metadata=7.0.1=pyha770c72_0
- importlib_metadata=7.0.1=hd8ed1ab_0
- importlib_resources=6.1.1=pyhd8ed1ab_0
- iniconfig=2.0.0=pyhd8ed1ab_0
- ipykernel=6.29.0=pyhd33586a_0
- ipython=8.20.0=pyh707e725_0
- ipywidgets=8.1.1=pyhd8ed1ab_0
- isoduration=20.11.0=pyhd8ed1ab_0
- jedi=0.19.1=pyhd8ed1ab_0
- jinja2=3.1.3=pyhd8ed1ab_0
- joblib=1.3.2=pyhd8ed1ab_0
- json5=0.9.14=pyhd8ed1ab_0
- jsonpointer=2.4=py311h38be061_3
- jsonschema=4.21.1=pyhd8ed1ab_0
- jsonschema-specifications=2023.12.1=pyhd8ed1ab_0
- jsonschema-with-format-nongpl=4.21.1=pyhd8ed1ab_0
- jupyter-lsp=2.2.2=pyhd8ed1ab_0
- jupyter_client=8.6.0=pyhd8ed1ab_0
- jupyter_core=5.7.1=py311h38be061_0
- jupyter_events=0.9.0=pyhd8ed1ab_0
- jupyter_server=2.12.5=pyhd8ed1ab_0
- jupyter_server_terminals=0.5.2=pyhd8ed1ab_0
- jupyterlab=4.0.12=pyhd8ed1ab_0
- jupyterlab_pygments=0.3.0=pyhd8ed1ab_0
- jupyterlab_server=2.25.2=pyhd8ed1ab_0
- jupyterlab_widgets=3.0.9=pyhd8ed1ab_0
- keyutils=1.6.1=h166bdaf_0
- kiwisolver=1.4.5=py311h9547e67_1
- krb5=1.21.2=h659d440_0
- lcms2=2.16=hb7c19ff_0
- ld_impl_linux-64=2.40=h41732ed_0
- lerc=4.0.0=h27087fc_0
- libaec=1.1.2=h59595ed_1
- libblas=3.9.0=21_linux64_openblas
- libboost=1.82.0=h6fcfa73_6
- libboost-python=1.82.0=py311h92ebd52_6
- libbrotlicommon=1.1.0=hd590300_1
- libbrotlidec=1.1.0=hd590300_1
- libbrotlienc=1.1.0=hd590300_1
- libcblas=3.9.0=21_linux64_openblas
- libcurl=8.5.0=hca28451_0
- libdeflate=1.19=hd590300_0
- libedit=3.1.20191231=he28a2e2_2
- libev=4.33=hd590300_2
- libexpat=2.5.0=hcb278e6_1
- libffi=3.4.2=h7f98852_5
- libgcc-ng=13.2.0=h807b86a_4
- libgfortran-ng=13.2.0=h69a702a_4
- libgfortran5=13.2.0=ha4646dd_4
- libglib=2.78.3=h783c2da_0
- libgomp=13.2.0=h807b86a_4
- libiconv=1.17=hd590300_2
- libjpeg-turbo=3.0.0=hd590300_1
- liblapack=3.9.0=21_linux64_openblas
- libnetcdf=4.9.2=nompi_h9612171_113
- libnghttp2=1.58.0=h47da74e_1
- libnsl=2.0.1=hd590300_0
- libopenblas=0.3.26=pthreads_h413a1c8_0
- libpng=1.6.39=h753d276_0
- libsodium=1.0.18=h36c2ea0_1
- libsqlite=3.46.0=hde9e2c9_0
- libssh2=1.11.0=h0841786_0
- libstdcxx-ng=13.2.0=h7e041cc_4
- libtiff=4.6.0=ha9c0a0a_2
- libuuid=2.38.1=h0b41bf4_0
- libwebp-base=1.3.2=hd590300_0
- libxcb=1.15=h0b41bf4_0
- libxcrypt=4.4.36=hd590300_1
- libxml2=2.12.4=h232c23b_1
- libzip=1.10.1=h2629f0a_3
- libzlib=1.2.13=hd590300_5
- lz4-c=1.9.4=hcb278e6_0
- lzo=2.10=h516909a_1000
- markupsafe=2.1.4=py311h459d7ec_0
- matplotlib-base=3.8.2=py311h54ef318_0
- matplotlib-inline=0.1.6=pyhd8ed1ab_0
- mda-xdrlib=0.2.0=pyhd8ed1ab_0
- mdtraj=1.9.9=py311h90fe790_1
- mistune=3.0.2=pyhd8ed1ab_0
- msgpack-python=1.0.7=py311h9547e67_0
- munkres=1.1.4=pyh9f0ad1d_0
- nbclient=0.8.0=pyhd8ed1ab_0
- nbconvert-core=7.14.2=pyhd8ed1ab_0
- nbformat=5.9.2=pyhd8ed1ab_0
- ncurses=6.5=h59595ed_0
- nest-asyncio=1.6.0=pyhd8ed1ab_0
- netcdf-fortran=4.6.1=nompi_hacb5139_103
- networkx=3.2.1=pyhd8ed1ab_0
- nomkl=1.0=h5ca1d4c_0
- notebook=7.0.7=pyhd8ed1ab_0
- notebook-shim=0.2.3=pyhd8ed1ab_0
- numexpr=2.8.8=py311h039bad6_100
- numpy=1.26.3=py311h64a7726_0
- ocl-icd=2.3.1=h7f98852_0
- ocl-icd-system=1.0.0=1
- openeye-toolkits=2023.1.1=py311_0
- openff-amber-ff-ports=0.0.4=pyhca7485f_0
- openff-forcefields=2024.01.0=pyhca7485f_0
- openff-interchange=0.3.18=pyhd8ed1ab_0
- openff-interchange-base=0.3.18=pyhd8ed1ab_0
- openff-models=0.1.1=pyhca7485f_0
- openff-qcsubmit=0.50.2=pyhd8ed1ab_0
- openff-toolkit=0.15.1=pyhd8ed1ab_0
- openff-toolkit-base=0.15.1=pyhd8ed1ab_0
- openff-units=0.2.1=pyh1a96a4e_0
- openff-utilities=0.1.12=pyhd8ed1ab_0
- openjpeg=2.5.0=h488ebb8_3
- openmm=8.1.1=py311h9766050_0
- openssl=3.3.1=h4ab18f5_0
- overrides=7.7.0=pyhd8ed1ab_0
- packaging=23.2=pyhd8ed1ab_0
- packmol=20.010=h86c2bf4_0
- pandas=2.2.0=py311h320fe9a_0
- pandocfilters=1.5.0=pyhd8ed1ab_0
- panedr=0.8.0=pyhd8ed1ab_0
- parmed=4.2.2=py311hb755f60_1
- parso=0.8.3=pyhd8ed1ab_0
- pcre2=10.42=hcad00b1_0
- perl=5.32.1=7_hd590300_perl5
- pexpect=4.9.0=pyhd8ed1ab_0
- pickleshare=0.7.5=py_1003
- pillow=10.2.0=py311ha6c5da5_0
- pint=0.21=pyhd8ed1ab_0
- pip=23.3.2=pyhd8ed1ab_0
- pixman=0.43.2=h59595ed_0
- pkgutil-resolve-name=1.3.10=pyhd8ed1ab_1
- platformdirs=4.2.0=pyhd8ed1ab_0
- pluggy=1.4.0=pyhd8ed1ab_0
- prometheus_client=0.19.0=pyhd8ed1ab_0
- prompt-toolkit=3.0.42=pyha770c72_0
- psutil=5.9.8=py311h459d7ec_0
- pthread-stubs=0.4=h36c2ea0_1001
- ptyprocess=0.7.0=pyhd3deb0d_0
- pure_eval=0.2.2=pyhd8ed1ab_0
- py-cpuinfo=9.0.0=pyhd8ed1ab_0
- pycairo=1.25.1=py311h8feb60e_0
- pycalverter=1.6.1=py_0
- pycparser=2.21=pyhd8ed1ab_0
- pydantic=2.6.0=pyhd8ed1ab_0
- pydantic-core=2.16.1=py311h46250e7_0
- pyedr=0.8.0=pyhd8ed1ab_0
- pygments=2.17.2=pyhd8ed1ab_0
- pyjwt=2.8.0=pyhd8ed1ab_0
- pyparsing=3.1.1=pyhd8ed1ab_0
- pysocks=1.7.1=pyha2e5f31_6
- pytables=3.9.2=py311h10c7f7f_1
- pytest=8.0.0=pyhd8ed1ab_0
- python=3.11.7=hab00c5b_1_cpython
- python-constraint=1.4.0=py_0
- python-dateutil=2.8.2=pyhd8ed1ab_0
- python-fastjsonschema=2.19.1=pyhd8ed1ab_0
- python-json-logger=2.0.7=pyhd8ed1ab_0
- python-tzdata=2023.4=pyhd8ed1ab_0
- python_abi=3.11=4_cp311
- pytz=2023.4=pyhd8ed1ab_0
- pyyaml=6.0.1=py311h459d7ec_1
- pyzmq=25.1.2=py311h34ded2d_0
- qcelemental=0.27.1=pyhd8ed1ab_0
- qcportal=0.55=pyhd8ed1ab_0
- rdkit=2023.09.4=py311h4c2f14b_0
- readline=8.2=h8228510_1
- referencing=0.33.0=pyhd8ed1ab_0
- regex=2023.12.25=py311h459d7ec_0
- reportlab=4.0.9=py311h459d7ec_0
- requests=2.31.0=pyhd8ed1ab_0
- rfc3339-validator=0.1.4=pyhd8ed1ab_0
- rfc3986-validator=0.1.1=pyh9f0ad1d_0
- rlpycairo=0.2.0=pyhd8ed1ab_0
- rpds-py=0.17.1=py311h46250e7_0
- scipy=1.12.0=py311h64a7726_2
- send2trash=1.8.2=pyh41d4057_0
- setuptools=69.0.3=pyhd8ed1ab_0
- six=1.16.0=pyh6c4a22f_0
- smirnoff99frosst=1.1.0=pyh44b312d_0
- snappy=1.1.10=h9fff704_0
- sniffio=1.3.0=pyhd8ed1ab_0
- soupsieve=2.5=pyhd8ed1ab_1
- sqlalchemy=2.0.25=py311h459d7ec_0
- sqlite=3.46.0=h6d4b2fc_0
- stack_data=0.6.2=pyhd8ed1ab_0
- tabulate=0.9.0=pyhd8ed1ab_1
- terminado=0.18.0=pyh0d859eb_0
- tinycss2=1.2.1=pyhd8ed1ab_0
- tk=8.6.13=noxft_h4845f30_101
- tomli=2.0.1=pyhd8ed1ab_0
- tornado=6.3.3=py311h459d7ec_1
- tqdm=4.66.1=pyhd8ed1ab_0
- traitlets=5.14.1=pyhd8ed1ab_0
- types-python-dateutil=2.8.19.20240106=pyhd8ed1ab_0
- typing-extensions=4.9.0=hd8ed1ab_0
- typing_extensions=4.9.0=pyha770c72_0
- typing_utils=0.1.0=pyhd8ed1ab_0
- tzdata=2023d=h0c530f3_0
- unidecode=1.3.8=pyhd8ed1ab_0
- uri-template=1.3.0=pyhd8ed1ab_0
- urllib3=2.2.0=pyhd8ed1ab_0
- wcwidth=0.2.13=pyhd8ed1ab_0
- webcolors=1.13=pyhd8ed1ab_0
- webencodings=0.5.1=pyhd8ed1ab_2
- websocket-client=1.7.0=pyhd8ed1ab_0
- wheel=0.42.0=pyhd8ed1ab_0
- widgetsnbextension=4.0.9=pyhd8ed1ab_0
- xmltodict=0.13.0=pyhd8ed1ab_0
- xorg-kbproto=1.0.7=h7f98852_1002
- xorg-libice=1.1.1=hd590300_0
- xorg-libsm=1.2.4=h7391055_0
- xorg-libx11=1.8.7=h8ee46fc_0
- xorg-libxau=1.0.11=hd590300_0
- xorg-libxdmcp=1.1.3=h7f98852_0
- xorg-libxext=1.3.4=h0b41bf4_2
- xorg-libxrender=0.9.11=hd590300_0
- xorg-libxt=1.3.0=hd590300_1
- xorg-renderproto=0.11.1=h7f98852_1002
- xorg-xextproto=7.3.0=h0b41bf4_1003
- xorg-xproto=7.0.31=h7f98852_1007
- xz=5.2.6=h166bdaf_0
- yaml=0.2.5=h7f98852_2
- zeromq=4.3.5=h59595ed_0
- zipp=3.17.0=pyhd8ed1ab_0
- zlib=1.2.13=hd590300_5
- zlib-ng=2.0.7=h0b41bf4_0
- zstandard=0.22.0=py311haa97af0_0
- zstd=1.5.5=hfc55251_0
- pip:
- amberutils==21.0
- edgembar==0.2
- mmpbsa-py==16.0
- packmol-memgen==2023.2.24
- pdb4amber==22.0
- pymsmt==22.0
- pytraj==2.0.6
- sander==22.0
prefix: /home/brent/mambaforge/envs/qcarchive-user-submit
Loading