-
Notifications
You must be signed in to change notification settings - Fork 24
CREST Adapter #807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
CREST Adapter #807
Conversation
d6a83b4 to
c231a43
Compare
3e88c36 to
7674f5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
9c25f3d to
51368be
Compare
9be935e to
eab7647
Compare
eab7647 to
6f36b41
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 25 out of 26 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
arc/settings/settings.py
Outdated
| def parse_version(folder_name): | ||
| """ | ||
| Parses the version from the folder name and returns a tuple for comparison. | ||
| Supports versions like: 3.0.2, v212, 2.1, 2 | ||
| """ | ||
| version_regex = re.compile(r"(?:v?(\d+)(?:\.(\d+))?(?:\.(\d+))?)", re.IGNORECASE) | ||
| match = version_regex.search(folder_name) | ||
| if not match: |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parse_version() uses the re module, but arc/settings/settings.py doesn't import re. Since find_crest_executable() is executed at import time, this will raise a NameError during ARC startup. Add import re near the other imports (or avoid using re at import time).
| if cluster_soft in ["condor", "htcondor"]: | ||
| # HTCondor branch (kept for completeness – you can delete if you don't use it) | ||
| sub_job = submit_scripts["local"]["crest"] | ||
| format_params = { | ||
| "name": f"crest_{xyz_crest_int}", | ||
| "cpus": cpus, | ||
| "memory": int(SERVERS["local"].get("memory", 32.0) * 1024), | ||
| } | ||
| sub_job = sub_job.format(**format_params) | ||
|
|
||
| with open( | ||
| os.path.join(path, settings["submit_filenames"]["HTCondor"]), "w" | ||
| ) as f: | ||
| f.write(sub_job) | ||
|
|
||
| crest_job = submit_scripts["local"]["crest_job"] | ||
| crest_job = crest_job.format( | ||
| path=path, |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crest_ts_conformer_search() expects submit_scripts["local"]["crest"] (and crest_job) templates to exist, but arc/settings/submit.py in this repo doesn't define any CREST entries. This will raise a KeyError the first time CREST is used unless every user provides a custom ~/.arc/submit.py. Add default CREST submit templates to arc/settings/submit.py (for PBS/HTCondor) or change the adapter to generate a minimal submit script internally without relying on missing submit_scripts keys.
|
|
||
| if len(self.reactions) < 5: | ||
| successes = len([tsg for tsg in rxn.ts_species.ts_guesses if tsg.success and 'heuristics' in tsg.method]) | ||
| successes = [tsg for tsg in rxn.ts_species.ts_guesses if tsg.success] |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The success summary counts all successful TS guesses on rxn.ts_species.ts_guesses, not just the ones produced/updated by the heuristics adapter. This can misreport results once other TS adapters also add guesses to the same ts_guesses list. Filter by heuristics provenance (e.g., "heuristics" in tsg.method_sources or tsg.method == "heuristics") when computing successes for the heuristics log message.
| successes = [tsg for tsg in rxn.ts_species.ts_guesses if tsg.success] | |
| successes = [ | |
| tsg | |
| for tsg in rxn.ts_species.ts_guesses | |
| if tsg.success | |
| and ( | |
| ( | |
| getattr(tsg, "method_sources", None) is not None | |
| and any( | |
| isinstance(src, str) and src.lower() == "heuristics" | |
| for src in tsg.method_sources | |
| ) | |
| ) | |
| or ( | |
| isinstance(getattr(tsg, "method", None), str) | |
| and tsg.method.lower() == "heuristics" | |
| ) | |
| ) | |
| ] |
| """ | ||
| Reorder an XYZ string between ``ATOM X Y Z`` and ``X Y Z ATOM`` with optional unit conversion. | ||
|
|
||
| Args: | ||
| xyz_str (str): The string xyz format to be converted. | ||
| reverse_atoms (bool, optional): Whether to reverse the atoms and coordinates. | ||
| units (str, optional): Units of the input coordinates ('angstrom' or 'bohr'). | ||
| convert_to (str, optional): The units to convert to (either 'angstrom' or 'bohr'). | ||
| project_directory (str, optional): The path to the project directory. | ||
|
|
||
| Raises: | ||
| ConverterError: If xyz_str is not a string or does not have four space-separated entries per non-empty line. | ||
|
|
||
| Returns: str | ||
| The converted string xyz format. | ||
| """ | ||
| if isinstance(xyz_str, tuple): | ||
| xyz_str = '\n'.join(xyz_str) | ||
| if isinstance(xyz_str, list): | ||
| xyz_str = '\n'.join(xyz_str) | ||
| if not isinstance(xyz_str, str): | ||
| raise ConverterError(f'Expected a string input, got {type(xyz_str)}') | ||
| if project_directory is not None: | ||
| file_path = os.path.join(project_directory, xyz_str) | ||
| if os.path.isfile(file_path): | ||
| with open(file_path, 'r') as f: | ||
| xyz_str = f.read() | ||
|
|
||
|
|
||
| if units.lower() == 'angstrom' and convert_to.lower() == 'angstrom': | ||
| conversion_factor = 1 | ||
| elif units.lower() == 'bohr' and convert_to.lower() == 'bohr': | ||
| conversion_factor = 1 | ||
| elif units.lower() == 'angstrom' and convert_to.lower() == 'bohr': | ||
| conversion_factor = constants.angstrom_to_bohr | ||
| elif units.lower() == 'bohr' and convert_to.lower() == 'angstrom': | ||
| conversion_factor = constants.bohr_to_angstrom | ||
| else: | ||
| raise ConverterError("Invalid target unit. Choose 'angstrom' or 'bohr'.") | ||
|
|
||
| processed_lines = list() | ||
| # Split the string into lines | ||
| lxyz = xyz_str.strip().splitlines() | ||
| # Determine whether the atom label appears first or last in each line | ||
| first_line_tokens = lxyz[0].strip().split() | ||
| atom_first = not is_str_float(first_line_tokens[0]) | ||
|
|
||
| for item in lxyz: | ||
| parts = item.strip().split() | ||
|
|
||
| if len(parts) != 4: | ||
| raise ConverterError(f'xyz_str has an incorrect format, expected 4 elements in each line, ' | ||
| f'got "{item}" in:\n{xyz_str}') |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reorder_xyz_string() claims to validate “four space-separated entries per non-empty line”, but the implementation iterates over all splitlines() results and will raise on blank lines (since len(parts) == 0). Either filter out empty/whitespace-only lines before parsing, or update the docstring/error message to match the stricter behavior. Also, the raised message “Invalid target unit” is misleading when units is invalid (not just convert_to).
alongd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @calvinp0 for this awesome contribution! I think it strengthens our heuristics adapter significantly.
Please see my comments, and some comments from copilot (like unused imports)
Important: Please add a description of the H-Abstraction heuristics (briefly) and of the CREST adapter (briefly) to ARC's Docs in this file.
arc/job/adapters/ts/crest.py
Outdated
|
|
||
| MAX_CHECK_INTERVAL_SECONDS = 100 | ||
|
|
||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you turn these into single lines such as: CREST_PATH = settings.get("CREST_PATH", None)?
arc/job/adapters/ts/crest.py
Outdated
| continue | ||
| if not crest_available(): | ||
| logger.warning('CREST is not available. Skipping CREST TS search.') | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace with break
arc/job/adapters/ts/crest.py
Outdated
| from arc.plotter import save_geo | ||
| from arc.species.converter import reorder_xyz_string, str_to_xyz, xyz_to_dmat, xyz_to_str | ||
| from arc.species.species import ARCSpecies, TSGuess | ||
| from arc.job.adapters.ts.heuristics import h_abstraction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you place this import a bot higher, next to the other job imports?
arc/job/adapters/ts/crest.py
Outdated
| tsg.tic() | ||
|
|
||
| crest_job_dirs = [] | ||
| xyz_guesses = h_abstraction(reaction=rxn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the CREST adapter is a wrapper of h_abstraction only (as also defined by ts_adapters_by_rmg_family).
I wonder whether it should wrap other TS search methods, and whether users can elegantly ask ARC to apply CREST to their chosen method/adapter (or even a user guess for the TS?).
At least, if it "complements the heuristics adapter", then let's make it call a switch/hub function in heuristics.py that maps to the specific heuristic function by the reaction family (h_abstraction or hydrolysis, more to come in the future). Then CREST can wrap that hub function. Users will specify 'crest' as the TS search adapter, it it will automatically do 'appropriate-heuristics-method' + CREST.
I think we'll just need a function in heuristics that returns pairs of atoms for CREST to constrain.
If you think there's a more elegant way to request a CREST run for any TS search adapter in ARC, then that would be welcomed. Let's discuss this.
arc/job/adapters/ts/crest.py
Outdated
| if reactions is None: | ||
| raise ValueError('Cannot execute TS CREST without ARCReaction object(s).') | ||
|
|
||
| dihedral_increment = dihedral_increment or 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it a to level constant, like in heuristics: DIHEDRAL_INCREMENT = 30, or even better, just import DIHEDRAL_INCREMENT from heuristics for consistency
| install-xtb: | ||
| bash $(DEVTOOLS_DIR)/install_xtb.sh | ||
|
|
||
| install-crest: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we add install-crest into install-all as well?
arc/species/species.py
Outdated
| for tsg in self.ts_guesses: | ||
| for cluster_tsg in cluster_tsgs: | ||
| if cluster_tsg.almost_equal_tsgs(tsg): | ||
| logger.info(f"Similar TSGuesses found: {tsg.index} is similar to {cluster_tsg.index}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this or make it a debug comment
|
|
||
| # TS methods to try when appropriate for a reaction (other than user guesses which are always allowed): | ||
| ts_adapters = ['heuristics', 'AutoTST', 'GCN', 'xtb_gsm'] | ||
| ts_adapters = ['heuristics', 'AutoTST', 'GCN', 'xtb_gsm', 'crest'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should wither use CREST or use H_Abstraction (this is an example-like settings, no?)
arc/settings/settings.py
Outdated
|
|
||
|
|
||
|
|
||
| def parse_version(folder_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should house these funcs somewhere else, not in settings.py
maybe try in imports.py?
Are these funcs testable?
arc/constants.py
Outdated
| epsilon_0 = 8.8541878128 | ||
|
|
||
| bohr_to_angstrom = 0.529177 | ||
| angstrom_to_bohr = 1.8897259886 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need it? maybe in the code we can just do 1 / bohr_to_angstrom instead of 1* angstrom_to_bohr
If you feel that you need it for semantics, than maybe consider defining angstrom_to_bohr = 1/ bohr_to_angstrom here, for consistency?
6f36b41 to
0c5db35
Compare
Adds the angstrom to bohr conversion factor to the constants module. This facilitates easier conversions between these units within the codebase, enhancing usability and reducing potential errors.
0c5db35 to
cf0ff11
Compare
| import os | ||
| import string | ||
| import sys | ||
| import shutil |
Check notice
Code scanning / CodeQL
Unused import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 19 hours ago
In general, an unused import should be removed from the file to avoid unnecessary dependencies and keep the code clean. Since we must not change existing functionality, we only remove the import when there is no evidence of it being used.
For arc/settings/settings.py, the best fix is to delete the import shutil line (line 12 in the snippet) while leaving all other imports intact. No additional methods, imports, or definitions are needed. The only change is to edit arc/settings/settings.py and remove the single unused import statement.
| @@ -9,7 +9,6 @@ | ||
| import os | ||
| import string | ||
| import sys | ||
| import shutil | ||
| from arc.settings.crest import ( | ||
| find_crest_executable, | ||
| find_highest_version_in_directory, |
4ce45bc to
9a560dc
Compare
Adds CREST settings and installation to ARC for transition state search. This commit introduces necessary files and updates to enable CREST within the ARC framework, enhancing its capabilities for exploring reaction pathways. It includes: - A script to install CREST via conda. - A module for locating the CREST executable and setting up the environment. - Integration of CREST into the settings to allow its use as a TS adapter.
Adds a CREST adapter for transition state (TS) conformer searches, leveraging heuristics-generated guesses to find suitable TS structures. This facilitates more comprehensive TS exploration, particularly for reaction families supported by heuristics but potentially refined through CREST's conformer searching capabilities. Also introduces a TS seed hub, which centralizes requests to base TS-search adapters, and provides wrapper adapters (e.g., CREST) family-specific constraints for a seed.
Adds documentation for the CREST adapter, including a guide for extending CREST-based TS workflows and minimal usage examples. The documentation covers current family support, external references, and extension instructions for adding new families to CREST or enabling CREST to wrap a new TS seed adapter. Also includes seed schema contract.
Introduces a `method_sources` attribute to the TSGuess class. This attribute stores all methods that produced an equivalent xyz guess. Normalizes the method sources to a unique, ordered, lowercase list, ensuring consistency and avoiding duplicates. This allows for better tracking of the origins of TS guesses and simplifies the clustering logic.
Adds 'crest' as a valid option for transition state search methods. Adds a new job to the test suite for coverage of the wall time exceeded functionality. Updates testing path for wall_exceeded fixture Updates the path used to locate the `wall_exceeded.txt` fixture in the `TestJobAdapter` test class. This ensures that the test can correctly access the fixture file, regardless of the execution environment.
Includes CREST as a valid TS adapter option for H_Abstraction reactions. This allows users to utilize CREST for transition state searches, expanding the available methods.
Adds a function to reorder and convert XYZ strings between ``ATOM X Y Z`` and ``X Y Z ATOM`` formats, with optional unit conversion. Also adds a backwards-compatible wrapper with a deprecation warning.
Modifies restart tests to generate unique project names when running in parallel using pytest-xdist. This avoids collisions during cleanup of project directories. Updates restart tests to use ARC_TESTING_PATH Modifies restart tests to utilize the ARC_TESTING_PATH constant for specifying test directories. This change ensures consistency and simplifies path management within the testing framework.
Includes setuptools as a dependency in the environment file. This ensures that the package installation process has access to required tools and resources.
Addition of CREST Adapter that complements the heuristic adapter.
This pull request adds support for the CREST conformer and transition state (TS) search method to the ARC project, along with several related improvements and code cleanups. The most important changes include integrating CREST as a TS search adapter, updating configuration and constants, and enhancing the heuristics TS search logic for better provenance tracking and code clarity.
CREST Integration:
JobEnum(arc/job/adapter.py), included CREST in the list of adapters and RMG family mapping, and registered it as a default incore adapter (arc/job/adapters/common.py,arc/job/adapters/ts/__init__.py). [1] [2] [3] [4]arc/job/adapters/ts/crest_test.py).Makefile). [1] [2]Constants and Configuration:
angstrom_to_bohrconversion constant to both Cython and Python constants modules (arc/constants.pxd,arc/constants.py). [1] [2] [3]Heuristics TS Search Enhancements and Refactoring:
arc/job/adapters/ts/heuristics.py). [1] [2] [3] [4]arc/job/adapters/ts/heuristics.py). [1] [2] [3] [4] [5] [6] [7].