synplan.route_quality package#
Route quality assessment module for synthetic route analysis.
Provides functional group detection, reaction classification, and incompatibility scoring to identify synthesis steps that may require protecting group strategies.
This module is inspired by the work of Westerlund et al.:
Westerlund, A. M.; Sigmund, L. M.; Kannas, C.; Genheden, S.; Kabeshov, M. “Toward lab-ready AI synthesis plans with protection strategies and route scoring.” ChemRxiv, 2025. https://doi.org/10.26434/chemrxiv-2025-gdrr8
The competing-sites score S(T) and the functional-group incompatibility framework follow the methodology described in that paper.
- class synplan.route_quality.CompetingInteraction(*, step_id: int, fg_name: str, fg_atoms: tuple[int, ...], reacting_fg: str | None, severity: str)#
Bases:
BaseModelA competing functional group interaction at a synthesis step.
- Parameters:
step_id – Index of the reaction step in the route.
fg_name – Name of the competing functional group.
fg_atoms – Atom indices of the matched functional group.
reacting_fg – Name of the FG at the reaction center (or None).
severity – Interaction severity: “incompatible”, “competing”, or “compatible”.
- model_config = {'frozen': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class synplan.route_quality.CompetingSitesScore(scanner: RouteScanner)#
Bases:
objectScore routes by their competing functional group burden.
Uses a RouteScanner to detect interactions and then computes a worst-per-step score:
For each step s, w_s = max severity penalty among interactions. S(T) = max[1 - (sum(w_s) + H) / max(N, 1), 0]
where H is the halogen competing-site count and N is the number of steps. Each step contributes at most 1.0 (incompatible) or 0.5 (competing) to the penalty, preventing highly functionalized molecules from overwhelming the score.
- Parameters:
scanner – A RouteScanner instance configured with a FunctionalGroupDetector and IncompatibilityMatrix.
- rank_routes(routes: dict[int, dict[int, ReactionContainer]], existing_scores: dict[int, float] | None = None, weight: float = 0.5) list[tuple[int, float, float, float]]#
Rank routes by a combined score mixing original and protection scores.
combined = (1 - weight) * original_score_normalized + weight * S(T)
If no existing scores are provided, the original score component is treated as 0.0 for all routes and only the protection score is used.
- Parameters:
routes – Dict mapping route_id -> {step_id: ReactionContainer}.
existing_scores – Optional dict mapping route_id -> original route score (e.g. from Tree.route_score()).
weight – Weight of the protection score in [0, 1].
- Returns:
List of (route_id, combined_score, protection_score, original_score) tuples, sorted descending by combined_score.
- score_route(route: dict[int, ReactionContainer]) tuple[float, list[CompetingInteraction]]#
Compute the S(T) score for a single route.
- Parameters:
route – A dict mapping step_id -> ReactionContainer.
- Returns:
Tuple of (score, interactions) where score is in [0, 1] and interactions is the list of CompetingInteraction objects.
- class synplan.route_quality.FunctionalGroupDetector(config_path: str)#
Bases:
objectSMARTS-based functional group detector.
Loads a YAML config of SMARTS patterns organized by category and uses chython substructure matching to detect functional groups in molecules.
Results are cached by canonical SMILES (with hydrogens) so that repeated queries for the same molecule are fast.
- Parameters:
config_path – Path to a YAML file with SMARTS definitions, organized by category (nucleophile/electrophile/unsaturated).
- detect_all(molecule: MoleculeContainer) list[FunctionalGroupMatch]#
Detect all functional group matches in a molecule.
Applies every loaded SMARTS pattern and returns deduplicated matches (unique by name + sorted atom indices). Results are cached by canonical SMILES so that the same molecule is not re-scanned.
- Parameters:
molecule – A chython MoleculeContainer to search.
- Returns:
List of FunctionalGroupMatch objects.
- detect_competing(molecule: MoleculeContainer, reaction_center_atoms: set[int]) list[FunctionalGroupMatch]#
Detect functional groups NOT overlapping with the reaction center.
These are “competing” sites that may interfere with the intended reaction at the reaction center.
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
List of FunctionalGroupMatch objects for competing FGs.
- detect_reacting(molecule: MoleculeContainer, reaction_center_atoms: set[int]) FunctionalGroupMatch | None#
Detect the functional group at the reaction center.
Returns the first FG whose atoms overlap with the reaction center, or
Noneif no known FG is found there.- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
The FunctionalGroupMatch at the reaction center, or None.
- class synplan.route_quality.FunctionalGroupMatch(*, name: str, category: str, atom_indices: tuple[int, ...])#
Bases:
BaseModelA single functional group match in a molecule.
- Parameters:
name – Human-readable name of the functional group (e.g. “hydroxyl”).
category – Reactivity category (e.g. “nucleophile”, “electrophile”).
atom_indices – Tuple of matched atom indices in the molecule, sorted for deduplication.
- model_config = {'frozen': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class synplan.route_quality.HalogenDetector(config_path: str)#
Bases:
objectSMARTS-based halogen group detector.
Loads a YAML config of halogen SMARTS patterns and detects halogens in molecules. Used to count same-family competing halogens for the H term in the S(T) score.
- Parameters:
config_path – Path to a YAML file with halogen SMARTS definitions.
- count_same_family_competing(molecule: MoleculeContainer, reaction_center_atoms: set[int]) int#
Count competing halogens in the same family as reaction center halogens.
Per the paper, only halogens at competing sites that share the same halogen family as a halogen at the reaction center count toward the H term in S(T).
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
Number of same-family competing halogen sites.
- detect_all(molecule: MoleculeContainer) list[HalogenMatch]#
Detect all halogen matches in a molecule.
- Parameters:
molecule – A chython MoleculeContainer to search.
- Returns:
List of HalogenMatch objects.
- detect_competing_halogens(molecule: MoleculeContainer, reaction_center_atoms: set[int]) list[HalogenMatch]#
Detect halogen groups NOT overlapping with the reaction center.
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
List of HalogenMatch objects for competing halogens.
- detect_reaction_center_halogens(molecule: MoleculeContainer, reaction_center_atoms: set[int]) list[HalogenMatch]#
Detect halogen groups overlapping with the reaction center.
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
List of HalogenMatch objects at the reaction center.
- class synplan.route_quality.HalogenMatch(*, name: str, family: str, atom_indices: tuple[int, ...])#
Bases:
BaseModelA single halogen group match in a molecule.
- Parameters:
name – Name of the halogen pattern (e.g. “aryl_bromide”).
family – Halogen family (e.g. “bromide”, “chloride”).
atom_indices – Tuple of matched atom indices in the molecule.
- model_config = {'frozen': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class synplan.route_quality.IncompatibilityMatrix(config_path: str)#
Bases:
objectLookup table for functional group vs functional group incompatibility.
Loads a TSV matrix where the first row contains column FG names (with an empty first cell) and subsequent rows have a row FG name followed by integer severity levels (0=compatible, 1=competing, 2=incompatible).
- Parameters:
config_path – Path to the incompatibility matrix TSV file.
- lookup(competing_fg: str, reacting_fg: str) str#
Look up the severity of a (competing_fg, reacting_fg) pair.
- Parameters:
competing_fg – Competing functional group name (row key).
reacting_fg – Reacting functional group name (column key).
- Returns:
Severity label: “incompatible”, “competing”, or “compatible”.
- class synplan.route_quality.ProtectionConfig(*, competing_groups_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/synplanner/checkouts/stable/synplan/route_quality/protection/data/competing_groups.yaml', incompatibility_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/synplanner/checkouts/stable/synplan/route_quality/protection/data/incompatibility_matrix.tsv', halogen_groups_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/synplanner/checkouts/stable/synplan/route_quality/protection/data/halogen_groups.yaml', score_weight: Annotated[float, Ge(ge=0.0), Le(le=1.0)] = 0.5, enable_reranking: bool = True)#
Bases:
BaseConfigModelConfiguration for protection-group analysis.
- Parameters:
competing_groups_path – Path to YAML file with SMARTS patterns for reactive functional groups, organized by category.
incompatibility_path – Path to TSV file with the FG x FG incompatibility matrix.
halogen_groups_path – Path to YAML file with halogen SMARTS patterns grouped by halogen family.
score_weight – Weight of the protection score S(T) when combining with the original route score. Must be in [0, 1].
enable_reranking – If True, re-rank candidate routes using the combined score that includes the protection penalty.
- model_config = {'extra': 'forbid'}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class synplan.route_quality.ProtectionRouteScorer(scorer: CompetingSitesScore, weight: float = 1.0)#
Bases:
RouteScorerRoute scorer based on competing functional-group incompatibility.
Wraps a
CompetingSitesScoreand applies the paper’s re-ranking formula:rescored = original * ((1 - w) + w * S(T))
With the default
weight=1.0this reduces tooriginal * S(T).- Parameters:
scorer – A configured
CompetingSitesScoreinstance.weight – Strength of the protection penalty in [0, 1]. 1.0 matches the paper exactly; lower values soften the penalty.
- classmethod from_config(config=None, weight: float = 1.0) ProtectionRouteScorer#
Build a scorer from a
ProtectionConfig.- Parameters:
config – A ProtectionConfig instance. If
None, uses default paths bundled with SynPlanner.weight – Protection penalty weight.
- Returns:
Configured ProtectionRouteScorer.
- class synplan.route_quality.RouteScanner(fg_detector: FunctionalGroupDetector, incompatibility: IncompatibilityMatrix, halogen_detector: HalogenDetector | None = None)#
Bases:
objectScan a synthesis route for competing functional group interactions.
For each step in the route, detects functional groups on the product molecule that do not overlap with the reaction center, identifies the FG at the reaction center (“reacting FG”), and classifies their interaction severity using the FG x FG incompatibility matrix. Also counts same-family competing halogens for the H term.
- Parameters:
fg_detector – A FunctionalGroupDetector instance.
incompatibility – An IncompatibilityMatrix instance.
halogen_detector – An optional HalogenDetector instance.
- static classify_interactions(interactions: list[CompetingInteraction], halogen_count: int = 0) tuple[int, int, int]#
Count interactions by severity category.
- Parameters:
interactions – List of CompetingInteraction objects.
halogen_count – Number of same-family competing halogen sites.
- Returns:
Tuple of (I, C, H) where: - I = number of incompatible interactions - C = number of competing interactions - H = number of same-family competing halogen sites
- scan_route(route: dict[int, ReactionContainer]) tuple[list[CompetingInteraction], int]#
Walk a route step-by-step and collect competing interactions.
For each step the scanner:
Identifies the reacting FG — the FG on the reactant side that is consumed by the reaction (present in reactant, overlapping the reaction center). This matches the paper’s approach of looking at the FG being transformed.
Identifies competing FGs on the product side that do not overlap the reaction center.
Looks up severity of each competing FG against the reacting FG in the incompatibility matrix.
- Parameters:
route – A dict mapping step_id -> ReactionContainer, as returned by
extract_reactions()insynplan.chem.reaction_routes.route_cgr.- Returns:
Tuple of (interactions, halogen_count) where interactions is a list of CompetingInteraction objects and halogen_count is the total number of same-family competing halogen sites.
- class synplan.route_quality.RouteScorer#
Bases:
ABCAbstract base for post-search route re-ranking.
Subclasses implement
score()to evaluate a synthesis route and optionally overriderescore()to customise how the quality score is blended with the original tree search score.- rescore(original_score: float, route: tuple[ReactionContainer, ...]) float#
Combine the original tree score with this scorer’s assessment.
Default:
original * score(route)(multiplicative weighting as in Westerlund et al., 2025). Override for custom blending.- Parameters:
original_score – Raw score from the tree search.
route – Ordered tuple of reactions.
- Returns:
Adjusted score.
- synplan.route_quality.classify_reaction_type(reaction: ReactionContainer, cgr=None) str#
Classify a reaction into a broad type based on CGR bond analysis.
This is the default classifier used throughout the protection module. It delegates to
classify_reaction_type_broad().- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A string label for the broad reaction type.
- synplan.route_quality.classify_reaction_type_broad(reaction: ReactionContainer, cgr=None) str#
Classify a reaction into a broad type based on CGR bond analysis.
This is the original 4-category classifier.
- Possible return values:
'bond_formation': only new bonds are formed'bond_breaking': only existing bonds are broken'substitution': bonds are both formed and broken'other': no bond changes detected
- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A string label for the broad reaction type.
- synplan.route_quality.classify_reaction_type_detailed(reaction: ReactionContainer, cgr=None) str#
Classify a reaction into a fine-grained named type using CGR analysis.
Examines which atoms and bond-order changes are involved at the reaction center to return a more specific label.
- Possible return values:
'acylation'– C=O at center + new C-N or C-O bond'alkylation'– new C-N, C-O, or C-S bond without C=O at center'reduction'– net bond order decrease'oxidation'– net bond order increase'cross_coupling'– new C-C bond formed'amide_formation'– new C-N bond + C=O at center'ester_formation'– new C-O bond + C=O at center'halogenation'– new C-halogen bond'dehalogenation'– C-halogen bond broken'ring_closure'– intramolecular bond formation creating a ring'ring_opening'– ring bond broken'other'– fallback
- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A string label for the detailed reaction type.
- synplan.route_quality.get_reaction_center_atoms(reaction: ReactionContainer, cgr: CGRContainer | None = None) set[int]#
Extract atoms involved in bond or charge changes using CGR.
Uses
reaction.compose()to obtain the CGR, then returns atoms where bond orders or charges change.cgr.center_atomsalready includes both bond-change and charge-change atoms.- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A set of atom indices that participate in bond or charge changes.
synplan.route_quality.protection package#
Protection strategy module for synthetic route analysis.
Provides functional group detection, reaction classification, and incompatibility scoring to identify synthesis steps that may require protecting group strategies.
This module is inspired by the work of Westerlund et al.:
Westerlund, A. M.; Sigmund, L. M.; Kannas, C.; Genheden, S.; Kabeshov, M. “Toward lab-ready AI synthesis plans with protection strategies and route scoring.” ChemRxiv, 2025. https://doi.org/10.26434/chemrxiv-2025-gdrr8
The competing-sites score S(T) and the functional-group incompatibility framework follow the methodology described in that paper.
- class synplan.route_quality.protection.CompetingInteraction(*, step_id: int, fg_name: str, fg_atoms: tuple[int, ...], reacting_fg: str | None, severity: str)
Bases:
BaseModelA competing functional group interaction at a synthesis step.
- Parameters:
step_id – Index of the reaction step in the route.
fg_name – Name of the competing functional group.
fg_atoms – Atom indices of the matched functional group.
reacting_fg – Name of the FG at the reaction center (or None).
severity – Interaction severity: “incompatible”, “competing”, or “compatible”.
- fg_name: str
- model_config = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- severity: str
- step_id: int
- class synplan.route_quality.protection.CompetingSitesScore(scanner: RouteScanner)
Bases:
objectScore routes by their competing functional group burden.
Uses a RouteScanner to detect interactions and then computes a worst-per-step score:
For each step s, w_s = max severity penalty among interactions. S(T) = max[1 - (sum(w_s) + H) / max(N, 1), 0]
where H is the halogen competing-site count and N is the number of steps. Each step contributes at most 1.0 (incompatible) or 0.5 (competing) to the penalty, preventing highly functionalized molecules from overwhelming the score.
- Parameters:
scanner – A RouteScanner instance configured with a FunctionalGroupDetector and IncompatibilityMatrix.
- rank_routes(routes: dict[int, dict[int, ReactionContainer]], existing_scores: dict[int, float] | None = None, weight: float = 0.5) list[tuple[int, float, float, float]]
Rank routes by a combined score mixing original and protection scores.
combined = (1 - weight) * original_score_normalized + weight * S(T)
If no existing scores are provided, the original score component is treated as 0.0 for all routes and only the protection score is used.
- Parameters:
routes – Dict mapping route_id -> {step_id: ReactionContainer}.
existing_scores – Optional dict mapping route_id -> original route score (e.g. from Tree.route_score()).
weight – Weight of the protection score in [0, 1].
- Returns:
List of (route_id, combined_score, protection_score, original_score) tuples, sorted descending by combined_score.
- score_route(route: dict[int, ReactionContainer]) tuple[float, list[CompetingInteraction]]
Compute the S(T) score for a single route.
- Parameters:
route – A dict mapping step_id -> ReactionContainer.
- Returns:
Tuple of (score, interactions) where score is in [0, 1] and interactions is the list of CompetingInteraction objects.
- class synplan.route_quality.protection.FunctionalGroupDetector(config_path: str)
Bases:
objectSMARTS-based functional group detector.
Loads a YAML config of SMARTS patterns organized by category and uses chython substructure matching to detect functional groups in molecules.
Results are cached by canonical SMILES (with hydrogens) so that repeated queries for the same molecule are fast.
- Parameters:
config_path – Path to a YAML file with SMARTS definitions, organized by category (nucleophile/electrophile/unsaturated).
- clear_cache() None
Clear the internal results cache.
- detect_all(molecule: MoleculeContainer) list[FunctionalGroupMatch]
Detect all functional group matches in a molecule.
Applies every loaded SMARTS pattern and returns deduplicated matches (unique by name + sorted atom indices). Results are cached by canonical SMILES so that the same molecule is not re-scanned.
- Parameters:
molecule – A chython MoleculeContainer to search.
- Returns:
List of FunctionalGroupMatch objects.
- detect_competing(molecule: MoleculeContainer, reaction_center_atoms: set[int]) list[FunctionalGroupMatch]
Detect functional groups NOT overlapping with the reaction center.
These are “competing” sites that may interfere with the intended reaction at the reaction center.
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
List of FunctionalGroupMatch objects for competing FGs.
- detect_reacting(molecule: MoleculeContainer, reaction_center_atoms: set[int]) FunctionalGroupMatch | None
Detect the functional group at the reaction center.
Returns the first FG whose atoms overlap with the reaction center, or
Noneif no known FG is found there.- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
The FunctionalGroupMatch at the reaction center, or None.
- class synplan.route_quality.protection.FunctionalGroupMatch(*, name: str, category: str, atom_indices: tuple[int, ...])
Bases:
BaseModelA single functional group match in a molecule.
- Parameters:
name – Human-readable name of the functional group (e.g. “hydroxyl”).
category – Reactivity category (e.g. “nucleophile”, “electrophile”).
atom_indices – Tuple of matched atom indices in the molecule, sorted for deduplication.
- category: str
- model_config = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: str
- class synplan.route_quality.protection.HalogenDetector(config_path: str)
Bases:
objectSMARTS-based halogen group detector.
Loads a YAML config of halogen SMARTS patterns and detects halogens in molecules. Used to count same-family competing halogens for the H term in the S(T) score.
- Parameters:
config_path – Path to a YAML file with halogen SMARTS definitions.
- count_same_family_competing(molecule: MoleculeContainer, reaction_center_atoms: set[int]) int
Count competing halogens in the same family as reaction center halogens.
Per the paper, only halogens at competing sites that share the same halogen family as a halogen at the reaction center count toward the H term in S(T).
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
Number of same-family competing halogen sites.
- detect_all(molecule: MoleculeContainer) list[HalogenMatch]
Detect all halogen matches in a molecule.
- Parameters:
molecule – A chython MoleculeContainer to search.
- Returns:
List of HalogenMatch objects.
- detect_competing_halogens(molecule: MoleculeContainer, reaction_center_atoms: set[int]) list[HalogenMatch]
Detect halogen groups NOT overlapping with the reaction center.
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
List of HalogenMatch objects for competing halogens.
- detect_reaction_center_halogens(molecule: MoleculeContainer, reaction_center_atoms: set[int]) list[HalogenMatch]
Detect halogen groups overlapping with the reaction center.
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
List of HalogenMatch objects at the reaction center.
- class synplan.route_quality.protection.HalogenMatch(*, name: str, family: str, atom_indices: tuple[int, ...])
Bases:
BaseModelA single halogen group match in a molecule.
- Parameters:
name – Name of the halogen pattern (e.g. “aryl_bromide”).
family – Halogen family (e.g. “bromide”, “chloride”).
atom_indices – Tuple of matched atom indices in the molecule.
- family: str
- model_config = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: str
- class synplan.route_quality.protection.IncompatibilityMatrix(config_path: str)
Bases:
objectLookup table for functional group vs functional group incompatibility.
Loads a TSV matrix where the first row contains column FG names (with an empty first cell) and subsequent rows have a row FG name followed by integer severity levels (0=compatible, 1=competing, 2=incompatible).
- Parameters:
config_path – Path to the incompatibility matrix TSV file.
- lookup(competing_fg: str, reacting_fg: str) str
Look up the severity of a (competing_fg, reacting_fg) pair.
- Parameters:
competing_fg – Competing functional group name (row key).
reacting_fg – Reacting functional group name (column key).
- Returns:
Severity label: “incompatible”, “competing”, or “compatible”.
- class synplan.route_quality.protection.ProtectionConfig(*, competing_groups_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/synplanner/checkouts/stable/synplan/route_quality/protection/data/competing_groups.yaml', incompatibility_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/synplanner/checkouts/stable/synplan/route_quality/protection/data/incompatibility_matrix.tsv', halogen_groups_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/synplanner/checkouts/stable/synplan/route_quality/protection/data/halogen_groups.yaml', score_weight: Annotated[float, Ge(ge=0.0), Le(le=1.0)] = 0.5, enable_reranking: bool = True)
Bases:
BaseConfigModelConfiguration for protection-group analysis.
- Parameters:
competing_groups_path – Path to YAML file with SMARTS patterns for reactive functional groups, organized by category.
incompatibility_path – Path to TSV file with the FG x FG incompatibility matrix.
halogen_groups_path – Path to YAML file with halogen SMARTS patterns grouped by halogen family.
score_weight – Weight of the protection score S(T) when combining with the original route score. Must be in [0, 1].
enable_reranking – If True, re-rank candidate routes using the combined score that includes the protection penalty.
- competing_groups_path: str
- enable_reranking: bool
- halogen_groups_path: str
- incompatibility_path: str
- model_config = {'extra': 'forbid'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- score_weight: float
- class synplan.route_quality.protection.RouteScanner(fg_detector: FunctionalGroupDetector, incompatibility: IncompatibilityMatrix, halogen_detector: HalogenDetector | None = None)
Bases:
objectScan a synthesis route for competing functional group interactions.
For each step in the route, detects functional groups on the product molecule that do not overlap with the reaction center, identifies the FG at the reaction center (“reacting FG”), and classifies their interaction severity using the FG x FG incompatibility matrix. Also counts same-family competing halogens for the H term.
- Parameters:
fg_detector – A FunctionalGroupDetector instance.
incompatibility – An IncompatibilityMatrix instance.
halogen_detector – An optional HalogenDetector instance.
- static classify_interactions(interactions: list[CompetingInteraction], halogen_count: int = 0) tuple[int, int, int]
Count interactions by severity category.
- Parameters:
interactions – List of CompetingInteraction objects.
halogen_count – Number of same-family competing halogen sites.
- Returns:
Tuple of (I, C, H) where: - I = number of incompatible interactions - C = number of competing interactions - H = number of same-family competing halogen sites
- scan_route(route: dict[int, ReactionContainer]) tuple[list[CompetingInteraction], int]
Walk a route step-by-step and collect competing interactions.
For each step the scanner:
Identifies the reacting FG — the FG on the reactant side that is consumed by the reaction (present in reactant, overlapping the reaction center). This matches the paper’s approach of looking at the FG being transformed.
Identifies competing FGs on the product side that do not overlap the reaction center.
Looks up severity of each competing FG against the reacting FG in the incompatibility matrix.
- Parameters:
route – A dict mapping step_id -> ReactionContainer, as returned by
extract_reactions()insynplan.chem.reaction_routes.route_cgr.- Returns:
Tuple of (interactions, halogen_count) where interactions is a list of CompetingInteraction objects and halogen_count is the total number of same-family competing halogen sites.
- synplan.route_quality.protection.classify_reaction_type(reaction: ReactionContainer, cgr=None) str
Classify a reaction into a broad type based on CGR bond analysis.
This is the default classifier used throughout the protection module. It delegates to
classify_reaction_type_broad().- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A string label for the broad reaction type.
- synplan.route_quality.protection.classify_reaction_type_broad(reaction: ReactionContainer, cgr=None) str
Classify a reaction into a broad type based on CGR bond analysis.
This is the original 4-category classifier.
- Possible return values:
'bond_formation': only new bonds are formed'bond_breaking': only existing bonds are broken'substitution': bonds are both formed and broken'other': no bond changes detected
- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A string label for the broad reaction type.
- synplan.route_quality.protection.classify_reaction_type_detailed(reaction: ReactionContainer, cgr=None) str
Classify a reaction into a fine-grained named type using CGR analysis.
Examines which atoms and bond-order changes are involved at the reaction center to return a more specific label.
- Possible return values:
'acylation'– C=O at center + new C-N or C-O bond'alkylation'– new C-N, C-O, or C-S bond without C=O at center'reduction'– net bond order decrease'oxidation'– net bond order increase'cross_coupling'– new C-C bond formed'amide_formation'– new C-N bond + C=O at center'ester_formation'– new C-O bond + C=O at center'halogenation'– new C-halogen bond'dehalogenation'– C-halogen bond broken'ring_closure'– intramolecular bond formation creating a ring'ring_opening'– ring bond broken'other'– fallback
- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A string label for the detailed reaction type.
- synplan.route_quality.protection.get_reaction_center_atoms(reaction: ReactionContainer, cgr: CGRContainer | None = None) set[int]
Extract atoms involved in bond or charge changes using CGR.
Uses
reaction.compose()to obtain the CGR, then returns atoms where bond orders or charges change.cgr.center_atomsalready includes both bond-change and charge-change atoms.- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A set of atom indices that participate in bond or charge changes.
synplan.route_quality.protection.config module#
Configuration for the protection strategy module.
- class synplan.route_quality.protection.config.ProtectionConfig(*, competing_groups_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/synplanner/checkouts/stable/synplan/route_quality/protection/data/competing_groups.yaml', incompatibility_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/synplanner/checkouts/stable/synplan/route_quality/protection/data/incompatibility_matrix.tsv', halogen_groups_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/synplanner/checkouts/stable/synplan/route_quality/protection/data/halogen_groups.yaml', score_weight: Annotated[float, Ge(ge=0.0), Le(le=1.0)] = 0.5, enable_reranking: bool = True)#
Bases:
BaseConfigModelConfiguration for protection-group analysis.
- Parameters:
competing_groups_path – Path to YAML file with SMARTS patterns for reactive functional groups, organized by category.
incompatibility_path – Path to TSV file with the FG x FG incompatibility matrix.
halogen_groups_path – Path to YAML file with halogen SMARTS patterns grouped by halogen family.
score_weight – Weight of the protection score S(T) when combining with the original route score. Must be in [0, 1].
enable_reranking – If True, re-rank candidate routes using the combined score that includes the protection penalty.
- model_config = {'extra': 'forbid'}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
synplan.route_quality.protection.functional_groups module#
Functional group detection for protection strategy analysis.
SMARTS-based detection of reactive functional groups in molecules, used to identify competing sites that may require protecting group strategies during synthesis.
- class synplan.route_quality.protection.functional_groups.FunctionalGroupDetector(config_path: str)#
Bases:
objectSMARTS-based functional group detector.
Loads a YAML config of SMARTS patterns organized by category and uses chython substructure matching to detect functional groups in molecules.
Results are cached by canonical SMILES (with hydrogens) so that repeated queries for the same molecule are fast.
- Parameters:
config_path – Path to a YAML file with SMARTS definitions, organized by category (nucleophile/electrophile/unsaturated).
- detect_all(molecule: MoleculeContainer) list[FunctionalGroupMatch]#
Detect all functional group matches in a molecule.
Applies every loaded SMARTS pattern and returns deduplicated matches (unique by name + sorted atom indices). Results are cached by canonical SMILES so that the same molecule is not re-scanned.
- Parameters:
molecule – A chython MoleculeContainer to search.
- Returns:
List of FunctionalGroupMatch objects.
- detect_competing(molecule: MoleculeContainer, reaction_center_atoms: set[int]) list[FunctionalGroupMatch]#
Detect functional groups NOT overlapping with the reaction center.
These are “competing” sites that may interfere with the intended reaction at the reaction center.
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
List of FunctionalGroupMatch objects for competing FGs.
- detect_reacting(molecule: MoleculeContainer, reaction_center_atoms: set[int]) FunctionalGroupMatch | None#
Detect the functional group at the reaction center.
Returns the first FG whose atoms overlap with the reaction center, or
Noneif no known FG is found there.- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
The FunctionalGroupMatch at the reaction center, or None.
- class synplan.route_quality.protection.functional_groups.FunctionalGroupMatch(*, name: str, category: str, atom_indices: tuple[int, ...])#
Bases:
BaseModelA single functional group match in a molecule.
- Parameters:
name – Human-readable name of the functional group (e.g. “hydroxyl”).
category – Reactivity category (e.g. “nucleophile”, “electrophile”).
atom_indices – Tuple of matched atom indices in the molecule, sorted for deduplication.
- model_config = {'frozen': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class synplan.route_quality.protection.functional_groups.HalogenDetector(config_path: str)#
Bases:
objectSMARTS-based halogen group detector.
Loads a YAML config of halogen SMARTS patterns and detects halogens in molecules. Used to count same-family competing halogens for the H term in the S(T) score.
- Parameters:
config_path – Path to a YAML file with halogen SMARTS definitions.
- count_same_family_competing(molecule: MoleculeContainer, reaction_center_atoms: set[int]) int#
Count competing halogens in the same family as reaction center halogens.
Per the paper, only halogens at competing sites that share the same halogen family as a halogen at the reaction center count toward the H term in S(T).
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
Number of same-family competing halogen sites.
- detect_all(molecule: MoleculeContainer) list[HalogenMatch]#
Detect all halogen matches in a molecule.
- Parameters:
molecule – A chython MoleculeContainer to search.
- Returns:
List of HalogenMatch objects.
- detect_competing_halogens(molecule: MoleculeContainer, reaction_center_atoms: set[int]) list[HalogenMatch]#
Detect halogen groups NOT overlapping with the reaction center.
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
List of HalogenMatch objects for competing halogens.
- detect_reaction_center_halogens(molecule: MoleculeContainer, reaction_center_atoms: set[int]) list[HalogenMatch]#
Detect halogen groups overlapping with the reaction center.
- Parameters:
molecule – A chython MoleculeContainer to search.
reaction_center_atoms – Atom indices of the reaction center.
- Returns:
List of HalogenMatch objects at the reaction center.
- class synplan.route_quality.protection.functional_groups.HalogenMatch(*, name: str, family: str, atom_indices: tuple[int, ...])#
Bases:
BaseModelA single halogen group match in a molecule.
- Parameters:
name – Name of the halogen pattern (e.g. “aryl_bromide”).
family – Halogen family (e.g. “bromide”, “chloride”).
atom_indices – Tuple of matched atom indices in the molecule.
- model_config = {'frozen': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
synplan.route_quality.protection.reaction_classifier module#
Reaction classifier module for protection strategy analysis.
Classifies reactions into broad types based on CGR (Condensed Graph of Reaction) bond analysis using chython. Used downstream by the route scanner to assess functional group incompatibilities at each synthetic step.
- synplan.route_quality.protection.reaction_classifier.classify_reaction_type(reaction: ReactionContainer, cgr=None) str#
Classify a reaction into a broad type based on CGR bond analysis.
This is the default classifier used throughout the protection module. It delegates to
classify_reaction_type_broad().- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A string label for the broad reaction type.
- synplan.route_quality.protection.reaction_classifier.classify_reaction_type_broad(reaction: ReactionContainer, cgr=None) str#
Classify a reaction into a broad type based on CGR bond analysis.
This is the original 4-category classifier.
- Possible return values:
'bond_formation': only new bonds are formed'bond_breaking': only existing bonds are broken'substitution': bonds are both formed and broken'other': no bond changes detected
- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A string label for the broad reaction type.
- synplan.route_quality.protection.reaction_classifier.classify_reaction_type_detailed(reaction: ReactionContainer, cgr=None) str#
Classify a reaction into a fine-grained named type using CGR analysis.
Examines which atoms and bond-order changes are involved at the reaction center to return a more specific label.
- Possible return values:
'acylation'– C=O at center + new C-N or C-O bond'alkylation'– new C-N, C-O, or C-S bond without C=O at center'reduction'– net bond order decrease'oxidation'– net bond order increase'cross_coupling'– new C-C bond formed'amide_formation'– new C-N bond + C=O at center'ester_formation'– new C-O bond + C=O at center'halogenation'– new C-halogen bond'dehalogenation'– C-halogen bond broken'ring_closure'– intramolecular bond formation creating a ring'ring_opening'– ring bond broken'other'– fallback
- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A string label for the detailed reaction type.
- synplan.route_quality.protection.reaction_classifier.get_reaction_center_atoms(reaction: ReactionContainer, cgr: CGRContainer | None = None) set[int]#
Extract atoms involved in bond or charge changes using CGR.
Uses
reaction.compose()to obtain the CGR, then returns atoms where bond orders or charges change.cgr.center_atomsalready includes both bond-change and charge-change atoms.- Parameters:
reaction – A chython ReactionContainer representing a chemical reaction.
cgr – Pre-composed CGR. If
None, computed from reaction.
- Returns:
A set of atom indices that participate in bond or charge changes.
synplan.route_quality.protection.scanner module#
Route scanner for competing functional group interactions.
Walks a synthesis route step-by-step, detecting functional groups that may compete with the intended reaction at each step, and classifies their severity using an FG x FG incompatibility matrix.
The competing-sites identification approach is inspired by the methodology of:
Westerlund et al., “Toward lab-ready AI synthesis plans with protection strategies and route scoring”, ChemRxiv, 2025. https://doi.org/10.26434/chemrxiv-2025-gdrr8
- class synplan.route_quality.protection.scanner.CompetingInteraction(*, step_id: int, fg_name: str, fg_atoms: tuple[int, ...], reacting_fg: str | None, severity: str)#
Bases:
BaseModelA competing functional group interaction at a synthesis step.
- Parameters:
step_id – Index of the reaction step in the route.
fg_name – Name of the competing functional group.
fg_atoms – Atom indices of the matched functional group.
reacting_fg – Name of the FG at the reaction center (or None).
severity – Interaction severity: “incompatible”, “competing”, or “compatible”.
- model_config = {'frozen': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class synplan.route_quality.protection.scanner.IncompatibilityMatrix(config_path: str)#
Bases:
objectLookup table for functional group vs functional group incompatibility.
Loads a TSV matrix where the first row contains column FG names (with an empty first cell) and subsequent rows have a row FG name followed by integer severity levels (0=compatible, 1=competing, 2=incompatible).
- Parameters:
config_path – Path to the incompatibility matrix TSV file.
- lookup(competing_fg: str, reacting_fg: str) str#
Look up the severity of a (competing_fg, reacting_fg) pair.
- Parameters:
competing_fg – Competing functional group name (row key).
reacting_fg – Reacting functional group name (column key).
- Returns:
Severity label: “incompatible”, “competing”, or “compatible”.
- class synplan.route_quality.protection.scanner.RouteScanner(fg_detector: FunctionalGroupDetector, incompatibility: IncompatibilityMatrix, halogen_detector: HalogenDetector | None = None)#
Bases:
objectScan a synthesis route for competing functional group interactions.
For each step in the route, detects functional groups on the product molecule that do not overlap with the reaction center, identifies the FG at the reaction center (“reacting FG”), and classifies their interaction severity using the FG x FG incompatibility matrix. Also counts same-family competing halogens for the H term.
- Parameters:
fg_detector – A FunctionalGroupDetector instance.
incompatibility – An IncompatibilityMatrix instance.
halogen_detector – An optional HalogenDetector instance.
- static classify_interactions(interactions: list[CompetingInteraction], halogen_count: int = 0) tuple[int, int, int]#
Count interactions by severity category.
- Parameters:
interactions – List of CompetingInteraction objects.
halogen_count – Number of same-family competing halogen sites.
- Returns:
Tuple of (I, C, H) where: - I = number of incompatible interactions - C = number of competing interactions - H = number of same-family competing halogen sites
- scan_route(route: dict[int, ReactionContainer]) tuple[list[CompetingInteraction], int]#
Walk a route step-by-step and collect competing interactions.
For each step the scanner:
Identifies the reacting FG — the FG on the reactant side that is consumed by the reaction (present in reactant, overlapping the reaction center). This matches the paper’s approach of looking at the FG being transformed.
Identifies competing FGs on the product side that do not overlap the reaction center.
Looks up severity of each competing FG against the reacting FG in the incompatibility matrix.
- Parameters:
route – A dict mapping step_id -> ReactionContainer, as returned by
extract_reactions()insynplan.chem.reaction_routes.route_cgr.- Returns:
Tuple of (interactions, halogen_count) where interactions is a list of CompetingInteraction objects and halogen_count is the total number of same-family competing halogen sites.
synplan.route_quality.protection.scorer module#
Competing sites scorer and route re-ranking.
Computes the S(T) competing sites score for synthesis routes and provides combined-score re-ranking that balances the original route search score with the protection penalty.
The scoring formula is inspired by Eq. 6 of:
Westerlund et al., “Toward lab-ready AI synthesis plans with protection strategies and route scoring”, ChemRxiv, 2025. https://doi.org/10.26434/chemrxiv-2025-gdrr8
We use a worst-per-step variant of the formula: each step contributes only the penalty of its most severe interaction (1.0 for incompatible, 0.5 for competing, 0.0 for compatible). This avoids overwhelming the score when drug-like molecules contain many functional groups that each trigger a matrix lookup.
- class synplan.route_quality.protection.scorer.CompetingSitesScore(scanner: RouteScanner)#
Bases:
objectScore routes by their competing functional group burden.
Uses a RouteScanner to detect interactions and then computes a worst-per-step score:
For each step s, w_s = max severity penalty among interactions. S(T) = max[1 - (sum(w_s) + H) / max(N, 1), 0]
where H is the halogen competing-site count and N is the number of steps. Each step contributes at most 1.0 (incompatible) or 0.5 (competing) to the penalty, preventing highly functionalized molecules from overwhelming the score.
- Parameters:
scanner – A RouteScanner instance configured with a FunctionalGroupDetector and IncompatibilityMatrix.
- rank_routes(routes: dict[int, dict[int, ReactionContainer]], existing_scores: dict[int, float] | None = None, weight: float = 0.5) list[tuple[int, float, float, float]]#
Rank routes by a combined score mixing original and protection scores.
combined = (1 - weight) * original_score_normalized + weight * S(T)
If no existing scores are provided, the original score component is treated as 0.0 for all routes and only the protection score is used.
- Parameters:
routes – Dict mapping route_id -> {step_id: ReactionContainer}.
existing_scores – Optional dict mapping route_id -> original route score (e.g. from Tree.route_score()).
weight – Weight of the protection score in [0, 1].
- Returns:
List of (route_id, combined_score, protection_score, original_score) tuples, sorted descending by combined_score.
- score_route(route: dict[int, ReactionContainer]) tuple[float, list[CompetingInteraction]]#
Compute the S(T) score for a single route.
- Parameters:
route – A dict mapping step_id -> ReactionContainer.
- Returns:
Tuple of (score, interactions) where score is in [0, 1] and interactions is the list of CompetingInteraction objects.