Value network#
The architecture of the value network, planning parameters, and value network tuning parameters can be specified with the configuration file.
Download example configuration#
GitHub: configs/tuning.yaml
Quickstart (CLI)#
Run value network tuning using the repository configuration in configs/tuning.yaml:
synplan value_network_tuning \
--config configs/tuning.yaml \
--targets targets.smi \
--reaction_rules reaction_rules.tsv \
--policy_network policy_network.ckpt \
--building_blocks building_blocks.smi \
--results_dir value_network
Configuration file
tree:
max_iterations: 100
max_tree_size: 10000
max_time: 120
max_depth: 9
search_strategy: expansion_first
ucb_type: uct
c_ucb: 0.1
backprop_type: muzero
exclude_small: True
min_mol_size: 6
init_node_value: 0.5
epsilon: 0
silent: True
node_evaluation:
evaluation_type: rollout
evaluation_agg: max
node_expansion:
top_rules: 50
rule_prob_threshold: 0.0
priority_rules_fraction: 0.5
value_network:
vector_dim: 512
num_conv_layers: 5
learning_rate: 0.0005
dropout: 0.4
num_epoch: 100
batch_size: 1000
reinforcement:
batch_size: 100
num_simulations: 1
Configuration parameters
Parameter |
Description |
|---|---|
tree:max_iterations |
The maximum number of iterations of the tree search algorithm |
tree:max_tree_size |
The maximum number of nodes that can be created in the search tree |
tree:max_time |
The maximum time (in seconds) for the tree search execution |
tree:max_depth |
The maximum depth of the tree, controlling how far the search can go from the root node |
tree:ucb_type |
The type of Upper Confidence Bound (UCB) statistics used in the tree search. Options include “puct” (predictive UCB), “uct” (standard UCB), and “value” |
tree:backprop_type |
The backpropagation method used during the tree search. Options are “muzero” (model-based approach) and “cumulative” (cumulative value approach) |
tree:search_strategy |
The strategy for navigating the tree. Options are “expansion_first” (prioritizing the expansion of new nodes) and “evaluation_first” (prioritizing the evaluation of new nodes) |
tree:exclude_small |
If True, excludes small molecules from the tree, typically focusing on more complex molecules |
tree:min_mol_size |
The minimum size of a molecule (the number of heavy atoms) to be considered in the search. Molecules smaller than this threshold are typically considered readily available building blocks |
tree:init_node_value |
The initial value for newly created nodes in the tree (for expansion_first search strategy) |
tree:epsilon |
This parameter is used in the epsilon-greedy strategy during the node selection, representing the probability of choosing a random action for exploration. A higher value leads to more exploration |
tree:silent |
If True, suppresses the progress logging of the tree search |
node_evaluation:evaluation_agg |
The way the evaluation scores are aggregated. Options are “max” (using the maximum score of the child nodes) and “average” (using the average score of the child nodes) |
node_evaluation:evaluation_type |
The method used for node evaluation. Options include “random” (random number between 0 and 1), “rollout” (using rollout simulations), and “gcn” (graph convolutional value network) |
node_expansion:top_rules |
The maximum amount of rules to be selected for node expansion from the list of predicted reaction rules |
node_expansion:rule_prob_threshold |
The reaction rules with predicted probability lower than this parameter will be discarded |
node_expansion:priority_rules_fraction |
The fraction of priority rules in comparison to the regular rules |
value_network:vector_dim |
The dimension of the hidden layers |
value_network:num_conv_layers |
The number of convolutional layers |
value_network:dropout |
The dropout value |
value_network:learning_rate |
The learning rate |
value_network:num_epoch |
The number of training epochs |
value_network:batch_size |
The size of the batch of input molecular graphs |
reinforcement:batch_size |
The size of the batch of target molecules used for planning simulation and value network update |
reinforcement:num_simulations |
The number of planning simulations per reinforcement learning iteration |