1.1.1.1. molpher.core package

This package contains the most essential features of the library. It provides everything that the user of the library will need. All the modules and classes below and their contents are meant to be imported by external scripts.

1.1.1.1.1. Modules

1.1.1.1.1.1. molpher.core.ExplorationData module

This module houses the ExplorationData class:

class molpher.core.ExplorationData.ExplorationData(other=None, **kwargs)[source]

Bases: molpher.swig_wrappers.core.ExplorationData

Parameters

Note

If both other and **kwargs are specified, then everything in **kwargs will be applied after the instance in other is wrapped.

This a specialized version of the molpher.swig_wrappers.core.ExplorationData proxy class. It implements some additional functionality for ease of use from Python.

It contains all the information needed to initialize an ExplorationTree instance. Additionally, any tree can be transformed into an instance of this class by calling the asData() method.

One advantage of this class over the ExplorationTree is that it allows direct modifications of the exploration tree structure. This is especially useful when we want to create an initial tree topology before the exploration itself.

Warning

Note that current implementations of the modification methods is experimental and may result in undefined behaviour. Therefore, it is only recommended to use it as a means of setting morphing parameters and spawning tree instances or spawning new trees from existing ones without the need to create a snapshot file.

Because it inherits from molpher.swig_wrappers.core.ExplorationData, it provides the same interface as the corresponding C++ class, but exposes the morphing parameters as object attributes for ease of use. These attributes follow a slightly different name convention than the corresponding getters and setters of the parent class. Their names are derived from the names of the parameters used in the XML template files that are more self-explanatory and easier to remember and type. The table below gives an overview of all available parameters, their default values and short descriptions and the respective getters and setters of the base class:

Table 1.1 Morphing parameters recognized by the current version.

Attribute

Default Value

Brief Description

Setter

Getter

source

None

SMILES of the source molecule.

setSource

getSource

target

None

SMILES of the target molecule.

setTarget

getTarget

operators

a tuple of selectors 1

A tuple of identifiers of the permitted chemical operators.

setChemicalOperators

getChemicalOperators

accept_max

100

Maximum number of candidates accepted at once (based on their position in ExplorationTree.candidates).

setCntCandidatesToKeepMax

getCntCandidatesToKeepMax

accept_min

50

Minimum number of candidates accepted during probability filtering.

setCntCandidatesToKeep

getCntCandidatesToKeep

close_produce

150

Maximum number of morphs to produce with an generateMorphs() call when close to the target molecule.

setCntMorphsInDepth

getCntMorphsInDepth

far_produce

80

Maximum number of morphs to produce with an generateMorphs() call.

setCntMorphs

getCntMorphs

far_close_threshold

0.15

Molecular distance below which the target molecule and a morph are considered to be close.

setDistToTargetDepthSwitch

getDistToTargetDepthSwitch

fingerprint

FP_MORGAN

Identification string of the current fingerprint strategy.

setFingerprint

getFingerprint

similarity

SC_TANIMOTO

Identification string of the current fingerprint strategy.

setSimilarityCoefficient

getSimilarityCoefficient

max_morphs_total

1500

Maximum number of morphs allowed to be derived from one molecule and the allowed number of non-producing descendants before a molecule is removed from the tree.

setCntMaxMorphs

getCntMaxMorphs

non_producing_survive

5

Number of iterations before descendants of a non-producing molecule are removed from the tree.

setItThreshold

getItThreshold

weight_max

100000.0

Maximum molecular weight of one morph.

setMaxAcceptableMolecularWeight

getMaxAcceptableMolecularWeight

weight_min

0.0

Minimum molecular weight of one morph.

setMinAcceptableMolecularWeight

getMinAcceptableMolecularWeight

1

(OP_ADD_ATOM, OP_ADD_BOND, OP_BOND_CONTRACTION, OP_BOND_REROUTE, OP_INTERLAY_ATOM, OP_MUTATE_ATOM, OP_REMOVE_ATOM, OP_REMOVE_BOND)

exception UnknownParameterException[source]

Bases: Exception

Indicates that an unknown parameter was supplied.

property accept_max

The maximum number of morphs allowed to be connected to the tree upon one call to extend().

If more than accept_max morphs with True in the appropriate position of candidates_mask are present in candidates and extend() is called, only first accept_max morphs from candidates will be connected to the tree and the rest will be discarded.

Returns

maximum number of candidates accepted upon extend()

Return type

int

property accept_min

If FilterMorphsOper.PROBABILITY is used during filtering, this is the number of morphs accepted with 100% probability.

Returns

minimum number of candidates accepted during probability filtering

Return type

int

property close_produce

This is the maximum number of morphs generated from one leaf when the leaf of the tree currently being processed with molpher.core.ExplorationTree.ExplorationTree.generateMorphs() lies less than far_close_threshold from the target molecule.

See also

generateMorphs()

Returns

maximum number of morphs to produce with an generateMorphs() call

Return type

int

property far_close_threshold

This distance threshold controls the number of morphs generated with molpher.core.ExplorationTree.ExplorationTree.generateMorphs() for molecules closer or further from the target molecule. Morphs that have distance from the target molecule lower than far_close_threshold are considered to be close.

See also

far_produce and close_produce

Returns

distance threshold for far_produce and close_produce

Return type

float

property far_produce

The maximum number of morphs generated from one leaf when the leaf of the tree currently being processed with molpher.core.ExplorationTree.ExplorationTree.generateMorphs() lies more than far_close_threshold from the target molecule.

See also

generateMorphs()

Returns

maximum number of morphs to produce with a generateMorphs() call

Return type

int

property fingerprint

Returns an identifier of the currently used molecular fingerprint.

Table 1.2 Currently supported molecular fingerprints.

Identifier

Description

FP_ATOM_PAIRS

FP_MORGAN

FP_TOPOLOGICAL

FP_TOPOLOGICAL_LAYERED_1

FP_TOPOLOGICAL_LAYERED_2

FP_VECTORFP

FP_TOPOLOGICAL_TORSION

FP_EXT_ATOM_PAIRS

FP_EXT_MORGAN

FP_EXT_TOPOLOGICAL

FP_EXT_TOPOLOGICAL_LAYERED_1

FP_EXT_TOPOLOGICAL_LAYERED_2

FP_EXT_TOPOLOGICAL_TORSION

Returns

molecular fingerprint identifier

Return type

str

property is_valid

Shows if this instance represents valid parameters. The instance becomes invalid, if there are any bad or nonsensical parameter values, values are missing (such as undefined chemical operators) or the tree structure is for any reason unacceptable.

Returns

True for a valid instance, False for invalid

Return type

bool

static load(snapshot)[source]

A factory method to create an instance of ExplorationData from a tree snapshot.

Parameters

snapshot (str) – path to the snapshot file

Returns

new instance representing the data loaded from the snapshot file

Return type

ExplorationData

property max_morphs_total

This value is the maximum number of morphs allowed to be generated from one molecule. If the number of generated morphs exceeds this number, all additional morphs can be filtered out using the FilterMorphsOper.MAX_DERIVATIONS filter.

It is also the maximum number of ‘bad morphs’ generated from one molecule. If a molecule has more than max_morphs_total descendants and none of them are closer to the target molecule than the molecule in question, then the molecule is permanently removed from the tree with all of its descendants when prune() is called.

Returns

maximum number of ‘bad morphs’ before pruning

Return type

int

property non_producing_survive

A molecule that has not produced any morphs closer to the target molecule than itself (a non-producing molecule) for non_producing_survive number of calls to extend() will have its descendants removed during the next prune() call.

Returns

number of calls to molpher.core.ExplorationTree.ExplorationTree.generateMorphs() before descendants of a non-producing molecule are removed from the tree

Return type

int

property operators

A set of chemical operators to use. These define how the input molecule and its descendants can be manipulated during morphing.

Can be set using an iterable of the appropriate selectors or their names as str. Any duplicates are automatically removed

Table 1.3 Currently supported chemical operators.

Identifier

Description

OP_ADD_ATOM

Add a random atom to the molecule.

OP_REMOVE_ATOM

Remove an atom from the molecule.

OP_ADD_BOND

Add a bond between two random atoms.

OP_REMOVE_BOND

Remove a bond between two random atoms.

OP_MUTATE_ATOM

Change a randomly selected atom to a different element.

OP_INTERLAY_ATOM

Insert an atom between two other atoms.

OP_BOND_REROUTE

Move one ending of a bond to another atom.

OP_BOND_CONTRACTION

Remove an atom between two atoms and connect them with a new bond.

Returns

names of the current chemical operators

Return type

tuple of str

property param_dict

Holds a dictionary of current morphing parameters values for this instance. A new dictionary of parameters can be assigned to change them.

Returns

a dictionary of parameters

Return type

dict

property similarity

Returns an identifier of the currently used similarity measure.

Table 1.4 Currently supported similarity measures.

Identifier

Description

SC_ALL_BIT

SC_ASYMMETRIC

SC_BRAUN_BLANQUET

SC_COSINE

SC_DICE

SC_KULCZYNSKI

SC_MC_CONNAUGHEY

SC_ON_BIT

SC_RUSSEL

SC_SOKAL

SC_TANIMOTO

SC_TVERSKY_SUBSTRUCTURE

SC_TVERSKY_SUPERSTRUCTURE

Returns

similarity measure identifier

Return type

str

property source

The source molecule. All morphs in an exploration tree are derived from this molecule during morphing. This is the root of the created tree.

Can be set using a MolpherMol instance or a SMILES string of the new source molecule.

Returns

current source molecule

Return type

MolpherMol

property target

The target molecule. This is the molecule being searched for during morphing. In the original version of the algorithm the goal is to maximize similarity (minimize structural distance) of the generated morphs and this molecule.

Can be set using a MolpherMol instance or a SMILES string of the new target molecule.

Returns

current target molecule

Return type

MolpherMol

property weight_max

If FilterMorphsOper.WEIGHT filter is used on an exploration tree, this will be the maximum weight of the candidate morphs accepted during a filtering procedure.

Returns

maximum acceptable weight during filtering

Return type

float

property weight_min

If FilterMorphsOper.WEIGHT filter is used on an exploration tree, this will be the minimum weight of the candidate morphs accepted during a filtering procedure.

Returns

minimum acceptable weight during filtering

Return type

float

1.1.1.1.1.2. molpher.core.ExplorationTree module

class molpher.core.ExplorationTree.ExplorationTree[source]

Bases: molpher.swig_wrappers.core.ExplorationTree

This a specialized version of the molpher.swig_wrappers.core.ExplorationTree proxy class. It implements some additional functionality for ease of use from Python.

Attention

This class has no constructor defined. Use the create() factory method to obtain instances of this class.

asData()[source]
Returns

the tree as an ExplorationData instance

Return type

ExplorationData

property candidates
Returns

the candidate morphs (morphs generated by a single call to generateMorphs().)

Return type

tuple of MolpherMol instances

property candidates_mask

A tuple of bool objects that serve as means of filtering the candidate morphs. Each morph in candidates has a bool variable assigned to it in this tuple – only morphs with True at the appropriate position are added to the tree when extend() is called.

It can be changed by assigning a new tuple or a call to setCandidateMorphsMask().

Returns

currently selected candidate morphs represented as a tuple of bool objects

Return type

tuple

static create(tree_data=None, source=None, target=None)[source]

Create an exploration tree.

Parameters

Note

When tree_data is specified, source and target are always ignored.

fetchMol(canonSMILES)[source]

Returns a molecule from the tree using a canonical SMILES string.

Raises a RuntimeError if the molecule is not found.

Parameters

canonSMILES (str) – SMILES string of the molecule to fetch

Returns

molecule from a tree

Return type

MolpherMol

fetchPathTo(smiles)[source]
generateMorphs(collectors=None)[source]
property generation_count
Returns

Number of morph generations connected to the tree so far.

Return type

int

property leaves
Returns

the current leaves of the tree

Return type

tuple of MolpherMol instances

property mol_count
property morphing_operators
property params

A dictionary representing the current exploration parameters.

It is possible to assign a new dictionary (or an instance of the molpher.swig_wrappers.core.ExplorationData class) to update the current parameters.

Note

Only parameters defined in the supplied dictionary are changed and if an instance of molpher.swig_wrappers.core.ExplorationData is supplied only the parameters are read from it (the tree structure remains the same).

Returns

current parameters

Return type

dict

property path_found
Returns

True if the target molecule is present in the tree, False otherwise.

Return type

bool

property source
property target
property thread_count
Returns

maximum number of threads this instance will use

Return type

int

traverse(callback, start_mol=None)[source]

This method can be used to traverse the whole tree structure (or just a subtree) starting from the root to leaves. It takes a callback function that accepts a single required argument and traverses the whole tree starting from its root (or root of a specified subtree – see start_mol) and calls the supplied callback with with encountered morph as its parameter.

Parameters
  • callback (a callable object that takes a single argument) – the callback to call

  • start_mol (str or MolpherMol) – the root of a subtree to explore as canonical SMILES or MolpherMol instance

1.1.1.1.1.3. molpher.core.MolpherMol module

class molpher.core.MolpherMol.MolpherMol(str_repr=None, other=None)[source]

Bases: molpher.swig_wrappers.core.MolpherMol

Parameters
  • str_repr (str) – smiles of the molecule that is to be created or a path to an SDF file (only the first molecule is read)

  • other (molpher.swig_wrappers.core.MolpherMol or its derived class) – another instance, the new instance will be its copy (tree ownership is not transferred onto the copy)

This is a specialized version of the molpher.swig_wrappers.core.MolpherMol proxy class. It implements some additional functionality for ease of use from Python.

asRDMol(include_locks=True)[source]
property atoms

Atoms of this molecule represented as MolpherAtom instances.

Returns

tuple

copy()[source]

Returns a copy of this instance. If this instance has a tree assigned, the returned will have None instead.

Returns

a copy of this instance

Return type

property descendents

Canonical SMILES strings of all molecules derived from this compound that are currently present in the tree.

Returns

Return type

str

property dist_to_target

The value of the objective function. In the original implementation, this is the structural distance to the target molecule using a similarity measure.

This value can be changed.

Returns

value of the objective function

Return type

float

static fromMolBlock(block)[source]
property gens_without_improvement

Number of morph generations derived from this molecule that did not contain any morphs with an improvement in the objective function from the target molecule.

This value can be changed.

Returns

number of non-producing generations

Return type

int

getAtom(idx)[source]
property historic_descendents

Canonical SMILES strings of all molecules derived from this compound.

Returns

Return type

str

property parent_operator

The name of the chemical operator selector that lead to the creation of this molecule.

Returns

name of the parent chemical operator

Return type

str

property parent_smiles

Canonical SMILES string of the parent molecule in the tree.

Can be an empty str, if the molecule is a root of the tree or is not associated with any.

Returns

canonical SMILES string of the parent molecule in the tree

Return type

str

property sascore

The synthetic feasibility score of the molecule according to Ertl.

This value can be changed.

Returns

synthetic feasibility score

Return type

float

property smiles
Returns

canonical SMILES string of this molecule

Return type

str

property tree

A reference to the tree this instance is currently in. If the molecule is not present in any tree, this value is None.

Returns

reference to the tree this instance is currently in

Return type

ExplorationTree or None

1.1.1.1.1.4. molpher.core.MolpherAtom module

class molpher.core.MolpherAtom.MolpherAtom(symbol, formal_charge=0)[source]

Bases: molpher.swig_wrappers.core.MolpherAtom

Parameters
  • symbol (str) – The symbol in the periodic table of the atom to create.

  • formal_charge (int) – its formal charge (zero by default)

This a specialized version of the molpher.swig_wrappers.core.MolpherAtom proxy class. It implements some additional functionality for ease of use from Python.

property atomic_number
property formal_charge
property is_locked
property lock_info
property locking_mask
property mass
property symbol

1.1.1.1.1.5. molpher.core.selectors module

Contains all global selectors that are usually used when creating an exploration tree or setting any of its parameters during runtime.

There are three types of selectors:

  1. fingerprints selectors

    Their names are prepended with ‘FP_’ and are used to either set the fingerprint member of the ExplorationData class or as the value of the fingerprint key when calling create() with the params parameter or writing into the params member of the ExplorationTree.

    FP_MORGAN is the default option.

  2. similarity coefficient selectors

    Their names are prepended with ‘SC_’ and are used to either set the similarity member of the ExplorationData class or as the value of the similarity key when calling create() with the params parameter or writing into the params member of the ExplorationTree.

    SC_TANIMOTO is the default option.

  3. chemical operators

    Their names are prepended with ‘OP_’ and an iterable of them is used to either set the operators member of the ExplorationData class or as items of the iterable assigned to operators key when calling create() with the params parameter or writing into the params member of the ExplorationTree.

    All of the available selectors are used by default.

molpher.core.selectors.FP_ATOM_PAIRS = 0
molpher.core.selectors.FP_EXT_ATOM_PAIRS = 7
molpher.core.selectors.FP_EXT_MORGAN = 8
molpher.core.selectors.FP_EXT_TOPOLOGICAL = 9
molpher.core.selectors.FP_EXT_TOPOLOGICAL_LAYERED_1 = 10
molpher.core.selectors.FP_EXT_TOPOLOGICAL_LAYERED_2 = 11
molpher.core.selectors.FP_EXT_TOPOLOGICAL_TORSION = 12
molpher.core.selectors.FP_MORGAN = 1

This is the default selector that is used when no other option is specified.

molpher.core.selectors.FP_TOPOLOGICAL = 2
molpher.core.selectors.FP_TOPOLOGICAL_LAYERED_1 = 3
molpher.core.selectors.FP_TOPOLOGICAL_LAYERED_2 = 4
molpher.core.selectors.FP_TOPOLOGICAL_TORSION = 6
molpher.core.selectors.FP_VECTORFP = 5
molpher.core.selectors.OP_ADD_ATOM = 0
molpher.core.selectors.OP_ADD_BOND = 2
molpher.core.selectors.OP_BOND_CONTRACTION = 7
molpher.core.selectors.OP_BOND_REROUTE = 6
molpher.core.selectors.OP_INTERLAY_ATOM = 5
molpher.core.selectors.OP_MUTATE_ATOM = 4
molpher.core.selectors.OP_REMOVE_ATOM = 1
molpher.core.selectors.OP_REMOVE_BOND = 3
molpher.core.selectors.SC_ALL_BIT = 0
molpher.core.selectors.SC_ASYMMETRIC = 1
molpher.core.selectors.SC_BRAUN_BLANQUET = 2
molpher.core.selectors.SC_COSINE = 3
molpher.core.selectors.SC_DICE = 4
molpher.core.selectors.SC_KULCZYNSKI = 5
molpher.core.selectors.SC_MC_CONNAUGHEY = 6
molpher.core.selectors.SC_ON_BIT = 7
molpher.core.selectors.SC_RUSSEL = 8
molpher.core.selectors.SC_SOKAL = 9
molpher.core.selectors.SC_TANIMOTO = 10

This is the default selector that is used when no other option is specified.

molpher.core.selectors.SC_TVERSKY_SUBSTRUCTURE = 11
molpher.core.selectors.SC_TVERSKY_SUPERSTRUCTURE = 12

1.1.1.1.2. Subpackages

1.1.1.1.3. Module contents

1.1.1.1.3.1. MolpherMol

A direct reference to molpher.core.MolpherMol.MolpherMol.

See also

MolpherMol

1.1.1.1.3.2. ExplorationTree

A direct reference to molpher.core.ExplorationTree.ExplorationTree.

See also

ExplorationTree

1.1.1.1.3.3. ExplorationData

A direct reference to molpher.core.ExplorationData.ExplorationData.

See also

ExplorationData