Reactingmoieties¶

addBondMappingsRXNFile(rxnfileName, rxnfileDirectory)[source]¶

Add bond mappings from an MDL rxn file. :USAGE: [bonds,bondMappings,nTotalBondTransitions] = addBondMappingsRXNFile (rxnfileName, rxnfileDirectory)

INPUT:
rxnfileName: The file name.

OPTIONAL INPUT:

rxnfileDirectory: Path to directory containing the rxnfile. Defaults
to current directory.

OUTPUTS: bondMappings: Table of bond mapping information, with s rows, one for each bond transition.

.mets - A s x 1 cell array of metabolite identifiers for bonds.

.headAtoms - A s x 1 vector containing the numbering of the first atom forming the bond within each metabolite.

.tailAtoms - A s x 1 vector containing the

numbering of the second atom forming the bond within each metabolite. * .bTypes - A s x 1 vector of the bond type within each metabolite (1 for a single bond, 2 for a double bond, and 3 for a triple bond). * .headAtomTransitionNrs - A s x 1 vector of atom transition indice of the first atom forming the bond within each metabolite. * .tailAtomTransitionNrs - A s x 1 vector of atom transition indice of the second atom forming the bond within each metabolite. * .isSubstrate - A s x 1 logical array. True for substrates, false for

products in the reaction for bonds.

.instances - A s x 1 vector indicating which instance of a repeated metabolite atom i belongs to for bonds.

.bondIndex - A ‘s’x 1 vector indicating a

unique numeric id for each bond * .bondTypeInstance - A s x 1 vector indicating which instance of a repeated bond with bTypes~=1 (ex: if bTypes=2—>bondTypeINstance(1)=1;bondTypeINstance(1)=2)) * .isReacting - A ‘s’x 1 vector indicating if a bond is broken (-1), formed (1) or conserved (0) * .bondTransitionNrs - A ‘s’x 1 vector indicating bond transition indices.

buildAtomAndBondTransitionMultigraph(model, RXNFileDir, options)[source]¶

Builds a matlab digraph object representing an atom transition multigraph

and a bond transition multigraph corresponding to a metabolic network from reaction stoichiometry and atom mappings.

—–Atoms

The multigraph nature is due to possible duplicate atom transitions, where the same pair of atoms are involved in the same atom transition in different reactions.

The directed nature is due to possible duplicate atom transitions, where the same pair of atoms are involved in atom transitions of opposite orientation, corresponding to reactions in different directions.

Note that A = incidence(dATM) returns a a x t atom transition directed multigraph incidence matrix where a is the number of atoms and t is the number of directed atom transitions. Each atom transition inherits the orientation of its corresponding reaction.

A stoichimetric matrix may be decomposed into a set of atom transitions with the following atomic decomposition:

N=left(VV^{T}right)^{-1}VAE

VV^{T} is a diagonal matrix, where each diagonal entry is the number of atoms in each metabolite, so V*V^{T}*N = V*A*E

With respect to the input, N is the subset of model.S corresponding to atom mapped reactions

With respect to the output V := M2Ai: E := Ti2R A := incidence(dATM);

so we have the atomic decomposition M2Ai*M2Ai’*N = M2Ai*A*Ti2R

—Bonds Note that B = incidence(dBTM) returns a b x s bond transition

directed multigraph incidence matrix where b is the number of bonds and s is the number of directed bond transitions. Each bond transition inherits the orientation of its corresponding reaction.

A stoichimetric matrix may be decomposed into a set of bond transitions with the following decomposition in terms of bonds:

N=left(UW^{T}right)^{-1}UBF

UW^{T} is a diagonal matrix, where each diagonal entry is the number of bonds in each metabolite, so U*W^{T}*N = U*B*F

With respect to the input, N is the subset of model.S corresponding to bond mapped reactions

With respect to the output U := M2Bi
W := M2BiW F := BTi2R B := incidence(dBTM);

so we have the decomposition in terms of bond M2Bi*M2BiW’*N = M2Bi*B*BTi2R

USAGE:

[dATM, metAtomMappedBool, rxnAtomMappedBool, M2Ai, Ti2R, dBTM, M2BiE, M2BiW, BTiE] = buildAtomAndBondTransitionMultigraph(model, RXNFileDir, options)

INPUTS:

model: Directed stoichiometric hypergraph
Represented by a matlab structure with following fields:

.S - The m x n stoichiometric matrix for the metabolic network

.mets - An m x 1 array of metabolite identifiers. Should match metabolite identifiers in rxnfiles.

.rxns - An n x 1 array of reaction identifiers. Should match rxnfile names in rxnFileDir.

.lb - An n x 1 vector of lower bounds on fluxes.

.ub - An n x 1 vector of upper bounds on fluxes.

RXNFileDir: Path to directory containing rxnfiles with atom mappings
for internal reactions in S. File names should correspond to reaction identifiers in input rxns. e.g. git clone https://github.com/opencobra/ctf ~/fork-ctf

then RXNFileDir = ~/fork-ctf/rxns/atomMapped

options: A structure that contains two fields representing customisable options for the function.

.sanityChecks - A boolean variable that controls whether

sanity checks are performed within the function. * .bondTransitionMultigraph - A boolean variable that specifies whether the function generates the bond transition multigraph.

OUTPUT:
dATM: Directed atom transition multigraph as a MATLAB digraph structure with the following tables:

.Nodes — Table of node information, with p rows, one for each atom.

.Nodes.Atom - unique index for each atom

.Nodes.Atom - unique alphanumeric id for each atom by concatenation of the metabolite, atom and element

.Nodes.AtomIndex - unique numeric id for each atom in atom transition multigraph

.Nodes.Met - metabolite containing each atom

.Nodes.AtomNumber - unique numeric id for each atom in an atom mapping

.Nodes.Element - atomic element of each atom

.EdgeTable — Table of edge information, with s rows, one for each bond transition instance.

.EdgeTable.EndNodes - two-column cell array of character vectors that defines the graph edges

.EdgeTable.Trans - unique alphanumeric id for each bond transition instance by concatenation of the reaction, head and tail atoms

.EdgeTable.TansInstIndex - unique numeric id for each bond transition instance

.EdgeTable.dirTransInstIndex - unique numeric id for each directed bond transition instance

.EdgeTable.Rxn - reaction corresponding to each bond transition

.EdgeTable.HeadBondIndex - head Nodes.BondIndex

.EdgeTable.TailBondIndex - tail Nodes.BondIndex

metRXNBool: m x 1 boolean vector indicating atom mapped metabolites rxnRXNBool: n x 1 boolean vector indicating atom mapped reactions M2Ai m x a matrix mapping each metabolite to an atom in the directed atom transition multigraph Ti2R t x n matrix mapping each directed atom transition instance to a mapped reaction

The internal stoichiometric matrix may be decomposition into N = (M2Ai*M2Ai)^(-1)*M2Ai*Ti*Ti2R; where Ti = incidence(dATM), is incidence matrix of directed atom transition multigraph.

dBTM: Directed bond transition multigraph as a MATLAB digraph structure with the following tables:

.Nodes — Table of node information, with q rows, one for each bonds.

.Nodes.Bond - unique alphanumeric id for each bond by

concatenation of the metabolite, head bond and tail bond * .Nodes.BondIndex - unique numeric id for each bond in bond transition multigraph * .Nodes.BondHeadAtom - the alphanumeric id for the head atom forming the bond * .Nodes.BondTailAtom - the alphanumeric id for the tail atom forming the bond * .Nodes.BondHeadAtomIndex - the numeric id for the head atom forming the bond * .Nodes.BondTailAtomIndex - the numeric id for the tail atom forming the bond * .Nodes.Met - metabolite containing each bond * .Nodes.BondType - the type of each bond (1 for a single bond, 2 for a double bond, and 3 for a triple bond) * .EdgeTable — Table of edge information, with q rows, one for each atom transition instance. * .EdgeTable.EndNodes - two-column cell array of character vectors that defines the graph edges * .EdgeTable.Trans - unique alphanumeric id for each atom transition instance by concatenation of the reaction, head and tail atoms * .EdgeTable.TansInstIndex - unique numeric id for each atom transition instance * .EdgeTable.dirTransInstIndex - unique numeric id for each directed atom transition instance * .EdgeTable.Rxn - reaction corresponding to each atom transition * .EdgeTable.HeadBondIndex - head Nodes.BondIndex * .EdgeTable.TailBondIndex - tail Nodes.BondIndex * .EdgeTable.HeadBond - head Nodes.Bond * .EdgeTable.TailBond - tail Nodes.Bond * .EdgeTable.HeadMet - head Nodes.Met * .EdgeTable.TailMet - tail Nodes.Met * .EdgeTable.HeadMetBondTypes - head Nodes.BondTypes * .EdgeTable.TailMetBondTypes - tail Nodes.BondTypes

metRXNBool: m x 1 boolean vector indicating bond mapped metabolites rxnRXNBool: n x 1 boolean vector indicating bond mapped reactions M2Bi m x b matrix mapping each metabolite to an bond in the directed bond transition multigraph BTi2R s x n matrix mapping each directed bond transition instance to a mapped reaction

The internal stoichiometric matrix may be decomposition into N= (M2Bi*M2Bi’)^(-1)*M2Bi*B*BTi2R (To edit) where BTi = incidence(dBTM), is incidence matrix of directed bond transition multigraph.

checkABRXNFiles(model, RXNFileDir)[source]¶

Checks whether the set of RXN files coresponding to a model have the consistent stoichiometry and are elementally balanced

INPUTS:

model: Directed stoichiometric hypergraph
Represented by a matlab structure with following fields:

.S - The m x n stoichiometric matrix for the metabolic network

.model.mets - An m x 1 array of metabolite identifiers. Should match metabolite identifiers in RXNfiles.

.model.rxns - An n x 1 array of reaction identifiers. Should match RXNfile names in RXNFileDir.

.lb - An n x 1 vector of lower bounds on fluxes.

.ub - An n x 1 vector of upper bounds on fluxes.

RXNFileDir: Path to directory containing RXNfiles with atom mappings
for internal reactions in S. File names should correspond to reaction identifiers in input model.rxns. e.g. git clone https://github.com/opencobra/ctf ~/fork-ctf

then RXNFileDir = ~/fork-ctf/rxns/atomMapped

OUTPUT: metRXNBool: m x 1 vector, true if metabolite identified in at least one RXN file RXNBool: n x 1 boolean vector, true if RXN file exists RXNParsedBool: n x 1 boolean vector, true if RXN file could be parsed RXNAtomsConservedBool: n x 1 boolean vector, true if atoms in RXN file are conserved RXNStoichiometryMatchBool: n x 1 boolean vector, true if RXN stoichiometry matches model.S stoichiometry RXNStoichiometryMatchUptoProtonsBool: n x 1 boolean vector, true if RXN stoichiometry matches model.S stoichiometry when ingnoring protons. RXNSubstrateTransitionNumbersOrdered: n x 1 boolean vector, true if RXN file with substrate transition numbers ordered 1:q. RXNProductTransitionNumbersOrdered: n x 1 boolean vector, true if RXN file with product transition numbers ordered 1:q. RXNTransitionNumbersMatching: n x 1 boolean vector, true if RXN file with matching numbering of atoms between substrates and products. RXNMatchingElementBool: n x 1 boolean vector, true if RXN file with matching elements between substrates and products.

nTotalBondTransitions: The total number of bond transitions in a rxn

createBIGraph(BG)[source]¶

CREATEBMGRAPH Creates a multigraph (BMGraph) where each bond instance (e.g., in double bonds) is represented as a separate edge.

INPUT:: BG - The original graph containing nodes and edges with properties.
OUTPUT:: BIG - Bond instance graph with duplicate edges for each bond type, preserving all properties.

createMoietyGraph(model, BG, arm)[source]¶

CreateMoietyGraph generates a graph representation of moieties in a metabolic network.

Input:

- model – A structure containing information about the metabolic submodel.
- BG – A graph representing the metabolic network.
- arm – An atomically resolved model as a matlab structure from identyConservedMoieties function.

Output:

- moietyGraph – Graph representation of moiety cycles in a metabolic network.

extractBondSubgraphs(BIG, ATG)[source]¶

EXTRACTBONDSUBGRAPHS Extracts subgraphs of bonds and their mappings.

INPUTS:

BIG - Bond Instance Graph – The full weighted bond graph containing all bonds and weights.
ATG - Atom Graph – Represents atoms as nodes and their bonds as edges.
atoms2component - Array mapping each atom to a specific component or group.

OUTPUTS:

bondSubgraphs - Cell array where each entry represents a subgraph of connected bonds.
BMG - Cell array of Bond Mapping Graphs (BMG)

findAndExtractMolecularGraphs(BIG, BMG, bondSubgraphs)[source]¶

findAndExtractMolecularGraphs - Identifies conserved and reacting isomorphic groups and extracts associated molecular graphs.

Inputs:

BIG - The original graph containing all bonds and nodes.
BMG - Cell array containing bond mapping graphs (subgraphs)
bondSubgraphs - Cell array where each cell contains a subgraph representing – a set of bonds mapped to each other.

Outputs:

CMTG - Conserved Molecular Transition Graph from bondSubgraphs.
RMTG - Reacting Molecular Transition Graph from bondSubgraphs.
CMG - Conserved Molecular Graph from BIG.
RMG - Reacting Molecular Graph from BIG.
conservedGroup - Indices of subgraphs in the largest isomorphic group.
reactingGroups - Indices of subgraphs not part of the largest isomorphic group.

getMetMoietySubgraphs(model, BG, arm)[source]¶

GETSUBGRAPHS Extract moieties and metabolite subgraphs from a given model.

Inputs:

- model – A metabolic model structure (COBRA Toolbox model).
- dATM – An atom transitions multigraph.
- dBTM – A bond transitions multigraph.
- BG – A bipartite graph representing the metabolic network.
- arm – An atomically resolved model as a matlab structure from identyConservedMoieties function.

Outputs:

- MG – Cell array of metabolite graphs.
- moietyMG – Cell array of moieties subgraphs.
- moietyInstanceG – Cell array of moiety instance subgraphs.

Example usage:: [GCMoieties, GMets, GCMoietyInstances] = getMetMoietySubgraphs(model, dATM, dBTM, BG, arm);

This function takes a metabolic model, atom and bond transition graphs, a bipartite graph representing the metabolic network, and the stoichiometric matrix. It then extracts moieties and metabolite subgraphs and returns them as cell arrays. .. Authors: - Hadjar Rahou, 2023.

identifyConservedReactingMoieties(model, BG, dATM, options)[source]¶

This function computes:

Conserved moieties (structural moiety conservation relations) from an atom-mapped network using graph-theoretic analysis of the directed atom transition multigraph (dATM).
Reacting moieties from the bond graph (BG) by identifying reacting bonds, contracting atom-transition components, and solving a minimum set cover problem over reactions (see theory PDF, Sections on reacting moieties).

The conserved-moiety part yields a decomposition of the atom-mapped stoichiometric matrix N into moiety transitions:

N = inv(M2M*M2M’) * M2M * M * M2R

where:: N = model.S(metAtomMappedBool, rxnAtomMappedBool) M2M = mapping metabolite -> moiety instances M = incidence(MTG) (incidence matrix of the moiety transition graph) M2R = mapping moiety transitions -> reactions

NOTE: (M2M*M2M’) is diagonal; each diagonal entry equals the number of moiety instances in the corresponding metabolite.

USAGE:

[arm, moietyFormulae, reacting] = identifyConservedReactingMoieties (model, BG, dATM, options)

INPUTS:

model – Structure with following fields:
- .S - The m x n stoichiometric matrix for the metabolic network
- .mets - An m x 1 array of metabolite identifiers. Should match metabolite identifiers in rxnfiles.
- .rxns - An n x 1 array of reaction identifiers. Should match rxnfile names in rxnFileDir.
BG – Bond graph / molecular graph input describing intra-metabolite bonds. Must be consistent with the atom set in dATM (same atoms / indices).
dATM – Directed atom transition multigraph, obtained from buildAtomTransitionMultigraph.m A MATLAB digraph structure with the following tables and variables:
- .Nodes — Table of node information, with p rows, one for each atom.
- .Nodes.Atom - unique alphanumeric id for each atom by concatenation of the metabolite, atom and element
- .Nodes.AtomIndex - unique numeric id for each atom in atom transition multigraph
- .Nodes.mets - metabolite containing each atom
- .Nodes.AtomNumber - unique numeric id for each atom in a metabolite
- .Nodes.Element - atomic element of each atom
- .Edges — Table of edge information, with q rows, one for each atom transition instance.
- .Edges.EndNodes - two-column cell array of character vectors that defines the graph edges
- .Edges.Trans - unique alphanumeric id for each atom transition instance by concatenation of the reaction, head and tail atoms
- .Edges.TransIndex - unique numeric id for each atom transition instance
- .Edges.rxns - reaction abbreviation corresponding to each atom transition instance
- .Edges.HeadAtomIndex - head Nodes.AtomIndex
- .Edges.TailAtomIndex - tail Nodes.AtomIndex

OPTIONAL INPUTS: options: Structure with following fields:

.sanityChecks {(0),1} true if additional sanity checks

on computations, but substantially more computation time

OUTPUTS: arm atomically resolved model as a matlab structure with the following fields:

arm.MRH: Directed metabolic reaction hypergraph, i.e. standard COBRA model, with additional fields: arm.MRH.metAtomMappedBool: m x 1 boolean vector indicating atom mapped metabolites arm.MRH.rxnAtomMappedBool: n x 1 boolean vector indicating atom mapped reactions

arm.dATM: Directed atom transition multigraph (dATM) obtained from buildAtomTransitionMultigraph.m

arm.M2Ai: m x a matrix mapping each mapped metabolite to one or more atoms in the directed atom transition multigraph arm.Ti2R: t x n matrix mapping one or more directed atom transition instances to each mapped reaction arm.Ti2I t x i matrix to map one or more directed atom transition instances to each isomorphism class

arm.ATG: Atom transition graph, as a MATLAB graph structure with the following tables and variables:

.Nodes — Table of node information, with a rows, one for each atom.

.Nodes.Atom - unique alphanumeric id for each atom by concatenation of the metabolite, atom and element

.Nodes.AtomIndex - unique numeric id for each atom in atom transition multigraph

.Nodes.mets - metabolite containing each atom

.Nodes.AtomNumber - unique numeric id for each atom in an atom mapping

.Nodes.Element - atomic element of each atom

.Nodes.MoietyIndex - numeric id of the corresponding moiety (arm.MTG.Nodes.MoietyIndex)

.Nodes.Component - numeric id of the corresponding connected component (rows of C2A)

.Nodes.IsomorphismClass - numeric id of the corresponding isomprphism class (rows of I2C)

.Nodes.IsCanonical - boolean, true if atom is within first component of an isomorphism class

.Edges — Table of edge information, with u rows, one for each atom transition instance.

.Edges.EndNodes - numeric id of head and tail atoms that defines the graph edges

.Edges.Trans - unique alphanumeric id for each atom transition by concatenation of head and tail atoms

.Edges.HeadAtomIndex - head Nodes.AtomIndex

.Edges.TailAtomIndex - tail Nodes.AtomIndex

.Edges.HeadAtom - head Nodes.Atom

.Edges.TailAtom - tail Nodes.Atom

.Edges.TransIndex - unique numeric id for each atom transition

.Edges.Component - numeric id of the corresponding connected component (columns of T2C)

.Edges.IsomorphismClass - numeric id of the corresponding isomprphism class (columns of T2I)

.Edges.IsCanonical - boolean, true if atom transition is within first component of an isomorphism class

arm.M2A: m x a matrix mapping each metabolite to one or more atoms in the (undirected) atom transition graph arm.A2R: u x n matrix that maps atom transitions to reactions. An atom transition can map to multiple reactions and a reaction can map to multiple atom transitions arm.A2Ti: u x t matrix to map each atom transition (in ATG) to one or more directed atom transition instance (in dATM) with reorientation if necessary.

arm.I2C i x c matrix to map each isomorphism class (I) to one or more components (C) of the atom transition graph (ATG) arm.C2A c x a matrix to map each connected component (C) of the atom transition graph to one or more atoms (A) arm.A2C u x c matrix to map one or more atom transitions (T) to connected components (C) of the atom transition graph (ATG)

arm.I2A i x a matrix to map each isomorphism class to one or more atoms of the atom transition graph (ATG) arm.A2I u x i matrix to map one or more atom transitions to each isomorphism class

arm.MTG = MTG; % (undirected) moiety transition graph

identifyConservedReactingSubgraphs(model, dATM, dBTM)[source]¶

IDENTIFYCONSERVEDREACTINGSUBGRAPHS Identifies conserved and reacting bond and atom subgraphs.

Inputs:

- dBTM – Bond transition multigraph.
- dATM – Atom transition multigraph.
- model – model containing reactions of interest.

Outputs:

- CBG – Conserved bond subgraph.
- RBG – Reacting bond subgraph.
- CAG – Conserved atom subgraph.
- RAG – Reacting atom subgraph.
- brokenBondsTable – a table of the broken bonds in the model.
- formedBondsTable – a table of the formed bonds in the model.

This function identifies conserved and reacting bond subgraphs from the bond transition multigraph and conserved and reacting atom subgraphs from the atom transition multigraph based on the provided submodel.

Usage:

[brokenBondsTable, formedBondsTable, CAG, RAG, CBG, RBG] =
identifyConservedReactingSubgraphs (model, dATM, dBTM)

Author: Hadjar Rahou, 2023

Check if all inputs are defined

identifyIsomorphicClasses(CBSubgraphs, sanityChecks)[source]¶

identifyIsomorphicClasses - Identifies isomorphism classes for a set of subgraphs.

Inputs:

CBSubgraphs - Cell array where each cell contains a subgraph.
sanityChecks - Boolean flag to enable additional consistency checks.

Outputs:

isomorphismClasses - Cell array where each cell contains indices of isomorphic subgraphs.
firstSubgraphIndices - Indices of the first subgraph in each isomorphism class.
subsequentSubgraphIndices - Array mapping subgraphs to their isomorphism class.

Notes

Requires MATLAB R2016b or later for the isisomorphic function with variable matching.
Node and edge properties are compared for isomorphism detection.

mapAontoBOld(Akey, Bkey, Ain, Bin)[source]¶: LIBkey: an array of the same size as Bkey containing true where the elements of B are in A and false otherwise. LOCAkey: an array LOCB containing the lowest absolute index in Akey for each element in Bkey which is a member of Akey and 0 if there is no such index.

readABRXNFile(rxnfileName, rxnfileDirectory, options)[source]¶

Read atom mappings from a MDL rxn file.

USAGE:

[atoms, bonds] = readRXNFile (rxnfileName, rxnfileDirectory)

INPUT:

rxnfileName – The file name.

OPTIONAL INPUT:

rxnfileDirectory – Path to directory containing the rxnfile. Defaults to current directory.

OUTPUTS:

atoms –
Table of atom information, with p rows, one for each atom.
- .mets - A p x 1 cell array of metabolite identifiers for atoms.
- .elements - A p x 1 cell array of element symbols for atoms.
- .metNrs - A p x 1 vector containing the numbering of atoms within
each metabolite molfile.
- .atomTransitionNrs - A p x 1 vector of atom transition indices.
- .isSubstrate - A p x 1 logical array. True for substrates, false for
products in the reaction.
- .instances - A p x 1 vector indicating which instance of a repeated metabolite atom i belongs to.
bonds –
Table of bond information, with q rows, one for each bond.
- .mets - A q x 1 cell array of metabolite identifiers for bonds.
- .headAtoms - A q x 1 vector containing the numbering of the first atom forming the bond within each metabolite.
- .tailAtoms - A q x 1 vector containing the
numbering of the second atom forming the bond within each metabolite. * .bTypes - A q x 1 vector of the bond type within each metabolite (1 for a single bond, 2 for a double bond, and 3 for a triple bond). * .headAtomTransitionNrs - A q x 1 vector of atom transition indice of the first atom forming the bond within each metabolite. * .tailAtomTransitionNrs - A q x 1 vector of atom transition indice of the second atom forming the bond within each metabolite. * .isSubstrate - A q x 1 logical array. True for substrates, false for
products in the reaction fpr bonds.
- .instances - A q x 1 vector indicating which instance of a repeated metabolite atom i belongs to for bonds.

Hadjar Rahou (readBonds)