groupContribution¶
-
assignThermoToModel
(model, Alberty2006, Legendre, LegendreCHI, useKeqData, printToFile, GCpriorityMetList, metGroupCont, metSpeciespKa)[source]¶ Assigns thermodynamic data to model at given temperature, pH, ionic strength and electrical potential.
Physicochemically, this is the most important function for setting up a thermodynamic model. It takes the standard Gibbs energies of the metabolite species and uses this data to create a standard transformed Gibbs energy for each reactant. It uses the metabolite species standard Gibbs energies of formation backcalculated from equilibrium constants, in preference to the group contribution estimates.
Usage
[model,computedSpeciesData] = assignThermoToModel(model, Alberty2006, Legendre, LegendreCHI, useKeqData, printToFile, GCpriorityMetList, metGroupCont, metSpeciespKa)Inputs
- model –
structure with fields:
- model.T - temperature 298.15 K to 313.15 K
- model.ph(p) - real pH in compartment defined by letter p
- model.is(p) - ionic strength (0 - 0.35M) in compartment defined by letter p
- model.chi(p) - electrical potential (mV) in compartment defined by letter p
- model.cellCompartments(p) - 1 x # cell array of distinct compartment letters
- model.NaNdfG0GCMetBool(m) - m x 1 boolean vector with 1 when no group contribution data is available for a metabolite
- Alberty2006 – Alberty’s data
Optional inputs
Legendre – {(1), 0} Legendre Transformation for specifc real pH?
LegendreCHI – {(1), 0} Legendre Transformation for specifc electrical potential?
useKeqData – {(1), 0} Use dGf0 back calculated from Keq?
printToFile – {(0), 1} 1 = print out repetitive material to log file
metGroupCont – Structure containing output from Jankowski et al.’s 2008 implementation of the group contribution method (GCM). Contains the following fields for each metabolite:
- .abbreviation: Metabolite ID
- .formulaMarvin: Metabolite formula output by GCM
- .delta_G_formation: Estimated standard Gibbs energy of formation
- .delta_G_formation_uncertainty: Uncertainty in estimated delta_G_formation
- .chargeMarvin: Metabolite charge output by GCM
metSpeciespKa – Structure containing pKa for acid-base equilibria between metabolite species. pKa are estimated with ChemAxon’s pKa calculator plugin (see function assignpKasToSpecies).
Output
- model –
structure with fields:
- model.dfG0(m) - standard Gibbs energy of formation
- model.dfG(m) - Gibbs energy of formation
- model.dfGt0(m) - standard transformed Gibbs energy of formation
- model.dHzerot(m) - standard transformed enthalpy of formation
- model.dfGt0Source(m) - origin of data, Keq or groupContFileName.txt
- model.dfGt0Keq(m)
- model.dfGt0GroupCont(m)
- model.dfHt0Keq(m)
- model.mf(m) - mole fraction of each species within a pseudoisomer group
- model.aveZi(m) - average charge
- model.chi - electrical potential
- model.aveHbound(m) - average number of protons bound to a reactant
- modelT.gasConstant - Gas Constant (deprecated)
- model.faradayConstant - Faraday Constant
- modelT.temp - Temperature (deprecated)
- model.ph(p) - real pH in compartment defined by letter p
- model.is(p) - ionic strength (0 - 0.35M) in compartment defined by letter p
- model.chi(p) - electrical potential (mV) in compartment defined by letter p
- model –
structure with fields:
-
checkFormulae
(formulaA, formulaB, exceptions)[source]¶ Compares two formulae for the number of elements for all optionally except the elements listed in exceptions
Usage
bool = checkFormulae(formulaA, formulaB, exceptions)Inputs
- formulaA
- formulaB
Optional input
- exceptions – String array of metabolite abbreviation exceptions e.g. {‘H’}
Output
- bool – Boolean where 1 means the formulae match and 0 otherwise
-
compareMetaboliteFormulae
(modelT)[source]¶ Prints out a tab delimited file with the abbreviations, reconstruction metabolite formluae, and group contribution metabolite formulae.
Usage
compareMetaboliteFormulae(modelT)Input
- modelT – output of setupThermoModel
-
createComputedSpeciesData
(metSpeciespKa, metGroupCont)[source]¶ Use group contribution estimated standard Gibbs energies of formation for predominant metabolite species at pH 7, and ChemAxon estimated pKa for species equilibria, to calculate standard Gibbs energies of formation for nonpredominant metabolite species.
Usage
computedSpeciesData = createComputedSpeciesData(metSpeciespKa, metGroupCont)Inputs
- metSpeciespKa – Structure containing pKa for acid-base equilibria between metabolite species. pKa are estimated with ChemAxon’s pKa calculator plugin (see function “assignpKasToSpecies”)
- metGroupCont – Structure array with group contribution method output mapped to BiGG metabolites.
Output
computedSpeciesData – Structure with thermodynamic data for metabolite species. Contains two fields for each metabolite:
- .abbreviation: Metabolite abbreviation
- .basicData: Cell array with 4 columns; 1. dGf0 (kJ/mol), 2. dHf0, 3. charge, 4. #Hydrogens
-
createGroupContributionStruct
(primaryFile, pH, secondaryFile)[source]¶ Generates a matlab structure out of the tab delimited group contribuion data
The matlab structure with the group contibution data for each metabolite uses the primaryFile file in preference to the secondaryFile file but these can be any first and second preference files as long as they are in the correct format see webCGMtoTabDelimitedFile.m
Usage
metGroupCont = createGroupContributionStruct(primaryFile, pH, secondaryFile)Input
- primaryFile – tab delimited text file with group contribution data (Janowski et al Biophysical Journal 95:1487-1499 (2008)) i.e. output such as webCGM.txt from webCGMtoTabDelimitedFile.m
Optional inputs
- pH – ph at which group contribution data given for, default = 7
- secondaryFile – tab delimited text file with group contribution data (Janowski et al Biophysical Journal 95:1487-1499 (2008)) i.e. output such as webCGM.txt from webCGMtoTabDelimitedFile.m If the same metabolite abbreviation occurs in both files, then the data in the primary file takes precedence.
Comment on input file format - the first two text columns in both files should correspond to: abbreviation, formulaMarvin, the next three columns in both files should correspond to: delta_G_formation, delta_G_formation_Uncertainty, chargeMarvin
Output
- metGroupCont –
structure with fields:
- metGroupCont(m).abbreviation - metabolite abbreviation
- metGroupCont(m).formulaMarvin - metabolite formula (Marvin)
- metGroupCont(m).delta_G_formation
- metGroupCont(m).delta_G_formation_uncertainty
- metGroupCont(m).chargeMarvin - metabolite charge (Marvin)
- metGroupCont(m).pH
- metGroupCont(m).file - file data came from
-
createGroupIncidenceMatrix_old
(model, gcmOutputFile, gcmMetList, jankowskiGroupData)[source]¶ Creates groupData struct to calculate reaction Gibbs energies with reduced error in vonB.
Usage
G = createGroupIncidenceMatrix_old(model, gcmOutputFile, gcmMetList, jankowskiGroupData)Inputs
- model
- gcmOutputFile
- gcmMetList
- jankowskiGroupData
Output
- G
-
dGfzeroGroupContToBiochemical
(model, Legendre)[source]¶ Transforms group contribution estimate of metabolite standard transformed Gibbs energy. Converts group contribution data biochemical standard transformed Gibbs energy of formation, at specified pH and ionic strength.
Usage
model = dGfzeroGroupContToBiochemical(model, Legendre)Input
- model –
structure with fields:
- model.mets{m}
- model.metCharges(m)
- model.metFormulas{m}
- model.T - temperature
- model.faradayConstant - Faraday constant
- model.gasConstant - Universal Gas Constant
- model.ph(p) - real pH in compartment defined by letter p
- model.is(p) - ionic strength (0 - 0.35M) in compartment defined by letter *
- model.chi(p) - electrical potential (mV) in compartment defined by letter *
- model.cellCompartments - 1 x # cell array of distinct compartment letters
- model.NaNdfG0GCMetBool - m x 1 boolean vector with 1 when no group contribution data is available for a metabolite generated in old SetupThermoModel.m only
Optional input
- Legendre – {(1), 0} Legendre Transformation for specifc pH and electrical potential?
Output
- model –
structure with fields:
- model.NaNdfG0GCMetBool - m x 1 boolean vector with 1 when no group contribution data is available for a metabolite
- model.dfG0GroupCont(m) - group contribution estimate (kJ mol^-1)
- model.dfG0GroupContUncertainty(m) - error on group contribution estimate (kJ mol^-1)
- model.dfGt0GroupCont(m) - group contribution estimate +/-Legendre transform (kJ mol^-1)
- model.dfGt0GroupContUncertainty(m) - error on group contribution estimate +/- Legendre transform (kJ mol^-1)
- model.aveHbound(m) - average number of H+ bound
- model.aveZi(m) - average charge
- model.mf(m) - mole fraction of each species within a pseudoisomer group
- model.lambda(m) - activity coefficient
Note
At the moment, the charges of the metabolites are for pH 7 only so strictly it should be pH 7 only
iAF1260 Supplemental Note: “All delta_f_G_est_0 calculated for the reconstruction using the group contribution method are based upon the standard condition of aqueous solution with pH equal to 7, temperature equal to 298.15 K, zero ionic strength and 1M concentrations of all species except H+, and water. In the cases where multiple charged forms of a molecule exist at pH 7, the most abundant form is used.” Same as Janowski et al Biophysical Journal 95:1487-1499 (2008)
- model –
structure with fields:
-
deltaG0concFluxConstraintBounds
(model, Legendre, LegendreCHI, gcmOutputFile, gcmMetList, jankowskiGroupData, figures, nStdDevGroupCont)[source]¶ Sets reaction directionality bounds from thermodynamic data first pass assignment of reaction directionality based on standard transformed Gibbs energy and concentration bounds.
Usage
model = deltaG0concFluxConstraintBounds(model, Legendre, LegendreCHI, gcmOutputFile, gcmMetList, jankowskiGroupData, figures, nStdDevGroupCont)Inputs
- model –
structure with fields:
- model.S
- model.SintRxnBool - Boolean indicating internal reactions
- model.gasConstant - gas constant
- model.T - temperature
- model.boundryConc - bounds on concentration of boundary metabolites
- model.dfGt0(m) - standard transformed Gibbs energy of formation(kJ/mol)
- model.dfG0GroupContUncertainty(m) - group. cont. uncertainty in estimate of standard Gibbs energy of formation (kJ/mol)
- model.xmin(m) - lower bound on metabolite concentration
- model.xmax(m) - upper bound on metabolite concentration
- model.metCharges(m) - reconstruction metabolite charge
- model.lb - reconstruction reaction lower bounds
- model.ub - reconstruction reaction upper bounds
- model.chi(p) - electrical potential (mV) in compartment defined by letter
- Legendre – {(1), 0} Legendre Transformation for specifc pHr?
- LegendreCHI – {(1), 0} Legendre Transformation for specifc electrical potential?
- gcmOutputFile – Path to output file from Jankowski et al.’s 2008 implementation of the group contribution method.
- gcmMetList – Cell array with metabolite ID for metabolites in gcmOutputFile. Metabolite order must be the same in gcmOutputFile and gcmMetList.
- jankowskiGroupData – Data on groups included in Jankowski et al.’s 2008 implementation of the group contribution method. Included with von Bertalanffy 1.1. Location: ...vonBertalanffysetupThermoModelexperimentalDatagroupContributionjankowskiGroupData.mat.
Optional inputs
- figures – {1, (0)} 1 = create figures
- nStdDevGroupCont – {real, (1)} number of standard deviations of group contribution uncertainty, 1 means uncertainty given by group contribution method (one standard deviation)
Outputs
nStdDevGroupCont – {real, (1)} number of standard deviations of group contribution uncertainty, 1 means uncertainty given by group contribution method (one standard deviation)
model – structure with fields: For each metabolite:
- model.xMin
- model.xMax
- model.dfGt0Min
- model.dfGt0Max
- model.dfGtMin
- model.dfGtMax
- model.NaNdfG0MetBool - metabolites without Gibbs Energy
For each reaction:
- model.dGt0Max(n) - molar standard
- model.dGt0Min(n) - molar standard
- model.dGtMax(n)
- model.dGtMin(n)
- model.dGtmMMin(n) - mili molar standard
- model.dGtmMMax(n) - mili molar standard
- model.directionalityThermo(n)
- model.lb_reconThermo - lower bounds from dGtMin/dGtMax and recon directions if thermo data missing
- model.ub_reconThermo - upper bounds from dGtMin/dGtMax and recon directions if thermo data missing
- model.NaNdG0RxnBool - reactions with NaN Gibbs Energy
- model.transportRxnBool - transport reactions
- model –
structure with fields:
-
getGroupVectorFromInchi
(inchi, silent, debug)[source]¶ Usage
group_def = getGroupVectorFromInchi(inchi, silent, debug)Inputs
- inchi
- silent
- debug – 0: No verbose output, 1: Progress information only (no warnings), 2: Progress and warnings
Output
- group_def
-
modelMetabolitesToSDF
(model, InChI)[source]¶ Write out an SDF which is effectively a set of mol files concatenated in a flat file with extra data headers
SDF format spec http://www.symyx.com/downloads/public/ctfile/ctfile.jsp
Usage
model = modelMetabolitesToSDF(model, InChI)Input
- model – model structure
Optional input
- InChI – m x 2 cell array of InChI strings for each metabolite, InChI{i, 1} is a metabolite abbreviation (no compartment), InChi{i, 2} is a metabolite InChI string
Optional output
- model –
structure with fields:
- model.mets(m).InChI - InChI mapped to model if provided as input
- model.met(m).formulaInChI - Chemical formula as given in InChI
-
molFilesToCDFfile
(model, cdfFileName)[source]¶ Concatenates all the mol files in current folder into a cdf file.
Creates a cdf file, named cdfFileNam, out of all the mol files in the current folder. The cdf file can then be used with the web based implementation of the group contribution method to estimate the Standard Gibbs energy of formation for a batch of metabolite species The web-based implementation of this new group contribution method is available free at http://sparta.chem-eng.northwestern.edu/cgi-bin/GCM/WebGCM.cgi. The code checks for a .mol file with the filename prefix given by the abbreviation in the model, therefore, you should name your own mol files accordingly.
Usage
metList = molFilesToCDFfile(model, cdfFileName)Inputs
- model –
structure with fields:
- model.mets - cell array of metabolite abbreviations corresponding to the mol files
- cdfFileName – name of cdf file
Outputs
- metList
- cdfFileName.cdf – cdf file with all the mol files in order of the model metabolite abbreviations
- model –
structure with fields:
-
plotConcVSdGft0GroupContUncertainty
(modelT)[source]¶ Compares the difference between minimum & maximum concentration, on a logarithmic scale, and the group contribution uncertainty for each metabolite.
Usage
[D, DGC] = plotConcVSdGft0GroupContUncertainty(modelT)Input
- modelT –
structure with fields:
- modelT.concMax
- modelT.concMin
- modelT.dfGt0GroupContUncertainty
Outputs
- D
- DGC
- modelT –
structure with fields:
-
secondPassDirectionalityAssignment
(model)[source]¶ Driver to call model specific code to manually generate a physiological model (if first pass does not result in a physiological model).
The second pass directionality assignment needs careful manual curation since the adjustments necessary to get one organism to grow will not necessarily be the same as the ones which will get another organism to grow. There’s no avioding manual debugging at this stage.
Usage
[model, solutionThermoRecon, solutionRecon, model1] = secondPassDirectionalityAssignment(model)Input
- model
Outputs
- model
- solutionThermoRecon
- solutionRecon
- model1
Note
This is the code used for a number of organisms in order to point out the kind of issues that arise. This is NOT supposed to work in the general case.
-
setThermoReactionDirectionalityiAF1260
(model, maxFlux, hardCoupleOxPhos)[source]¶ Second pass assignment of reaction directionality (E. coli specific)
Set the upper and lower bounds for each internal flux based on thermodynamic data where available. The remainder of the reactions without thermodynamic data stay as they were in the reconstruction To apply this script to a particular stoichiometric model, one would have to modify it manually. At the moment, it is specfic to E. coli. The same manual adjustment of reaction directionality made here to get the model to grow, and grow at the rate seen in vivo, may not work for other organisms. Nevertheless, this script outlines the steps needed to identify what needs to be changed to get a model to grow and then to get it to grow at the correct rate. There is currently no automatic substitution for manual curation.
Usage
[modelD, solutionThermoRecon, solutionRecon, model1] = setThermoReactionDirectionalityiAF1260(model, maxFlux, hardCoupleOxPhos)Input
model – structure with fields:
model.NaNdG0RxnBool - reactions with NaN Gibbs Energy
model.transportRxnBool - transport reactions
model.directions: Reactions that are qualitatively assigned by thermodynamics:
- directions.fwdThermoOnlyBool
- directions.revThermoOnlyBool
- directions.reversibleThermoOnlyBool
subsets of forward qualtiative -> reversible quantiative change:
- directions.ChangeForwardReversible_dGfKeq
- directions.ChangeForwardReversibleBool_dGfGC
- directions.ChangeForwardReversibleBool_dGfGC_byConcLHS
- directions.ChangeForwardReversibleBool_dGfGC_byConcRHS
- directions.ChangeForwardReversibleBool_dGfGC_bydGt0
- directions.ChangeForwardReversibleBool_dGfGC_bydGt0LHS
- directions.ChangeForwardReversibleBool_dGfGC_bydGt0Mid
- directions.ChangeForwardReversibleBool_dGfGC_bydGt0RHS
- directions.ChangeForwardReversibleBool_dGfGC_byConc_No_dGt0ErrorLHS
- directions.ChangeForwardReversibleBool_dGfGC_byConc_No_dGt0ErrorRHS
Outputs
- model –
structure with fields:
- model.lb_reconThermo - lower bound
- model.ub_reconThermo - upper bound
- solutionThermoRecon – FBA with thermodynamic in preference to reconstruction directions, with exceptions specific to E. coli given below
- solutionRecon – FBA with reconstruction direction
-
webCGMtoTabDelimitedFile
(model, webCGMoutputFile, gcmMetList)[source]¶ Parses a webCGM output file and prepare a tab delimited file with group contribution data mapped to the metabolite abbreviations in the given model.
Parses webCGM output file and creates an input file for createGroupContributionStruct.m
Usage
webCGMtoTabDelimitedFile(model, webCGMoutputFile, gcmMetList)Inputs
- model –
structure with fields:
- model.S - m x n, stoichiometric matrix
- model.mets - m x 1, cell array of metabolite abbreviations
- model.metFormulas - m x 1, cell array of metabolite formulae
- webCGMoutputFile – filename of output from webCG server
- metList – m x 1, cell array of metabolite ID for metabolites in webCGMoutputFile. Metabolite order must be the same in metList and webGCMoutputFile.
Output
- gc_data_webCGM.txt – tab delimited text file with group contribution data for createGroupContributionStruct.m. The first two text columns in both files should correspond to: abbreviation, formulaMarvin, the next three columns in both files should correspond to: delta_G_formation, delta_G_formation_Uncertainty, chargeMarvin.
Note
By default, any group contribution data for metabolites with underdefined formulae ( e.g. R group), are ignored, even if there is group contribution data available for this metabolite.
- model –
structure with fields: