Xomicstomodel¶
- XomicsToModel(genericModel, specificData, param)[source]¶
Given a generic model, this funtion generates a specific model using constraints generated from multi-omics data specific to a particular cell type or context. Specific data can be transcriptomics, proteomics, metabolomics or literature-based (“bibliomics”) data, or combinations thereof.
- USAGE
[model, modelGenerationReport] = XomicsToModel (genericModel, specificData, param)
- INPUT
genericModel – A generic COBRA model in standard format https://github.com/opencobra/cobratoolbox/blob/master/docs/source/notes/COBRAModelFields.md
* .S – `m x n’ stoichiometric matrix
* .b – m x 1 change in concentration with time
* .csense `m x 1` character array with entries in {L,E,G}
* .mets – m x 1 cell array of metabolite identifiers
* .metNames – m x 1 cell array of metabolite names
* .rxns – n x 1 cell array of reaction identifiers
* .rxnNames – n x 1 cell array of reaction names
* .lb – n x 1 double vector of lower bounds on reaction fluxes
* .ub – n x 1 double vector of upper bounds on reaction fluxes
* .genes – g x 1 - cell array of Entrez ID
* .grRules `g x 1` - cell array of boolean gene-protein-reaction associations
* .rxnGeneMat `n x g` - matrix with rows corresponding to reactions and columns corresponding to genes
- OPTIONAL INPUTS
- genericModel – A generic COBRA model in standard format with optional fields
.metFormulas: m x 1 cell array of metabolite formulae
.c - n x 1 linear objective coefficient vector
.C - k x n Left hand side of C*v <= d
.d - k x 1 Right hand side of C*v <= d
.dsense - k x 1 character array with entries in {L,E,G}
- .beta - A scalar weight on minimisation of one-norm of internal fluxes. Default 1e-4.
Larger values increase the incentive to find a flux vector to be thermodynamically feasibile in thermoKernel and decrease the incentive to search the steady state solution space for a flux vector that results in certain reactions and metabolites to be active and present, respectively.
specificData: A structure containing context-specific data:
.activeGenes - cell array of Entrez ID of genes that are known to be active based on the bibliomic data (Default: empty).
.inactiveGenes - cell array of Entrez ID of genes known to be inactive based on the bibliomics data (Default: empty).
.activeReactions -cell array of reaction identifiers know to be active based on bibliomic data (Default: empty).
- .coupledRxns -Table containing information about the coupled reactions. This includes the coupled reaction identifier, the
list of coupled reactions, the coefficients of those reactions, the constraint, the sense or the directionality of the constraint, and the reference (Default: empty).
.essentialAA - cell array of reaction identifiers of exchange reactions denoting essential amino acids (Default: empty).
.exoMet -Table with the fluxes obtained from exometabolomics experiments.
.exoMet.mets: metabolite identifers
.exoMet.rxns: reaction identifier
.exoMet.rxnNames: reaction name
.exoMet.mean: measured mean flux
.exoMet.SD: standard deviation of the measured flux,
.exoMet.units: the flux units
.exoMet.platform: platform used to measure it
.mediaData -Table containing information on metabolomic composition of fresh media (Default: empty)
.mediaData.rxns: cell array of reaction identifiers
.mediaData.mediumMaxUptake: maximum media uptake rate (same units as model.ub, e.g. umol/gDW/h))
.mediaData.constraintDescription: description of each constraint
.presentMetabolites.mets -cell array of metabolites known to be present based on the bibliomics data (Default: empty).
.presentMetabolites.weights -Weights on metabolites known to be present based on the bibliomics data (Default: empty).
.absentMetabolites.mets -cell array of metabolites known to be absent based on the bibliomics data (Default: empty).
.absentMetabolites.weights -Weights on metabolites known to be absent based on the bibliomics data (Default: empty).
.rxns2add: table containing reactions to add to the generic model
.rxns2add.rxns: cell array of reaction identifiers
.rxns2add.rxnFormulas: cell array of reaction formulas
.rxns2add.lb: vector of reaction lower bounds
.rxns2add.ub: vector of reaction upper bounds
.rxns2add.geneRule: gene rules to which the reaction is subject
.rxns2remove.rxns - cell array of reaction identifiers know to be inactive based on bibliomic data (Default: empty).
.rxns2constrain -Table where each row corresponds to a reaction to constrain (Default: empty).
.rxns2constrain.rxns: reaction identifier
.rxns2constrain.lb:
.rxns2constrain.ub:
.rxns2constrain.constraintDescription: description of each constraint
.rxns2constrain.notes: notes such as references or special cases
.transcriptomicData -Table with transcriptomic data, with one row per gene (Default: empty)
.transcriptomicData.genes - Entrez ID’s of the gene corresponding to each transcript
.transcriptomicData.expVal - Non-negative transcriptomic expression value i.e. linear scale.
.proteomicData:Table of proteomic data, with one row per protein (Default: empty)
.proteomicData.genes: Entrez ID’s of the gene corresponding to each protein
.proteomicData.expVal: Non-negative abundance of each protein i.e. linear scale.
- param: a structure containing the parameters for the function:
.printLevel -Level of verbose that should be printed (Default: 0).
.debug -Logical, should the function save its progress for debugging (Default: false).
.addCoupledRxns -Logical, determines if the flux of reactions specified in specificData.coupledRxns should be coupled (Default: true).
.addSinksexoMet - Logical, should sink reactions be added for metabolites detected in exometabolomic data (if no exchange or sink is already present).
- .activeGenesApproach -String with the name of the active genes approach will be used
‘oneRxnPerActiveGene’ adds at least one reaction per active gene (Default) ‘allRxnPerActiveGene’ adds all reactions corresponding to an active gene (generates a larger model)
.TolMaxBoundary -The reaction boundary’s maximum value (Default: 1000)
.TolMinBoundary -The reaction boundary’s minimum value (Default: -1000)
- .boundPrecisionLimit -Precision of flux estimate, if the absolute valueof the lower bound or the upper bound are lower
than the boundPrecisionLimit but higher than 0 the value will be set to the boundPrecisionLimit (Default: primal feasibility tolerance x 10).
.closeIons -Logical, it determines whether or not ion exchange reactions are closed. (Default: false).
.closeUptakes -Logical, decide whether or not all of the uptakes in the draft model will be closed (Default: false).
.uptakeSign -Sign for uptakes (Default: -1).
.diaryFilename -The location where a diary will be printed with the function output. (Default: 0).
.fluxCCmethod -String with thee name of the algorithm to be used for the flux consistency check (Possible options: ‘swiftcc’, ‘fastcc’ or ‘dc’, Default: ‘fastcc’).
- .modelExtractionAlgorithm - Model extraction algorithm to be used to extract the context-specific model
‘thermoKernel’ (Default) ‘fastCore’
.fluxEpsilon -Minimum non-zero flux value accepted for tolerance (Default: Primal feasibility tolerance X 10).
.thermoFluxEpsilon -Flux epsilon used in ‘thermoKernel’ (Default: Primal feasibility tolerance X 10).
.findThermoConsistentFluxSubset - True to identify largest thermodynamically flux consistent set before extracting a subset with thermoKernel (Default: true)
.weightsFromOmics - True to use weights derived from transcriptomic data when biasing inclusion of reactions with thermoKernel (Default: true)
.curationOverOmics -True to use literature curated data with priority over other omics data (Default: false).
.activeOverInactive -True to use active data with priority over inactive data (Default: false).
.inactiveGenesTranscriptomics - Logical, indicate if inactive genes in the transcriptomic analysis should be added to the list of inactive genes (Default: true).
.transcriptomicThreshold - Logarithmic scale transcriptomic cutoff threshold for determining whether or not a gene is active (Default: 0).
.thresholdP -Logarithmic scale proteomic cutoff threshold for determining whether or not a gene is active (Default: 0).
.growthMediaBeforeReactionRemoval - Logical, should the growth media data be added before the model extraction (Default: true).
.metabolomicsBeforeExtraction - Logical, should the metabolomics data be added before the model extraction (Default: true).
- .boundsToRelaxExoMet - String indicating the type of bounds allowed to be relaxed when fitting metabolomic data
‘all’ - allow to relax bounds on all reactions ‘both’ - allow to relax both lower and upper bounds on reactions corresponding to specificData.exoMet.rxns (Default) ‘upper’ - allow to relax both upper bounds on reactions corresponding to specificData.exoMet.rxns ‘lower’ - allow to relax both lower bounds on reactions corresponding to specificData.exoMet.rxns
.metabolomicWeights -String indicating the type of weights to be applied for metabolomics fitting (Possible options: ‘SD’, ‘mean’ and ‘RSD’; Default: ‘SD’)
- .nonCoreSinksDemands -The type of sink or demand reaction to close is indicated by a string
(Possible options: ‘closeReversible’, ‘closeForward’, ‘closeReverse’, ‘closeAll’ and ‘closeNone’; Default: ‘closeNone’).
.relaxOptions -A structure array with the relaxation options if the problem becomes infeasible, see relaxedFBA.m
.relaxOptions.steadyStateRelax (Default: param.relaxOptions.steadyStateRelax = 0).
.relaxOptions.printLevel (Default: set to param.printLevel)
.setObjective - Linear objective function to optimise (Default: none).
- .biomassRxn -The biomass reaction that represents the growth capacity of cells
Possible options for Recon3: ‘biomass_reaction’ (Default: empty)
- .maintenanceRxn -The biomass maintenance reaction that represents the turnover and update capacity of cells
(Possible options for Recon3:’biomass_maintenance’, ‘biomass_maintenance_noTrTr’) (Default: empty)
- OUTPUTS
model –
- A Context-specific COBRA model with the following fields (the
content of the variables specificData and param influences the generation of new fields):
- .activeInactiveRxn - n x 1 vector indicating if a reaction is desigated
as present (1) absent (-1) or added by the XomicsToModel (0).
.alpha1 - thermoKernel parameter (step 20).
.beta - thermoKernel parameter (step 20).
.C - The constraint matrix containing coefficients for coupled reactions (step 12).
.coupledRxnIdxs - Vector containing the indexes of the coupled reactions (step 12).
.coupledRxns - IDs of the coupled reactions (step 12).
.ctrs - The constraint IDs for coupled reactions (step 12).
.d - The constraint right hand side values for coupled reactions (step 12).
.delta0 - thermoKernel parameter (step 20).
.delta1 - thermoKernel parameter (step 20).
- .dsense - the constraint sense (‘L’: <= , ‘G’: >=, ‘E’: =), or a vector
for multiple constraints (default: (‘L’)) for coupled reactions (step 12).
- .dummyMetBool - m x 1 boolean vector indicating dummy metabolites
i.e. contains(model.mets,’dummy_Met_’; step 19).
- .dummyRxnBool - n x 1 boolean vector indicating dummy reactions
i.e. contains(model.rxns,’dummy_Rxn_’; step 19).
- .exometRelaxation - Struct array identifying the reactions where the
bounds are relaxed (step 10/22)
.exometRelaxationObj - Flux fitting used (step 10/22)
- .expressionRxns - n x 1 non-negative value for reaction expression,
corresponding to model.rxns. expressionRxns(j) is NaN when there is no expression data for the genes corresponding to reaction j (step 6).
- .fluxConsistentMetBool - m x 1 boolean vector indicating flux
consistent metabolites.
- .fluxConsistentRxnBool - n x 1 boolean vector indicating flux
consistent reactions.
- .fluxInConsistentMetBool - m x 1 boolean vector indicating flux
inconsistent metabolites.
- .fluxInConsistentRxnBool - n x 1 boolean vector indicating flux
inconsistent reactions.
- .forcedIntRxnBool - n x 1 boolean vector indicating the internal
reactions that are thermodynamically forced (step 20).
- .geneExpVal - Vector containing corresponding expression value for each gene
FPKM/RPKM; step 6).
.lambda0 - thermoKernel parameter (step 20).
.lambda1 - thermoKernel parameter (step 20).
- .lb_preconditioned - n x 1 vector containing the old lower bounds
prior to the media constraints (step 10/22).
- .ub_preconditioned - n x 1 vector containing the old upper bounds
prior to the media constraints (step 10/22).
- .lbpreSinkDemandOff - n x 1 vector with the original lower bounds
before colsing sink and demand reactions.
- .ubpreSinkDemandOff - n x 1 vector with the original upper bounds
before colsing sink and demand reactions.
.metRemoveBool - m x 1 boolean vector of metabolites removed to form stoichConsistModel.
.rxnRemoveBool - n x 1 boolean vector of reactions removed to form stoichConsistModel.
.metUnknownInconsistentRemoveBool - m x 1 boolean vector indicating removed mets
.rxnUnknownInconsistentRemoveBool - n x 1 boolean vector indicating removed rxns
.presentAbsentMet - m x 1 vector indicating if a metabolite is desigated as present (1) absent (-1) or added by the XomicsToModel (0).
.SInConsistentMetBool - m x 1 boolean vector indicating inconsistent mets.
.SInConsistentRxnBool - n x 1 boolean vector indicating inconsistent rxns.
.relaxationUsed - Logical value indicating if the model was relaxed during XomicsToModel.
.rxnFormulas - n x 1 cell array containing the formulas of the reactions.
.unknownSConsistencyMetBool - m x 1 boolean vector indicating unknown consistent mets (all zeros when algorithm converged perfectly!).
.unknownSConsistencyRxnBool - n x 1 boolean vector indicating unknown consistent rxns (all zeros when algorithm converged perfectly!).
.XomicsToModelParam - Parameters used to generate the model.
.XomicsToModelSpecificData - Context-specific data used to generate the model.
Requires The COBRA Toolbox and a linear optimisation solver (e.g. Gurobi) to be installed
2023 German Preciat, Agnieszka Wegrzyn, Xi Luo, Ronan Fleming
- XomicsToMultipleModels(modelGenerationConditions, param)[source]¶
Variations of the xomicstomodel function can be generated using this function.
- USAGE
directories = XomicsToMultipleModels (modelGenerationConditions, param)
- INPUT
modelGenerationConditions – Options to vary or to save the data
- .activeGenesApproach -The different approached to identify the active
genes (Possible options: ‘allRxnPerActiveGene’ and ‘oneRxnPerActiveGene’; default: ‘oneRxnPerActiveGene’);
- .boundsToRelaxExoMet - The type of bounds that can be relaxed, upper bounds,
lower bounds or both (‘b’; possible options: ‘u’, ‘l’ and ‘b’; default: ‘b’);
- .closeIons - Indicate whether the ions are open or closed (Possible options:
true and false; default: false);
- .cobraSolver - Optimisation solvers supported by the function. Possible
options: ‘glpk’, ‘gurobi’, ‘ibm_cplex’, ‘matlab’; default: ‘gurobi’;
- .curationOverOmics - indicates whether curated data should take priority
over omics data ; default: false;
*. genericModel: Generic COBRA model(s) * .inactiveGenesTranscriptomics - Use inactive transcriptomic genes or not
(Possible options: true and false; default: false);
.specificData - Specific data variations (Default: empty)
.limitBounds - Boundary on the model (Default: 1000).
- .metabolomicsBeforeExtraction - Indicate whether the metabolomic
data is included before or after the extraction (Possible options: true and false; default: true);
- .tissueSpecificSolver - Extraction solver (Possible options: ‘fastCore’ and
‘thermoKernel’; default: ‘thermoKernel’)
- .outputDir - Directory where the models will be generated (Default: current
directory)
- .transcriptomicThreshold - Transcriptomic thresholds that are defined by the
user (Default: log2(2));
param: Variable with fixed parameters (Default: empty struct array)
- OUTPUTS
directories - Array with the name of the new directories