Example use of functions listed in the Standard operating procedure for metabolic reconstruction.

Author(s): Ines Thiele, LCSB, University of Luxembourg.
Reviewer(s): Catherine Clancy, Stefania Magnusdottir, LCSB, University of Luxembourg.

INTRODUCTION

This tutorial has been adapted and expanded from the protocol for generating metabolic network reconstruction [1].
This tutorial will illustrate the example input for The COBRA Toolbox functions applied in the steps 38 to 96 of the protocol workflow, using the E. coli core reconstruction [2] as the model of choice.
When Matlab/Toolbox commands are following by a ‘;’ the output is not printed. Omitting the ‘;’ invokes printing of the variable content.

EQUIPMENT SETUP

Step 38. Initialize The COBRA Toolbox.

Initialize The COBRA Toolbox using the initCobraToolbox function.
initCobraToolbox(false) % false, as we don't want to update
_____ _____ _____ _____ _____ | / ___| / _ \ | _ \ | _ \ / ___ \ | COnstraint-Based Reconstruction and Analysis | | | | | | | |_| | | |_| | | |___| | | The COBRA Toolbox - 2017 | | | | | | | _ { | _ / | ___ | | | |___ | |_| | | |_| | | | \ \ | | | | | Documentation: \_____| \_____/ |_____/ |_| \_\ |_| |_| | http://opencobra.github.io/cobratoolbox | > Checking if git is installed ... Done. > Checking if the repository is tracked using git ... Done. > Checking if curl is installed ... Done. > Checking if remote can be reached ... Done. > Initializing and updating submodules ... Done. > Adding all the files of The COBRA Toolbox ... Done. > Define CB map output... set to svg. > Retrieving models ... Done. > TranslateSBML is installed and working properly. > Configuring solver environment variables ... - [---*] ILOG_CPLEX_PATH: C:\Program Files\IBM\ILOG\CPLEX_Studio1263\cplex\matlab\x64_win64 - [----] GUROBI_PATH : --> set this path manually after installing the solver ( see instructions ) - [---*] TOMLAB_PATH: C:\Program Files\tomlab\ - [----] MOSEK_PATH : --> set this path manually after installing the solver ( see instructions ) Done. > Checking available solvers and solver interfaces ... Done. > Setting default solvers ... Done. > Saving the MATLAB path ... Done. - The MATLAB path was saved in the default location. > Summary of available solvers and solver interfaces Support LP MILP QP MIQP NLP ---------------------------------------------------------------------- cplex_direct full 0 0 0 0 - dqqMinos full 0 - - - - glpk full 1 1 - - - gurobi full 1 1 1 1 - ibm_cplex full 0 0 0 - - matlab full 1 - - - 1 mosek full 0 0 0 - - pdco full 1 - 1 - - quadMinos full 0 - - - 0 tomlab_cplex full 1 1 1 1 - qpng experimental - - 1 - - tomlab_snopt experimental - - - - 1 gurobi_mex legacy 0 0 0 0 - lindo_old legacy 0 - - - - lindo_legacy legacy 0 - - - - lp_solve legacy 1 - - - - opti legacy 0 0 0 0 0 ---------------------------------------------------------------------- Total - 6 3 4 2 2 + Legend: - = not applicable, 0 = solver not compatible or not installed, 1 = solver installed. > You can solve LP problems using: 'glpk' - 'gurobi' - 'matlab' - 'pdco' - 'tomlab_cplex' - 'lp_solve' > You can solve MILP problems using: 'glpk' - 'gurobi' - 'tomlab_cplex' > You can solve QP problems using: 'gurobi' - 'pdco' - 'tomlab_cplex' - 'qpng' > You can solve MIQP problems using: 'gurobi' - 'tomlab_cplex' > You can solve NLP problems using: 'matlab' - 'tomlab_snopt' > Checking for available updates ... --> You cannot update your fork using updateCobraToolbox(). [f4a806 @ TutorialReview-SOP]. Please use the MATLAB.devTools (https://github.com/opencobra/MATLAB.devTools).

Setting the optimization solver.

This tutorial will be run with the 'glpk' package, which is a linear programming ('LP') solver. The 'glpk' package does not require additional instalation and configuration.
solverName = 'glpk';
solverType = 'LP';
changeCobraSolver(solverName, solverType);
However, for the analysis of large models, such as Recon 3, it is not recommended to use the 'glpk' package but rather an industrial strength solver, such as the 'gurobi' package.
A solver package may offer different types of optimization programmes to solve a problem. The above example used a LP optimization, other types of optimization programmes include; mixed-integer linear programming ('MILP'), quadratic programming ('QP'), and mixed-integer quadratic programming ('MIQP').
warning off MATLAB:subscripting:noSubscriptsSpecified

PROCEDURE

Step 39. Load reconstruction into Matlab.

The metabolic network reconstruction, containing a reaction and metabolite list, is contained by the file 'Ecoli_core_model.mat'.
modelFileName = 'ecoli_core_model.mat';
modelDirectory = getDistributedModelFolder(modelFileName); %Look up the folder for the distributed Models.
modelFileName= [modelDirectory filesep modelFileName]; % Get the full path. Necessary to be sure, that the right model is loaded
model = readCbModel(modelFileName);
modelEcore = model;
The reconstruction is contained in the resulting model structure.
The figure above shows the data contained in the different structure fields. We will use this model structure for all consequent computation if not noted differently.
Use the command open to view the model structure in Matlab.
if usejava('desktop') % This line of code is to avoid execution of example in non gui-environments
open modelEcore
end
The content of the structure can be assessed as follows:
modelEcore.mets{8}
ans = acald[c]
modelEcore.rxns{1}
ans = ACALD
modelEcore.S(8,1)
ans =
(1,1) -1
printRxnFormula(modelEcore, modelEcore.rxns(1))
ACALD acald[c] + coa[c] + nad[c] <=> accoa[c] + h[c] + nadh[c]
ans =
'acald[c] + coa[c] + nad[c] <=> accoa[c] + h[c] + nadh[c] '
modelEcore.lb(5) = 10
modelEcore =
rxns: {95×1 cell} mets: {72×1 cell} S: [72×95 double] rev: [95×1 double] lb: [95×1 double] ub: [95×1 double] c: [95×1 double] rules: {95×1 cell} genes: {137×1 cell} rxnGeneMat: [95×137 double] grRules: {95×1 cell} subSystems: {95×1 cell} rxnNames: {95×1 cell} metNames: {72×1 cell} metFormulas: {72×1 cell} b: [72×1 double] description: 'Ecoli_core_model'
A note:
modelEcore.newField = 'ABC – a note'
modelEcore =
rxns: {95×1 cell} mets: {72×1 cell} S: [72×95 double] rev: [95×1 double] lb: [95×1 double] ub: [95×1 double] c: [95×1 double] rules: {95×1 cell} genes: {137×1 cell} rxnGeneMat: [95×137 double] grRules: {95×1 cell} subSystems: {95×1 cell} rxnNames: {95×1 cell} metNames: {72×1 cell} metFormulas: {72×1 cell} b: [72×1 double] description: 'Ecoli_core_model' newField: 'ABC – a note'
An array B = [1 2 3]
B = [1 2 3];
modelEcore.newField = B
modelEcore =
rxns: {95×1 cell} mets: {72×1 cell} S: [72×95 double] rev: [95×1 double] lb: [95×1 double] ub: [95×1 double] c: [95×1 double] rules: {95×1 cell} genes: {137×1 cell} rxnGeneMat: [95×137 double] grRules: {95×1 cell} subSystems: {95×1 cell} rxnNames: {95×1 cell} metNames: {72×1 cell} metFormulas: {72×1 cell} b: [72×1 double] description: 'Ecoli_core_model' newField: [1 2 3]
ListStrings = {'A' 'B' 'C'}
ListStrings =
'A' 'B' 'C'
ListNumbers = [1 2 3]
ListNumbers =
1 2 3
ListTranspose = ListNumbers'
ListTranspose =
1 2 3
rxnList = 'ATPM';
rxnID = findRxnIDs(modelEcore, rxnList)
rxnID = 11

Step 40. Verify S matrix.

Remember that in the S matrix the rows and the columns correspond to the metabolites and the reactions of the model, respectively. The number of non-zero (nz) entries in the S matrix is visualized graphically below using a spy image.
spy(modelEcore.S)
To put this number into perspective, we can calculate the percentage of non-zero entries of S.
[a, b] = size(modelEcore.S);
nz = nnz(modelEcore.S);
Perc_nz = nz*100/(a*b)
Perc_nz = 5.2632
Here are two further examples of S matrices visualized using a spy image:
E. coli - AF1260 [3]
Geobacter sulfurreducans [4]
Consider to also use the tutorial on 'Numerical properties of a reconstruction' to investigate the propeties of the S matrix.

Step 41. Set objective function.

We will set the biomass reaction (Biomass_Ecoli_core_w_GAM) of the E. coli core model as objective function:
modelEcore = changeObjective(modelEcore, 'Biomass_Ecoli_core_w_GAM');
If you wish to check which reaction(s) make up the objective function and its components, use the following function:
objectiveAbbr = checkObjective(modelEcore)
summaryT =
Coefficient Metabolite metID Reaction RxnID ___________ __________ _____ ________________________ _____ -1.496 3pg[c] 3 Biomass_Ecoli_core_w_GAM 13 -3.7478 accoa[c] 10 Biomass_Ecoli_core_w_GAM 13 59.81 adp[c] 13 Biomass_Ecoli_core_w_GAM 13 4.1182 akg[c] 14 Biomass_Ecoli_core_w_GAM 13 -59.81 atp[c] 17 Biomass_Ecoli_core_w_GAM 13 3.7478 coa[c] 21 Biomass_Ecoli_core_w_GAM 13 -0.361 e4p[c] 23 Biomass_Ecoli_core_w_GAM 13 -0.0709 f6p[c] 26 Biomass_Ecoli_core_w_GAM 13 -0.129 g3p[c] 33 Biomass_Ecoli_core_w_GAM 13 -0.205 g6p[c] 34 Biomass_Ecoli_core_w_GAM 13 -0.2557 gln-L[c] 36 Biomass_Ecoli_core_w_GAM 13 -4.9414 glu-L[c] 38 Biomass_Ecoli_core_w_GAM 13 -59.81 h2o[c] 41 Biomass_Ecoli_core_w_GAM 13 59.81 h[c] 43 Biomass_Ecoli_core_w_GAM 13 -3.547 nad[c] 50 Biomass_Ecoli_core_w_GAM 13 3.547 nadh[c] 51 Biomass_Ecoli_core_w_GAM 13 13.028 nadp[c] 52 Biomass_Ecoli_core_w_GAM 13 -13.028 nadph[c] 53 Biomass_Ecoli_core_w_GAM 13 -1.7867 oaa[c] 58 Biomass_Ecoli_core_w_GAM 13 -0.5191 pep[c] 59 Biomass_Ecoli_core_w_GAM 13 59.81 pi[c] 60 Biomass_Ecoli_core_w_GAM 13 -2.8328 pyr[c] 62 Biomass_Ecoli_core_w_GAM 13 -0.8977 r5p[c] 66 Biomass_Ecoli_core_w_GAM 13
objectiveAbbr =
'Biomass_Ecoli_core_w_GAM'

Step 42. Set simulation constraints.

A first step when using and debugging a model should be to identify the constraints that have been applied.
minInf = -1000;
maxInf = 1000;
printConstraints(modelEcore, minInf, maxInf)
MinConstraints: ACONTb 10 ATPM 8.39 EX_glc(e) -10 maxConstraints:
To know which medium constraints are applied to the model, we can use the following function:
printUptakeBound(modelEcore);
EX_co2(e) -1000 EX_glc(e) -10 EX_h(e) -1000 EX_h2o(e) -1000 EX_nh4(e) -1000 EX_o2(e) -1000 EX_pi(e) -1000
Let’s assume that you would like to set the lower bound of the ATP maintenance reaction (‘ATPM’) to 8.39 mmol/gDW/h:
modelEcore = changeRxnBounds(modelEcore, 'ATPM', 8.39, 'l');
and the upper bound of the ‘ATPM’ reaction to 8.39 mmol/gDW/h:
modelEcore = changeRxnBounds(modelEcore, 'ATPM', 8.39, 'u');
To change both the lower and upper bound in the same command use:
modelEcore = changeRxnBounds(modelEcore, 'ATPM', 8.39, 'b');
Let’s assume that you would like to set the lower bound of the ‘ATPM’ reaction to 8.39 mmol/gDW/h and the ATP synthetase (‘ATPS4r’) to an upper bound of 100 mmol/gDW/h:
modelEcore = changeRxnBounds(modelEcore, 'ATPM', 8.39, 'l' );
modelEcore = changeRxnBounds(modelEcore, 'ATPS4r', 100, 'u');
The set constraints can be checked using the following function:
printConstraints(modelEcore, -1000,1000)
MinConstraints: ACONTb 10 ATPM 8.39 EX_glc(e) -10 maxConstraints: ATPM 8.39 ATPS4r 100

Steps 43 - 44. Test if network is mass- and charge balanced.

Check mass- and charge balance for the entire network of the model:
[massImbalance, imBalancedMass, imBalancedCharge, imBalancedRxnBool, Elements, missingFormulaeBool, balancedMetBool] = checkMassChargeBalance(modelEcore);

Step 45. Identify metabolic dead-ends

Detect deadend metabolites:
outputMets = detectDeadEnds(modelEcore)
outputMets =
30 32 37 49
Print the corresponding metabolite names:
DeadEnds = modelEcore.mets(outputMets)
DeadEnds =
'fru[e]' 'fum[e]' 'gln-L[e]' 'mal-L[e]'
Identify associated reactions:
[rxnList, rxnFormulaList] = findRxnsFromMets(modelEcore, DeadEnds)
rxnList =
'EX_fru(e)' 'EX_fum(e)' 'EX_gln_L(e)' 'EX_mal_L(e)' 'FRUpts2' 'FUMt2_2' 'GLNabc' 'MALt2_2'
rxnFormulaList =
'fru[e] -> ' 'fum[e] -> ' 'gln-L[e] -> ' 'mal-L[e] -> ' 'fru[e] + pep[c] -> f6p[c] + pyr[c] ' 'fum[e] + 2 h[e] -> fum[c] + 2 h[c] ' 'atp[c] + gln-L[e] + h2o[c] -> adp[c] + gln-L[c] + h[c] + pi[c] ' '2 h[e] + mal-L[e] -> 2 h[c] + mal-L[c] '
Let's have a look at the lower and upper bounds of these reactions:
modelEcore.lb(find(ismember(modelEcore.rxns, rxnList)))
ans =
0 0 0 0 0 0 0 0
modelEcore.ub(find(ismember(modelEcore.rxns, rxnList)))
ans =
1000 1000 1000 1000 1000 1000 1000 1000
modelEcore_New = modelEcore;
modelEcore_New.lb(strmatch('EX_', modelEcore_New.rxns)) = -1000;
modelEcore_New.ub(strmatch('EX_', modelEcore_New.rxns)) = 1000;
Now we repeat the identification of deadend metabolites:
outputMets = detectDeadEnds(modelEcore_New)
outputMets = 0×1 empty double column vector
DeadEnds = modelEcore_New.mets(outputMets)
DeadEnds = 0×1 empty cell array
[rxnList, rxnFormulaList] = findRxnsFromMets(modelEcore_New, DeadEnds)
rxnList = 0×1 empty cell array
rxnFormulaList = 0×0 empty cell array

Steps 46 - 49. Refer to 'gap analysis' tutorial.

Step 50. Set exchange constraints for a simulation condition.

The COBRA Toolbox function, changeRxnBounds,is used to set constraints on a reaction which is demonstrated in step 67 of this tutorial.

Steps 51 - 59. Test for stoichiometrically balanced cycles (SBCs).

SBCs or Type III pathways, are formed by internal network reactions and can carry fluxes despite closed exchange reactions (closed system). Therefore, use the following testForTypeIIIPathways function. The function is only available for windows systems and requires the X3 program from the UCSD Systems Biology Research Group homepage :
selExc = findExcRxns(modelEcore);
listExch = find(selExc);
try
testForTypeIIIPathways(modelEcore, listExch, 'test');
catch ME
getReport(ME)
end
No Type III pathways can be found in model.rm: cannot remove 'test_myT3_Sprs.txttest_myT3.txttest_myRxnMet.txt': No such file or directory /bin/bash: del: command not found

Steps 60 - 66. Test if biomass precursors can be produced in standard medium

%TODO steps 64 to 66 are not referenced in this section.
60. Obtain a list of biomass components:
BiomassComponents = modelEcore.mets(find(modelEcore.S(:, strmatch('Biomass', modelEcore.rxns))))
BiomassComponents =
'3pg[c]' 'accoa[c]' 'adp[c]' 'akg[c]' 'atp[c]' 'coa[c]' 'e4p[c]' 'f6p[c]' 'g3p[c]' 'g6p[c]' 'gln-L[c]' 'glu-L[c]' 'h2o[c]' 'h[c]' 'nad[c]' 'nadh[c]' 'nadp[c]' 'nadph[c]' 'oaa[c]' 'pep[c]' 'pi[c]' 'pyr[c]' 'r5p[c]'
61. Add a demand function for each biomass precursor:
[modelEcore_NEW, rxnNames] = addDemandReaction(modelEcore, BiomassComponents);
DM_3pg[c] 3pg[c] -> DM_accoa[c] accoa[c] -> DM_adp[c] adp[c] -> DM_akg[c] akg[c] -> DM_atp[c] atp[c] -> DM_coa[c] coa[c] -> DM_e4p[c] e4p[c] -> DM_f6p[c] f6p[c] -> DM_g3p[c] g3p[c] -> DM_g6p[c] g6p[c] -> DM_gln-L[c] gln-L[c] -> DM_glu-L[c] glu-L[c] -> DM_h2o[c] h2o[c] -> DM_h[c] h[c] -> DM_nad[c] nad[c] -> DM_nadh[c] nadh[c] -> DM_nadp[c] nadp[c] -> DM_nadph[c] nadph[c] -> DM_oaa[c] oaa[c] -> DM_pep[c] pep[c] -> DM_pi[c] pi[c] -> DM_pyr[c] pyr[c] -> DM_r5p[c] r5p[c] ->
For each biomass component i, perform the following test:
for i = 1 : length(BiomassComponents)
62. Change objective function to the demand function:
modelEcore_NEW = changeObjective(modelEcore_NEW, rxnNames(i));
63. Maximize (‘max’) for new objective function (Demand function)
FBAsolution = optimizeCbModel(modelEcore_NEW, 'max');
BiomassComponentsValue(i,1) = FBAsolution.f;
end
[BiomassComponents num2cell(BiomassComponentsValue)]
ans =
'3pg[c]' [ 10.0000] 'accoa[c]' [-1.1313e-16] 'adp[c]' [-1.7764e-15] 'akg[c]' [ 10.0000] 'atp[c]' [ 0] 'coa[c]' [ 3.4642e-15] 'e4p[c]' [ 7.5000] 'f6p[c]' [ 5.0000] 'g3p[c]' [ 10] 'g6p[c]' [ 5.0000] 'gln-L[c]' [ 10] 'glu-L[c]' [ 10.0000] 'h2o[c]' [ 1000] 'h[c]' [ 1000] 'nad[c]' [ 8.3105e-16] 'nadh[c]' [ 2.9009e-27] 'nadp[c]' [ 9.1312e-29] 'nadph[c]' [-1.1583e-16] 'oaa[c]' [ 10.0000] 'pep[c]' [ 10.0000] 'pi[c]' [ 1.0000e+03] 'pyr[c]' [ 10.0000] 'r5p[c]' [ 6.0000]
printConstraints(modelEcore_NEW, -1000,1000)
MinConstraints: ACONTb 10 ATPM 8.39 EX_glc(e) -10 maxConstraints: ATPM 8.39 ATPS4r 100
modelEcore = changeObjective(modelEcore, modelEcore.rxns(strmatch('Biomass', modelEcore.rxns)));
[objectiveAbbr] = checkObjective(modelEcore)
summaryT =
Coefficient Metabolite metID Reaction RxnID ___________ __________ _____ ________________________ _____ -1.496 3pg[c] 3 Biomass_Ecoli_core_w_GAM 13 -3.7478 accoa[c] 10 Biomass_Ecoli_core_w_GAM 13 59.81 adp[c] 13 Biomass_Ecoli_core_w_GAM 13 4.1182 akg[c] 14 Biomass_Ecoli_core_w_GAM 13 -59.81 atp[c] 17 Biomass_Ecoli_core_w_GAM 13 3.7478 coa[c] 21 Biomass_Ecoli_core_w_GAM 13 -0.361 e4p[c] 23 Biomass_Ecoli_core_w_GAM 13 -0.0709 f6p[c] 26 Biomass_Ecoli_core_w_GAM 13 -0.129 g3p[c] 33 Biomass_Ecoli_core_w_GAM 13 -0.205 g6p[c] 34 Biomass_Ecoli_core_w_GAM 13 -0.2557 gln-L[c] 36 Biomass_Ecoli_core_w_GAM 13 -4.9414 glu-L[c] 38 Biomass_Ecoli_core_w_GAM 13 -59.81 h2o[c] 41 Biomass_Ecoli_core_w_GAM 13 59.81 h[c] 43 Biomass_Ecoli_core_w_GAM 13 -3.547 nad[c] 50 Biomass_Ecoli_core_w_GAM 13 3.547 nadh[c] 51 Biomass_Ecoli_core_w_GAM 13 13.028 nadp[c] 52 Biomass_Ecoli_core_w_GAM 13 -13.028 nadph[c] 53 Biomass_Ecoli_core_w_GAM 13 -1.7867 oaa[c] 58 Biomass_Ecoli_core_w_GAM 13 -0.5191 pep[c] 59 Biomass_Ecoli_core_w_GAM 13 59.81 pi[c] 60 Biomass_Ecoli_core_w_GAM 13 -2.8328 pyr[c] 62 Biomass_Ecoli_core_w_GAM 13 -0.8977 r5p[c] 66 Biomass_Ecoli_core_w_GAM 13
objectiveAbbr =
'Biomass_Ecoli_core_w_GAM'
64. Identify reactions that are mainly responsible for synthesizing the biomass component.
BiomassComponentsPos = modelEcore.mets(find(modelEcore.S(:, strmatch('Biomass', modelEcore.rxns)) > 0))
BiomassComponentsPos =
'adp[c]' 'akg[c]' 'coa[c]' 'h[c]' 'nadh[c]' 'nadp[c]' 'pi[c]'
BiomassComponentsNeg = modelEcore.mets(find(modelEcore.S(:, strmatch('Biomass', modelEcore.rxns)) < 0))
BiomassComponentsNeg =
'3pg[c]' 'accoa[c]' 'atp[c]' 'e4p[c]' 'f6p[c]' 'g3p[c]' 'g6p[c]' 'gln-L[c]' 'glu-L[c]' 'h2o[c]' 'nad[c]' 'nadph[c]' 'oaa[c]' 'pep[c]' 'pyr[c]' 'r5p[c]'
65. For each of these metabolites, we add sink reactions for components with a positive coefficient and demand reactions for components with a negative coefficient.
[modelEcore_NEW] = addSinkReactions(modelEcore, BiomassComponentsPos);
sink_adp[c] adp[c] <=> sink_akg[c] akg[c] <=> sink_coa[c] coa[c] <=> sink_h[c] h[c] <=> sink_nadh[c] nadh[c] <=> sink_nadp[c] nadp[c] <=> sink_pi[c] pi[c] <=>
[modelEcore_NEW, rxnNames] = addDemandReaction(modelEcore_NEW, BiomassComponentsNeg);
DM_3pg[c] 3pg[c] -> DM_accoa[c] accoa[c] -> DM_atp[c] atp[c] -> DM_e4p[c] e4p[c] -> DM_f6p[c] f6p[c] -> DM_g3p[c] g3p[c] -> DM_g6p[c] g6p[c] -> DM_gln-L[c] gln-L[c] -> DM_glu-L[c] glu-L[c] -> DM_h2o[c] h2o[c] -> DM_nad[c] nad[c] -> DM_nadph[c] nadph[c] -> DM_oaa[c] oaa[c] -> DM_pep[c] pep[c] -> DM_pyr[c] pyr[c] -> DM_r5p[c] r5p[c] ->
Now lets repeat the analysis (note that we minimize the objective):
for i = 1 : length(BiomassComponentsPos)
modelEcore_NEW = changeObjective(modelEcore_NEW, strcat('sink_', BiomassComponentsPos(i)));
FBAsolution = optimizeCbModel(modelEcore_NEW, 'min');
BiomassComponentsValuePos(i, 1) = FBAsolution.f;
end
[BiomassComponentsPos num2cell(BiomassComponentsValuePos)]
ans =
'adp[c]' [ -1000] 'akg[c]' [ -1000] 'coa[c]' [-663.3333] 'h[c]' [ -1000] 'nadh[c]' [ -1000] 'nadp[c]' [ -1000] 'pi[c]' [ -1000]
Now lets repeat the analysis (note that we maximize the objective):
for i = 1 : length(BiomassComponentsNeg)
modelEcore_NEW = changeObjective(modelEcore_NEW, strcat('DM_', BiomassComponentsNeg(i)));
FBAsolution = optimizeCbModel(modelEcore_NEW, 'max');
BiomassComponentsValueNeg(i, 1) = FBAsolution.f;
end
[BiomassComponentsNeg num2cell(BiomassComponentsValueNeg)]
ans =
'3pg[c]' [ 555.8050] 'accoa[c]' [ 663.3333] 'atp[c]' [ 1000] 'e4p[c]' [ 280.4025] 'f6p[c]' [ 186.9350] 'g3p[c]' [ 370.5367] 'g6p[c]' [ 186.9350] 'gln-L[c]' [ 500] 'glu-L[c]' [1.0000e+03] 'h2o[c]' [ 1000] 'nad[c]' [1.0000e+03] 'nadph[c]' [ 1000] 'oaa[c]' [ 990.1464] 'pep[c]' [ 555.8050] 'pyr[c]' [ 992.5000] 'r5p[c]' [ 224.3220]
FBAsolution = optimizeCbModel(modelEcore, 'max');
FBAsolution.f
ans = 0.7082
FBAsolution.x contains the flux value for each reaction in the network. To view the flux values use:
printFluxVector(modelEcore, FBAsolution.x, 'true')
ACONTa 10 ACONTb 10 ADK1 9.68291 ATPM 8.39 ATPS4r 61.4747 Biomass_Ecoli_core_w_GAM 0.708235 CO2t -29.8607 CS 10 CYTBD 58.0838 ENO 17.0578 EX_co2(e) 29.8607 EX_glc(e) -10 EX_h(e) 14.2072 EX_h2o(e) 35.0198 EX_nh4(e) -3.86187 EX_o2(e) -29.0419 EX_pi(e) -2.60539 FBA 9.29552 FUM 9.23588 GAPD 18.1173 GLCpts 10 GLNS 0.181096 GLUDy -3.68077 H2Ot -35.0198 ICDHyr 0.764115 ICL 9.23588 MALS 9.23588 MDH -5.10764 ME2 23.5794 NADH16 48.8479 NADTRHD 11.4359 NH4t 3.86187 O2t 29.0419 PDH 21.8902 PFK 9.29552 PGI 9.85481 PGK -18.1173 PGM -17.0578 PIt2r 2.60539 PPC 16.373 PPS 9.68291 RPE -0.50908 RPI -0.50908 SUCDi 9.23588 TALA -0.126703 TKT1 -0.126703 TKT2 -0.382376 TPI 9.29552
FBAsolution = optimizeCbModel(modelEcore, 'max', 'zero');
FBAsolution.f
ans = 0.7082
printFluxVector(modelEcore, FBAsolution.x, 'true')
ACONTa 10 ACONTb 10 AKGDH 3.69917 ATPM 8.39 ATPS4r 31.7196 Biomass_Ecoli_core_w_GAM 0.708235 CO2t -11.6696 CS 10 CYTBD 39.8928 ENO 17.0578 EX_co2(e) 11.6696 EX_for(e) 18.191 EX_glc(e) -10 EX_h(e) 32.3982 EX_h2o(e) 16.8287 EX_nh4(e) -3.86187 EX_o2(e) -19.9464 EX_pi(e) -2.60539 FBA 9.29552 FORti 18.191 FUM 9.23588 GAPD 18.1173 GLCpts 10 GLNS 0.181096 GLUDy -3.68077 H2Ot -16.8287 ICDHyr 4.46328 ICL 5.53672 MALS 5.53672 MDH 11.2654 ME2 3.5072 NADH16 30.6569 NH4t 3.86187 O2t 19.9464 PFK 9.29552 PFL 18.191 PGI 9.85481 PGK -18.1173 PGM -17.0578 PIt2r 2.60539 PYK 6.69013 RPE -0.50908 RPI -0.50908 SUCDi 9.23588 SUCOAS -3.69917 TALA -0.126703 THD2 4.93711 TKT1 -0.126703 TKT2 -0.382376 TPI 9.29552
66. To test whether the biomass precursors can be produced in other growth media, repeat steps 60-65.

Steps 67 - 70. Test if model can produce known secretion products.

67. Collect a list of known secretion products and medium conditions.
For this example, acetate secretion ('EX_ac(e)') was chosen and let's require that secretion flux is at least 2 mmol/gDW/h (i.e. the lower bound is constrained to 2).
modelEcore_New = changeRxnBounds(modelEcore, 'EX_ac(e)', 2, 'l');
68. Set the constraints to the desired medium condition (e.g., minimal medium + carbon source). For changing the constraints use the following function:
modelEcore_New = changeRxnBounds(modelEcore, {'EX_glc(e)' 'EX_o2(e)'}, [-10 -18.5], 'l');
Note: If the model is required to grow in addition to producing the by-product, set the lower bound of the biomass reaction to the corresponding value required for growth. We determine the value based on the following FBA solution.
modelEcore_New = changeObjective(modelEcore_New, 'Biomass_Ecoli_core_w_GAM');
FBAsolution = optimizeCbModel(modelEcore_New, 'max')
FBAsolution =
full: [95×1 double] obj: 0.6411 rcost: [95×1 double] dual: [72×1 double] solver: 'glpk' algorithm: 'default' stat: 1 origStat: 5 time: 0.0260 basis: [] x: [95×1 double] f: 0.6411 y: [72×1 double] w: [95×1 double] v: [95×1 double]
modelEcore_New = changeRxnBounds(modelEcore_New, 'Biomass_Ecoli_core_w_GAM', 0.64, 'l');
69. Change the objective function to the exchange reaction of your secretion product (i.e., acetate):
modelEcore_New = changeObjective(modelEcore_New, 'EX_ac(e)');
70. Maximize (‘max’) for the new objective function (as a secretion is expected to have a positive flux value, see Figure ):
FBAsolution = optimizeCbModel (modelEcore_New, 'max')
FBAsolution =
full: [95×1 double] obj: 0.0448 rcost: [95×1 double] dual: [72×1 double] solver: 'glpk' algorithm: 'default' stat: 1 origStat: 5 time: 0.0260 basis: [] x: [95×1 double] f: 0.0448 y: [72×1 double] w: [95×1 double] v: [95×1 double]
printConstraints(modelEcore_New, -1000,1000)
MinConstraints: ACONTb 10 ATPM 8.39 Biomass_Ecoli_core_w_GAM 0.64 EX_glc(e) -10 EX_o2(e) -18.5 maxConstraints: ATPM 8.39 ATPS4r 100

Steps 71 - 75. Test if model can produce a certain ratio of two secretion products.

Acetate and Formate secretion are the two secretion products used in this example.
71. Set the constraints to the desired medium condition (e.g., minimal medium + carbon source). As shown above in step 68.
72. Let's verify that both metabolites can be secreted independently. Repeat steps 69 and 70.
modelEcoreAc = changeObjective(modelEcore, 'EX_ac(e)');
FBAsolution = optimizeCbModel(modelEcoreAc, 'max')
FBAsolution =
full: [95×1 double] obj: 10.0000 rcost: [95×1 double] dual: [72×1 double] solver: 'glpk' algorithm: 'default' stat: 1 origStat: 5 time: 0.0270 basis: [] x: [95×1 double] f: 10.0000 y: [72×1 double] w: [95×1 double] v: [95×1 double]
modelEcoreFor = changeObjective(modelEcore, 'EX_for(e)');
FBAsolution = optimizeCbModel(modelEcoreFor, 'max')
FBAsolution =
full: [95×1 double] obj: 40.0000 rcost: [95×1 double] dual: [72×1 double] solver: 'glpk' algorithm: 'default' stat: 1 origStat: 5 time: 0.0260 basis: [] x: [95×1 double] f: 40.0000 y: [72×1 double] w: [95×1 double] v: [95×1 double]
73. Add a row to the S matrix to couple the by-product secretion reactions:
modelEcore_NEW = addRatioReaction(modelEcore, {'EX_ac(e)' 'EX_for(e)'}, [1 1]);
Also, let's require that the acetate secretion flux is at least 1 mmol/gDW/h (i.e. the lower bound is constrained to 1).
modelEcore_NEW = changeRxnBounds(modelEcore_NEW, 'EX_ac(e)', 1, 'l');
Note: If the model is required to grow in addition to producing the by-product, set the lower bound of the biomass reaction to the corresponding value required for growth. We determine the value based on the following FBA solution.
modelEcore_NEW = changeObjective(modelEcore_NEW, 'Biomass_Ecoli_core_w_GAM');
FBAsolution = optimizeCbModel(modelEcore_NEW, 'max')
FBAsolution =
full: [95×1 double] obj: 0.6374 rcost: [95×1 double] dual: [73×1 double] solver: 'glpk' algorithm: 'default' stat: 1 origStat: 5 time: 0.0260 basis: [] x: [95×1 double] f: 0.6374 y: [73×1 double] w: [95×1 double] v: [95×1 double]
modelEcore_NEW = changeRxnBounds(modelEcore_NEW, 'Biomass_Ecoli_core_w_GAM', 0.63, 'l');
What is the flux through the two secretion reactions:
FBAsolution.x(find(ismember(modelEcore_NEW.rxns, 'EX_for(e)')))
ans = 1
FBAsolution.x(find(ismember(modelEcore_NEW.rxns, 'EX_ac(e)')))
ans = 1
74. Change the objective function to the exchange reaction of one of your secretion products:
modelEcore_NEW = changeObjective(modelEcore_NEW, 'EX_ac(e)');
75. Maximize for the new objective function (as a secretion is expected to have a positive flux value):
FBAsolution = optimizeCbModel (modelEcore_NEW, 'max')
FBAsolution =
full: [95×1 double] obj: 1.1047 rcost: [95×1 double] dual: [73×1 double] solver: 'glpk' algorithm: 'default' stat: 1 origStat: 5 time: 0.0260 basis: [] x: [95×1 double] f: 1.1047 y: [73×1 double] w: [95×1 double] v: [95×1 double]
FBAsolution.x(find(ismember(modelEcore_NEW.rxns, 'EX_for(e)')))
ans = 1.1047

Steps 76 - 77. Check for blocked reactions.

76. Change simulation conditions to rich medium or open all exchange reactions.
selExc = findExcRxns(modelEcore);
ExR = modelEcore.rxns(selExc);
modelEcore_Open = changeRxnBounds(modelEcore, ExR, -1000, 'l');
modelEcore_Open = changeRxnBounds(modelEcore_Open, ExR, 1000, 'u');
printConstraints(modelEcore_Open, -1000, 1000)
MinConstraints: ACONTb 10 ATPM 8.39 maxConstraints: ATPM 8.39 ATPS4r 100
printUptakeBound(modelEcore_Open);
EX_ac(e) -1000 EX_acald(e) -1000 EX_akg(e) -1000 EX_co2(e) -1000 EX_etoh(e) -1000 EX_for(e) -1000 EX_fru(e) -1000 EX_fum(e) -1000 EX_glc(e) -1000 EX_gln_L(e) -1000 EX_glu_L(e) -1000 EX_h(e) -1000 EX_h2o(e) -1000 EX_lac_D(e) -1000 EX_mal_L(e) -1000 EX_nh4(e) -1000 EX_o2(e) -1000 EX_pi(e) -1000 EX_pyr(e) -1000 EX_succ(e) -1000
77. Run analysis for blocked reactions. The findBlockedReaction function returns a list of blocked reactions (‘BlockedReactions’).
BlockedReactions = findBlockedReaction(modelEcore_Open)
Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers.
BlockedReactions =
{''}

Steps 79 - 80. Compute single gene deletion phenotypes

79. Use The Cobra Toolbox function, singleGeneDeletion, to simulate gene deletion:
[grRatio, grRateKO, grRateWT, hasEffect] = singleGeneDeletion(modelEcore);
Single gene deletion analysis in progress ...
Let's visualize hasEffect:
bar(sort(hasEffect,'descend'))
xlabel('Model genes (rank ordered)');
ylabel('Lethality (1 = lethal, 0 = no effect)')
How many genes are in my model?
length(modelEcore.genes)
ans = 137
How many genes have an effect and which ones?
length(find(hasEffect))
ans = 71
Which gene deletions had an affect?
modelEcore.genes(find(hasEffect))
ans =
'b0114' 'b0115' 'b0116' 'b0474' 'b0720' 'b0721' 'b0722' 'b0723' 'b0724' 'b0726' 'b0727' 'b0728' 'b0729' 'b0767' 'b0809' 'b0810' 'b0811' 'b1136' 'b1479' 'b1602' 'b1603' 'b1702' 'b1761' 'b1779' 'b1817' 'b1818' 'b1819' 'b1852' 'b2029' 'b2276' 'b2277' 'b2278' 'b2279' 'b2280' 'b2281' 'b2282' 'b2283' 'b2284' 'b2285' 'b2286' 'b2287' 'b2288' 'b2415' 'b2416' 'b2463' 'b2587' 'b2779' 'b2926' 'b3212' 'b3213' 'b3236' 'b3403' 'b3528' 'b3731' 'b3732' 'b3733' 'b3734' 'b3735' 'b3736' 'b3737' 'b3738' 'b3919' 'b3956' 'b4015' 'b4025' 'b4077' 'b4151' 'b4152' 'b4153' 'b4154' 's0001'
Which gene deletions are lethal?
tol = 1e-3;
LethalGenes = modelEcore.genes(find(grRateKO < tol))
LethalGenes =
'b0721' 'b0722' 'b0723' 'b0724' 'b1136' 'b3919'
length(LethalGenes)
ans = 6
Some entries in 'grRateKO' maybe NaN. This is because the model is infeasible for those knockouts due to the lower bound on the 'ATPM'. We will replace those instances by zero.
grRateKO(isnan(grRateKO)) = 0;
 
bar(sort(grRateKO, 'descend'));
xlabel('Model genes (rank ordered)');
ylabel('Growth rate (1/hr)')
80. Compare with experimental data.

Steps 81-82. Test for known incapabilities of the organism.

81. Set simulation condition for comparisons with known incapabilities.
modelIncapable = changeRxnBounds(modelEcore, 'EX_glc(e)', 0, 'l');
modelIncapable = changeRxnBounds(modelIncapable, 'EX_ac(e)', -10, 'l');
FBAsolution = optimizeCbModel(modelIncapable, 'max', false)
FBAsolution =
full: [95×1 double] obj: 5.2916e-16 rcost: [95×1 double] dual: [72×1 double] solver: 'glpk' algorithm: 'default' stat: 1 origStat: 5 time: 0.0380 basis: [] x: [95×1 double] f: 5.2916e-16 y: [72×1 double] w: [95×1 double] v: [95×1 double]
82. If the in silico model is capable of a function that the organism is incapable of in vitro, use single-reaction deletion to identify candidate reactions that enable the model’s capability despite known incapability (see step 79).

Step 83. Compare predicted physiological properties with known properties.

Use previous steps of the tutorial and compare known physiological, phenotypic, or genetic properties with the model capabilities.

Steps 84-87. Test if the model can grow fast enough.

84. Optimize for biomass reaction in different medium conditions and compare with experimental data.
85. Test if any of the medium components are growth limiting.
86. Maximize for biomass.
87. Determine reduced costs when maximizing biomass (see steps 88-89).

Steps 88 - 89. Test if the model grows too fast.

When optimization results for a biomass reaction in different medium conditions are compared with experimental data, one can evaluate if the model grows too fast. Analysis of modeling constraints, reduced cost and single gene deletion, are helpful in the evaluation of growth conditions.
88. Determine the reduced cost associated with network reactions when optimizing for an objective function:
FBAsolution = optimizeCbModel(modelEcore, 'max', false)
FBAsolution =
full: [95×1 double] obj: 0.7082 rcost: [95×1 double] dual: [72×1 double] solver: 'glpk' algorithm: 'default' stat: 1 origStat: 5 time: 0.0310 basis: [] x: [95×1 double] f: 0.7082 y: [72×1 double] w: [95×1 double] v: [95×1 double]
Print those reactions that have the smallest reduced cost associated:
[a,b] = sort(FBAsolution.w, 'descend');
modelEcore.rxns(b(1:10))
ans =
'EX_fru(e)' 'EX_glc(e)' 'EX_succ(e)' 'EX_ac(e)' 'EX_acald(e)' 'EX_etoh(e)' 'EX_fum(e)' 'EX_mal_L(e)' 'EX_akg(e)' 'EX_glu_L(e)'
a(1:10)
ans =
0.1416 0.1416 0.0708 0.0708 0.0708 0.0708 0.0708 0.0708 0.0708 0.0708
89. Beforehand we proceed, let's verify the constraints are set as intended:
printConstraints(modelEcore, -1000, 1000);
MinConstraints: ACONTb 10 ATPM 8.39 EX_glc(e) -10 maxConstraints: ATPM 8.39 ATPS4r 100
printUptakeBound(modelEcore);
EX_co2(e) -1000 EX_glc(e) -10 EX_h(e) -1000 EX_h2o(e) -1000 EX_nh4(e) -1000 EX_o2(e) -1000 EX_pi(e) -1000
Increase the flux through EX_glc(e) by one unit (keep in mind that uptake is defined as a negative flux through exchange reactions):
modelEcore_New = changeRxnBounds(modelEcore, 'EX_glc(e)', -11, 'l');
FBAsolution = optimizeCbModel(modelEcore_New, 'max', false)
FBAsolution =
full: [95×1 double] obj: 0.8499 rcost: [95×1 double] dual: [72×1 double] solver: 'glpk' algorithm: 'default' stat: 1 origStat: 5 time: 0.0330 basis: [] x: [95×1 double] f: 0.8499 y: [72×1 double] w: [95×1 double] v: [95×1 double]

93. Use single-reaction deletion to identify single reactions that may enable the model to grow too fast.

[grRatio, grRateKO, grRateWT] = singleRxnDeletion(modelEcore);
Single reaction deletion analysis in progress ... 1% [ ]2% [ ]3% [. ]4% [. ]5% [.. ]6% [.. ]7% [.. ]8% [... ]9% [... ]10% [.... ]11% [.... ]12% [.... ]13% [..... ]14% [..... ]15% [...... ]16% [...... ]17% [...... ]18% [....... ]20% [........ ]21% [........ ]22% [........ ]23% [......... ]24% [......... ]25% [.......... ]26% [.......... ]27% [.......... ]28% [........... ]29% [........... ]30% [............ ]31% [............ ]32% [............ ]33% [............. ]34% [............. ]35% [.............. ]36% [.............. ]37% [.............. ]38% [............... ]40% [................ ]41% [................ ]42% [................ ]43% [................. ]44% [................. ]45% [.................. ]46% [.................. ]47% [.................. ]48% [................... ]49% [................... ]50% [.................... ]51% [.................... ]52% [.................... ]53% [..................... ]54% [..................... ]55% [...................... ]56% [...................... ]57% [...................... ]58% [....................... ]60% [........................ ]61% [........................ ]62% [........................ ]63% [......................... ]64% [......................... ]65% [.......................... ]66% [.......................... ]67% [.......................... ]68% [........................... ]69% [........................... ]70% [............................ ]71% [............................ ]72% [............................ ]73% [............................. ]74% [............................. ]75% [.............................. ]76% [.............................. ]77% [.............................. ]78% [............................... ]80% [................................ ]81% [................................ ]82% [................................ ]83% [................................. ]84% [................................. ]85% [.................................. ]86% [.................................. ]87% [.................................. ]88% [................................... ]89% [................................... ]90% [.................................... ]91% [.................................... ]92% [.................................... ]93% [..................................... ]94% [..................................... ]95% [...................................... ]96% [...................................... ]97% [...................................... ]98% [....................................... ]100% [........................................]
plot(grRateKO)

94. Reduced cost.

The reduced cost analysis can be used to identify those reactions that can reduce the growth rate (positive cost value). Reduced cost demonstrated as as part of step 89.
95. Print Matlab model content.
if ~isfield(modelEcore, 'osense')
modelEcore.osense = -1;
end
if ~isfield(modelEcore, 'csense')
modelEcore.csense(1:length(modelEcore.mets), 1) = 'E';
end
writeCbModel(modelEcore, 'sbml', 'EColiCore.xml')
Document written
ans =
constraint: [1×0 struct] functionDefinition: [1×0 struct] event: [1×0 struct] rule: [1×0 struct] unitDefinition: [1×1 struct] initialAssignment: [1×0 struct] SBML_level: 3 SBML_version: 1 annotation: '' areaUnits: '' avogadro_symbol: '' conversionFactor: '' delay_symbol: '' extentUnits: '' fbc_activeObjective: 'obj' fbc_version: 2 id: 'COBRAModel' lengthUnits: '' metaid: 'COBRAModel' name: '' notes: '' sboTerm: -1 substanceUnits: '' timeUnits: '' time_symbol: '' typecode: 'SBML_MODEL' volumeUnits: '' species: [1×72 struct] compartment: [1×2 struct] parameter: [1×7 struct] reaction: [1×95 struct] fbc_fluxBound: [1×2 struct] fbc_geneProduct: [1×137 struct] fbc_objective: [1×1 struct] namespaces: [1×2 struct] fbc_strict: 1

TIMING

The tutorial runs as given in a few seconds to minutes. However, if you use this tutorial for debugging and generating your own model, please consider the timing of the steps, as they have been given in the original protocol [1].
The timing of the entire reconstruction process depends on the properties of the target organism (prokaryote vs. eukaryote, genome size), the quality of the genome annotation, and the availability of experimental data.
The timing listed below represents an average and can be used to plan the different stages.
Step 1 - 4 (Stage 1, draft reconstruction): days to a week.
Step 5 (Stage 1, collection of experimental data): ongoing throughout the reconstruction process
Step 6 - 23 (Stage 2, reconstruction refinement): months to a year (if debugging and gap filling is done along the way)
Step 24 - 32 (Stage 2, biomass determination): days to weeks, depending on data availability
Step 34 - 36 (Stage 2, biomass determination): days to a week.
Step 37 (Stage 2, growth requirements): days to weeks, depending on data availability
Step 38 - 42 (Stage 3, conversion): days to a week.
Step 43 - 94 (Stage 4, network evaluation/debugging): week to months.
Step 95 - 96 (Data assembly): days to weeks, depending how much and in which format data was collected.

TROUBLESHOOTING

As given in original protocol [1].
Step 38 See installation instructions of the COBRA Toolbox for details on how to install and setup Matlab, SBML and COBRA Toolbox.
Step 51 Make sure that you are working in the directory were the X3.exe script was copied to. The .expa file produced by the function must be in the same directory as X3.exe.

ANTICIPATED RESULTS

As given in original protocol [1].
This protocol will result in a reconstruction that covers most of the known metabolic information of the target organism and represents a knowledge database. This reconstruction can be used as a resource for information (query tool), high-throughput data mapping (context for content), and a starting point for mathematical models.

References

[1] Thiele I, Palsson BO: A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc 2010, 5:93-121.
[2] Orth JD, Fleming RM, Palsson BO: Reconstruction and Use of Microbial Metabolic Networks: the Core Escherichia coli Metabolic Model as an Educational Guide. EcoSal Plus 2010, 4.
[3] Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis V, Palsson BO: A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Molecular systems biology 2007, 3:121.
[4] Geobacter sulfurreducans