mgPipe

adaptVMHDietToAGORA(VMHDiet, setupUsed, AGORAPath)[source]

Part of the Microbiome Modeling Toolbox. This function adapts a diet generated by the Diet Designer on https://www.vmh.life such that microbiome community models created from the AGORA resource can generate biomass. All metabolites required by at least one AGORA model are added. Note that the adapted diet that is the output of this function is specific to the AGORA resource. It is not guaranteed that other constraint-based models can produce biomass on this diet. Units are given in mmol/day/person.

Usage

[adaptedDietConstraints, growthOK] = adaptVMHDietToAGORA(VMHDiet, setupUsed, AGORAPath)

Inputs

  • VMHDietConstraints – Name of text file with VMH exchange reaction IDs and values on lower bounds generated by Diet Designer on https://www.vmh.life (or manually).
  • setupUsed – Model setup for which the adapted diet will be used. Allowed inputs are AGORA (the single AGORA models), Pairwise (the microbe-microbe models generated by the pairwise modeling module), and Microbiota (the microbe community models generated by MgPipe).

Optional input

  • AGORAPath – Path to the AGORA model resource. If entered, growth of all single models on the adapted diet will be tested.

Output

  • adaptedDiet – Cell array of exchange reaction IDs, values on lower bounds, and values on upper bounds that can serve as input for the function useDiet.

Optional output

  • growthOK – Variable indicating whether all AGORA models could grow on the adapted diet (if 1 then yes).
addMicrobeCommunityBiomass(model, microbeNames, abundances)[source]

Adds a community biomass reaction to a model structure with multiple microbes based on their relative abundances. If no abundance values are provided, all n microbes get equal weights (1/n). Assumes a lumen compartment [u] and a fecal secretion comparment [fe]. Creates a community biomass metabolite ‘microbeBiomass’ that is secreted from [u] to [fe] and exchanged from fecal compartment.

Usage

model = addMicrobeCommunityBiomass(model, microbeNames, abundances)

Inputs

  • model – COBRA model structure with n joined microbes with biomass metabolites ‘Microbe_biomass[c]’.
  • microbeNames – nx1 cell array of n unique strings that represent each microbe in the model.

Optional input

  • abundances – nx1 vector with the relative abundance of each microbe.

Output

  • model – COBRA model structure
checkNomenConsist(organisms, autoFix)[source]

This function checks consistence of inputs (organisms names). If the parameter autoFix == 0 the function halts execution with error msg when inconsistences are detected, otherwise it really tries hard to fix the problem and continues execution.

Usage

[autoStat, fixVec, organisms] = checkNomenConsist(organisms, autoFix)

Inputs

  • organisms – nx1 cell array cell array with names of organisms in the study
  • autoFix – double indicating if to try to automatically fix inconsistencies

Outputs

  • autoStat – double indicating if inconsistencies were found
  • fixVec – nx1 cell array cell array with names of individuals in the study
  • organisms – nx1 cell array cell array with non ambiguous names of organisms in the study
createPanModels(agoraPath, panPath, taxonLevel)[source]

This function creates pan-models for all unique taxa (e.g., species) included in the AGORA resource. If reconstructions of multiple strains in a given taxon are present, the reactions in these reconstructions will be combined into a pan-reconstruction. The pan-biomass reactions will be built from the average of all biomasses. Futile cycles that result from the newly combined reaction content are removed by setting certain reactions irreversible. These reactions have been determined manually. NOTE: Futile cycle removal has only been tested at the species and genus level. Pan-models at higher taxonomical levels (e.g., family) may contain futile cycles and produce unrealistically high ATP flux. The pan-models can be used an input for mgPipe if taxon abundance data is available at a higher level than strain, e.g., species, genus.

Usage

createPanModels(agoraPath,panPath,taxonLevel)

Inputs

  • agoraPath String containing the path to the AGORA reconstructions.

    Must end with a file separator.

  • panPath String containing the path to an empty folder that the

    created pan-models will be stored in. Must end with a file separator.

  • taxonLevel String with desired taxonomical level of the pan-models.

    Allowed inputs are ‘Species’,’Genus’,’Family’,’Order’, ‘Class’,’Phylum’.

createPersonalizedModel(abunFilePath, resPath, model, sampName, orglist, patNumb)[source]

This function creates personalized models from integration of given organisms abundances into the previously built global setup. Coupling constraints are also added for each organism. All the operations are parallelized and the generated personalized models directly saved in .mat format.

Usage

[createdModels] = createPersonalizedModel(abunFilePath, resPath, model, sampName, orglist, patNumb)

Inputs

  • infoPath – char with path of directory and file name from where to retrieve abundance information
  • resPath – char with path of directory where results are saved
  • model – “global setup” model in COBRA model structure format
  • sampName – cell array with names of individuals in the study
  • orglist – cell array with names of organisms in the study
  • patNumb – number (double) of individuals in the study

Output

  • createdModels – created personalized models
detectOutput(resPath, objNam)[source]

This function checks the existence of a specific file in the results folder.

Usage

mapP = detectOutput(resPath, objNam)

Inputs

  • resPath – char with path of directory where results are saved
  • objNam – char with name of object to find in the results folder

Output

  • mapP – double indicating if object was found in the result folder
extractFullRes(resPath, ID, dietType, sampName, fvaCt, nsCt)[source]

This function is called from the MgPipe pipeline. Its purpose is to retrieve and export, in a comprehensive way, all the results (fluxes) computed during the simulations for a specified diet. Since FVA is computed on diet and fecal exchanges, every metabolite will have four different values for each individual, values corresponding min and max of uptake and secretion.

Usage

[finRes]= extractFullRes(resPath, ID, dietType, sampName, fvaCt, nsCt)

Inputs

  • resPath – char with path of directory where results are saved
  • ID – cell array with list of all unique Exchanges to diet/ fecal compartment
  • dietType – char indicating under which diet to extract results: rDiet (rich diet), sDiet(previously specified diet) set by default, and pDiet(personalized)if available
  • sampName – nx1 cell array cell array with names of individuals in the study
  • fvaCt – cell array containing FVA values for maximal uptake
  • nsCt – cell array containing FVA values for minimal uptake and secretion for setup lumen / diet exchanges

Output

  • finRes – cell array with min and max value of uptake and secretion for each metabolite
fastSetupCreator(models, microbeNames, host, objre)[source]

creates a microbiota model (min 1 microbe) that can be coupled with a host model. Microbes and host are connected with a lumen compartment [u], host can secrete metabolites into body fluids [b]. Diet is simulated as uptake through the compartment [d], transporters are unidirectional from [d] to [u]. Secretion goes through the fecal compartment [fe], transporters are unidirectional from [u] to [fe]. Reaction types Diet exchange: ‘EX_met[d]’: ‘met[d] <=>’ Diet transporter: ‘DUt_met’: ‘met[d] -> met[u]’ Fecal transporter: ‘UFEt_met’: ‘met[u] -> met[fe]’ Fecal exchanges: ‘EX_met[fe]’: ‘met[fe] <=>’ Microbe uptake/secretion: ‘Microbe_IEX_met[c]tr’: ‘Microbe_met[c] <=> met[u]’ Host uptake/secretion lumen: ‘Host_IEX_met[c]tr’: ‘Host_met[c] <=> met[u]’ Host exchange body fluids: ‘Host_EX_met(e)b’: ‘Host_met[b] <=>’

Inputs

  • models – nx1 cell array that contains n microbe models in COBRA model structure format
  • microbeNames – nx1 cell array of n unique strings that represent each microbe model. Reactions and metabolites of each microbe will get the corresponding microbeNames (e.g., ‘Ecoli’) prefix. Reactions will be named ‘Ecoli_RxnAbbr’ and metabolites ‘Ecoli_MetAbbr[c]’).
  • host – Host COBRA model structure, can be left empty if there is no host model
  • objre – char with reaction name of objective function of organisms

Output

  • model – COBRA model structure with all models combined
getIndividualSizeName(abunFilePath)[source]

This function automatically detects organisms, names and number of individuals present in the study.

Usage

[indNumb, sampName, organisms] = getIndividualSizeName(abunFilePath)

Input

  • abunFilePath – char with path and name of file from which to retrieve information

Outputs

  • indNumb – number of individuals in the study
  • sampName – nx1 cell array cell array with names of individuals in the study
  • organisms – nx1 cell array cell array with names of organisms in the study
getMappingInfo(models, abunFilePath, patNumb)[source]

This function automatically extracts information from strain abundances in different individuals and combines this information into different tables.

Usage

[reac, micRea, binOrg, patOrg, reacPat, reacNumb, reacSet, reacTab, reacAbun, reacNumber] = getMappingInfo(models, abunFilePath, patNumb)

Inputs

  • models – nx1 cell array that contains n microbe models in COBRA model structure format
  • abunFilePath – char with path and name of file from which to retrieve abundance information
  • patNumb – number of individuals in the study

Outputs

  • reac – cell array with all the unique set of reactions contained in the models
  • micRea – binary matrix assessing presence of set of unique reactions for each of the microbes
  • binOrg – binary matrix assessing presence of specific strains in different individuals
  • reacPat – matrix with number of reactions per individual (organism resolved)
  • reacSet – matrix with names of reactions of each individual
  • reacTab – char with names of individuals in the study
  • reacAbun – binary matrix with presence/absence of reaction per individual: to compare different individuals
  • reacNumber – number of unique reactions of each individual
guidedSim(model, fvaType, rl)[source]

This function is part of the MgPipe pipeline and runs FVAs on a series of selected reactions with different possible FVA functions. Solver is automatically set to ‘cplex’, objective function is maximized, and optPercentage set to 99.99.

Usage

[minFlux, maxFlux] = guidedSim(model, fvaType, rl)

Inputs

  • model – COBRA model structure with n joined microbes with biomass metabolites ‘Microbe_biomass[c]’.
  • fvaType – double indicating what FVA function to use fvaType=1 for fastFVA; fvaType=0 for fluxVariability.
  • rl – nx1 vector with the reactions of interest.
  • solver – char with slver name to use.

Outputs

  • minFlux – Minimum flux for each reaction
  • maxFlux – Maximum flux for each reaction

..Author: Federico Baldini, 2017-2018

initMgPipe(modPath, toolboxPath, resPath, dietFilePath, abunFilePath, indInfoFilePath, objre, figForm, numWorkers, autoFix, compMod, rDiet, extSolve, fvaType, autorun, printLevel)[source]

This function is called from the MgPipe driver StartMgPipe takes care of saving some variables in the environment (in case that the function is called without a driver), does some checks on the inputs, and automatically launches MgPipe. As matter of fact, if all the inputs are properly inserted in the function it can replace the driver.

Inputs

  • modPath – char with path of directory where models are stored
  • abunFilePath – char with path and name of file from which to retrieve abundance information

Optional inputs

  • toolboxPath – char with path of directory where the toolbox is saved
  • resPath – char with path of directory where results are saved
  • dietFilePath – char with path of directory where the diet is saved
  • abunFilePath – char with path and name of file from which to retrieve abundance information
  • indInfoFilePath – char indicating, if stratification criteria are available, full path and name to related documentation(default: no)
  • objre – char with reaction name of objective function of organisms
  • figForm – format to use for saving figures
  • numWorkers – boolean indicating the number of cores to use for parallelization
  • autoFix – double indicating if to try to automatically fix inconsistencies
  • compMod – boolean indicating if outputs in open format should be produced for each section (default: false)
  • rDiet – boolean indicating if to enable also rich diet simulations (default: false)
  • extSolve – boolean indicating if to save the constrained models to solve them externally (default: false)
  • fvaType – boolean indicating which function to use for flux variability (default: true)
  • autorun – boolean used to enable /disable autorun behavior (please set to true) (default: false)
  • printLevel – verbose level (default: 1)

Outputs

  • init – status of initialization
  • modPath – char with path of directory where models are stored
  • toolboxPath – char with path of directory where the toolbox is saved
  • resPath – char with path of directory where results are saved
  • dietFilePath – char with path of directory where the diet is saved
  • abunFilePath – char with path and name of file from which to retrieve abundance information
  • indInfoFilePath – char indicating, if stratification criteria are available, full path and name to related documentation(default: no)
  • objre – char with reaction name of objective function of organisms
  • figForm – format to use for saving figures
  • numWorkers – boolean indicating the number of cores to use for parallelization
  • autoFix – double indicating if to try to automatically fix inconsistencies
  • compMod – boolean indicating if outputs in open format should be produced for each section (1=T)
  • patStat – boolean indicating if documentation on health status is available
  • rDiet – boolean indicating if to enable also rich diet simulations
  • extSolve – boolean indicating if to save the constrained models to solve them externally
  • fvaType – boolean indicating which function to use for flux variability
  • autorun – boolean used to enable /disable autorun behavior (please set to 1)
loadUncModels(modPath, organisms, objre, printLevel)[source]

This function loads and unconstrains metabolic models from a specific folder

Usage

models = loadUncModels(modPath, organisms, objre)

Inputs

  • organisms – nx1 cell array cell array with names of organisms in the study
  • modPath – char with path of directory where models are stored
  • objre – char with reaction name of objective function of organisms
  • printLevel – Verbose level (default: printLevel = 1)

Output

  • models – nx1 cell array cell array with models of organisms in the study
makeDummyModel(numMets, numRxns)[source]

Makes an empty model with numMets rows for metabolites and numRxns columns for reactions. Includes all fields that are necessary to join models.

Usage

dummy = makeDummyModel(numMets, numRxns)

Inputs

  • numMets – Number of metabolites
  • numRxns – Number of reactions

Output

  • dummy – Empty COBRA model structure
mgSimResCollect(resPath, ID, sampName, rDiet, pDiet, patNumb, indInfoFilePath, fvaCt, figForm)[source]

This function is called from the MgPipe pipeline. Its purpose is to compute NMPCs from simulations with different diet on multiple microbiota models. Results are outputted as .csv and a PCoA on NMPCs to group microbiota models of individuals for similar metabolic profile is also computed and outputted.

Usage

[fSp, Y]= mgSimResCollect(resPath, ID, sampName, rDiet, pDiet, patNumb, indInfoFilePath, fvaCt, figForm)

Inputs

  • resPath – char with path of directory where results are saved
  • ID – cell array with list of all unique Exchanges to diet/ fecal compartment
  • sampName – nx1 cell array cell array with names of individuals in the study
  • rDiet – number (double) indicating if to simulate a rich diet
  • pDiet – number (double) indicating if a personalized diet is available and should be simulated
  • patNumb – number (double) of individuals in the study
  • indInfoFilePath – char indicating, if stratification criteria are available, full path and name to related documentation(default: no) is available
  • fvaCt – cell array containing FVA values for maximal uptake
  • figForm – char indicating the format of figures

Outputs

  • fSp – cell array with computed NMPCs
  • Y – classical multidimensional scaling
microbiotaModelSimulator(resPath, setup, sampName, dietFilePath, rDiet, pDiet, extSolve, patNumb, fvaType)[source]

This function is called from the MgPipe pipeline. Its purpose is to apply different diets (according to the user?s input) to the microbiota models and run simulations computing FVAs on exchanges reactions of the microbiota models. The output is saved in multiple .mat objects. Intermediate saving checkpoints are present.

Usage

[ID, fvaCt, nsCt, presol, inFesMat] = microbiotaModelSimulator(resPath, setup, sampName, dietFilePath, rDiet, pDiet, extSolve, patNumb, fvaType)

Inputs

  • resPath – char with path of directory where results are saved
  • setup – “global setup” model in COBRA model structure format
  • sampName – cell array with names of individuals in the study
  • dietFilePath – path to and name of the text file with dietary information
  • rDiet – number (double) indicating if to simulate a rich diet
  • pDiet – number (double) indicating if a personalized diet is available and should be simulated
  • extSolve – number (double) indicating if simulations will be not run in matlab but externally (models with imposed constraints are saved)
  • patNumb – number (double) of individuals in the study
  • fvaType – number (double) which FVA function to use(fastFVA =1)

Outputs

  • ID – cell array with list of all unique Exchanges to diet/ fecal compartment

  • fvaCt – cell array containing FVA values for maximal uptake and secretion for setup lumen / diet exchanges

  • nsCt – cell array containing FVA values for minimal uptake and secretion for setup lumen / diet exchanges

  • presol array containing values of microbiota models

    objective function

  • inFesMat cell array with names of infeasible microbiota models

parsave(fname, microbiota_model)[source]

Saves a model from a parfor loop - might not work in R2105b

Usage

parsave(fname, microbiota_model)

Inputs

  • fname – name of file
  • microbiota_model – name of variable
plotMappingInfo(resPath, patOrg, reacPat, reacTab, reacNumber, indInfoFilePath, figForm, sampName, organisms)[source]

This function computes and automatically plots information coming from the mapping data as metabolic diversity and classical multidimensional scaling of individuals’ reactions repertoire. If the last 2 arguments are specified MDS plots will be annotated with samples and organisms names

Usage

Y =plotMappingInfo(resPath, patOrg, reacPat, reacTab, reacNumber, indInfoFilePath, figForm, sampName, organisms)

Inputs

  • resPath – char with path of directory where results are saved
  • reac – nx1 cell array with all the unique set of reactions contained in the models
  • micRea – binary matrix assessing presence of set of unique reactions for each of the microbes
  • reacSet – matrix with names of reactions of each individual
  • reacTab – binary matrix with presence/absence of reaction per individual.
  • reacAbun – matrix with abundance of reaction per individual
  • reacNumber – number of unique reactions of each individual
  • indInfoFilePath – char indicating, if stratification criteria are available, full path and name to related documentation(default: no) is available
  • figForm – format to use for saving figures
  • sampName – nx1 cell array cell array with names of individuals in the study
  • organisms – nx1 cell array cell array with names of organisms in the study

Output

  • Y – classical multidimensional scaling of individuals’ reactions repertoire