Refinement¶

addRefinementComments(model, summary)[source]¶

Adds descriptions to the model.comments field based on refinement performed by the DEMETER pipeline

USAGE:: model = addRefinementComments (model,summary)

INPUT model: COBRA model structure summary Structure with description of performed refinement

OUTPUT model: COBRA model structure with comments

anaerobicGrowthGapfill(model, biomassReaction, database)[source]¶

Tests if the input microbe model can grow anaerobically and gap-fills by adding anaerobic co-factor utilizing reactions.

USAGE: [model,oxGapfillRxns,anaerGrowthOK] = anaerobicGrowthGapfill(model, biomassReaction, database)

INPUT model COBRA model structure biomassReaction Biomass reaction abbreviation database rBioNet reaction database containing min. 3 columns:

Column 1: reaction abbreviation, Column 2: reaction name, Column 3: reaction formula.

OUTPUT model COBRA model structure

Almut Heinken and Stefania Magnusdottir, 2016-2019

carbonSourceGapfill(model, microbeID, database, inputDataFolder)[source]¶

Gap-fills carbon source utilization pathways in a microbial reconstruction based on experimental evidence.

USAGE: [model, addedRxns_carbonSources, removedRxns_carbonSources] = carbonSourceGapfill(model, microbeID, database,inputDataFolder)

INPUT model COBRA model structure microbeID: ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

database rBioNet reaction database containing min. 3 columns:: Column 1: reaction abbreviation, Column 2: reaction name, Column 3: reaction formula.
inputDataFolder Folder with experimental data and database files: to load

OUTPUT model COBRA model structure refined through experimental data

for carbon sources

addedRxns List of reactions that were added during refinement removedRxns List of reactions that were removed during refinement

Almut Heinken and Stefania Magnusdottir, 2016-2020

connectRxnGapfilling(model, database)[source]¶

Part of the DEMETER pipeline. This function adds reactions to unblock specific pathways.

USAGE: [resolveBlocked,model]=connectRxnGapfilling(model,database)

INPUT model COBRA model structure database rBioNet reaction database containing min. 3 columns:

Column 1: reaction abbreviation, Column 2: reaction name, Column 3: reaction formula.

OUTPUT model COBRA model structure

Almut Heinken and Stefania Magnusdottir, 2016-2019

createPeriplasmaticSpace(model, microbeID, infoFile)[source]¶

Part of the DEMETER pipeline. This function creates a periplasmatic space for refined reconstructions if it is appropriate for the organism. The periplasmatic space is created by by retrieving all extracellular metabolites and adding a third compartment.

USAGE: [model] = createPeriplasmaticSpace(model,microbeID,infoFile)

INPUT model COBRA model structure microbeID: ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

infoFile: Table with taxonomic and gram staining information on: microbes to reconstruct

OUTPUT model COBRA model structure

AUTHOR:

Almut Heinken, 03/2020

createSBMLFiles(refinedFolder, sbmlFolder)[source]¶

Creates SBML files for the created refined reconstructions. This may be time-consuming.

USAGE: createSBMLFiles(refinedFolder, sbmlFolder)

INPUTS refinedFolder Folder with refined COBRA models generated by

the refinement pipeline

sbmlFolder Folder where SBML files, if desired, will be saved

AUTHOR:

Almut Heinken, 03/2020

curateAgainstBacDiveData(model, microbeID, database, inputDataFolder)[source]¶

Gap-fills and/or removes reactions in a genome-scale reconstructions based on data from BacDive (https://bacdive.dsmz.de).

USAGE: [model, addedRxns, removedRxns] = gapfillAgainstBacDiveData(model, microbeID, database, inputDataFolder)

INPUT model COBRA model structure microbeID: ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

database rBioNet reaction database containing min. 3 columns:: Column 1: reaction abbreviation, Column 2: reaction name, Column 3: reaction formula.
inputDataFolder Folder with experimental data and database files: to load

OUTPUT model COBRA model structure refined through BacDive data addedRxns List of reactions that were added during refinement removedRxns List of reactions that were removed during refinement

Almut Heinken, 11/2021

curateGrowthRequirements(model, microbeID, database, inputDataFolder)[source]¶: Takes the growth requirements of an organism (if known) as input and refines the reconstruction accordingly. Reactions are gap-filled and/or delete to reconcile mismatches between experimental and in silico metabolite essentiality. These curation steps were determined manually. The first step is printing the organism’s biomass components and subsequent evaluation which ones are required/ not required by the model. There are four possible cases: 1) essential in vivo and not in BOF -> add to BOF and add transporter/ remove unannotated biosynthesis reactions 2) essential in vivo and in BOF -> add transporter/ remove unannotated biosynthesis reactions 3) nonessential in vivo and not in BOF -> OK 4) nonessential in vivo and in BOF -> if pathway is mostly present: gapfill. If pathway is not present: remove from BOF

deleteSeedGapfilledReactions(model, biomassReaction)[source]¶

Part of the DEMETER pipeline. Deletes reactions gapfilled by the Model SEED pipeline that are no longer needed after the reconstruction was refined.

INPUT model COBRA model structure biomassReaction Biomass reaction abbreviation

OUTPUT model COBRA model structure deletedSEEDRxns deleted gapfilled reactions

Almut Heinken and Stefania Magnusdottir, 2016-2019

doubleCheckGapfilledReactions(model, summary, biomassReaction, microbeID, database, definedMediumGrowthOK, inputDataFolder)[source]¶

Part of the DEMETER pipeline. Deletes reactions gapfilled by DEMETER that are no longer needed after the reconstruction was are no longer needed after finishing all steps of the pipeline.

USAGE: [model,summary]=doubleCheckGapfilledReactions(model,summary,biomassReaction,microbeID,database,definedMediumGrowthOK,inputDataFolder)

INPUT model: COBRA model structure summary: Structure with information of refinement

performed on the model

biomassReaction: Biomass reaction abbreviation microbeID: ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

database rBioNet reaction database containing min. 3 columns:: Column 1: reaction abbreviation, Column 2: reaction name, Column 3: reaction formula.
definedMediumGrowthOK: If 1, defined medium is available for the: organism and the model can grow on it
inputDataFolder: Folder with experimental data and database files: to load

OUTPUT model: COBRA model structure summary: Structure with information of refinement

performed on the model

fermentationPathwayGapfill(model, microbeID, database, inputDataFolder)[source]¶

Gap-fills fermentation pathways in a microbial reconstruction based on experimental evidence.

USAGE: [model, addedRxns, removedRxns] = fermentationPathwayGapfill(model, microbeID, database, inputDataFolder)

INPUT model COBRA model structure microbeID: ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

fermDataTable Data table with binary data showing which microbe: should have what fermentation pathway(s). Column 1: Cell array of strings with microbeIDs. Columns 2-29: Cell array of numbers either 0 (microbe does not have pathway) or 1 (has pathway).
database rBioNet reaction database containing min. 3 columns:: Column 1: reaction abbreviation, Column 2: reaction name, Column 3: reaction formula.

OUTPUT model COBRA model structure addedRxns List of reactions that were added during refinement removedRxns List of reactions that were removed during refinement

Almut Heinken and Stefania Magnusdottir, 2016-2020

findTransportersWithoutExchanges(model)[source]¶

Part of the DEMETER pipeline. Finds transporters to extracellular space that are blocked because they have no exchange reaction associated with them.

USAGE: [model, transportersWithoutExchanges] = findTransportersWithoutExchanges(model)

INPUT model COBRA model structure

OUTPUT model COBRA model structure transportersWithoutExchanges Removed transport reactions

Almut Heinken and Stefania Magnusdottir, 2016-2019

findUnusedExchangeReactions(model)[source]¶

Part of the DEMETER pipeline. Finds exchange reactions that are no longer used and should be deleted after deleting unnecessary transport reactions gapfilled by Model SEED.

USAGE: [model, unusedExchanges] = findUnusedExchangeReactions(model)

INPUT model COBRA model structure

OUTPUT model COBRA model structure unusedExchanges Removed unused exchange reactions

Almut Heinken and Stefania Magnusdottir, 2016-2019

performDataDrivenRefinement(model, microbeID, biomassReaction, database, inputDataFolder, summary)[source]¶

This function is part of the DEMETER pipeline and performs data-driven refinement of a genome-scale reconstruction based on available species-specific experimental data.

USAGE: [model,summary] = performDataDrivenRefinement(model, microbeID, biomassReaction, database, inputDataFolder,summary)

INPUTS model COBRA model structure to refine microbeID ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

inputDataFolder Folder with input tables with experimental data and: databases that inform the refinement process

summary Structure with information on performed refinement

OUTPUT refinedModel Refined COBRA model structure summary Structure with information on performed refinement

putrefactionPathwaysGapfilling(model, microbeID, database)[source]¶

This function adds exchange, transport and biosynthesis reactions for putrefaction pathways according to data collected from Ref. PMID:29163445 as part of the DEMETER pipeline.

USAGE:: [model,rxnsAdded]=putrefactionPathwaysGapfilling (model,microbeID,database)

INPUTS model: COBRA model structure microbeID: ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

database: rBioNet reaction database containing min. 3 columns:: Column 1: reaction abbreviation, Column 2: reaction name, Column 3: reaction formula.

OUTPUTS model: COBRA model structure with added pathways if applies rxnsAdded: Reactions added based on experimental data

rebuildBiomassReaction(model, microbeID, biomassReaction, database, infoFile)[source]¶

Part of the DEMETER pipeline. This function rebuilds the biomass objective function of the reconstruction based on taxonomical information for the organism. The biomass formulation is based on gram-staining, taxonomy (Bacteria vs. Archaea), and phylum-specific features.

USAGE: [model,removedBioComp,addedReactionsBiomass] = rebuildBiomassReaction(model,microbeID,biomassReaction,database,infoFile)

INPUTS model: COBRA model structure biomassReaction: Biomass reaction abbreviation microbeID: ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

infoFile: Table with taxonomic and gram staining: information on microbes to reconstruct

OUTPUTS model: COBRA model structure removedBioComp: Removed components that shpould not be in the BOF addedReactionsBiomass: Reactions that were added to enable flux through

the rebuilt BOF

AUTHOR:

Almut Heinken, 03/2020

rebuildModel(model, database, biomassReaction)[source]¶

Rebuilds a genome-scale reconstruction with Virtual Metabolic Human (VMH) metabolic and reaction nomenclature while ensuring quality control through rBioNet.

USAGE [rebuiltModel] = rebuildModel(model,database)

INPUT

model COBRA model structure database Structure containing rBioNet reaction and metabolite

database

OPTIONAL INPUT

biomassReaction Biomass reaction abbreviation (if needs to be: specified, otherwise, will be inferred automatically)

OUTPUT

rebuiltModel Quality-controlled COBRA model structure

refineGenomeAnnotation(model, microbeID, database, inputDataFolder)[source]¶

Part of the DEMETER pipeline. Refines a reconstruction based on comparative genomics data retrieved from PubSEED spreadsheets. Adds reactions linked to genes that were found in the respective organisms based on manual comparative genomic analyses. If the reaction is already present, the gene-protein-reaction association (GPR) is updated.

USAGE: [model,addAnnRxns,updateGPRCnt]=refineGenomeAnnotation(model,microbeID,database,inputDataFolder)

INPUTS model: COBRA model structure microbeID: ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

database: rBioNet reaction database containing min. 3 columns:: Column 1: reaction abbreviation, Column 2: reaction name, Column 3: reaction formula.
inputDataFolder: Folder with experimental data and database files to: load

OUTPUTS model: COBRA model structure addAnnRxns: Reactions newly added based on comparative genomics

data

updateGPRCnt: Reactions for which GPRs were updated based on: comparative genomics data

refinementPipeline(model, microbeID, infoFilePath, inputDataFolder, translateModels)[source]¶

This function runs the semi-automatic refinement pipeline on a draft reconstruction generated by the KBase pipeline or a previously refined reconstruction.

USAGE:: [refinedModel, summary] = refinementPipeline (modelIn, microbeID, infoFilePath, inputDataFolder, translateModels)

INPUTS model COBRA model structure to refine microbeID ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

infoFilePath File with information on reconstructions to refine inputDataFolder Folder with input tables with experimental data and

databases that inform the refinement process

translateModels Boolean indicating whether to translate models if they: are in KBase nomenclature (default: true)

OUTPUTS refinedModel COBRA model structure refined through AGORA pipeline summary Structure with description of performed refineemnt

removeUnannotatedReactions(model, microbeID, biomassReaction, growsOnDefinedMedium, inputDataFolder)[source]¶

Part of the DEMETER pipeline. Refines a reconstruction based on comparative genomics data retrieved from PubSEED spreadsheets. Removes reactions that were present in the reconstruction before refinement but that are not annotated in the organism according to manually performed comparative genomic analyses.

USAGE: [model,rmUnannRxns]=removeUnannotatedReactions(model,microbeID,biomassReaction,growsOnDefinedMedium,inputDataFolder)

INPUTS model: COBRA model structure microbeID: ID of the reconstructed microbe that serves as

the reconstruction name and to identify it in input tables

definedMediumGrowthOK: If 1, defined medium is available for the: organism and the model can grow on it
inputDataFolder: Folder with experimental data and database: files to load

OUTPUTS model: COBRA model structure rmUnannRxns: Removed reactions based on comparative genomics

data

runDemeter(draftFolder, varargin)[source]¶

This function runs the DEMETER pipeline consisting of three steps: 1) refining all draft reconstructions, 2) testing the refined reconstructions against the input data, 3) preparing a report detailing any additional debugging that needs to be performed.

USAGE:: [refinedFolder,translatedDraftsFolder,summaryFolder,sbmlFolder] = runPipeline (draftFolder, varargin)

REQUIRED INPUTS draftFolder Folder with draft COBRA models generated by

KBase pipeline to analyze

OPTIONAL INPUTS translateModels Boolean indicating whether to translate models

if they are in KBase nomenclature (default: true)

refinedFolder Folder with refined COBRA models generated by: the refinement pipeline
translatedDraftsFolder Folder with draft COBRA models with translated: nomenclature and stored as mat files

infoFilePath File with information on reconstructions to refine inputDataFolder Folder with experimental data and database files

to load

summaryFolder Folder with information on performed gapfilling: and refinement
reconVersion Name of the refined reconstruction resource: (default: “Reconstructions”)

numWorkers Number of workers in parallel pool (default: 2) createSBML Defines whether refined reconstructions should

be exported in SBML format (default: false)

OUTPUTS reconVersion Name of the refined reconstruction resource

(default: “Reconstructions”)

refinedFolder Folder with refined COBRA models generated by: the refinement pipeline
translatedDraftsFolder Folder with draft COBRA models with translated: nomenclature and stored as mat files
summaryFolder Folder with information on performed gapfilling: and refinement

secretionProductGapfill(model, microbeID, database, inputDataFolder)[source]¶

This function adds exchange, transport and biosynthesis reactions for experimentally shown secreted metabolites according to data collected for the DEMETER pipeline.

USAGE:: [model,secretionRxnsAdded] = secretionProductGapfill (model,microbeID,database,inputDataFolder)

INPUTS model: COBRA model structure microbeID: ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

database: rBioNet reaction database containing min. 3 columns:: Column 1: reaction abbreviation, Column 2: reaction name, Column 3: reaction formula.
inputDataFolder: Folder with input tables with experimental data: and databases that inform the refinement process

OUTPUTS model: COBRA model structure with added pathways if applies secretionRxnsAdded: Reactions added based on experimental data

translateDraftReconstruction(model)[source]¶

This function translates reaction and metabolite IDs in KBase draft reconstructions to VMH nomenclature so the test suite can be performed on them. It does not make changes to the reaction content otherwise.

USAGE: [translatedModel] = translateDraftReconstruction(model)

INPUT model COBRA model structure generated by KBase

reconstruction pipeline

OUTPUT translatedModel: COBRA model structure with metabolites and

reactions translated to VMH identifiers (but not ran through refinement pipeline)

AUTHOR

Almut Heinken 09/2020

translateKBaseModel2VMHModel(model, biomassReaction, database)[source]¶

Translates reaction and metabolite identifiers from a KBase/ModelSEED reconstruction to the Virtual Metabolic Human (https://vmh.life) reaction and metabolite nomenclature. The reaction and metabolite database with VMH identifiers as well as the translation table from KBase/Model SEED to VMH reaction and metabolite identiers are retrieved from the folder cobratoolbox/papers/2018_microbiomeModelingToolbox/database. Note that there will likely be reactions and metabolites that are not yet included in the translation table and will thus be missing from the translated model.

INPUT model COBRA model structure derived from KBase/ModelSEED biomassReaction Biomass reaction abbreviation

OUTPUTS translatedModel Translated COBRA model structure notInTableRxns Reactions that are currently not in translation

table

notInTableMets Metabolites that are currently not in translation: table

… AUTHORS Stefania Magnusdottir, Oct 2017 Almut Heinken, Dec 2018 - simplified inputs

uptakeMetaboliteGapfill(model, microbeID, database, inputDataFolder)[source]¶

This function adds exchange, transport and biosynthesis reactions for experimentally shown consumed metabolites according to data collected for the DEMETER pipeline.

USAGE:: [model,uptakeRxnsAdded] = uptakeMetaboliteGapfill (model,microbeID, database, inputDataFolder)

INPUTS model: COBRA model structure microbeID: ID of the reconstructed microbe that serves as the

reconstruction name and to identify it in input tables

database: rBioNet reaction database containing min. 3 columns:: Column 1: reaction abbreviation, Column 2: reaction name, Column 3: reaction formula.
inputDataFolder: Folder with input tables with experimental data: and databases that inform the refinement process

OUTPUTS model: COBRA model structure with added pathways if applies uptakeRxnsAdded: Reactions added based on experimental data