Utilities¶

SBML-utilities

buildRxnGeneMat(model)[source]¶

Build the rxnGeneMat based on the given models rules field

Usage

model = buildRxnGeneMat(model)

Input

model – Model to build the rxnGeneMat. Must have the rules field, otherwise the rxnGeneMat is empty

Output

model – The Model including a rxnGeneMat field.

convertOldStyleModel(model, printLevel)[source]¶

Converts several old fields to their replacement.

Usage

model = convertOldStyleModel(model) model = convertOldStyleModel(model, printLevel)

Input

model – a COBRA Model (potentially with old field names)

Optional input

printLevel – indicates whether warnings and messages are given (default, 1).

Output

model – a COBRA model with old field names replaced by new ones and duplicated fields merged.

Note

There are multiple fields which were used inconsistently in the course of the COBRA toolbox. This function provides a simple way to get these model fields converted to the current names. In addition, some fields were commonly not present in older models and are now checked in many newer models. These fields are initialized by this function, with default values, which do not alter any previous behaviour. The model fields changed are as follows: ‘confidenceScores’ -> ‘rxnConfidenceScores’ ‘metCharge’ -> ‘metCharges’ ‘ecNumbers’ -> ‘rxnECNumbers’ ‘KEGGID’ -> ‘metKEGGID’ ‘metKeggID’ -> ‘metKEGGID’ ‘rxnKeggID’ -> ‘rxnKEGGID’ ‘metInchiString’ -> ‘metInChIString’ ‘metSmile’ -> ‘metSmiles’ ‘metHMDB’ -> ‘metHMDBID’ If both an old and a new field is present, data from old fields is merged into new fields, with the data of new fields taking precedence (i.e. if not data is present in the new field at any position, the old field data replaces it, otherwise the new field data is kept. Furthermore, fields deemed to be required for Flux Balance analysis are generated if not present: osenseStr: Objective Sense.

By default this field is initialized as ‘max’. If osense is present, a -1 will be translates as ‘max’ and a 1 will be translated as ‘min’

csense: Constraint sense.: This field indicates the sense of the b matrix, i.e. if b stands for lower than (‘L’) or greater than (‘G’) or equality constraints (‘E’). It is initialized as a char vector of ‘E’ with the same size as model.mets.

genes: A Field for genes present in the model. rules: The rules field is a logical representation of the GPR rules,

and used in multiple functions. If the grRules field is present, this field will be initialized according to grRules, otherwise it will be initialized as a cell array of the ame size as model.rxns with empty strings in each cell.

rev: This field was deprecated and is therefore removed, for: reversibility determination the toolbox relies on the lower bounds of the reactions.

The following fields might be altered to adhere to the definitions in the COBRAModelFields documentation: rxnConfidenceScores: This field is defined as a number betwen 0 and 4

indicating the confidence of a reaction. It is therefore assumed to be a double vector in COBRA functions. Some old models provide this as Strings, or a numeric cell array. Those fields are converted to double vectors, with the data retained.

Fields with Cell arrays: Some older models have defined cell array fields: which have individual cells which are numeric (i.e. empty []). These empty cells are replaced by ‘’ for those fields, which are defined in the COBRAModelFields file as having cell arrays with chars.

createEmptyFields(model, fieldNames, fieldDefinitions)[source]¶

Create the specified model field with its default values. Works only for fields defined in the toolbox if fieldDefinitions are not supplied.

Usage

model = createEmptyFields(model,fieldName, fieldDefinitions)

Inputs

model – The model to add a field to

fieldNames – The names of the fields to add.

Optional input

fieldDefinitions – The Specifications of the field. Only necessary if a field which is not defined yet should be added.

Output

model – The original model struct with the specified field added.

Author:: Thomas Pfau Nov 2017

creategrRulesField(model, positions)[source]¶

Generates the grRules optional model field from the required rules and gene fields.

Usage

modelWithField = creategrRulesField(model, positions)

Input

model – The COBRA Model structure to generate the grRules Field for an existing grRules field will be overwritten

Optional input

positions – The positions to update. Can be supplied either as logical array, double indices, or reaction names (Default: model.rxns)

Output

model – The Output model with a grRules field

generateFieldDescriptionFile(FileName)[source]¶

Generates the ModelFields.md file describing the Required and Optional Fields of a COBRA model.

Usage

FileString = generateFieldDescriptionFile(FileName)

Optional input

FileName – The FileName to write to. (default [CBTDIR filesep ‘docs’ filesep ‘notes’ filesep ‘COBRAModelFields.md’])

Output

FileString – The string written to the specified filename (or the default file)

getDatabaseMappings(field, qualifiers)[source]¶

getDataBaseMappings returns information on known mappings of database entries to model field names, along with additional information about the fields.

Input

field – the basic model field to extract mappings for (e.g. ‘met’, ‘gene’, ‘rxn’)

Optional input

qualifiers – the qualifiers to restrict the selection to. These have to be part of the bioql modifiers definition (e.g. is, isDescribedBy, isEncodedBy etc) providing ‘all’ will return all associated Database mappings. Default: ‘all’

Output

returnedmappings – The mappings known for the given field. The structure is: X{:,1} : the database ID (in identifiers.org/miriam annotation) X{:,2} : The qualifier associated with the DB X{:,3} : The model field associated with this db X{:,4} : The association field (met/rxn/gene/prot/comp) X{:,5} : The specified regular expression the identifier has to adhere to. X{:,6} : The type of the qualifier (modelQualifier or bioQualifier)

getDefaultCompartmentSymbols()[source]¶

Returns the default compartment symbol and name lists to use for model IO or compartment matching Usage

[defaultCompartmentSymbolList, defaultCompartmentNameList] = getDefaultCompartmentSymbols()

Outputs

defaultCompartmentSymbolList – a List of abbreviations of compartment names

defaultCompartmentNameList – a List of names of compartments where element i corresponds to the i-th abbreviation in defaultCompartmentSymbolList

getDefaultCompartments()[source]¶

GETDEFAULTCOMPARTMENTS returns the default compartment Symbols and default Compartments

USAGE: [ compSymbolList, compNameList ] = getDefaultCompartments( )

Outputs

compSymbolList – Default symbols of compartments

compNameList – Names of the default compartments.

getDefinedFieldProperties(varargin)[source]¶

Returns the fields defined in the COBRA Toolbox along with checks for their properties

Usage

[fields] = getDefinedFieldProperties(varargin)

Optional input

varargin – The following parameter/value pairs can be used: * Descriptions: Whether to obtain the field descriptions (default = false). * SpecificFields: Indication whether to only obtain definitions for a

specific set of fields (default all).

DataBaseFields: Get the fields with specified Database relations (true, if requested).

Output

fields – All fields and their properties as requested, if fields without definitions are requested, they will not be contained in the result.

Note

The optional inputs are to be provided as parameter/value pairs. The returned Cell arrays are structured as follows: Default:

X{:,1} are the field names

X{:,2} are the associated fields for the first dimension (i.e. size(model.(X{A,1}),1) == size(model.(X{A,2}),1) has to evaluate to true

X{:,3} are the associated fields for the second dimension (i.e. size(model.(X{A,1}),2) == size(model.(X{A,2}),1) has to evaluate to true

X{:,4} are evaluateable statements, which have to evaluate to true for the model to be valid, these mainly check the content types.

X{:,5} are default values (or evaluateable strings for cell arrays)

E.g.

x = model.(X{A, 1});

eval(X{A, 4}) has to return 1

DataBaseFields:

X{:, 1} - database id

X{:, 2} - qualifier

X{:, 3} - model Field

X{:, 4} - model field reference (without s)

X{:, 5} - Patterns for ids from the respecive database.

getDistributedModel(modelName, description)[source]¶

Loads the indicated model from the models submodule.

Usage

model = getDistributedModel(modelName)

Input

modelName – The name of the model including the file extension

Optional input

description – If the model description should be set to a specific value

Output

model – The loaded model from the models submodule (i.e. those distributed for the test suite)

getDistributedModelFolder(modelName)[source]¶

Identifies the folder a distributed model is located in. This function only works with models which are part of the models

Usage

modelDir = getDistributedModelFolder(modelName)

Input

modelName – The name of the model including the file extension

Output

modelDir – The folder the model should be located in.

getMultiDimensionFields(fieldDefinitions)[source]¶

Get those fields which have multiple dimensions depending on another field from the definitions

Usage

[fieldNames,firstDim,secondDim] = getMultiDimensionFields(fieldDefinitions)

Input

fieldDefinitions – Field Definitilons as obtained from getDefinedFieldProperties();

Outputs

fieldNames – The names of the multi-dimensional fields

firstDim – the referenced field in the first dimension

secondDim – the referenced field in the second dimension

initFBAFields(model, printLevel)[source]¶

This function initializes all fields in a model that are required for downstream FBA analysis. It does so if and only if a Stoichiometric matrix S is provided.

Usage

model = convertOldStyleModel(model, printLevel)

Input

model – a COBRA Model structure with at least the model.S field. All fields already present are retained and absent fields are initialized with their defaults.

Optional input

printLevel – indicates whether warnings and messages are given (default, 1).

Output

model – a COBRA model struct with the following fields: .S (same as input) .rxns (default: a vector of strings R1 .. R size(S,2) .mets (default: a vector of strings M1 .. M size(S,1) .lb (default: -1000 * ones(size(S,2),1) ); .ub (default: 1000 * ones(size(S,2),1) ); .genes (default: cell(0,1)); .rules (default: repmat({‘’},size(S,2),1)) .osense (default: -1) .csense (default: a char vector of ‘E’ of the size size(S,1) x 1)

loadIdentifiedModel(filename, directory)[source]¶

Load a single cobra toolbox model saved as a filename.mat file, then rename the model structure ‘model’ while retaining the original name of the model structure in model.modelID

Usage

model = loadIdentifiedModel(filename, directory)

Inputs

filename – name of the .mat file containing cobra toolbox model structure

directory – directory where the .mat file resides.

Output

model – COBRA model structure

mergeTextDataAndData(textdata, data, headings)[source]¶

Merges textdata and data imported from .xls file assuming that the first row of textdata is column headings

Usage

mergedData = mergeTextDataAndData(textdata, data, headings)

Inputs

textdata – cell array from .xls import

data – matrix with numeric data from .xls import

Optional input

headings – {(1), 0}, zero if no column headings

Output

mergedData – merged cell array with all data from .xls import

model2xls(model, fileName, compSymbols, compNames)[source]¶

Writes a model to and Excel spreadsheet.

Usage

model2xls(model, fileName, compSymbols, compNames)

Inputs

model – A COBRA model struct

fileName – filename with an xsl extension.

Optional inputs

compSymbols – Symbols of compartments used in metabolite ids

compNames – Names of the compartments identified by the symbols

Example

'Reaction List' tab headers (case sensitive):

  * Required:

    * 'Abbreviation':      HEX1
    * 'Reaction':          `1 atp[c] + 1 glc-D[c] --> 1 adp[c] + 1 g6p[c] + 1 h[c]`
    * 'GPR':               (3098.3) or (80201.1) or (2645.3) or ...
  * Optional:

    * 'Description':       Hexokinase
    * 'Subsystem':         Glycolysis
    * 'Reversible':        0 (false) or 1 (true)
    * 'Lower bound':       0
    * 'Upper bound':       1000
    * 'Objective':         0/1
    * 'Confidence Score':  0,1,2,3,4
    * 'EC Number':         2.7.1.1;2.7.1.2
    * 'KEGG ID':           R000001
    * 'Notes':             Reaction also associated with EC 2.7.1.2
    * 'References':        PMID:2043117;PMID:7150652,...

'Metabolite List' tab: Required headers (case sensitive): (needs to be complete list of metabolites,
i.e., if a metabolite appears in multiple compartments it has to be represented in multiple rows.
Abbreviations need to overlap with use in Reaction List

  * Required

    * 'Abbreviation':      glc-D or glc-D[c]
  * Optional:

    * 'Charged formula' or formula:   C6H12O6
    * 'Charge':                       0
    * 'Compartment':                  cytosol
    * 'Description':                  D-glucose
    * 'KEGG ID':                      C00031
    * 'PubChem ID':                   5793
    * 'ChEBI ID':                     4167
    * 'InChI string':                 InChI=1/C6H12O6/c7-1-2-3(8)4(9)5(10)6(11)12-2/h2-11H,1H2/t2-,3-,4+,5-,6?/m1/s1
    * 'SMILES':                       OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O
    * 'HMDB ID':                      HMDB00122

Note

Optional inputs may be required for input on unix machines.

outputHypergraph(model, weights, fileName)[source]¶

Outputs to a file a metabolic reaction network hypergraph with weights in reactions Output format: Rxn metabolite_1 metabolite_2 ... metabolite_n rxnWeight

Usage

outputHypergraph(model, weights, fileName)

Inputs

model – Standard model structure

weights – Weights for each reaction

fileName – Output filename

outputNetworkCytoscape(model, fileBase, rxnList, rxnData, metList, metData, metDegreeThr)[source]¶

Outputs a metabolic network in Cytoscape format

Usage

notShownMets = outputNetworkCytoscape(model, fileBase, rxnList, rxnData, metList, metData, metDegreeThr)

Inputs

model – COBRA metabolic network model

fileBase – Base file name (without extensions) for Cytoscape input files that are generated by the function

Optional inputs

rxnList – List of reactions that will included in the output (Default = all reactions)

rxnData – Vector or matrix of data or cell array of strings to output for each reaction in rxnList (Default = empty)

metList – List of metabolites that will be included in the output (Default = all metabolites)

metData – Vector or matrix of data or cell array of strings to output for each metabolite in metList (Default = empty)

metDegreeThr – Maximum degree of metabolites that will be included in the output. Allows filtering out highly connected metabolites such as ‘h2o’ or ‘atp’ (Default = no filtering)

Output

notShownMets – Metabolites that are not included in the output

Note

Outputs three to five files:

[baseName].sif - Basic network structure file containing reaction-metabolite and gene-reaction (if provided in model) associations

[baseName]_nodeType.noa - Describes the node types (gene, rxn, met) in the network

[baseName]_nodeComp.noa - Describes the compartments for metabolites

[baseName]_subSys.noa - Describes the subsystems for reactions (if provided)

[baseName]_rxnMetData.noa - Reaction and metabolite data (if provided)

outputNetworkOmix(model, rxnBool)[source]¶

Outputs a text file for import into omix http://www.13cflux.net/omix/

Usage

outputNetworkOmix(model, rxnBool)

Input

model – COBRA model structure

Optional input

rxnBool – boolean vector with 1 for each reaction to be exported

parseSBMLAnnotationField(annotationField)[source]¶

Parses the annotation field of an SBML file to extract metabolite information associations

Usage

[metCHEBI, metHMDB, metKEGG, metPubChem, metInChI] = parseSBMLAnnotationField(annotationField)

Input

annotationField – annotation filed of an SBML fileBase

Outputs

metCHEBI – Formula for each metabolite in the ChEBI format

metHMDB – Formula for each metabolite in the HMDB format

metKEGG – Formula for each metabolite in the KEGG format

metPubChem – PubChem ID of each metabolite

metInChI – Formula for each metabolite in the InCHI strings format

parseSBMLAnnotationFieldRxn(annotationField)[source]¶

Parses the annotation field of an SBML file to extract reaction information associations

Usage

[rxnEC, rxnReference] = parseSBMLAnnotationFieldRxn(annotationField)

Input

annotationField – annotation filed of an SBML fileBase

Output

rxnEC,rxnReference – only one of them is not empty depending on annotationField

parseSBMLNotesField(notesField)[source]¶

Parses the notes field of an SBML file to extract gene-rxn associations

Usage

[subSystem, grRule, formula, confidenceScore, citation, comment, ecNumber, charge] = parseSBMLNotesField(notesField)

Input

notesField – notes field of SBML file

Outputs

subSystem – subSystem assignment for each reaction

grRule – a string representation of the GPR rules defined in a readable format

formula – elementa formula

confidenceScore – confidence scores for reaction presence

citation – joins strings with authors

comment – comments and notes

ecNumber – E.C. number for each reaction

charge – charge of the respective metabolite

planariseModel(model, replicateMetBool)[source]¶

Converts model into a form that is suitable for display as a planar hypergraph

Usage

[modelPlane, replicateMetBool, metData, rxnData] = planariseModel(model, replicateMetBool)

Inputs

model – model structure

replicateMetBool – met x 1 boolean vector of metabolites to be replicated for each reaction

Outputs

modelPlane – structure with fields:

.S - matrix

.mets - metabolites

.origMets - original metabolites

replicateMetBool – as in input

metData – data of metabolites

rxnData – data of reactions

readBooleanRegModel(metModel, fileName)[source]¶

Reads Boolean regulatory network model

Usage

regModel = readBooleanRegModel(metModel, fileName)

Input

metModel – model

Optional input

fileName – file name

Output

regModel – model containing the following fields:

.mets - Metabolite rules:

name - Metabolite/pool name (internal to the reg network model)

rule - Metabolite ‘activation’ rule

type - Metabolite type (extra/intracellular/pool)

excInd - Exchange flux indices corresponding to extracellular metabolites

icmRules - Intracellular metabolite ‘activation’ rules (based on a flux vector - fluxVector)

pool - Pool components

.regs - Regulator rules

name - Regulator name

rule - Regulator rule

comp - Regulator rule components (i.e. metabolites or other regulators that affect the state of this regulator)

ruleParsed - Rule in parsed format (based on metabolite state - metState, regulator state - regState)

.tars - Target rules

name - Target name

rule - Target rule

comp - Target rule components (i.e. metabolites or other regulators that affect the state of this regulator)

ruleParsed - Rule in parsed format (based on metabolite state - metState, regulator state - regState)

readSBML(fileName, defaultBound)[source]¶

Reads in a SBML format model as a COBRA matlab structure

Usage

model = readSBML(fileName, defaultBound)

Input

fileName – File name for file to read in

Optional input

defaultBound – Maximum bound for model (Default = 1000)

Output

model – COBRA model structure

readSimPhenyCMPD(fileName)[source]¶

Reads SimPheny compound file obtained from admin console

Usage

[metInfo, mets] = readSimPhenyCMPD(fileName)

Input

fileName – SimPheny compound file name

Outputs

metInfo – Structure contaning data on metabolites

mets – List of metabolites

readSimPhenyGPR(fileName)[source]¶

Reads SimPheny gene-protein-reaction association file obtained from admin console

Usage

[rxnInfo, rxns, allGenes] = readSimPhenyGPR(fileName)

Input

fileName – SimPheny GPR file

Outputs

rxnInfo – Structure containing data for each reaction

rxns – List of reactions

allGenes – List of all genes

readSimPhenyGprText(file, model)[source]¶

Parses SimPheny GPRA’s in text format into a rxn x gene association matrix

Usage

gpraModel = readSimPhenyGPRText(file, model)

Inputs

file – GPR text file

model – COBRA model structure

Output

gpraModel – COBRA model structure with reaction-gene association matrix

restrictModelsToFields(models, fieldNames)[source]¶

Removes all fields not given as fieldnames from the models

Usage

restrictedModels = restrictModelsToFields(models, fieldNames)

Inputs

models – A Cell array of model structs (or single model struct that has all fieldNames provided.

fieldNames – Names of the fields the models will be restricted to.

Output

restrictedModels – The models with the non names fields removed, or a single struct if its just one model.

write4ti2(SeFull, filename, uni)[source]¶

Writes an input file for 4ti2. ‘ti2’ is a software package for algebraic, geometric and combinatorial problems on linear spaces - www.4ti2.de

Usage

write4ti2(SeFull, filename, uni)

Inputs

SeFull – full stoichiometric matrix

filename – name of the file

Optional input

uni – {(0),1}, uni = 1 only outputs every second reaction

writeCytoscapeEdgeAttributeTable(model, C, B, N, replicateMetBool, filename)[source]¶

Writes out a set of boolean edge attributes as one of a pair of colours, ‘Red’ for ‘yes’, ‘Black’ for ‘no’

Usage

writeCytoscapeEdgeAttributeTable(model, C, B, N, replicateMetBool, filename)

Inputs

model – structure with obligatory field .S - met x reaction

C – reaction x attribute cell array

B – reaction x attribute Boolean matrix

N – reaction x attribute numeric array

replicateMetBool – boolean for replicated mets

filename – name of the file

writePajekNet(model)[source]¶

Builds a metabolite centric directed graph from a COBRA model and outputs a graph in a .net format ready to use for most graph analysis software e.g. Pajek, it does one fba to set the link width equal to reaction fluxes.

Usage

writePajekNet(model)

Input

model – a COBRA structured model

Output

.net – file containing the graph

Ex: A + B -> C (hypergraph) with v = 0 => no output (empty line)

if v > 0 then it becomes A -> C; B -> C (graph),

if v < 0 then the order is reversed

writeSBML(model, fileName, compSymbolList, compNameList)[source]¶

Exports a COBRA structure into an SBML FBCv2 file. A SBMLFBCv2 file a file is written to the current Matlab path.

Usage

sbmlModel = writeSBML(model, fileName, compSymbolList, compNameList)

Inputs

model – COBRA model structure

fileName – File name for output file

Optional inputs

compSymbolList – List of compartment symbols

compNameList – List of copmartment names corresponding to compSymbolList

Output

sbmlModel – SBML MATLAB structure

xls2model(fileName, biomassRxnEquation, defaultbound)[source]¶

Reads a model from Excel spreadsheet.

Usage

model = xls2model(fileName, biomassRxnEquation, defaultbound)

Input

fileName – xls spreadsheet, with one ‘Reaction List’ and one ‘Metabolite List’ tab

Optional inputs

biomassRxnEquation – .xls may have a 255 character limit on each cell, so pass the biomass reaction separately if it hits this maximum.

defaultbound – the deault bound for lower and upper bounds, if no bounds are specified in the Excel sheet

Output

model – COBRA Toolbox model

Example

'Reaction List' tab headers (case sensitive):

  * Required:

    * 'Abbreviation':      HEX1
    * 'Reaction':          `1 atp[c] + 1 glc-D[c] --> 1 adp[c] + 1 g6p[c] + 1 h[c]`

  * Optional:

    * 'GPR':               (3098.3) or (80201.1) or (2645.3) or ...
    * 'Description':       Hexokinase
    * 'Subsystem':         Glycolysis
    * 'Reversible':        0 (false) or 1 (true)
    * 'Lower bound':       0
    * 'Upper bound':       1000
    * 'Objective':         0/1
    * 'Confidence Score':  0,1,2,3,4
    * 'EC Number':         2.7.1.1,2.7.1.2
    * 'KEGG ID':           R000001
    * 'Notes':             Reaction also associated with EC 2.7.1.2
    * 'References':        PMID:2043117,PMID:7150652,...

'Metabolite List' tab: Required headers (case sensitive): (needs to be complete list of metabolites,
i.e., if a metabolite appears in multiple compartments it has to be represented in multiple rows.
Abbreviations need to overlap with use in Reaction List

  * Required

    * 'Abbreviation':      glc-D or glc-D[c]
  * Optional:

    * 'Charged formula' or formula:   C6H12O6
    * 'Charge':                       0
    * 'Compartment':                  cytosol
    * 'Description':                  D-glucose
    * 'KEGG ID':                      C00031
    * 'PubChem ID':                   5793
    * 'ChEBI ID':                     4167
    * 'InChI string':                 InChI=1/C6H12O6/c7-1-2-3(8)4(9)5(10)6(11)12-2/h2-11H,1H2/t2-,3-,4+,5-,6?/m1/s1
    * 'SMILES':                       OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O
    * 'HMDB ID':                      HMDB00122

Note

Optional inputs may be required for input on unix machines.

Note

Find an example Excel sheet at docs/source/examples/ExcelExample.xlsx