Utilities¶
-
buildRxnGeneMat
(model)[source]¶ Build the rxnGeneMat based on the given models rules field
Usage
model = buildRxnGeneMat(model)Input
- model – Model to build the rxnGeneMat. Must have the rules field, otherwise the rxnGeneMat is empty
Output
- model – The Model including a rxnGeneMat field.
-
convertOldStyleModel
(model, printLevel)[source]¶ Converts several old fields to their replacement.
Usage
model = convertOldStyleModel(model) model = convertOldStyleModel(model, printLevel)Input
- model – a COBRA Model (potentially with old field names)
Optional input
- printLevel – indicates whether warnings and messages are given (default, 1).
Output
- model – a COBRA model with old field names replaced by new ones and duplicated fields merged.
Note
There are multiple fields which were used inconsistently in the course of the COBRA toolbox. This function provides a simple way to get these model fields converted to the current names. In addition, some fields were commonly not present in older models and are now checked in many newer models. These fields are initialized by this function, with default values, which do not alter any previous behaviour. The model fields changed are as follows: ‘confidenceScores’ -> ‘rxnConfidenceScores’ ‘metCharge’ -> ‘metCharges’ ‘ecNumbers’ -> ‘rxnECNumbers’ ‘KEGGID’ -> ‘metKEGGID’ ‘metKeggID’ -> ‘metKEGGID’ ‘rxnKeggID’ -> ‘rxnKEGGID’ ‘metInchiString’ -> ‘metInChIString’ ‘metSmile’ -> ‘metSmiles’ ‘metHMDB’ -> ‘metHMDBID’ If both an old and a new field is present, data from old fields is merged into new fields, with the data of new fields taking precedence (i.e. if not data is present in the new field at any position, the old field data replaces it, otherwise the new field data is kept. Furthermore, fields deemed to be required for Flux Balance analysis are generated if not present: osenseStr: Objective Sense.
By default this field is initialized as ‘max’. If osense is present, a -1 will be translates as ‘max’ and a 1 will be translated as ‘min’- csense: Constraint sense.
- This field indicates the sense of the b matrix, i.e. if b stands for lower than (‘L’) or greater than (‘G’) or equality constraints (‘E’). It is initialized as a char vector of ‘E’ with the same size as model.mets.
genes: A Field for genes present in the model. rules: The rules field is a logical representation of the GPR rules,
and used in multiple functions. If the grRules field is present, this field will be initialized according to grRules, otherwise it will be initialized as a cell array of the ame size as model.rxns with empty strings in each cell.- rev: This field was deprecated and is therefore removed, for
- reversibility determination the toolbox relies on the lower bounds of the reactions.
The following fields might be altered to adhere to the definitions in the COBRAModelFields documentation: rxnConfidenceScores: This field is defined as a number betwen 0 and 4
indicating the confidence of a reaction. It is therefore assumed to be a double vector in COBRA functions. Some old models provide this as Strings, or a numeric cell array. Those fields are converted to double vectors, with the data retained.- Fields with Cell arrays: Some older models have defined cell array fields
- which have individual cells which are numeric (i.e. empty []). These empty cells are replaced by ‘’ for those fields, which are defined in the COBRAModelFields file as having cell arrays with chars.
-
createEmptyFields
(model, fieldNames, fieldDefinitions)[source]¶ Create the specified model field with its default values. Works only for fields defined in the toolbox if fieldDefinitions are not supplied.
Usage
model = createEmptyFields(model,fieldName, fieldDefinitions)Inputs
- model – The model to add a field to
- fieldNames – The names of the fields to add.
Optional input
- fieldDefinitions – The Specifications of the field. Only necessary if a field which is not defined yet should be added.
Output
- model – The original model struct with the specified field added.
- Author:
- Thomas Pfau Nov 2017
-
creategrRulesField
(model, positions)[source]¶ Generates the grRules optional model field from the required rules and gene fields.
Usage
modelWithField = creategrRulesField(model, positions)Input
- model – The COBRA Model structure to generate the grRules Field for an existing grRules field will be overwritten
Optional input
- positions – The positions to update. Can be supplied either as logical array, double indices, or reaction names (Default: model.rxns)
Output
- model – The Output model with a grRules field
-
generateFieldDescriptionFile
(FileName)[source]¶ Generates the ModelFields.md file describing the Required and Optional Fields of a COBRA model.
Usage
FileString = generateFieldDescriptionFile(FileName)Optional input
- FileName – The FileName to write to. (default [CBTDIR filesep ‘docs’ filesep ‘notes’ filesep ‘COBRAModelFields.md’])
Output
- FileString – The string written to the specified filename (or the default file)
-
getDatabaseMappings
(field, qualifiers)[source]¶ getDataBaseMappings returns information on known mappings of database entries to model field names, along with additional information about the fields.
Input
- field – the basic model field to extract mappings for (e.g. ‘met’, ‘gene’, ‘rxn’)
Optional input
- qualifiers – the qualifiers to restrict the selection to. These have to be part of the bioql modifiers definition (e.g. is, isDescribedBy, isEncodedBy etc) providing ‘all’ will return all associated Database mappings. Default: ‘all’
Output
- returnedmappings – The mappings known for the given field. The structure is: X{:,1} : the database ID (in identifiers.org/miriam annotation) X{:,2} : The qualifier associated with the DB X{:,3} : The model field associated with this db X{:,4} : The association field (met/rxn/gene/prot/comp) X{:,5} : The specified regular expression the identifier has to adhere to. X{:,6} : The type of the qualifier (modelQualifier or bioQualifier)
-
getDefaultCompartmentSymbols
()[source]¶ Returns the default compartment symbol and name lists to use for model IO or compartment matching Usage
[defaultCompartmentSymbolList, defaultCompartmentNameList] = getDefaultCompartmentSymbols()Outputs
- defaultCompartmentSymbolList – a List of abbreviations of compartment names
- defaultCompartmentNameList – a List of names of compartments where element i corresponds to the i-th abbreviation in defaultCompartmentSymbolList
-
getDefaultCompartments
()[source]¶ GETDEFAULTCOMPARTMENTS returns the default compartment Symbols and default Compartments
USAGE: [ compSymbolList, compNameList ] = getDefaultCompartments( )Outputs
- compSymbolList – Default symbols of compartments
- compNameList – Names of the default compartments.
-
getDefinedFieldProperties
(varargin)[source]¶ Returns the fields defined in the COBRA Toolbox along with checks for their properties
Usage
[fields] = getDefinedFieldProperties(varargin)Optional input
varargin – The following parameter/value pairs can be used: * Descriptions: Whether to obtain the field descriptions (default = false). * SpecificFields: Indication whether to only obtain definitions for a
specific set of fields (default all).
- DataBaseFields: Get the fields with specified Database relations (true, if requested).
Output
- fields – All fields and their properties as requested, if fields without definitions are requested, they will not be contained in the result.
Note
The optional inputs are to be provided as parameter/value pairs. The returned Cell arrays are structured as follows: Default:
- X{:,1} are the field names
- X{:,2} are the associated fields for the first dimension (i.e. size(model.(X{A,1}),1) == size(model.(X{A,2}),1) has to evaluate to true
- X{:,3} are the associated fields for the second dimension (i.e. size(model.(X{A,1}),2) == size(model.(X{A,2}),1) has to evaluate to true
- X{:,4} are evaluateable statements, which have to evaluate to true for the model to be valid, these mainly check the content types.
- X{:,5} are default values (or evaluateable strings for cell arrays)
E.g.
x = model.(X{A, 1});
eval(X{A, 4}) has to return 1
DataBaseFields:
- X{:, 1} - database id
- X{:, 2} - qualifier
- X{:, 3} - model Field
- X{:, 4} - model field reference (without s)
- X{:, 5} - Patterns for ids from the respecive database.
-
getDistributedModel
(modelName, description)[source]¶ Loads the indicated model from the models submodule.
Usage
model = getDistributedModel(modelName)Input
- modelName – The name of the model including the file extension
Optional input
- description – If the model description should be set to a specific value
Output
- model – The loaded model from the models submodule (i.e. those distributed for the test suite)
-
getDistributedModelFolder
(modelName)[source]¶ Identifies the folder a distributed model is located in. This function only works with models which are part of the models
Usage
modelDir = getDistributedModelFolder(modelName)Input
- modelName – The name of the model including the file extension
Output
- modelDir – The folder the model should be located in.
-
getMultiDimensionFields
(fieldDefinitions)[source]¶ Get those fields which have multiple dimensions depending on another field from the definitions
Usage
[fieldNames,firstDim,secondDim] = getMultiDimensionFields(fieldDefinitions)Input
- fieldDefinitions – Field Definitilons as obtained from getDefinedFieldProperties();
Outputs
- fieldNames – The names of the multi-dimensional fields
- firstDim – the referenced field in the first dimension
- secondDim – the referenced field in the second dimension
-
initFBAFields
(model, printLevel)[source]¶ This function initializes all fields in a model that are required for downstream FBA analysis. It does so if and only if a Stoichiometric matrix S is provided.
Usage
model = convertOldStyleModel(model, printLevel)Input
- model – a COBRA Model structure with at least the model.S field. All fields already present are retained and absent fields are initialized with their defaults.
Optional input
- printLevel – indicates whether warnings and messages are given (default, 1).
Output
- model – a COBRA model struct with the following fields: .S (same as input) .rxns (default: a vector of strings R1 .. R size(S,2) .mets (default: a vector of strings M1 .. M size(S,1) .lb (default: -1000 * ones(size(S,2),1) ); .ub (default: 1000 * ones(size(S,2),1) ); .genes (default: cell(0,1)); .rules (default: repmat({‘’},size(S,2),1)) .osense (default: -1) .csense (default: a char vector of ‘E’ of the size size(S,1) x 1)
-
loadIdentifiedModel
(filename, directory)[source]¶ Load a single cobra toolbox model saved as a filename.mat file, then rename the model structure ‘model’ while retaining the original name of the model structure in model.modelID
Usage
model = loadIdentifiedModel(filename, directory)Inputs
- filename – name of the .mat file containing cobra toolbox model structure
- directory – directory where the .mat file resides.
Output
- model – COBRA model structure
-
mergeTextDataAndData
(textdata, data, headings)[source]¶ Merges textdata and data imported from .xls file assuming that the first row of textdata is column headings
Usage
mergedData = mergeTextDataAndData(textdata, data, headings)Inputs
- textdata – cell array from .xls import
- data – matrix with numeric data from .xls import
Optional input
- headings – {(1), 0}, zero if no column headings
Output
- mergedData – merged cell array with all data from .xls import
-
model2xls
(model, fileName, compSymbols, compNames)[source]¶ Writes a model to and Excel spreadsheet.
Usage
model2xls(model, fileName, compSymbols, compNames)Inputs
- model – A COBRA model struct
- fileName – filename with an xsl extension.
Optional inputs
- compSymbols – Symbols of compartments used in metabolite ids
- compNames – Names of the compartments identified by the symbols
Example
'Reaction List' tab headers (case sensitive): * Required: * 'Abbreviation': HEX1 * 'Reaction': `1 atp[c] + 1 glc-D[c] --> 1 adp[c] + 1 g6p[c] + 1 h[c]` * 'GPR': (3098.3) or (80201.1) or (2645.3) or ... * Optional: * 'Description': Hexokinase * 'Subsystem': Glycolysis * 'Reversible': 0 (false) or 1 (true) * 'Lower bound': 0 * 'Upper bound': 1000 * 'Objective': 0/1 * 'Confidence Score': 0,1,2,3,4 * 'EC Number': 2.7.1.1;2.7.1.2 * 'KEGG ID': R000001 * 'Notes': Reaction also associated with EC 2.7.1.2 * 'References': PMID:2043117;PMID:7150652,... 'Metabolite List' tab: Required headers (case sensitive): (needs to be complete list of metabolites, i.e., if a metabolite appears in multiple compartments it has to be represented in multiple rows. Abbreviations need to overlap with use in Reaction List * Required * 'Abbreviation': glc-D or glc-D[c] * Optional: * 'Charged formula' or formula: C6H12O6 * 'Charge': 0 * 'Compartment': cytosol * 'Description': D-glucose * 'KEGG ID': C00031 * 'PubChem ID': 5793 * 'ChEBI ID': 4167 * 'InChI string': InChI=1/C6H12O6/c7-1-2-3(8)4(9)5(10)6(11)12-2/h2-11H,1H2/t2-,3-,4+,5-,6?/m1/s1 * 'SMILES': OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O * 'HMDB ID': HMDB00122
Note
Optional inputs may be required for input on unix machines.
-
outputHypergraph
(model, weights, fileName)[source]¶ Outputs to a file a metabolic reaction network hypergraph with weights in reactions Output format: Rxn metabolite_1 metabolite_2 ... metabolite_n rxnWeight
Usage
outputHypergraph(model, weights, fileName)Inputs
- model – Standard model structure
- weights – Weights for each reaction
- fileName – Output filename
-
outputNetworkCytoscape
(model, fileBase, rxnList, rxnData, metList, metData, metDegreeThr)[source]¶ Outputs a metabolic network in Cytoscape format
Usage
notShownMets = outputNetworkCytoscape(model, fileBase, rxnList, rxnData, metList, metData, metDegreeThr)Inputs
- model – COBRA metabolic network model
- fileBase – Base file name (without extensions) for Cytoscape input files that are generated by the function
Optional inputs
- rxnList – List of reactions that will included in the output (Default = all reactions)
- rxnData – Vector or matrix of data or cell array of strings to output for each reaction in rxnList (Default = empty)
- metList – List of metabolites that will be included in the output (Default = all metabolites)
- metData – Vector or matrix of data or cell array of strings to output for each metabolite in metList (Default = empty)
- metDegreeThr – Maximum degree of metabolites that will be included in the output. Allows filtering out highly connected metabolites such as ‘h2o’ or ‘atp’ (Default = no filtering)
Output
- notShownMets – Metabolites that are not included in the output
Note
Outputs three to five files:
- [baseName].sif - Basic network structure file containing reaction-metabolite and gene-reaction (if provided in model) associations
- [baseName]_nodeType.noa - Describes the node types (gene, rxn, met) in the network
- [baseName]_nodeComp.noa - Describes the compartments for metabolites
- [baseName]_subSys.noa - Describes the subsystems for reactions (if provided)
- [baseName]_rxnMetData.noa - Reaction and metabolite data (if provided)
-
outputNetworkOmix
(model, rxnBool)[source]¶ Outputs a text file for import into omix http://www.13cflux.net/omix/
Usage
outputNetworkOmix(model, rxnBool)Input
- model – COBRA model structure
Optional input
- rxnBool – boolean vector with 1 for each reaction to be exported
-
parseSBMLAnnotationField
(annotationField)[source]¶ Parses the annotation field of an SBML file to extract metabolite information associations
Usage
[metCHEBI, metHMDB, metKEGG, metPubChem, metInChI] = parseSBMLAnnotationField(annotationField)Input
- annotationField – annotation filed of an SBML fileBase
Outputs
- metCHEBI – Formula for each metabolite in the ChEBI format
- metHMDB – Formula for each metabolite in the HMDB format
- metKEGG – Formula for each metabolite in the KEGG format
- metPubChem – PubChem ID of each metabolite
- metInChI – Formula for each metabolite in the InCHI strings format
-
parseSBMLAnnotationFieldRxn
(annotationField)[source]¶ Parses the annotation field of an SBML file to extract reaction information associations
Usage
[rxnEC, rxnReference] = parseSBMLAnnotationFieldRxn(annotationField)Input
- annotationField – annotation filed of an SBML fileBase
Output
- rxnEC,rxnReference – only one of them is not empty depending on annotationField
-
parseSBMLNotesField
(notesField)[source]¶ Parses the notes field of an SBML file to extract gene-rxn associations
Usage
[subSystem, grRule, formula, confidenceScore, citation, comment, ecNumber, charge] = parseSBMLNotesField(notesField)Input
- notesField – notes field of SBML file
Outputs
- subSystem – subSystem assignment for each reaction
- grRule – a string representation of the GPR rules defined in a readable format
- formula – elementa formula
- confidenceScore – confidence scores for reaction presence
- citation – joins strings with authors
- comment – comments and notes
- ecNumber – E.C. number for each reaction
- charge – charge of the respective metabolite
-
planariseModel
(model, replicateMetBool)[source]¶ Converts model into a form that is suitable for display as a planar hypergraph
Usage
[modelPlane, replicateMetBool, metData, rxnData] = planariseModel(model, replicateMetBool)Inputs
- model – model structure
- replicateMetBool – met x 1 boolean vector of metabolites to be replicated for each reaction
Outputs
- modelPlane –
structure with fields:
- .S - matrix
- .mets - metabolites
- .origMets - original metabolites
- replicateMetBool – as in input
- metData – data of metabolites
- rxnData – data of reactions
-
readBooleanRegModel
(metModel, fileName)[source]¶ Reads Boolean regulatory network model
Usage
regModel = readBooleanRegModel(metModel, fileName)Input
- metModel – model
Optional input
- fileName – file name
Output
- regModel –
model containing the following fields:
- .mets - Metabolite rules:
- name - Metabolite/pool name (internal to the reg network model)
- rule - Metabolite ‘activation’ rule
- type - Metabolite type (extra/intracellular/pool)
- excInd - Exchange flux indices corresponding to extracellular metabolites
- icmRules - Intracellular metabolite ‘activation’ rules (based on a flux vector - fluxVector)
- pool - Pool components
- .regs - Regulator rules
- name - Regulator name
- rule - Regulator rule
- comp - Regulator rule components (i.e. metabolites or other regulators that affect the state of this regulator)
- ruleParsed - Rule in parsed format (based on metabolite state - metState, regulator state - regState)
- .tars - Target rules
- name - Target name
- rule - Target rule
- comp - Target rule components (i.e. metabolites or other regulators that affect the state of this regulator)
- ruleParsed - Rule in parsed format (based on metabolite state - metState, regulator state - regState)
- .mets - Metabolite rules:
-
readSBML
(fileName, defaultBound)[source]¶ Reads in a SBML format model as a COBRA matlab structure
Usage
model = readSBML(fileName, defaultBound)Input
- fileName – File name for file to read in
Optional input
- defaultBound – Maximum bound for model (Default = 1000)
Output
- model – COBRA model structure
-
readSimPhenyCMPD
(fileName)[source]¶ Reads SimPheny compound file obtained from admin console
Usage
[metInfo, mets] = readSimPhenyCMPD(fileName)Input
- fileName – SimPheny compound file name
Outputs
- metInfo – Structure contaning data on metabolites
- mets – List of metabolites
-
readSimPhenyGPR
(fileName)[source]¶ Reads SimPheny gene-protein-reaction association file obtained from admin console
Usage
[rxnInfo, rxns, allGenes] = readSimPhenyGPR(fileName)Input
- fileName – SimPheny GPR file
Outputs
- rxnInfo – Structure containing data for each reaction
- rxns – List of reactions
- allGenes – List of all genes
-
readSimPhenyGprText
(file, model)[source]¶ Parses SimPheny GPRA’s in text format into a rxn x gene association matrix
Usage
gpraModel = readSimPhenyGPRText(file, model)Inputs
- file – GPR text file
- model – COBRA model structure
Output
- gpraModel – COBRA model structure with reaction-gene association matrix
-
restrictModelsToFields
(models, fieldNames)[source]¶ Removes all fields not given as fieldnames from the models
Usage
restrictedModels = restrictModelsToFields(models, fieldNames)Inputs
- models – A Cell array of model structs (or single model struct that has all fieldNames provided.
- fieldNames – Names of the fields the models will be restricted to.
Output
- restrictedModels – The models with the non names fields removed, or a single struct if its just one model.
-
write4ti2
(SeFull, filename, uni)[source]¶ Writes an input file for 4ti2. ‘ti2’ is a software package for algebraic, geometric and combinatorial problems on linear spaces - www.4ti2.de
Usage
write4ti2(SeFull, filename, uni)Inputs
- SeFull – full stoichiometric matrix
- filename – name of the file
Optional input
- uni – {(0),1}, uni = 1 only outputs every second reaction
-
writeCytoscapeEdgeAttributeTable
(model, C, B, N, replicateMetBool, filename)[source]¶ Writes out a set of boolean edge attributes as one of a pair of colours, ‘Red’ for ‘yes’, ‘Black’ for ‘no’
Usage
writeCytoscapeEdgeAttributeTable(model, C, B, N, replicateMetBool, filename)Inputs
- model – structure with obligatory field .S - met x reaction
- C – reaction x attribute cell array
- B – reaction x attribute Boolean matrix
- N – reaction x attribute numeric array
- replicateMetBool – boolean for replicated mets
- filename – name of the file
-
writePajekNet
(model)[source]¶ Builds a metabolite centric directed graph from a COBRA model and outputs a graph in a .net format ready to use for most graph analysis software e.g. Pajek, it does one fba to set the link width equal to reaction fluxes.
Usage
writePajekNet(model)Input
- model – a COBRA structured model
Output
- .net – file containing the graph
Ex: A + B -> C (hypergraph) with v = 0 => no output (empty line)
if v > 0 then it becomes A -> C; B -> C (graph),
if v < 0 then the order is reversed
-
writeSBML
(model, fileName, compSymbolList, compNameList)[source]¶ Exports a COBRA structure into an SBML FBCv2 file. A SBMLFBCv2 file a file is written to the current Matlab path.
Usage
sbmlModel = writeSBML(model, fileName, compSymbolList, compNameList)Inputs
- model – COBRA model structure
- fileName – File name for output file
Optional inputs
- compSymbolList – List of compartment symbols
- compNameList – List of copmartment names corresponding to compSymbolList
Output
- sbmlModel – SBML MATLAB structure
-
xls2model
(fileName, biomassRxnEquation, defaultbound)[source]¶ Reads a model from Excel spreadsheet.
Usage
model = xls2model(fileName, biomassRxnEquation, defaultbound)Input
- fileName – xls spreadsheet, with one ‘Reaction List’ and one ‘Metabolite List’ tab
Optional inputs
- biomassRxnEquation – .xls may have a 255 character limit on each cell, so pass the biomass reaction separately if it hits this maximum.
- defaultbound – the deault bound for lower and upper bounds, if no bounds are specified in the Excel sheet
Output
- model – COBRA Toolbox model
Example
'Reaction List' tab headers (case sensitive): * Required: * 'Abbreviation': HEX1 * 'Reaction': `1 atp[c] + 1 glc-D[c] --> 1 adp[c] + 1 g6p[c] + 1 h[c]` * Optional: * 'GPR': (3098.3) or (80201.1) or (2645.3) or ... * 'Description': Hexokinase * 'Subsystem': Glycolysis * 'Reversible': 0 (false) or 1 (true) * 'Lower bound': 0 * 'Upper bound': 1000 * 'Objective': 0/1 * 'Confidence Score': 0,1,2,3,4 * 'EC Number': 2.7.1.1,2.7.1.2 * 'KEGG ID': R000001 * 'Notes': Reaction also associated with EC 2.7.1.2 * 'References': PMID:2043117,PMID:7150652,... 'Metabolite List' tab: Required headers (case sensitive): (needs to be complete list of metabolites, i.e., if a metabolite appears in multiple compartments it has to be represented in multiple rows. Abbreviations need to overlap with use in Reaction List * Required * 'Abbreviation': glc-D or glc-D[c] * Optional: * 'Charged formula' or formula: C6H12O6 * 'Charge': 0 * 'Compartment': cytosol * 'Description': D-glucose * 'KEGG ID': C00031 * 'PubChem ID': 5793 * 'ChEBI ID': 4167 * 'InChI string': InChI=1/C6H12O6/c7-1-2-3(8)4(9)5(10)6(11)12-2/h2-11H,1H2/t2-,3-,4+,5-,6?/m1/s1 * 'SMILES': OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O * 'HMDB ID': HMDB00122
Note
Optional inputs may be required for input on unix machines.
Note
Find an example Excel sheet at docs/source/examples/ExcelExample.xlsx