Preprocessing

GPRparser(model, getCNFSets)

Maps the GPR rules of the model to a specified format that is used by the model extraction methods

USAGE:

parsedGPR = GPRparser(model)

INPUT:

model: cobra model structure

OPTIONAL INPUT:
getCNFSets: whether to get the CNF sets (true) or DNF sets (false).

DNF sets represent functional enzyme complexes, while CNF sets represent the possible subunits of a complex. (default: false , i.e. DNF sets)

OUTPUT:

parsedGPR: cell matrix containing parsed GPR rule

AUTHORS: Thomas Pfau & Anne Richelle, May 2017

findUsedGenesLevels(model, exprData, printLevel)

Returns vectors of gene identifiers and corresponding gene expression levels for each gene present in the model (‘model.genes’).

USAGE:

[gene_id, gene_expr] = findUsedGenesLevels(model, exprData) [gene_id, gene_expr, gene_sig] = findUsedGenesLevels(model, exprData)

INPUTS:

model: input model (COBRA model structure)

exprData: mRNA expression data structure
.gene cell array containing GeneIDs in the same

format as model.genes

.value Vector containing corresponding expression value (FPKM) .sig: [optional field] Vector containing significance values of

expression corresponding to expression values in exprData.value (ex. p-values)

OPTIONAL INPUTS:

printLevel: Printlevel for output (default 0);

OUTPUTS:

gene_id: vector of gene identifiers present in the model

that are associated with expression data

gene_expr: vector of expression values associated to each

‘gene_id’

OPTIONAL OUTPUTS:
gene_sig: vector of significance values associated to each

‘gene_id’

mapExpressionToReactions(model, expressionData, minSum)

Determines the expression data associated to each reaction present in the model

USAGE:

[expressionRxns parsedGPR, gene_used] = mapExpressionToReactions(model, expressionData) [expressionRxns, parsedGPR, gene_used, signifRxns] = mapExpressionToReactions(model, expressionData, minSum)

INPUTS:

model model strusture expressionData mRNA expression data structure

.gene cell array containing GeneIDs in the same

format as model.genes

.value Vector containing corresponding expression

value (FPKM/RPKM)

.sig: [optional field] Vector containing significance values of

expression corresponding to expression values in expressionData.value (ex. p-values)

OPTIONAL INPUT:
minSum: instead of using min and max, use min for AND and Sum

for OR (default: false, i.e. use min)

OUTPUTS:
expressionRxns: n x 1 non-negative value for reaction expression, corresponding to model.rxns.

expressionRxns(j) is NaN when there is no expression data for the genes corresponding to reaction j.

parsedGPR: cell matrix containing parsed GPR rule gene_used: gene identifier, corresponding to model.rxns, from GPRs

whose value (expression and/or significance) was chosen for that reaction

OPTIONAL OUTPUTS:

signifRxns: significance of reaction expression, corresponding to model.rxns.

selectGeneFromGPR(model, gene_names, gene_exp, parsedGPR, minSum, gene_sig)

Map gene expression to reaction expression using the GPR rules. An AND will be replaced by MIN and an OR will be replaced by MAX.

USAGE:

[expressionCol, gene_used] = selectGeneFromGPR(model, gene_names, gene_exp, parsedGPR, minSum) [expressionCol, gene_used, signifCol] = selectGeneFromGPR(model, gene_names, gene_exp, parsedGPR, minSum, gene_sig)

INPUTS:

model: COBRA model struct gene_names: gene identifiers corresponding to gene_exp. Names must

be in the same format as model.genes (column vector) (as returned by “findUsedGeneLevels.m”)

gene_exp: gene FPKM/expression values, corresponding to names (column vector)

(as returned by “findUsedGeneLevels.m”)

parsedGPR: GPR matrix as returned by “GPRparser.m”

OPTIONAL INPUTS:
minSum: instead of using min and max, use min for AND and Sum

for OR

gene_sig: vector of significance values associated to each

‘gene_id’ (as returned by “findUsedGeneLevels.m”)

OUTPUTS:
expressionCol: reaction expression, corresponding to model.rxns.

No gene-expression data and orphan reactions will be given a value of NaN.

gene_used: gene identifier, corresponding to model.rxns, from GPRs

whose value (expression and/or significance) was chosen for that reaction

OPTIONAL OUTPUTS:
signifCol: reaction significance, corresponding to model.rxns.

No gene-expression data and orphan reactions will be given a value of 0.