Sampling¶
- calcSampleDifference(sample1, sample2, nPts)¶
Selects randomly nPts flux vectors from sample1 and sample2 and calcutes the difference between the flux vectors
USAGE:
[sampleDiff, sampleRatio] = calcSampleDifference(sample1, sample2, nPts)
- INPUTS:
sample1: First flux sample sample2: Second flux sample
- OPTIONAL INPUTS:
- nPts: Number of flux difference profiles desired (default:
10% of the samples)
- OUTPUTS:
sampleDiff: Difference between the flux vectors sampleRatio: Ratio of the flux vectors
Example
example 1: [sampleDiff, sampleRatio] = calcSampleDifference(sample1, sample2) example 2: [sampleDiff, sampleRatio] = calcSampleDifference(sample1, sample2, 10)
- calcSampleStats(samples)¶
Calculate sample modes, means, standard devs, and medians of the sample
USAGE:
sampleStats = calcSampleStats(samples)
- INPUT:
samples: Samples to analyze
- OUTPUT:
sampleStats: Structure with the following fields:
mean
std
mode
median
skew
kurt
Example
example 1: sampleStats = calcSampleStats(sample) example 2: sampleStats = calcSampleStats({sample1, sample2})
- compareSampleTraj(rxnNames, samples, models, nBins)¶
Compares flux histograms for two or more samples for one or more reactions
USAGE:
compareSampleTraj(rxnNames, samples, models, nBins)
- INPUTS:
rxnNames: List of reaction names to compare samples: Samples to compare models: Cell array containing COBRA model structures
- OPTIONAL INPUTS:
nBins: Number of bins (Default = nSamples / 25)
- compareTwoSamplesStat(sample1, sample2, tests)¶
Compares statistically the difference between two samples. Does the Kolmogorov-Smirnov, rank-sum, chi-square, and T-tests.
USAGE:
[stats, pVals] = compareTwoSamplesStat(sample1, sample2, tests)
- INPUTS:
sample1, sample2: Samples to compare tests: {test1, test2,…} (Default = all tests)
‘ks’ - Kolmogorov-Smirnov test
‘rankSum’ - rank-sum test
‘chiSquare’ - chi-squre test
‘tTest’ - T-test
- OUTPUTS:
- stats: Struct array with statistics of the selected
tests.
- pVals: Struct array with p values of the selected
tests.
Example
example 1: [stats, pVals] = compareTwoSamplesStat(sample1, sample2) example 2: [stats, pVals] = compareTwoSamplesStat(sample1, sample2, {‘ks’, ‘rankSum’})
Output will be in order that tests are inputed. i.e. {‘ks’,’rankSum’}
- identifyCorrelSets(model, sample, corrThr, R)¶
Identifies correlated reaction sets from sampling data
USAGE:
[sets, setNumber, setSize] = identifyCorrelSets(model, sample, corrThr, R)
- INPUTS:
model: COBRA model structure sample: Sample to be used to identify correlated sets
- OPTIONAL INPUTS:
corrThr: Minimum correlation (\(R^2\)) threshold (Default = 1-1e-8) R: Correlation coefficient
- OUTPUTS:
sets: Sorted cell array of sets (largest first) setNumber: List of set numbers for each reaction in model (0 indicates
that there is no set)
setSize: List of set sizes
- loadSamples(filename, numFiles, pointsPerFile, numSkipped, randPts)¶
Loads a set of sampled data points
USAGE:
samples = loadSamples(filename, numFiles, pointsPerFile, numSkipped, randPts)
- INPUTS:
filename: The name of the files containing the sample points. numFiles: The number of files containing the sample points. pointsPerFile: The number of points to be taken from each file.
- OPTIONAL INPUTS:
numSkipped: Number of files skipped (default = 0) randPts: Select random points from each file (true/false, default = false)
- OUTPUT:
samples: Sample flux distributions
- plotHistConv(model, sample, rxnNames, nSubSamples)¶
Plots convergence of sample histograms
USAGE:
plotHistConv(model, sample, rxnNames, nSubSamples)
- INPUTS:
model: COBRA model structure sample: Sampled fluxes rxnNames: List of reactions to plot nSubSamples: Number of sub samples
Example
example 1: rxnNames = {‘R1’, ‘R2’} plotHistConv(model, sample, rxnNames, nSubSamples)
- plotSampleHist(rxnNames, samples, models, nBins, perScreen, modelNames, add2Plot)¶
Compares flux histograms for one or more samples for one or more reactions
USAGE:
plotSampleHist(rxnNames, samples, models, nBins, perScreen, modelNames, add2Plot)
- INPUTS:
rxnNames: Cell array of reaction abbreviations samples: Cell array containing samples models: Cell array containing model structures or common model
structure
- OPTIONAL INPUTS:
nBins: Number of bins to be used perScreen: Number of reactions to show per screen. Either a number or [nY, nX] vector.
(press ‘enter’ to advance screens)
- modelNames: Cell array containing the name of the models (used for the
plot’s legend).
- add2Plot: Struct array with additional data to show more
detaled information (real measuremets, FVA resuts, statistics results, etc).
Example
sampleStructOut1 = gpSampler(model1, 2150); sampleStructOut2 = gpSampler(model2, 2150); %Plot for model 1 plotSampleHist(model1.rxns,{samplePoints1},{model1})
%Plot reactions reactions in model 1 that also exist in model 2 using 10 %bins and plotting 12 reactions per screen. plotSampleHist(model1.rxns,{samplePoints1,samplePoints2},{model1,model2},10,12)
CONTROLS: To advance to next screen hit ‘enter/return’ or type ‘f’ and hit ‘enter/return’ To rewind to previous screen type ‘r’ or ‘b’ and hit ‘enter/return’ To quit script type ‘q’ and hit ‘enter/return’
- printSampleStats(sampledModel, commonModel, sampleNames, fileName)¶
Prints out sample statistics for multiple samples
USAGE:
printSampleStats(samples, commonModel, sampleNames, fileName)
- INPUTS:
sampledModel: Samples to plot commonModel: COBRA model structure sampleNames: Names of the models
- OPTIONAL INPUT:
- fileName: Name of tab delimited CSV file to generate
(Default = print to command window)
- sampleCbModel(model, sampleFile, samplerName, options, modelSampling)¶
Samples the solution-space of a constraint-based model
USAGE:
[modelSampling, samples] = sampleCbModel(model, sampleFile, samplerName, options, modelSampling)
- INPUTS:
- model: COBRA model structure with fields
.S - Stoichiometric matrix
.b - Right hand side vector
.lb - ‘n x 1’ vector: Lower bounds
.ub - ‘n x 1’ vector: Upper bounds
.C - ‘k x n’ matrix of additional inequality constraints
.d - ‘k x 1’ rhs of the above constraints
.dsense - ‘k x 1’ the sense of the above constraints (‘L’ or ‘G’)
.vMean - ‘n x 1’ vector: the mean for Gaussian sampling (RHMC only)
.vCov - ‘n x 1’ vector: the diagonal for the covariance for Gaussian sampling (RHMC only)
- OPTIONAL INPUTS:
sampleFile: File names for sampling output files (only implemented for ACHR) samplerName: {(‘CHRR’), ‘ACHR’, ‘RHMC’} Name of the sampler to be used to
sample the solution.
- options: Options for sampling and pre/postprocessing (default values
in parenthesis).
.nStepsPerPoint - Number of sampler steps per point saved (200)
.nPointsReturned - Number of points loaded for analysis (2000)
.nWarmupPoints - Number of warmup points (5000). ACHR only.
.nFiles - Number of output files (10). ACHR only.
.nPointsPerFile - Number of points per file (1000). ACHR only.
.nFilesSkipped - Number of output files skipped when loading points to avoid potentially biased initial samples (2) loops (true). ACHR only.
.maxTime - Maximum time limit (Default = 36000 s). ACHR only.
.toRound - Option to round the model before sampling (true). CHRR only.
.lambda - the bias vector for exponential sampling. CHRR_EXP only.
.nWorkers - Number of parallel workers. RHMC only.
- modelSampling: From a previous round of sampling the same
model. Input to avoid repeated preprocessing.
- OUTPUTS:
modelSampling: Cleaned up model used in sampling samples: n x numSamples matrix of flux vectors
Examples
%1) Sample a model called ‘superModel’ using default settings and save the % results in files with the common beginning ‘superModelSamples’
[modelSampling,samples] = sampleCbModel(superModel,’superModelSamples’);
%2) Sample a model called ‘hyperModel’ using default settings except with a total of 50 sample files % saved and with 5000 sample points returned.
options.nFiles = 50; options.nPointsReturned = 5000; [modelSampling,samples] = sampleCbModel(hyperModel,’options’,options);
- sampleScatterMatrix(rxnNames, model, sample, nPoints, fontSize, dispRFlag, rxnNames2)¶
Draws a scatterplot matrix with pairwise scatterplots for multiple reactions
USAGE:
sampleScatterMatrix(rxnNames, model, sample, nPoints, fontSize, dispRFlag, rxnNames2)
- INPUTS:
rxnNames: Cell array of reaction names to be plotted model: Model structure sample: Samples to be analyzed (nRxns x nSamples)
- OPTIONAL INPUTS:
nPoints: How many sample points to plot (Default nSamples) fontSize: Font size for labels (Default calculated based on
number of reactions)
dispRFlag: Display correlation coefficients (Default false) rxnNames2: Optional second set of reaction names
Example
%Plots the scatterplots only between the three reactions listed - %histograms for each reaction will be on the diagonal sampleScatterMatrix({‘PFK’, ‘PYK’, ‘PGL’}, model, sample); %Plots the scatterplots between each of the first set of reactions and %each of the second set of reactions. No histograms will be shown. sampleScatterMatrix({‘PFK’, ‘PYK’, ‘PGL’}, model, sample, 100, 10, true, {‘ENO’,’TPI’});