Sampling¶
-
calcSampleDifference
(sample1, sample2, nPts)[source]¶ Selects randomly nPts flux vectors from sample1 and sample2 and calcutes the difference between the flux vectors
Usage
[sampleDiff, sampleRatio] = calcSampleDifference(sample1, sample2, nPts)Inputs
- sample1 – First flux sample
- sample2 – Second flux sample
Optional input
- nPts – Number of flux difference profiles desired (default: 10% of the samples)
Outputs
- sampleDiff – Difference between the flux vectors
- sampleRatio – Ratio of the flux vectors
Example
example 1: [sampleDiff, sampleRatio] = calcSampleDifference(sample1, sample2) example 2: [sampleDiff, sampleRatio] = calcSampleDifference(sample1, sample2, 10)
-
calcSampleStats
(samples)[source]¶ Calculate sample modes, means, standard devs, and medians of the sample
Usage
sampleStats = calcSampleStats(samples)Input
- samples – Samples to analyze
Output
- sampleStats –
Structure with the following fields:
- mean
- std
- mode
- median
- skew
- kurt
Example
example 1: sampleStats = calcSampleStats(sample) example 2: sampleStats = calcSampleStats({sample1, sample2})
-
compareSampleTraj
(rxnNames, samples, models, nBins)[source]¶ Compares flux histograms for two or more samples for one or more reactions
Usage
compareSampleTraj(rxnNames, samples, models, nBins)Inputs
- rxnNames – List of reaction names to compare
- samples – Samples to compare
- models – Cell array containing COBRA model structures
Optional input
- nBins – Number of bins (Default = nSamples / 25)
-
compareTwoSamplesStat
(sample1, sample2, tests)[source]¶ Compares statistically the difference between two samples. Does the Kolmogorov-Smirnov, rank-sum, chi-square, and T-tests.
Usage
[stats, pVals] = compareTwoSamplesStat(sample1, sample2, tests)Inputs
- sample1, sample2 – Samples to compare
- tests –
{test1, test2,...} (Default = all tests)
- ‘ks’ - Kolmogorov-Smirnov test
- ‘rankSum’ - rank-sum test
- ‘chiSquare’ - chi-squre test
- ‘tTest’ - T-test
Outputs
- stats – Struct array with statistics of the selected tests.
- pVals – Struct array with p values of the selected tests.
Example
example 1: [stats, pVals] = compareTwoSamplesStat(sample1, sample2) example 2: [stats, pVals] = compareTwoSamplesStat(sample1, sample2, {'ks', 'rankSum'})
Output will be in order that tests are inputed. i.e. {‘ks’,’rankSum’}
-
identifyCorrelSets
(model, sample, corrThr, R)[source]¶ Identifies correlated reaction sets from sampling data
Usage
[sets, setNumber, setSize] = identifyCorrelSets(model, sample, corrThr, R)Inputs
- model – COBRA model structure
- sample – Sample to be used to identify correlated sets
Optional inputs
- corrThr – Minimum correlation (\(R^2\)) threshold (Default = 1-1e-8)
- R – Correlation coefficient
Outputs
- sets – Sorted cell array of sets (largest first)
- setNumber – List of set numbers for each reaction in model (0 indicates that there is no set)
- setSize – List of set sizes
-
loadSamples
(filename, numFiles, pointsPerFile, numSkipped, randPts)[source]¶ Loads a set of sampled data points
Usage
samples = loadSamples(filename, numFiles, pointsPerFile, numSkipped, randPts)Inputs
- filename – The name of the files containing the sample points.
- numFiles – The number of files containing the sample points.
- pointsPerFile – The number of points to be taken from each file.
Optional inputs
- numSkipped – Number of files skipped (default = 0)
- randPts – Select random points from each file (true/false, default = false)
Output
- samples – Sample flux distributions
-
plotHistConv
(model, sample, rxnNames, nSubSamples)[source]¶ Plots convergence of sample histograms
Usage
plotHistConv(model, sample, rxnNames, nSubSamples)Inputs
- model – COBRA model structure
- sample – Sampled fluxes
- rxnNames – List of reactions to plot
- nSubSamples – Number of sub samples
Example
example 1: rxnNames = {'R1', 'R2'} plotHistConv(model, sample, rxnNames, nSubSamples)
-
plotSampleHist
(rxnNames, samples, models, nBins, perScreen, modelNames, add2Plot)[source]¶ Compares flux histograms for one or more samples for one or more reactions
Usage
plotSampleHist(rxnNames, samples, models, nBins, perScreen, modelNames, add2Plot)Inputs
- rxnNames – Cell array of reaction abbreviations
- samples – Cell array containing samples
- models – Cell array containing model structures or common model structure
Optional inputs
- nBins – Number of bins to be used
- perScreen – Number of reactions to show per screen. Either a number or [nY, nX] vector. (press ‘enter’ to advance screens)
- modelNames – Cell array containing the name of the models (used for the plot’s legend).
- add2Plot – Struct array with additional data to show more detaled information (real measuremets, FVA resuts, statistics results, etc).
Example
sampleStructOut1 = gpSampler(model1, 2150); sampleStructOut2 = gpSampler(model2, 2150); %Plot for model 1 plotSampleHist(model1.rxns,{samplePoints1},{model1}) %Plot reactions reactions in model 1 that also exist in model 2 using 10 %bins and plotting 12 reactions per screen. plotSampleHist(model1.rxns,{samplePoints1,samplePoints2},{model1,model2},10,12)
CONTROLS: To advance to next screen hit ‘enter/return’ or type ‘f’ and hit ‘enter/return’ To rewind to previous screen type ‘r’ or ‘b’ and hit ‘enter/return’ To quit script type ‘q’ and hit ‘enter/return’
-
printSampleStats
(sampledModel, commonModel, sampleNames, fileName)[source]¶ Prints out sample statistics for multiple samples
Usage
printSampleStats(samples, commonModel, sampleNames, fileName)Inputs
- sampledModel – Samples to plot
- commonModel – COBRA model structure
- sampleNames – Names of the models
Optional input
- fileName – Name of tab delimited CSV file to generate (Default = print to command window)
-
sampleCbModel
(model, sampleFile, samplerName, options, modelSampling)[source]¶ Samples the solution-space of a constraint-based model
Usage
[modelSampling, samples] = sampleCbModel(model, sampleFile, samplerName, options, modelSampling)Input
- model – COBRA model structure with fields * .S - Stoichiometric matrix * .b - Right hand side vector * .lb - Lower bounds * .ub - Upper bounds
Optional inputs
sampleFile – File names for sampling output files (only implemented for ACHR)
samplerName – {(‘CHRR’), ‘ACHR’} Name of the sampler to be used to sample the solution.
options – Options for sampling and pre/postprocessing (default values in parenthesis).
- .nStepsPerPoint - Number of sampler steps per point saved (200)
- .nPointsReturned - Number of points loaded for analysis (2000)
- .nWarmupPoints - Number of warmup points (5000). ACHR only.
- .nFiles - Number of output files (10). ACHR only.
- .nPointsPerFile - Number of points per file (1000). ACHR only.
- .nFilesSkipped - Number of output files skipped when loading points to avoid potentially biased initial samples (2) loops (true). ACHR only.
- .maxTime - Maximum time limit (Default = 36000 s). ACHR only.
- .toRound - Option to round the model before sampling (true). CHRR only.
- .lambda - the bias vector for exponential sampling. CHRR_EXP only.
modelSampling – From a previous round of sampling the same model. Input to avoid repeated preprocessing.
Outputs
- modelSampling – Cleaned up model used in sampling
- samples – n x numSamples matrix of flux vectors
Example
%1) Sample a model called 'superModel' using default settings and save the % results in files with the common beginning 'superModelSamples' [modelSampling,samples] = sampleCbModel(superModel,'superModelSamples'); %2) Sample a model called 'hyperModel' using default settings except with a total of 50 sample files % saved and with 5000 sample points returned. options.nFiles = 50; options.nPointsReturned = 5000; [modelSampling,samples] = sampleCbModel(hyperModel,'options',options);
-
sampleScatterMatrix
(rxnNames, model, sample, nPoints, fontSize, dispRFlag, rxnNames2)[source]¶ Draws a scatterplot matrix with pairwise scatterplots for multiple reactions
Usage
sampleScatterMatrix(rxnNames, model, sample, nPoints, fontSize, dispRFlag, rxnNames2)Inputs
- rxnNames – Cell array of reaction names to be plotted
- model – Model structure
- sample – Samples to be analyzed (nRxns x nSamples)
Optional inputs
- nPoints – How many sample points to plot (Default nSamples)
- fontSize – Font size for labels (Default calculated based on number of reactions)
- dispRFlag – Display correlation coefficients (Default false)
- rxnNames2 – Optional second set of reaction names
Example
%Plots the scatterplots only between the three reactions listed - %histograms for each reaction will be on the diagonal sampleScatterMatrix({'PFK', 'PYK', 'PGL'}, model, sample); %Plots the scatterplots between each of the first set of reactions and %each of the second set of reactions. No histograms will be shown. sampleScatterMatrix({'PFK', 'PYK', 'PGL'}, model, sample, 100, 10, true, {'ENO','TPI'});