Content
- Introduction
- Example model
- File formats
- Flux Balance Analysis
- Detection of dead ends
- Flux Variability Analysis
- Minimization of Metabolic Adjustment
- Knockout analysis
- Split-ratio Analysis
- Metabolite Flux Minimization
- SBML Support
- Other metano tools
Introduction
All methods implemented in are available both as Python functions and as stand-alone
command line
scripts. This page only describes the stand-alone programs, which take lists of reactions in plain ASCII format as
input. is designed to run in any Linux environment. A usage hint for any script in
can be
obtained by launching the program with the command line option -h
.
Example model
An E. coli core model can be downloaded in the SBML format from
http://systemsbiology.ucsd.edu/Downloads/EcoliCore.
The model can be converted to the format via the command
sbmlparser.py -f ecoli_core_model.xml -o reactions_Ec_core.txt -p scenario_Ec_core.txt -q flux_Ec_core.txt -b _b$ -n -s
This produces the reaction file reactions_Ec_core.txt
, the scenario file
scenario_Ec_core.txt
, and the flux file flux_Ec_core.txt
in ’s output
file format. These files can be used as test input for trying out the programs in .
File formats
In
Both reaction and scenario files can contain comments. A comment starts with a hash character (#) and ends at the
end
of the line on which it occurs.
Reaction file
The reaction file contains lines of the form
<reaction name> : <substrates> <arrow> <products>where
<substrates>
and <products>
are lists of compounds (metabolites) given
as
[coefficient] <compound name>entries, which are separated by plus signs (+). If the stoichiometric coefficient is omitted, the default value of 1 is assumed.
<arrow>
is <=>
, -->
, or <--
and
specifies a constraint on the direction of the reaction. -->
means that the reaction is irreversible
and can only proceed from left to right, while <--
means that it can only proceed from right to
left. <=>
means that the reaction is reversible. The flux through an irreversible reaction is
implicitly constrained to be positive (-->
) or negative (<--
), respectively.Example:
PBG-D : 4 porphobilinogen + h2o <=> hydroxymethylbilane + 4 ammoniaTransporters are defined as ordinary reactions that convert between a compound in one compartment and the same compound in another compartment. For distinguishing between metabolite pools in different compartments, compartment identifiers should be appended to the compound names. Example:
co2-trans : co2[c] --> co2[e]Boundary fluxes are defined by reactions that are not in mass balance, i.e. they convert a compound to nothing or produce it from nothing. Example:
co2-boundary : co2[e] <=>Reaction and compound names may contain any character except for whitespace characters (space, tab, newline), colon (:), hash (#), and the reserved arrow sequences (
<=>
, -->
, <--
).
Scenario file
The scenario file can contain five different types of parameter definitions:
- Objective function for FBA
- Flux bounds
- Linear equality or inequality constraints
- Solver to be used for optimization problems
- Number of starting points (used only for nonlinear optimization)
OBJ <MAX|MIN> <objective function>
<objective function>
is either a linear combination of reaction fluxes or a term of the form
PER_FLUX(<reaction name>)
. In the first case, the value of the linear function is minimized
or
maximized, respectively. In the second case, the nonlinear objective function “flux through single reaction
divided by summed absolute flux over all reactions” is used. There may be only one OBJ definition per file.
In a linear combination of flux variables, reaction fluxes are identified by the corresponding reaction names as defined in the reaction file. For instance,
R1 - R2b + 2.5 R3represents the expression “flux through reaction R1 minus flux through R2b plus 2.5 times flux through R3”. As + and − can be part of reaction names, whitespace is required between operators, coefficients, and reaction names.
Flux bounds are defined by lines of the form
LB <reaction name> <value>or
UB <reaction name> <value>respectively. UB defines an upper bound, while LB defines a lower bound.
UB GLU-1 22
means
that the flux through reaction GLU-1 is restricted to values less than 22. It is possible to restrict the flux
through a reaction to a fixed value by setting the lower and upper bounds for that reaction to the same value. Gene
deletions can be modeled by restricting to zero the fluxes through all reactions catalyzed by the enzyme to be
knocked
out.Arbitrary linear constraints can be defined as equalities or inequalities of the form
<linear combination of fluxes> <sign> <value>where
<sign>
is <, >, or =, and <linear combination of fluxes>
is a
linear combination of flux variables, as described above. For example, the constraint
R_complex-1A - R_complex-1B = 0states that the reactions
R_complex-1A
and R_complex-1B
must have the same flux value.
The syntax of linear constraints can also be used to define flux bounds. The lines
UB GLU-1 22and
GLU-1 < 22are treated as equivalent.
The solver to be used for solving optimization problems is specified by name in a line of the form
SOLVER <name>The following solvers are currently available:
glpk
(default, recommended) andlpsolve
for linear programmingcvxopt_qp
for quadratic programmingalgencan
for nonlinear programming
The number of random starting points to try in nonlinear optimization can be specified by a line of the form
NUM_ITER <value>There may be only one NUM_ITER definition per file.
Flux Balance Analysis
The script fba.py performs flux balance analysis (FBA) on a given metabolic network, i.e. it solves the following optimization problem:
Minimize/Maximize the given objective function in steady state under the given additional linear equality and inequality constraints with all flux variables constrained to the given bounds.The script is called as follows:
fba.py -r <reaction file> -p <scenario file> -o <output file> [-m]It parses the given reaction and scenario files, formulates the optimization problem, applies the given solver to the problem, and writes the output to the given file. The optional command line switch
-m
instructs the
program to return the optimal solution with the lowest total flux. As this solution is determined using binary
search
with one FBA at each step, this has a much longer running time than an ordinary FBA run.The output is in form of a table with one entry for each reaction, listing the reaction name, the predicted flux value, and the effective lower and upper flux bounds. Like all programs in ,
fba.py
supports the command line option -h
, which prints a help message regarding the usage of the script.
Detection of dead ends
deadends.py identifies dead ends in the metabolic network, i.e. compounds that cannot be produced and consumed in separate reactions. The usage is
deadends.py -r <reaction file> -p <scenario file> [-a] [-t <regex>]Owing to the steady state condition, any reaction that contains a dead end is assigned a flux of zero in the FBA. The optional command line switch
-a
instructs the program to recursively identify all reactions that
are
blocked due to dead ends and all secondary dead ends induced by blocked reactions. Dead ends can indicate errors in
the model, such as missing reactions, wrong reaction directions, and use of different names for the same compound.
The program can be instructed to temporarily leave the fluxes through transport and exchange reactions unconstrained. This allows distinguishing between dead ends resulting from the network topology and those induced by the virtual medium composition. To this end, a regular expression identifying the transport and/or exchange reactions can be passed to the program using the option
-t
.The output, which is sent to the console, consists of a list of all dead-end metabolites followed by a list of all blocked reactions.
Flux Variability Analysis
The script fva.py performs flux variability analysis (FVA) on a given metabolic network. Its usage is
fva.py -r <reaction file> -p <scenario file> -o <output file> [-t <threshold>] [-s <solution file>]It first performs a standard FBA (or reads in an FBA solution given via command line option
-s
) and
then performs variability analysis by the following algorithm:
- Introduce a new constraint: Original objective function must have a value of at least
<threshold>
(default: 0.95) times the optimal value (as determined from the original FBA). - Perform linear programming for each flux variable: Successively maximize and minimize each of the flux variables.
-t
.The output is in form of a table with one entry for each reaction, listing the reaction name, the minimum flux value, the flux value in the FBA solution, the maximum flux value, the difference between maximum and minimum value, and the effective lower and upper flux bounds.
Minimization of Metabolic Adjustment
The script moma.py performs minimization of metabolic adjustment (MOMA) on the metabolic network and inequality constraints of a virtual mutant against the given FBA solution for the corresponding wild type, i.e. it solves the following optimization problem:
Minimize the squared Euclidean distance between the flux vector and the wild-type flux distribution in steady state under the given additional linear equality and inequality constraints with all flux variables constrained to the given bounds.The script is called as follows:
moma.py -r <reaction file> -p <scenario file> -w <wild-type solution file> -o <output file> [-s <solver>] [-i <num_iter>]It parses the given reaction, scenario, and FBA solution files, formulates the quadratic optimization problem, applies the optionally given solver to the problem, and writes the output to the given file. If the command line option
-s
is omitted, solver cvxopt_qp
is used by default. If a nonlinear solver is used,
the number of random starting points to be used can be specified using the command line option -i
.
Mutant and wild type must have the same reactions. Gene deletions can be modeled by restricting to zero the fluxes through all reactions catalyzed by the enzyme to be knocked out. MOMA output has the same format as FBA output (see above).
Knockout analysis
The script knockout.py performs an automated analysis of the virtual knockout mutants given as a list of reaction groups to be knocked out together. This is done by sequentially setting the fluxes through all reactions in each group to zero and performing FBA or MOMA against the wild-type solution. The usage is
knockout.py -r <reaction file> -p <scenario file> -o <output file> [-f <solution file prefix>] [-g <knockout group file>] [-n] [-m <always|never|auto>] [-w <wild-type solution>] [-s <solver>] [-i <num_iter>]Reaction groups are defined using the command line option
-g
. <knockout group file>
contains lines of the form
group_name;reaction_nameAll lines with the same
group_name
define one reaction group. A reaction may belong to more than one
reaction group. If the command line switch -n
is present, all reactions not belonging to any group are
knocked out individually (usually not desirable). If the option -g
is omitted, all single-reaction
“knockouts” are generated.At first, the script performs an FBA to get the optimal wild-type flux distribution or reads it from file if the
-w
option is used. It then sequentially generates virtual knockout mutants for each reaction group and
computes a flux distribution for each such “mutant”. If the command line option -f
is used,
all generated flux distributions are written to files named
<solution file prefix><group_name>.txt
. The command line option -m
selects the
computational method:
-m auto
(default): Perform FBA first to determine viability, then perform MOMA only if the knockout has not been found to be lethal.-m always
: Always perform MOMA (slower).-m never
: Perform FBA only (fast, but less accurate).
If the command line option -s
is omitted, solver
cvxopt_qp
is used by default. If a nonlinear solver is used, the number of random starting points to be
used can be specified using the command line option -i
.
The output is in form of a table with one entry for each reaction group, listing the group name, the percentage of
the
FBA objective function value obtained for the mutant in comparison to the wild type, the distance as measured by the
actual MOMA objective function (Euclidean distance or squared distance, possibly multiplied by a constant factor),
the
absolute distance, and the FBA objective function value for the mutant.
Because of its long running time, this program has been parallelized using
MPI, which allows it to be
executed on computer clusters. On most MPI installations, a multi-process run of knockout.py
can be
launched by typing
mpiexec -n <number of processes> knockout.py <options>in the console, provided that the MPI daemon is running.
Alternatively, FBA can be used for knockout analysis instead of MOMA. This is selected via the command line switch
-v
. In this case, the running time is much lower, but results may be less biologically meaningful.
Split-ratio Analysis
The script splitratios.py performs split-ratio analysis (SRA) on a given FBA solution for a given metabolic network. Its usage is
splitratios.py <solution file> -r <reaction file> -o <output file> [-u] [-t <threshold>]This computes the split ratios for all metabolite nodes in the network. The command line switch
-u
excludes all metabolite nodes that have only one incoming and one outgoing flux (i.e. those lying on unbranched
paths).
The option -t
further allows the exclusion of all metabolite nodes with a total flux below the given
threshold.The script can also be used to compute only the split ratios for a single metabolite. In this case the usage is
splitratios.py <solution file> -r <reaction file> -m <metabolite name>This leads to a much clearer output than the long table produced when the command line option
-o
is
used.
Metabolite Flux Minimization
The script mfm.py performs metabolite flux minimization (MFM) on a given metabolic network, i.e. it successively minimizes the flux through each metabolite node. Its usage is
mfm.py -r <reaction file> -p <scenario file> -o <output file> [-t <threshold>]It first performs a standard FBA and then performs variability analysis by the following algorithm:
- Introduce a new constraint: Original objective function must have a value of at least
<threshold>
(default: 0.95) times the optimal value (as determined from the original FBA). - Perform linear programming for each metabolite flux: Successively minimize each metabolite flux.
-t
.
Automated plausibility checking of metabolic models
The script modelassert.py checks a number of assertions (Boolean expressions that evaluate to
True
for the expected case) about a model’s predictions and reports those that are violated. The
assertion monitor is used as follows:
modelassert.py -r <reaction file> -p <scenario file> -a <assertion file> [-t <threshold>]It reads in the given reaction file and scenario file, performs FBA, FVA, and split-ratio analysis, and checks the results against the list of assertions from the given file. All assertions that do not hold are reported.
Each line of the assertion file is one Boolean expression. Empty lines are allowed. Comments, which start with the hash sign (#) and end at the end of the line, are ignored. The following symbols can be used in assertions:
<reaction name>
– flux through reaction in FBA solutionmin(<reaction name>)
– FVA minimum of flux through reactionmax(<reaction name>)
– FVA maximum of flux through reaction<metabolite name>
– total flux going through metabolite in FBA solutioninratio(<metabolite name>, <reaction name>)
– fraction of metabolite produced via reaction in FBA solutionoutratio(<metabolite name>, <reaction name>)
– fraction of metabolite consumed via reaction in FBA solution
eval()
function is used to evaluate the assertions. It is the user’s responsibility to
use only Python functions that do not produce disruptive side effects.
SBML Parser
With sbmlparser.py,
includes a parser for models in the Systems Biology Markup Language (SBML). The parser reads an SBML file and writes a set of files in the native formats of . The usage is as follows:sbmlparser.py -f <SBML file> -o <r.filename> -p <s.filename> [-q <f.filename>] [-i <inf-level>] [-n] [-s] [-x <solver>] [-b <boundary regex>]The parser reads the given SBML file and writes the reactions to the file designated by
r.filename
. The
flux bounds are written to the scenario file s.filename
. If the command line option -q
is
present, reaction fluxes (if present in the SBML file) are written to file f.filename
. The file format
for fluxes is the output file format of fba.py
(see above). The
reaction file, scenario file, and flux file are overwritten if they exist.Option
-i
can be used to specify an infinity level above which flux bounds are considered to be
infinite
(default: 1000). The same value is used for positive and negative infinity. The command line switch -n
instructs the parser to use names in place of IDs for reactions and compounds, which can improve the readability of
the resulting files. The switch -s
suppresses the transfer of comments, which may clutter the reaction
file. Option -x
can be used to pass a solver to be written to the scenario file.“Boundary species”, which can be exchanged freely with the environment, are usually marked in SBML by having their
boundaryCondition
set to “true”
. However, in some models this
information is only encoded in the name of the compound (e.g. by the suffix _b
). The identifying tag
can
be passed to the program as a
regular expression via command line
option -b
. If boundary metabolites are not correctly identified, the model will not work in
FBA.Error, warning, and info messages are written to the console. See above for a usage example.
SBML Writer
sbmlwriter.py converts a set of one reaction file, one scenario file, and (optionally) one flux distribution (i.e. FBA or MOMA solution file) to an SBML file. Its usage is
sbmlwriter.py -r <reaction file> -p <scenario file> -o <SBML file> [-q <flux file>] [-c <compartment regex>] [-d <default compartment>] [-i <inf-level>] [-s] [-b <suffix>]The SBML writer reads a reaction file, scenario file, and (optionally) flux file and writes the output to
<SBML file>
.The command line option
-c
can be used to specify the encoding of compartment information in the names
of
the compounds in the reaction file. For instance, "\[(\w+)\]$"
means that a word in brackets at the end
of the compound name (such as [Cytosol]
or [Extracellular]
) identifies the compartment.
The
regular expression must contain exactly one group (in parentheses). The option -d
is used to define the
default compartment to which all compounds not matching the regular expression are assigned (default:
Cytosol
).If the option
-i
is present, finite values are used for all flux bounds. The value
<inf-level>
is assigned to all unbounded fluxes (with appropriate sign). The command line switch
-s
can be used to suppress the transfer of comments. If the option -b
is present, boundary
species (marked with the given suffix) are added to all reactions lacking either products or reactants.
Other metano tools
-h
,
which prints a help message regarding the usage of the script.
to_scatterplot.py
Warning: this module contains known bugs and limitations. Use at own risk.
This script allows to compare two flux distributions in a logarithmic scatter plot. The x and y files can be either
flux distributions obtained from FBA, MOMA, FVA, or MFM analysis, or plain scenario files. The script reads the
files or performs analyses depending on the given analysis method. The files that are passed to x and y must be of
the same kind. If two scenario files are passed to the script, a reaction file must be provided (-r
).
The usage is
to_scatterplot.py -x <x FILE> -y <y FILE> -m <fba|moma|fva|mfm|mva> [-r <reaction file> -c <config file> -o <output file> -b <obj fun>]
If an output file is given (-o
, the scatter plot is saved in a file. Otherwise, an interactive
scatterplot is shown.
The user can pass the name of the objective function to the -b
flag of the script. By doing this, all
differences between both scenarios that are only based on the change in objective function, are considered to be not
significant.
It is possible to pass a configuration file, that is used to alter the style of the scatter plot, to the
-c
flag. This style file may contain the following parameters:
-
label_length = 15
(Maximum label length longer labels will be abbreviated) -
ids_in_names = False
(True if BRENDA, KEGG, or MetaCyc IDs are in reaction names) -
unique_names = False
(If True, keep names unique if reaction names are abbreviated) -
min_value = 0.0
(Lower cutoff value for the plot) -
insignificant_color = #545454
(Color for insignificant data points) -
significant_color = #BE1E3C
(Color for insignificant data points) -
xlabel = x file name
(label for the x axis; if omitted, the filename is used) -
ylabel = y file name
(label for the y axis; if omitted, the filename is used) -
title =
(Title for the plot; leave empty to disable the title) -
show_errorbar = significant
(choose which errorbar to show [all, significant, none], works only for FVA)/MVA -
show_labels = significant
(choose which point labels to show [all, significant, none]) -
arrowcolor = #BE1E3C
(color for label arrows, only works with textAlign) -
labelcolor = #BE1E3C
(color for point label text; does not work with textAlign) -
significance_threshold = 0.1
(threshold for significance calculation) -
fontsize = 10
(general fontsize) -
figsize = 10 8
(the figure size, x and y seperated by one space) -
plotgrid = False
(show the grid of the plot) -
arrow_line_width = 1
(the width of the label arrows; only works with textAlign)
-x
and -y
, the metabolic variability
analysis (MVA) kann be performed by setting -m mva
. First of all, this method performs FBA and shows the
result of split ratio analysis as data points. Secondly, a MFM for each scenario is performed, with both, metabolic flux
minimization and maximization. The resulting flux range is shown as error bars in the scatterplot.
to_evaluate_objective.py
This script improves debugging of
models by evaluating the production flux of each$ It iterates through all metabolites of the objective function and maximizes their production flux. The usage isto_evaluate_objective.py -r <reaction file> -s <solution file>
The script prints its output in the console and lists all metabolites, that could not be produced by the metabo$
to_rea2m.py
This script reads in a reaction file and exports the stoichiometric matrix corresponding to the metabolic network as a .m file, which can be read by MATLAB and GNU Octave. The usage is
to_rea2m.py -r <reaction file> -o <output file>The output file should have the extension “.m”.
to_list_active_metab.py
This script reads in an FBA/MOMA solution file and the underlying reaction file and prints the names of all compounds present in reactions with a non-zero flux along with the number of such reactions in which they appear. The usage is
to_list_active_metab.py -r <reaction file> -s <solution file> [-c <cutoff>]Reactions with a flux value below the given cutoff (default: 1E-10) are considered inactive. The output, which is sent to the console, consists of a list of all active metabolites, each with the number of active reactions it is involved in, followed by a list of all active reactions.
to_diff_solutions.py
This script computes the absolute difference between two flux distributions given as FBA/MOMA solution files. The usage is
to_diff_solutions.py <file1> <file2> [-o <output file>] [-c <cutoff>]The output is in form of a table with one entry for each reaction, listing the reaction name, the absolute flux difference, and the values of the flux in file1 and file2, respectively. If no output file is given, the output is sent to the console, and the latter two columns are omitted. The output is sorted by the absolute flux differences in descending order. Only flux differences above the given cutoff value are reported. If no cutoff is specified, the default value of 1E-10 is used.
to_check_constraints.py
This script checks whether the given flux distribution (FBA/MOMA solution file) conforms to the constraints imposed by the given reaction and/or scenario file. It lists all constraints violations. There is zero tolerance, so that e.g. a value of -1E-25 violates an “LB 0” constraint. The usage is
to_check_constraints.py <solution file> [-r <reaction file>] [-p <scenario file>]The list of violated constraints is printed to the console.
to_fix_names.py
This script reads in a reaction file that contains whitespace characters in the reaction or compound names and exports a fixed reaction file, in which these characters have been replaced with underscores (_). The usage is
to_fix_names.py -i <input file> -o <output file>Whitespace characters (space, tab, newline) are normally illegal in compound names, but might still occur if the file has been automatically generated, e.g. from enzyme or pathway databases. This script fixes such files, so that the metabolic network defined therein can be analyzed with .