0% found this document useful (0 votes)

50 views

SQMtools 0.7.1

The SQMtools package provides functions for loading and analyzing results from the SqueezeMeta metagenomics analysis pipeline. It loads SqueezeMeta outputs into a single R object for easy analysis and filtering. Functions allow filtering results based on taxonomy, functions, contigs and more. Results can be visualized with basic plots and exported for programs like Krona and Pathview. The package combines multiple filtered SqueezeMeta results into a single object for integrated analysis.

Uploaded by

shanmugapriya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views

SQMtools 0.7.1

Uploaded by

shanmugapriya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Package ‘SQMtools’

January 26, 2022

Title Analyze results generated by the SqueezeMeta pipeline
Version 0.7.1
Description SqueezeMeta is a versatile pipeline for the automated analysis of metage-
nomics/metatranscriptomics data (http://github.com/jtamames/SqueezeMeta). This package pro-
vides functions loading SqueezeMeta results into R, filtering them based on different crite-
ria, and visualizing the results using basic plots. The SqueezeMeta project (and any sub-
sets of it generated by the different filtering functions) is parsed into a single object, whose dif-
ferent components (e.g. tables with the taxonomic or functional composition across sam-
ples, contig/gene abundance profiles) can be easily analyzed using other R packages such as ve-
gan or DESeq2
Author Fernando Puente-Sánchez, Natalia García-García
Maintainer Fernando Puente-Sánchez <[email protected]>
Depends R (>= 3.2.0)
Imports reshape2, ggplot2, pathview, data.table
Suggests vegan, DESeq2
License GPLv3
Encoding UTF-8
LazyData true
RoxygenNote 7.1.2
BugReports https://github.com/jtamames/SqueezeMeta/issues
URL https://github.com/jtamames/SqueezeMeta

R topics documented:
combineSQM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
combineSQMlite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
exportKrona . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
exportPathway . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
exportTable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Hadza . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
loadSQM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
loadSQMlite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
MGKOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
MGOGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
mostAbundant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1
2 combineSQM

plotBars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
plotFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
plotHeatmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
plotTaxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
RecA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
rowMaxs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
rowMins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
subsetBins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
subsetContigs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
subsetFun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
subsetORFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
subsetRand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
subsetTax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
summary.SQM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
summary.SQMlite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
USiCGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Index 30

combineSQM Combine several SQM objects

Description
Combine an arbitrary number of SQM objects into a single SQM object. The input objects must
be subsets of the same original SQM object (i.e. from the same SqueezeMeta run). For combining
results from different runs please check combineSQMlite.

Usage
combineSQM(
...,
tax_source = "orfs",
trusted_functions_only = F,
ignore_unclassified_functions = F,
rescale_tpm = T,
rescale_copy_number = T
)

Arguments
... an arbitrary number of SQM objects. Alternatively, a single list containing an
arbitrary number of SQM objects.
tax_source character. Features used for calculating aggregated abundances at the different
taxonomic ranks. Either "orfs" or "contigs" (default "orfs"). If the
objects being combined contain a subset of taxa or bins, this parameter can be
set to TRUE.
trusted_functions_only
logical. If TRUE, only highly trusted functional annotations (best hit + best aver-
age) will be considered when generating aggregated function tables. If FALSE,
best hit annotations will be used (default FALSE).
combineSQMlite 3

ignore_unclassified_functions
logical. If FALSE, ORFs with no functional classification will be aggregated
together into an "Unclassified" category. If TRUE, they will be ignored (default
FALSE).
rescale_tpm logical. If TRUE, TPMs for KEGGs, COGs, and PFAMs will be recalculated
(so that the TPMs in the subset actually add up to 1 million). Otherwise, per-
function TPMs will be calculated by aggregating the TPMs of the ORFs an-
notated with that function, and will thus keep the scaling present in the parent
object (default TRUE).
rescale_copy_number
logical. If TRUE, copy numbers with be recalculated using the RecA/RadA
coverages in the subset. Otherwise, RecA/RadA coverages will be taken from
the parent object with the highest RecA/RadA coverages. By default it is set to
TRUE, which means that the returned copy numbers will represent the average
copy number per function in the genomes of the selected bins or contigs. If any
SQM objects that are being combined contain a functional subset rather than a
contig/bins subset, this parameter should be set to FALSE.

Value
A SQM object

See Also
subsetFun, subsetTax, combineSQMlite

Examples
data(Hadza)
# Select Carbohydrate metabolism ORFs in Bacteroidetes, and Amino acid metabolism ORFs in
bact = subsetTax(Hadza, "phylum", "Bacteroidetes")
bact.carb = subsetFun(bact, "Carbohydrate metabolism")
proteo = subsetTax(Hadza, "phylum", "Proteobacteria")
proteo.amins = subsetFun(proteo, "Amino acid metabolism")
bact.carb_proteo.amins = combineSQM(bact.carb, proteo.amins, rescale_copy_number=F)

combineSQMlite Combine several SQM or SQMlite objects

Description
Combine an arbitrary number of SQM or SQMlite objects into a single SQMlite object. This func-
tion accepts objects originating from different projects (i.e. different SqueezeMeta runs).

Usage
combineSQMlite(...)

Arguments
... an arbitrary number of SQM or SQMlite objects. Alternatively, a single list
containing an arbitrary number of SQMlite objects.
4 exportKrona

Value
A SQMlite object

See Also
subsetFun, subsetTax, combineSQM

Examples
## Not run:
data(Hadza)
# Load data coming from a different run
other = loadSQMlite("/path/to/other/project/tables") # e.g. if the project was run using
# (We could also use loadSQM to load the data as long as the data comes from a SqueezeMet
combined = combineSQMlite(Hadza, other)
plotTaxonomy(combined, 'family') # Now we can plot together the samples from Hadza and th

## End(Not run)

exportKrona Export the taxonomy of a SQM object into a Krona Chart

Description
Generate a krona chart containing the full taxonomy from a SQM object.

Usage
exportKrona(SQM, output_name = NA)

Arguments
SQM A SQM or SQMlite object.
output_name character. Name of the output file containing the Krona charts in html format
(default "<project_name>.krona.html").

Details
Original code was kindly provided by Giuseppe D’Auria ([email protected]).

See Also
plotTaxonomy for plotting the most abundant taxa of a SQM object.

Examples
data(Hadza)
exportKrona(Hadza)
exportPathway 5

exportPathway Export the functions of a SQM object into KEGG pathway maps

Description

This function is a wrapper for the pathview package (Luo et al., 2017. Nucleic acids research,
45:W501-W508). It will generate annotated KEGG pathway maps showing which reactions are
present in the different samples. It will also generate legends with the color scales for each sample
in separate png files.

Usage

exportPathway(
SQM,
pathway_id,
count = "tpm",
samples = NULL,
split_samples = F,
sample_colors = NULL,
log_scale = F,
fold_change_groups = NULL,
fold_change_colors = NULL,
max_scale_value = NULL,
color_bins = 10,
output_suffix = "pathview"
)

Arguments

SQM A SQM or SQMlite object.

pathway_id character. The five-number KEGG pathway identifier. A list of all pathway
identifiers can be found in https://www.genome.jp/kegg/pathway.
html.
count character. Either "abund" for raw abundances, "percent" for percent-
ages, "bases" for raw base counts, "tpm" for TPM normalized values or
"copy_number" for copy numbers (default "tpm"). Note that a given count
type might not available in this object (e.g. TPM or copy number in SQMlite
objects originating from a SQM reads project).
samples character. An optional vector with the names of the samples to export. If absent,
all samples will be exported (default NULL).
split_samples
logical. Generate a different output file for each sample (default FALSE).
sample_colors
character. An optional vector with the plotting colors for each sample (default
NULL).
log_scale logical. Use a base 10 logarithmic transformation for the color scale. Will have
no effect if fold_change_groups is provided (default FALSE).
6 exportTable

fold_change_groups
list. An optional list containing two vectors of samples. If provided, the function
will generate a single plot displaying the log2 fold-change between the average
abundances of both groups of samples ( log(second group / first group) ) (default
NULL).
fold_change_colors
character. An optional vector with the plotting colors of both groups in the fold-
change plot. Will be ignored if fold_change_group is not provided.
max_scale_value
numeric. Maximum value to include in the color scale. By default it is the max-
imum value in the selected samples (if plotting abundances in samples) or the
maximum absolute log2 fold-change (if plotting fold changes) (default NULL).
color_bins numeric. Number of bins used to generate the gradient in the color scale (default
10).
output_suffix
character. Suffix to be added to the output files (default "pathview").

See Also
plotFunctions for plotting the most functions taxa of a SQM object.

Examples
data(Hadza)
exportPathway(Hadza, "00910", count = 'copy_number', output_suffix = "nitrogen_metabolism
exportPathway(Hadza, "00250", count = 'tpm', output_suffix = "ala_asp_glu_metabolism_Fold

exportTable Export results in tabular format

Description
This function is a wrapper for R’s write.table function.

Usage
exportTable(table, output_name)

Arguments
table vector, matrix or data.frame. The table to be written.logical.
output_name character. Name of the output file.

Examples
data(Hadza)
Hadza.iron = subsetFun(Hadza, "iron")
# Write the taxonomic distribution at the genus level of all the genes related to iron.
exportTable(Hadza.iron$taxa$genus$percent, "Hadza.ironGenes.genus.tsv")
# Now write the distribution of the different iron-related COGs (Clusters of Orthologous
exportTable(Hadza.iron$functions$COG$tpm, "Hadza.ironGenes.COG.tsv")
# Now write all the information contained in the ORF table.
exportTable(Hadza.iron$orfs$table, "Hadza.ironGenes.orftable.tsv")
Hadza 7

Hadza Hadza hunter-gatherer gut metagenomes

Description
Subset of 5 bins (and the associated contigs and genes) generated by running SqueezeMeta on two
gut metagenomic samples obtained from two hunter-gatherers of the Hadza ethnic group.

Usage
data(Hadza)

Format
A SQM object; see loadSQM.

Source
SRR1927149, SRR1929485.

References
Rampelli et al., 2015. Metagenome Sequencing of the Hadza Hunter-Gatherer Gut Microbiota.
Curr. biol. 25:1682-93 (PubMed).

Examples
data(Hadza)
plotTaxonomy(Hadza, "genus", rescale=T)
plotFunctions(Hadza, "COG")

loadSQM Load a SqueezeMeta project into R

Description
This function takes the path to a project directory generated by SqueezeMeta (whose name is spec-
ified in the -p parameter of the SqueezeMeta.pl script) and parses the results into a SQM object.

Usage
loadSQM(
project_path,
tax_mode = "allfilter",
trusted_functions_only = F,
engine = "data.frame"
)
8 loadSQM

Arguments
project_path character, project directory generated by SqueezeMeta.
tax_mode character, which taxonomic classification should be loaded? SqueezeMeta ap-
plies the identity thresholds described in Luo et al., 2014. Use allfilter for
applying the minimum identity threshold to all taxa (default), prokfilter
for applying the threshold to Bacteria and Archaea, but not to Eukaryotes, and
nofilter for applying no thresholds at all.
trusted_functions_only
logical. If TRUE, only highly trusted functional annotations (best hit + best aver-
age) will be considered when generating aggregated function tables. If FALSE,
best hit annotations will be used (default FALSE). Will only have an effect if the
project_dir/results/tables is not already present.
engine character. Engine used to load the ORFs and contigs tables. Either data.frame
(default) or data.table (significantly faster if your project is large).

Value
SQM object containing the parsed project.

Prerequisites
Run SqueezeMeta! An example call for running it would be:
/path/to/SqueezeMeta/scripts/SqueezeMeta.pl
-m coassembly -f fastq_dir -s samples_file -p project_dir

The SQM object structure

The SQM object is a nested list which contains the following information:

lvl1 lvl2 lvl3 type rows/names columns data

$orfs $table dataframe orfs misc. data misc. da
$abund numeric matrix orfs samples abundan
$bases numeric matrix orfs samples abundan
$tpm numeric matrix orfs samples tpm
$seqs character vector orfs (n/a) sequence
$tax character matrix orfs tax. ranks taxonom
$contigs $table dataframe contigs misc. data misc. da
$abund numeric matrix contigs samples abundan
$tpm numeric matrix contigs samples tpm
$seqs character vector contigs (n/a) sequence
$tax character matrix contigs tax. ranks taxonom
$bins character matrix contigs bin. methods bins
$bins $table dataframe bins misc. data misc. da
$tpm numeric matrix bins samples tpm
$tax character matrix bins tax. ranks taxonom
$taxa $superkingdom $abund numeric matrix superkingdoms samples abundan
$percent numeric matrix superkingdoms samples percenta
$phylum $abund numeric matrix phyla samples abundan
$percent numeric matrix phyla samples percenta
$class $abund numeric matrix classes samples abundan
$percent numeric matrix classes samples percenta
$order $abund numeric matrix orders samples abundan
loadSQM 9

$percent numeric matrix orders samples percenta

$family $abund numeric matrix families samples abundan
$percent numeric matrix families samples percenta
$genus $abund numeric matrix genera samples abundan
$percent numeric matrix genera samples percenta
$species $abund numeric matrix species samples abundan
$percent numeric matrix species samples percenta
$functions $KEGG $abund numeric matrix KEGG ids samples abundan
$bases numeric matrix KEGG ids samples abundan
$cov numeric matrix KEGG ids samples coverage
$tpm numeric matrix KEGG ids samples tpm
$copy_number numeric matrix KEGG ids samples avg. cop
$COG $abund numeric matrix COG ids samples abundan
$bases numeric matrix COG ids samples abundan
$cov numeric matrix COG ids samples coverage
$tpm numeric matrix COG ids samples tpm
$copy_number numeric matrix COG ids samples avg. cop
$PFAM $abund numeric matrix PFAM ids samples abundan
$bases numeric matrix PFAM ids samples abundan
$cov numeric matrix PFAM ids samples coverage
$tpm numeric matrix PFAM ids samples tpm
$copy_number numeric matrix PFAM ids samples avg. cop
$total_reads numeric vector samples (n/a) total read
$misc $project_name character vector (empty) (n/a) project n
$samples character vector (empty) (n/a) samples
$tax_names_long $superkingdom character vector short names (n/a) full nam
$phylum character vector short names (n/a) full nam
$class character vector short names (n/a) full nam
$order character vector short names (n/a) full nam
$family character vector short names (n/a) full nam
$genus character vector short names (n/a) full nam
$species character vector short names (n/a) full nam
$tax_names_short character vector full names (n/a) short nam
$KEGG_names character vector KEGG ids (n/a) KEGG n
$KEGG_paths character vector KEGG ids (n/a) KEGG h
$COG_names character vector COG ids (n/a) COG na
$COG_paths character vector COG ids (n/a) COG hie
$ext_annot_sources character vector COG ids (n/a) external

If external databases for functional classification were provided to SqueezeMeta via the -extdb
argument, the corresponding abundance (reads and bases), coverages, tpm and copy number profiles
will be present in SQM$functions (e.g. results for the CAZy database would be present in
SQM$functions$CAZy). Additionally, the extended names of the features present in the external
database will be present in SQM$misc (e.g. SQM$misc$CAZy_names).

R
library(SQMtools)
Hadza = loadSQM("Hadza") # Where Hadza is the path to the SqueezeMeta output directory

## End(Not run)

data(Hadza)
# Which are the ten most abundant KEGG IDs in our data?
topKEGG = sort(rowSums(Hadza$functions$KEGG$tpm), decreasing=T)[1:11]
topKEGG = topKEGG[names(topKEGG)!="Unclassified"]
# Which functions do those KEGG IDs represent?
Hadza$misc$KEGG_names[topKEGG]
What is the relative abundance of the Gammaproteobacteria class across samples?
Hadza$taxa$class$percent["Gammaproteobacteria",]
# Which information is stored in the orf, contig and bin tables?
colnames(Hadza$orfs$table)
colnames(Hadza$contigs$table)
colnames(Hadza$bins$table)
# What is the GC content distribution of my metagenome?
boxplot(Hadza$contigs$table[,"GC perc"]) # Not weighted by contig length or abundance!

loadSQMlite Load tables generated by sqm2tables.py,

sqmreads2tables.py or combine-sqm-tables.py
into R.

Description
This function takes the path to the output directory generated by sqm2tables.py, sqmreads2tables.py
or combine-sqm-tables.py a SQMlite object. The SQMlite object will contain taxonomic
and functional profiles, but no detailed information on ORFs, contigs or bins. However, it will also
have a much smaller memory footprint. A SQMlite object can be used for plotting and exporting,
but it can not be subsetted.

Usage
loadSQMlite(tables_path, tax_mode = "allfilter")

Arguments
tables_path character, tables directory generated by sqm2table.py, sqmreads2tables.py
or combine-sqm-tables.py.
tax_mode character, which taxonomic classification should be loaded? SqueezeMeta ap-
plies the identity thresholds described in Luo et al., 2014. Use allfilter for
applying the minimum identity threshold to all taxa (default), prokfilter
for applying the threshold to Bacteria and Archaea, but not to Eukaryotes, and
nofilter for applying no thresholds at all.

Value
SQMlite object containing the parsed tables.
loadSQMlite 11

The SQMlite object structure

The SQMlite object is a nested list which contains the following information:
12 loadSQMlite

lvl1 lvl2 lvl3 type rows/names columns data

$taxa $superkingdom $abund numeric matrix superkingdoms samples abundances
$percent numeric matrix superkingdoms samples percentages
$phylum $abund numeric matrix phyla samples abundances
$percent numeric matrix phyla samples percentages
$class $abund numeric matrix classes samples abundances
$percent numeric matrix classes samples percentages
$order $abund numeric matrix orders samples abundances
$percent numeric matrix orders samples percentages
$family $abund numeric matrix families samples abundances
$percent numeric matrix families samples percentages
$genus $abund numeric matrix genera samples abundances
$percent numeric matrix genera samples percentages
$species $abund numeric matrix species samples abundances
$percent numeric matrix species samples percentages
$functions $KEGG $abund numeric matrix KEGG ids samples abundances (
$bases numeric matrix KEGG ids samples abundances (
$tpm numeric matrix KEGG ids samples tpm
$copy_number numeric matrix KEGG ids samples avg. copies
$COG $abund numeric matrix COG ids samples abundances (
$bases numeric matrix COG ids samples abundances (
$tpm numeric matrix COG ids samples tpm
$copy_number numeric matrix COG ids samples avg. copies
$PFAM $abund numeric matrix PFAM ids samples abundances (
$bases numeric matrix PFAM ids samples abundances (
$tpm numeric matrix PFAM ids samples tpm
$copy_number numeric matrix PFAM ids samples avg. copies
$total_reads numeric vector samples (n/a) total reads
$misc $project_name character vector (empty) (n/a) project name
$samples character vector (empty) (n/a) samples
$tax_names_long $superkingdom character vector short names (n/a) full names
$phylum character vector short names (n/a) full names
$class character vector short names (n/a) full names
$order character vector short names (n/a) full names
$family character vector short names (n/a) full names
$genus character vector short names (n/a) full names
$species character vector short names (n/a) full names
$tax_names_short character vector full names (n/a) short names
$KEGG_names character vector KEGG ids (n/a) KEGG name
$KEGG_paths character vector KEGG ids (n/a) KEGG hiarar
$COG_names character vector COG ids (n/a) COG names
$COG_paths character vector COG ids (n/a) COG hierarc
$ext_annot_sources character vector (empty) (n/a) external data

If external databases for functional classification were provided to SqueezeMeta or SqueezeMeta_reads

via the -extdb argument, the corresponding abundance, tpm and copy number profiles will be
present in SQM$functions (e.g. results for the CAZy database would be present in SQM$functions$CAZy).
Additionally, the extended names of the features present in the external database will be present in
SQM$misc (e.g. SQM$misc$CAZy_names). Note that results generated by SqueezeMeta_reads
will contain only read abundances, but not bases, tpm or copy number estimations.
MGKOs 13

See Also
plotBars and plotFunctions will plot the most abundant taxa and functions in a SQMlite
object. exportKrona will generate Krona charts reporting the taxonomy in a SQMlite object.

Examples
## Not run:
# (outside R)
/path/to/SqueezeMeta/scripts/SqueezeMeta.pl -p Hadza -f raw -m coassembly -s test.samples
/path/to/SqueezeMeta/utils/sqm2tables.py Hadza Hadza/results/tables # Generate the tabula
# now go into R
R
library(SQMtools)
Hadza = loadSQMlite("Hadza/results/tables") # Where Hadza is the path to the SqueezeMeta
# Note that this is not the whole SQM project, just the directory containing the tables.
# It would also work with tables generated by sqmreads2tables.py, or combine-sqm-tables.p
# plotTaxonomy(Hadza)
# plotFunctions(Hadza)
# exportKrona(Hadza, 'myKronaTest.html')

## End(Not run)

MGKOs Single Copy Phylogenetic Marker Genes from Sunagawa’s group

(KOs)

Description
Lists of Single Copy Phylogenetic Marker Genes. These are useful for transforming coverages
or tpms into copy numbers. This is an alternative way of normalizing data in order to be able to
compare functional profiles in samples with different sequencing depths.

Usage
data(MGKOs)

Format
Character vector with the KEGG identifiers for 10 Single Copy Phylogenetic Marker Genes.

References
Salazar, G et al. (2019). Gene Expression Changes and Community Turnover Differentially Shape
the Global Ocean Metatranscriptome Cell 179:1068-1083. (PubMed).

See Also
MGOGs for an equivalent list using OGs instead of KOs; USiCGs for an alternative set of single
copy genes, and for examples on how to generate copy numbers.
14 mostAbundant

MGOGs Single Copy Phylogenetic Marker Genes from Sunagawa’s group

(OGs)

Usage
data(MGOGs)

Format
Character vector with the COG identifiers for 10 Single Copy Phylogenetic Marker Genes.

References
Salazar, G et al. (2019). Gene Expression Changes and Community Turnover Differentially Shape
the Global Ocean Metatranscriptome Cell 179:1068-1083. (PubMed).

See Also
MGKOs for an equivalent list using KOs instead of OGs; USiCGs for an alternative set of single
copy genes, and for examples on how to generate copy numbers.

mostAbundant Get the N most abundant rows from a numeric table

Description
Return a subset of an input matrix or data frame, containing only the N most abundant rows, sorted.
Alternatively, a custom set of rows can be returned.

Usage
mostAbundant(data, N = 10, items = NULL, others = F, rescale = F)

Arguments
data numeric matrix or data frame
N integer Number of rows to return (default 10).
items Character vector. Custom row names to return. If provided, it will override N
(default NULL).
others logical. If TRUE, an extra row will be returned containing the aggregated abun-
dances of the elements not selected with N or items (default FALSE).
rescale logical. Scale result to percentages column-wise (default FALSE).
plotBars 15

Value
A matrix or data frame (same as input) with the selected rows.

Examples
data(Hadza)
Hadza.carb = subsetFun(Hadza, "Carbohydrate metabolism")
# Which are the 20 most abundant KEGG functions in the ORFs related to carbohydrate metab
topCarb = mostAbundant(Hadza.carb$functions$KEGG$tpm, N=20)
# Now print them with nice names
rownames(topCarb) = paste(rownames(topCarb), Hadza.carb$misc$KEGG_names[rownames(topCarb)
topCarb
We can pass this to any R function
heatmap(topCarb)
But for convenience we provide wrappers for plotting ggplot2 heatmaps and barplots
plotHeatmap(topCarb, label_y="TPM")
plotBars(topCarb, label_y="TPM")

plotBars Plot a barplot using ggplot2

Description
Plot a ggplot2 barplot from a matrix or data frame. The data should be in tabular format (e.g.
features in rows and samples in columns).

Usage
plotBars(
data,
label_x = "Samples",
label_y = "Abundances",
label_fill = "Features",
color = NULL,
base_size = 11,
max_scale_value = NULL,
metadata_groups = NULL
)

Arguments
data Numeric matrix or data frame.
label_x character Label for the x axis (default "Samples").
label_y character Label for the y axis (default "Abundances").
label_fill character Label for color categories (default "Features").
color Vector with custom colors for the different features. If empty, the default ggplot2
palette will be used (default NULL).
base_size numeric. Base font size (default 11).
max_scale_value
numeric. Maximum value to include in the y axis. By default it is handled
automatically by ggplot2 (default NULL).
16 plotFunctions

metadata_groups
list. Split the plot into groups defined by the user: list(’G1’ = c(’sample1’,
sample2’), ’G2’ = c(’sample3’, ’sample4’)) default NULL).

Value
a ggplot2 plot object.

See Also
plotTaxonomy for plotting the most abundant taxa of a SQM object; plotHeatmap for plot-
ting a heatmap with arbitrary data; mostAbundant for selecting the most abundant rows in a
dataframe or matrix.

Examples
data(Hadza)
sk = Hadza$taxa$superkingdom$abund
plotBars(sk, label_y = "Raw reads", label_fill = "Superkingdom")

plotFunctions Heatmap of the most abundant functions in a SQM object

Description
This function selects the most abundant functions across all samples in a SQM object and represents
their abundances in a heatmap. Alternatively, a custom set of functions can be represented.

Usage
plotFunctions(
SQM,
fun_level = "KEGG",
count = "tpm",
N = 25,
fun = NULL,
samples = NULL,
ignore_unmapped = T,
ignore_unclassified = T,
gradient_col = c("ghostwhite", "dodgerblue4"),
base_size = 11,
metadata_groups = NULL
)

Arguments
SQM A SQM or SQMlite object.
fun_level character. Either "KEGG", "COG", "PFAM" or any other custom database used
for annotation (default "KEGG").
plotHeatmap 17

count character. Either "abund" for raw abundances, "percent" for percent-
ages, "bases" for raw base counts, "tpm" for TPM normalized values or
"copy_number" for copy numbers (default "tpm"). Note that a given count
type might not available in this object (e.g. TPM or copy number in SQMlite
objects originating from a SQM reads project).
N integer Plot the N most abundant functions (default 25).
fun character. Custom functions to plot. If provided, it will override N (default
NULL).
samples character. Character vector with the names of the samples to include in the plot.
Can also be used to plot the samples in a custom order. If not provided, all
samples will be plotted (default NULL).
ignore_unmapped
logical. Don’t include unmapped ORFs in the plot (default TRUE).
ignore_unclassified
logical. Don’t include unclassified ORFs in the plot (default TRUE).
gradient_col A vector of two colors representing the low and high ends of the color gradient
(default c("ghostwhite","dodgerblue4")).
base_size numeric. Base font size (default 11).
metadata_groups
list. Split the plot into groups defined by the user: list(’G1’ = c(’sample1’,
sample2’), ’G2’ = c(’sample3’, ’sample4’)) default NULL).

Value

a ggplot2 plot object.

plotHeatmap Plot a heatmap using ggplot2

Description

Plot a ggplot2 heatmap from a matrix or data frame. The data should be in tabular format (e.g.
features in rows and samples in columns).
18 plotTaxonomy

Usage
plotHeatmap(
data,
label_x = "Samples",
label_y = "Features",
label_fill = "Abundance",
gradient_col = c("ghostwhite", "dodgerblue4"),
base_size = 11,
metadata_groups = NULL
)

Arguments
data numeric matrix or data frame.
label_x character Label for the x axis (default "Samples").
label_y character Label for the y axis (default "Features").
label_fill character Label for color scale (default "Abundance").
gradient_col A vector of two colors representing the low and high ends of the color gradient
(default c("ghostwhite","dodgerblue4")).
base_size numeric. Base font size (default 11).
metadata_groups
list. Split the plot into groups defined by the user: list(’G1’ = c(’sample1’,
sample2’), ’G2’ = c(’sample3’, ’sample4’)) default NULL).

Value
A ggplot2 plot object.

See Also
plotFunctions for plotting the top functional categories of a SQM object; plotBars for
plotting a barplot with arbitrary data; mostAbundant for selecting the most abundant rows in a
dataframe or matrix.

Examples
data(Hadza)
topPFAM = mostAbundant(Hadza$functions$PFAM$tpm)
topPFAM = topPFAM[rownames(topPFAM) != "Unclassified",] # Take out the Unclassified ORFs.
plotHeatmap(topPFAM, label_x = "Samples", label_y = "PFAMs", label_fill = "TPM")

plotTaxonomy Barplot of the most abundant taxa in a SQM object

Description
This function selects the most abundant taxa across all samples in a SQM object and represents their
abundances in a barplot. Alternatively, a custom set of taxa can be represented.
plotTaxonomy 19

Usage
plotTaxonomy(
SQM,
rank = "phylum",
count = "percent",
N = 15,
tax = NULL,
others = T,
samples = NULL,
nocds = "treat_separately",
ignore_unmapped = F,
ignore_unclassified = F,
no_partial_classifications = F,
rescale = F,
color = NULL,
base_size = 11,
max_scale_value = NULL,
metadata_groups = NULL
)

Arguments
SQM A SQM or a SQMlite object.
rank Taxonomic rank to plot (default phylum).
count character. Either "percent" for percentages, or "abund" for raw abun-
dances (default "percent").
N integer Plot the N most abundant taxa (default 15).
tax character. Custom taxa to plot. If provided, it will override N (default NULL).
others logical. Collapse the abundances of least abundant taxa, and include the result
in the plot (default TRUE).
samples character. Character vector with the names of the samples to include in the plot.
Can also be used to plot the samples in a custom order. If not provided, all
samples will be plotted (default NULL).
nocds character. Either "treat_separately" to treat reads annotated as No CDS
separately, "treat_as_unclassified" to treat them as Unclassified or
"ignore" to ignore them in the plot (default "treat_separately").
ignore_unmapped
logical. Don’t include unmapped reads in the plot (default FALSE).
ignore_unclassified
logical. Don’t include unclassified reads in the plot (default FALSE).
no_partial_classifications
logical. Treat reads not fully classified at the requested level (e.g. "Unclassified
bacteroidetes" at the class level or below) as fully unclassified. This takes ef-
fect before ignore_unclassified, so if both are TRUE the plot will only
contain fully classified contigs (default FALSE).
rescale logical. Re-scale results to percentages (default FALSE).
color Vector with custom colors for the different features. If empty, we will use our
own hand-picked pallete if N<=15, and the default ggplot2 palette otherwise
(default NULL).
20 RecA

base_size numeric. Base font size (default 11).

max_scale_value
numeric. Maximum value to include in the y axis. By default it is handled
automatically by ggplot2 (default NULL).

Value
a ggplot2 plot object.

See Also
plotFunctions for plotting the most abundant functions of a SQM object; plotBars and
plotHeatmap for plotting barplots or heatmaps with arbitrary data.

Examples
data(Hadza)
Hadza.amin = subsetFun(Hadza, "Amino acid metabolism")
# Taxonomic distribution of amino acid metabolism ORFs at the family level.
plotTaxonomy(Hadza.amin, "family")

RecA RecA/RadA recombinase

Description
The recombination protein RecA/RadA is essential for the repair and maintenance of DNA, and has
homologs in every bacteria and archaea. By dividing the coverage of functions by the coverage of
RecA, abundances can be transformed into copy numbers, which can be used to compare functional
profiles in samples with different sequencing depths. RecA-derived copy numbers are available in
the SQM object (SQM$functions$<annotation_type>$copy_number).

Usage
data(RecA)

Format
Character vector with the COG identifier for RecA/RadA.

Source
EggNOG Database.

Examples
data(Hadza)
data(RecA)
### Let's calculate the average copy number of each function in our samples.
# We do it for COG annotations here, but we could also do it for KEGG or PFAMs.
COG.coverage = SQMtools:::aggregate.fun(Hadza, "COG", trusted_functions_only=T,
ignore_unclassified_functions=F)$cov
COG.copynumber = t(t(COG.coverage) / COG.coverage[RecA,]) # Sample-wise division by RecA
rowMaxs 21

rowMaxs Return a vector with the row-wise maxima of a matrix or dataframe.

Description

Return a vector with the row-wise maxima of a matrix or dataframe.

Usage

rowMaxs(table)

rowMins Return a vector with the row-wise minima of a matrix or dataframe.

Description

Return a vector with the row-wise minima of a matrix or dataframe.

Usage

rowMins(table)

subsetBins Create a SQM object containing only the requested bins, and the con-
tigs and ORFs contained in them.

Description

Create a SQM object containing only the requested bins, and the contigs and ORFs contained in
them.

Usage

subsetBins(
SQM,
bins,
trusted_functions_only = F,
ignore_unclassified_functions = F,
rescale_tpm = T,
rescale_copy_number = T
)
22 subsetContigs

Arguments

SQM SQM object to be subsetted.

bins character. Vector of bins to be selected.
trusted_functions_only
logical. If TRUE, only highly trusted functional annotations (best hit + best aver-
age) will be considered when generating aggregated function tables. If FALSE,
best hit annotations will be used (default FALSE).
ignore_unclassified_functions
logical. If FALSE, ORFs with no functional classification will be aggregated
together into an "Unclassified" category. If TRUE, they will be ignored (default
FALSE).
rescale_tpm logical. If TRUE, TPMs for KEGGs, COGs, and PFAMs will be recalculated
(so that the TPMs in the subset actually add up to 1 million). Otherwise, per-
function TPMs will be calculated by aggregating the TPMs of the ORFs an-
notated with that function, and will thus keep the scaling present in the parent
object. By default it is set to TRUE, which means that the returned TPMs will
be scaled by million of reads of the selected bins.
rescale_copy_number
logical. If TRUE, copy numbers with be recalculated using the RecA/RadA
coverages in the subset. Otherwise, RecA/RadA coverages will be taken from
the parent object. By default it is set to TRUE, which means that the returned
copy numbers for each function will represent the average copy number of that
function per genome of the selected bins.

Value

SQM object containing only the requested bins.

subsetContigs Select contigs

Description

Create a SQM object containing only the requested contigs, the ORFs contained in them and the
bins that contain them.
subsetContigs 23

Usage
subsetContigs(
SQM,
contigs,
trusted_functions_only = F,
ignore_unclassified_functions = F,
rescale_tpm = F,
rescale_copy_number = F
)

Arguments
SQM SQM object to be subsetted.
contigs character. Vector of contigs to be selected.
trusted_functions_only
logical. If TRUE, only highly trusted functional annotations (best hit + best aver-
age) will be considered when generating aggregated function tables. If FALSE,
best hit annotations will be used (default FALSE).
ignore_unclassified_functions
logical. If FALSE, ORFs with no functional classification will be aggregated
together into an "Unclassified" category. If TRUE, they will be ignored (default
FALSE).
rescale_tpm logical. If TRUE, TPMs for KEGGs, COGs, and PFAMs will be recalculated
(so that the TPMs in the subset actually add up to 1 million). Otherwise, per-
function TPMs will be calculated by aggregating the TPMs of the ORFs an-
notated with that function, and will thus keep the scaling present in the parent
object (default FALSE).
rescale_copy_number
logical. If TRUE, copy numbers with be recalculated using the RecA/RadA
coverages in the subset. Otherwise, RecA/RadA coverages will be taken from
the parent object. By default it is set to FALSE, which means that the returned
copy numbers for each function will represent the average copy number of that
function per genome in the parent object.

Value
SQM object containing only the selected contigs.

subsetFun Filter results by function

Description
Create a SQM object containing only the ORFs with a given function, and the contigs and bins that
contain them.

Usage
subsetFun(
SQM,
fun,
columns = NULL,
ignore_case = T,
fixed = F,
trusted_functions_only = F,
ignore_unclassified_functions = F,
rescale_tpm = F,
rescale_copy_number = F
)

Arguments
SQM SQM object to be subsetted.
fun character. Pattern to search for in the different functional classifications.
columns character. Restrict the search to the provided column names from SQM$orfs$table.
If not provided the search will be performed in all the columns containing func-
tional information (default NULL).
ignore_case logical Make pattern matching case-insensitive (default TRUE).
fixed logical. If TRUE, pattern is a string to be matched as is. If FALSE the pattern is
treated as a regular expression (default FALSE).
trusted_functions_only
logical. If TRUE, only highly trusted functional annotations (best hit + best aver-
age) will be considered when generating aggregated function tables. If FALSE,
best hit annotations will be used (default FALSE).
ignore_unclassified_functions
logical. If FALSE, ORFs with no functional classification will be aggregated
together into an "Unclassified" category. If TRUE, they will be ignored (default
FALSE).
rescale_tpm logical. If TRUE, TPMs for KEGGs, COGs, and PFAMs will be recalculated
(so that the TPMs in the subset actually add up to 1 million). Otherwise, per-
function TPMs will be calculated by aggregating the TPMs of the ORFs an-
notated with that function, and will thus keep the scaling present in the parent
object (default FALSE).
rescale_copy_number
logical. If TRUE, copy numbers with be recalculated using the RecA/RadA
coverages in the subset. Otherwise, RecA/RadA coverages will be taken from
the parent object. By default it is set to FALSE, which means that the returned
copy numbers for each function will represent the average copy number of that
function per genome in the parent object.
subsetORFs 25

Value
SQM object containing only the requested function.

See Also
subsetTax, subsetORFs, combineSQM. The most abundant items of a particular table con-
tained in a SQM object can be eselected with mostAbundant.

Examples
data(Hadza)
Hadza.iron = subsetFun(Hadza, "iron")
Hadza.carb = subsetFun(Hadza, "Carbohydrate metabolism")
# Search for multiple patterns using regular expressions
Hadza.twoKOs = subsetFun(Hadza, "K00812|K00813", fixed=F)

subsetORFs Select ORFs

Description
Create a SQM object containing only the requested ORFs, and the contigs and bins that contain
them. Internally, all the other subset functions in this package end up calling subsetORFs to do
the work for them.

Usage
subsetORFs(
SQM,
orfs,
tax_source = "orfs",
trusted_functions_only = F,
ignore_unclassified_functions = F,
rescale_tpm = F,
rescale_copy_number = F,
contigs_override = NULL
)

Arguments
SQM SQM object to be subsetted.
orfs character. Vector of ORFs to be selected.
tax_source character. Features used for calculating aggregated abundances at the different
taxonomic ranks. Either "orfs" or "contigs" (default "orfs").
trusted_functions_only
logical. If TRUE, only highly trusted functional annotations (best hit + best aver-
age) will be considered when generating aggregated function tables. If FALSE,
best hit annotations will be used (default FALSE).
ignore_unclassified_functions
logical. If FALSE, ORFs with no functional classification will be aggregated
together into an "Unclassified" category. If TRUE, they will be ignored (default
FALSE).
26 subsetRand

rescale_tpm logical. If TRUE, TPMs for KEGGs, COGs, and PFAMs will be recalculated
(so that the TPMs in the subset actually add up to 1 million). Otherwise, per-
function TPMs will be calculated by aggregating the TPMs of the ORFs an-
notated with that function, and will thus keep the scaling present in the parent
object (default FALSE).
rescale_copy_number
logical. If TRUE, copy numbers with be recalculated using the RecA/RadA
coverages in the subset. Otherwise, RecA/RadA coverages will be taken from
the parent object. By default it is set to FALSE, which means that the returned
copy numbers for each function will represent the average copy number of that
function per genome in the parent object.

Value
SQM object containing the requested ORFs.

A note on contig/bins subsetting

While this function selects the contigs and bins that contain the desired orfs, it DOES NOT recalcu-
late contig/bin abundance and statistics based on the selected ORFs only. This means that the abun-
dances presented in tables such as SQM$contig$abund or SQM$bins$tpm will still refer to the
complete contigs and bins, regardless of whether only a fraction of their ORFs are actually present
in the returned SQM object. This is also true for the statistics presented in SQM$contigs$table
and SQM$bins$table.

Examples
data(Hadza)
# Select the 100 most abundant ORFs in our dataset.
mostAbundantORFnames = names(sort(rowSums(Hadza$orfs$tpm), decreasing=T))[1:100]
mostAbundantORFs = subsetORFs(Hadza, mostAbundantORFnames)

subsetRand Select random ORFs

Description
Create a random subset of a SQM object.

Usage
subsetRand(SQM, N)

Arguments
SQM SQM object to be subsetted.
N numeric. number of random ORFs to select.

Value
SQM object containing a random subset of ORFs.
subsetTax 27

subsetTax Filter results by taxonomy

Description
Create a SQM object containing only the contigs with a given consensus taxonomy, the ORFs
contained in them and the bins that contain them.

Usage
subsetTax(
SQM,
rank,
tax,
trusted_functions_only = F,
ignore_unclassified_functions = F,
rescale_tpm = T,
rescale_copy_number = T
)

Arguments
SQM SQM object to be subsetted.
rank character. The taxonomic rank from which to select the desired taxa (superkingdom,
phylum, class, order, family, genus, species)
tax character. The taxon to select.
trusted_functions_only
logical. If TRUE, only highly trusted functional annotations (best hit + best aver-
age) will be considered when generating aggregated function tables. If FALSE,
best hit annotations will be used (default FALSE).
ignore_unclassified_functions
logical. If FALSE, ORFs with no functional classification will be aggregated
together into an "Unclassified" category. If TRUE, they will be ignored (default
FALSE).
rescale_tpm logical. If TRUE, TPMs for KEGGs, COGs, and PFAMs will be recalculated
(so that the TPMs in the subset actually add up to 1 million). Otherwise, per-
function TPMs will be calculated by aggregating the TPMs of the ORFs an-
notated with that function, and will thus keep the scaling present in the parent
object. By default it is set to TRUE, which means that the returned TPMs will
be scaled by million of reads of the selected taxon.
rescale_copy_number
logical. If TRUE, copy numbers with be recalculated using the RecA/RadA
coverages in the subset. Otherwise, RecA/RadA coverages will be taken from
the parent object. By default it is set to TRUE, which means that the returned
copy numbers for each function will represent the average copy number of that
function per genome of the selected taxon.
28 summary.SQMlite

Value
SQM object containing only the requested taxon.

See Also
subsetFun, subsetContigs, combineSQM. The most abundant items of a particular table
contained in a SQM object can be eselected with mostAbundant.

Examples
data(Hadza)
Hadza.Escherichia = subsetTax(Hadza, "genus", "Escherichia")
Hadza.Bacteroidetes = subsetTax(Hadza, "phylum", "Bacteroidetes")

summary.SQM summary method for class SQM

Description
Computes different statistics of the data contained in the SQM object.

Usage
## S3 method for class 'SQM'
summary(SQM)

Value
A list of summary statistics.

summary.SQMlite summary method for class SQMlite

Description
Computes different statistics of the data contained in the SQMlite object.

Usage
## S3 method for class 'SQMlite'
summary(SQM)

Value
A list of summary statistics.
USiCGs 29

USiCGs Universal Single-Copy Genes

Description
Lists of Universal Single Copy Genes for Bacteria and Archaea. These are useful for transforming
coverages or tpms into copy numbers. This is an alternative way of normalizing data in order to be
able to compare functional profiles in samples with different sequencing depths.

Usage
data(USiCGs)

Format
Character vector with the KEGG identifiers for 15 Universal Single Copy Genes.

Source
Carr et al., 2013. Table S1.

References
Carr, Shen-Orr & Borenstein (2013). Reconstructing the Genomic Content of Microbiome Taxa
through Shotgun Metagenomic Deconvolution PLoS Comput. Biol. 9:e1003292. (PubMed).

Examples
data(Hadza)
data(USiCGs)
### Let's look at the Universal Single Copy Gene distribution in our samples.
KEGG.tpm = Hadza$functions$KEGG$tpm
all(USiCGs %in% rownames(KEGG.tpm)) # Are all the USiCGs present in our dataset?
# Plot a boxplot of USiCGs tpms and calculate median USiCGs tpm.
# This looks weird in the test dataset because it contains only a small subset of the met
# In a set of complete metagenomes USiCGs should have fairly similar TPM averages
# and low dispersion across samples.
boxplot(t(KEGG.tpm[USiCGs,]), names=USiCGs, ylab="TPM", col="slateblue2")

### Now let's calculate the average copy numbers of each function.
# We do it for KEGG annotations here, but we could also do it for COGs or PFAMs.
USiCGs.cov = apply(Hadza$functions$KEGG$cov[USiCGs,], 2, median)
# Sample-wise division by the median USiCG coverage.
KEGG.copynumber = t(t(Hadza$functions$KEGG$cov) / USiCGs.cov)
Index

∗Topic datasets
Hadza, 7
MGKOs, 13
MGOGs, 14
RecA, 20
USiCGs, 29

combineSQM, 2, 4, 25, 28
combineSQMlite, 2, 3, 3

exportKrona, 4, 13
exportPathway, 5
exportTable, 6

Hadza, 7

loadSQM, 7, 7
loadSQMlite, 10

MGKOs, 13, 14
MGOGs, 13, 14
mostAbundant, 14, 16, 18, 25, 28

plotBars, 13, 15, 17, 18, 20

plotFunctions, 6, 13, 16, 18, 20
plotHeatmap, 16, 17, 17, 20
plotTaxonomy, 4, 16, 17, 18

RecA, 20
rowMaxs, 21
rowMins, 21

subsetBins, 21
subsetContigs, 22, 22, 28
subsetFun, 3, 4, 24, 28
subsetORFs, 22, 23, 25, 25, 27
subsetRand, 26
subsetTax, 3, 4, 25, 27
summary.SQM, 28
summary.SQMlite, 28

USiCGs, 13, 14, 29

Intrusion Detection Honeypots
From Everand
Intrusion Detection Honeypots
Chris Sanders
3/5 (2)
Viome Sample Report
No ratings yet
Viome Sample Report
48 pages
SqueezeMetaManual v1.6.0
No ratings yet
SqueezeMetaManual v1.6.0
76 pages
Audio, Video, and Media in the Ministry
From Everand
Audio, Video, and Media in the Ministry
Clarence Floyd Richmond
No ratings yet
Knit Soxx for Everyone: 25 Colorful Sock Patterns for the Whole Family
From Everand
Knit Soxx for Everyone: 25 Colorful Sock Patterns for the Whole Family
Kerstin Balke
4.5/5 (2)
Deadline Istanbul (The Elizabeth Darcy Series)
From Everand
Deadline Istanbul (The Elizabeth Darcy Series)
Peggy Hanson
5/5 (1)
Deadline Yemen (The Elizabeth Darcy Series)
From Everand
Deadline Yemen (The Elizabeth Darcy Series)
Peggy Hanson
5/5 (1)
The Gracious Lily Affair
From Everand
The Gracious Lily Affair
Van Wyck Mason
5/5 (1)
Osama the Gun
From Everand
Osama the Gun
Norman Spinrad
5/5 (1)
GCDkit Manual
No ratings yet
GCDkit Manual
342 pages
Between River and Mountain
From Everand
Between River and Mountain
Sally Walker Brinkmann
No ratings yet
GCDkit Manual PDF
No ratings yet
GCDkit Manual PDF
282 pages
Operation Longlife
From Everand
Operation Longlife
E. Hoffmann Price
3.5/5 (3)
Bimbo Heaven: Stone Angel #7
From Everand
Bimbo Heaven: Stone Angel #7
Marvin H. Albert
No ratings yet
Package SDM': July 17, 2024
No ratings yet
Package SDM': July 17, 2024
63 pages
GCDkit Manual
No ratings yet
GCDkit Manual
272 pages
Package Phytools': R Topics Documented
No ratings yet
Package Phytools': R Topics Documented
132 pages
BiodiversityR PDF
No ratings yet
BiodiversityR PDF
128 pages
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Operation Exile
From Everand
Operation Exile
E. Hoffmann Price
3.5/5 (1)
Hamlet Had an Uncle: A Comedy of Honor
From Everand
Hamlet Had an Uncle: A Comedy of Honor
James Branch Cabell
4.5/5 (7)
BiodiversityR PDF
No ratings yet
BiodiversityR PDF
145 pages
BioTIMEr
No ratings yet
BioTIMEr
12 pages
The Last Smile: Stone Angel #5
From Everand
The Last Smile: Stone Angel #5
Marvin H. Albert
No ratings yet
Phy Tools
No ratings yet
Phy Tools
252 pages
enmSdmX
No ratings yet
enmSdmX
156 pages
Biodiversity R
No ratings yet
Biodiversity R
149 pages
Package SDM': R Topics Documented
No ratings yet
Package SDM': R Topics Documented
60 pages
Duenna to a Murder
From Everand
Duenna to a Murder
Rufus King
No ratings yet
Time Management Tracker (Printable Version)
From Everand
Time Management Tracker (Printable Version)
Sheba Blake
No ratings yet
Back in the Real World (Stone Angel #2)
From Everand
Back in the Real World (Stone Angel #2)
Marvin H. Albert
5/5 (1)
Biodiversity R
No ratings yet
Biodiversity R
85 pages
The Future Is Ours: The Collected Science Fiction of Edward D. Hoch
From Everand
The Future Is Ours: The Collected Science Fiction of Edward D. Hoch
Edward D. Hoch
No ratings yet
phyloseq
No ratings yet
phyloseq
87 pages
Biodiversity R
No ratings yet
Biodiversity R
158 pages
Speaq
No ratings yet
Speaq
36 pages
Bart Man
No ratings yet
Bart Man
40 pages
Murder in the Willett Family: A Lt. Valcour Mystery
From Everand
Murder in the Willett Family: A Lt. Valcour Mystery
Rufus King
No ratings yet
Open Air
No ratings yet
Open Air
159 pages
Package Openair': December 7, 2020
No ratings yet
Package Openair': December 7, 2020
165 pages
BIOMASS
No ratings yet
BIOMASS
34 pages
Monthly Productivity Planner (Printable Version)
From Everand
Monthly Productivity Planner (Printable Version)
Sheba Blake
No ratings yet
Package Dismo': R Topics Documented
No ratings yet
Package Dismo': R Topics Documented
68 pages
Trouble in Tahiti: Blood on the Hibiscus
From Everand
Trouble in Tahiti: Blood on the Hibiscus
Hayford Peirce
No ratings yet
The Attention Fix: How to Focus in a World That Wants to Distract You
From Everand
The Attention Fix: How to Focus in a World That Wants to Distract You
Anders Hansen
No ratings yet
ChatGPT CheatSheet: 400 Powerful Examples That Turn You Into a ChatGPT Expert
From Everand
ChatGPT CheatSheet: 400 Powerful Examples That Turn You Into a ChatGPT Expert
Igor Pogany
No ratings yet
Package Caret': R Topics Documented
No ratings yet
Package Caret': R Topics Documented
136 pages
Dada 2
No ratings yet
Dada 2
45 pages
Biotools
No ratings yet
Biotools
34 pages
Kellory the Warlock
From Everand
Kellory the Warlock
Lin Carter
No ratings yet
Never Walk Alone
From Everand
Never Walk Alone
Rufus King
No ratings yet
Open Air
No ratings yet
Open Air
147 pages
Sperich
No ratings yet
Sperich
34 pages
Otubase: December 9, 2011
No ratings yet
Otubase: December 9, 2011
16 pages
actuar
No ratings yet
actuar
145 pages
mixOmics
No ratings yet
mixOmics
235 pages
Lords of Creation
From Everand
Lords of Creation
Eando Binder
2.5/5 (3)
Xcms
No ratings yet
Xcms
89 pages
Caret - Preprocesamiento
No ratings yet
Caret - Preprocesamiento
215 pages
ChatGPT for Business: Strategies for Success
From Everand
ChatGPT for Business: Strategies for Success
Matthew C. Smith
1/5 (1)
ChemoSpec 1
No ratings yet
ChemoSpec 1
42 pages
Content Creation Revolution with chatGPT
From Everand
Content Creation Revolution with chatGPT
Maria Cowen
No ratings yet
Supergut Sample Report
100% (1)
Supergut Sample Report
29 pages
Microviz An R Package For Microbiome Data Visualiz
No ratings yet
Microviz An R Package For Microbiome Data Visualiz
4 pages
Sqmtools: Automated Processing and Visual Analysis of 'Omics Data With R and Anvi'O
No ratings yet
Sqmtools: Automated Processing and Visual Analysis of 'Omics Data With R and Anvi'O
11 pages
DMC ExQuiz Sol
No ratings yet
DMC ExQuiz Sol
112 pages
N1 Mathematics Lecturer Guide
No ratings yet
N1 Mathematics Lecturer Guide
102 pages
ACFrOgDY3QTlWh7sSk7tO7In8r GjzTEnRCizwcnoJ4DuL5 Pvl55JBQSqVkepdYOo4pqlPIyUQeaiwqCQKTM5sHXIBsQafwS6ikzxpipQSo2KLQ2cziJUuT5HdEFr9Q96ynrmQMhkK1z6X2mUOo
No ratings yet
ACFrOgDY3QTlWh7sSk7tO7In8r GjzTEnRCizwcnoJ4DuL5 Pvl55JBQSqVkepdYOo4pqlPIyUQeaiwqCQKTM5sHXIBsQafwS6ikzxpipQSo2KLQ2cziJUuT5HdEFr9Q96ynrmQMhkK1z6X2mUOo
15 pages
Sequence Series Logarithms P1 MS
No ratings yet
Sequence Series Logarithms P1 MS
15 pages
All chapter download Solution Manual for Introduction to the Design and Analysis of Algorithms, 3/E 3rd Edition Anany Levitin
100% (16)
All chapter download Solution Manual for Introduction to the Design and Analysis of Algorithms, 3/E 3rd Edition Anany Levitin
56 pages
3-3 Properties of Logarithms
No ratings yet
3-3 Properties of Logarithms
29 pages
ADA BCS401 Module 1notes
No ratings yet
ADA BCS401 Module 1notes
27 pages
Basic Math Question
No ratings yet
Basic Math Question
20 pages
Chapter 14 Indices Exponentials and Logarithms Part 2
No ratings yet
Chapter 14 Indices Exponentials and Logarithms Part 2
20 pages
JEE Compendium Solutions Mathematcis 1
50% (2)
JEE Compendium Solutions Mathematcis 1
218 pages
g11 All Source Complete Week 1 To 10 1
No ratings yet
g11 All Source Complete Week 1 To 10 1
225 pages
Namma Kalvi 11th Maths Question Bank em 216446
No ratings yet
Namma Kalvi 11th Maths Question Bank em 216446
30 pages
Basic Mathematics - DPP 05 - IOQM 2024
No ratings yet
Basic Mathematics - DPP 05 - IOQM 2024
4 pages
TIMO 2023 Solution Manual 1
75% (4)
TIMO 2023 Solution Manual 1
27 pages
CSE-206 - Lab Report - 01 PDF
No ratings yet
CSE-206 - Lab Report - 01 PDF
6 pages
2020 Euclid Contest: The Centre For Education in Mathematics and Computing Cemc - Uwaterloo.ca
No ratings yet
2020 Euclid Contest: The Centre For Education in Mathematics and Computing Cemc - Uwaterloo.ca
15 pages
WMA12 01 Que 20210304
100% (1)
WMA12 01 Que 20210304
32 pages
Mock A I Me One Solutions
No ratings yet
Mock A I Me One Solutions
13 pages
Software Metrics-Notes
No ratings yet
Software Metrics-Notes
4 pages
Ece-V-Information Theory & Coding (10ec55) - Notes
No ratings yet
Ece-V-Information Theory & Coding (10ec55) - Notes
217 pages
1ST TERM S1 FURTHER MATHEMATICS
No ratings yet
1ST TERM S1 FURTHER MATHEMATICS
43 pages
ML Aggarwal Maths Solutions Class 9 Chapter 9 Logarithms
No ratings yet
ML Aggarwal Maths Solutions Class 9 Chapter 9 Logarithms
54 pages
ARML Local 2017 Final
No ratings yet
ARML Local 2017 Final
71 pages
Algorithms and Data Structures: Binary Search Algorithm
No ratings yet
Algorithms and Data Structures: Binary Search Algorithm
3 pages
Additional Mathematics (Test1) (Part C)
No ratings yet
Additional Mathematics (Test1) (Part C)
6 pages
(AMALEAKS - BLOGSPOT.COM) GenMath Week 1-10
No ratings yet
(AMALEAKS - BLOGSPOT.COM) GenMath Week 1-10
77 pages
A Level Pure 1 2023
No ratings yet
A Level Pure 1 2023
12 pages
MULTIPLE CHOICE QUESTIONS in ENGINEERING MATHEMATICS by Diego Inocencio T. Gillesania
No ratings yet
MULTIPLE CHOICE QUESTIONS in ENGINEERING MATHEMATICS by Diego Inocencio T. Gillesania
74 pages
VTAMPS 8.0 Senior Secondary
No ratings yet
VTAMPS 8.0 Senior Secondary
12 pages
21EC51_DC_Module_4
No ratings yet
21EC51_DC_Module_4
40 pages

SQMtools 0.7.1

Uploaded by

SQMtools 0.7.1

Uploaded by

Package ‘SQMtools’

January 26, 2022

combineSQM Combine several SQM objects

combineSQMlite Combine several SQM or SQMlite objects

exportKrona Export the taxonomy of a SQM object into a Krona Chart

SQM A SQM or SQMlite object.

exportTable Export results in tabular format

Hadza Hadza hunter-gatherer gut metagenomes

loadSQM Load a SqueezeMeta project into R

The SQM object structure

lvl1 lvl2 lvl3 type rows/names columns data

$percent numeric matrix orders samples percenta

loadSQMlite Load tables generated by sqm2tables.py,

The SQMlite object structure

lvl1 lvl2 lvl3 type rows/names columns data

If external databases for functional classification were provided to SqueezeMeta or SqueezeMeta_reads

MGKOs Single Copy Phylogenetic Marker Genes from Sunagawa’s group

MGOGs Single Copy Phylogenetic Marker Genes from Sunagawa’s group

mostAbundant Get the N most abundant rows from a numeric table

plotBars Plot a barplot using ggplot2

plotFunctions Heatmap of the most abundant functions in a SQM object

a ggplot2 plot object.

plotHeatmap Plot a heatmap using ggplot2

plotTaxonomy Barplot of the most abundant taxa in a SQM object

base_size numeric. Base font size (default 11).

RecA RecA/RadA recombinase

rowMaxs Return a vector with the row-wise maxima of a matrix or dataframe.

Return a vector with the row-wise maxima of a matrix or dataframe.

rowMins Return a vector with the row-wise minima of a matrix or dataframe.

Return a vector with the row-wise minima of a matrix or dataframe.

SQM SQM object to be subsetted.

SQM object containing only the requested bins.

subsetContigs Select contigs

subsetFun Filter results by function

subsetORFs Select ORFs

A note on contig/bins subsetting

subsetRand Select random ORFs

subsetTax Filter results by taxonomy

summary.SQM summary method for class SQM

summary.SQMlite summary method for class SQMlite

USiCGs Universal Single-Copy Genes

plotBars, 13, 15, 17, 18, 20

USiCGs, 13, 14, 29

You might also like