nanalogue CLI Commands Reference
Note: This file is auto-generated.
Main Command
BAM/Mod BAM parsing and analysis tool with a single-molecule focus
Usage: nanalogue <COMMAND>
Commands:
read-table-show-mods Prints basecalled len, align len, mod count per molecule
read-table-hide-mods Prints basecalled len, align len per molecule
read-stats Calculates various summary statistics on all reads
read-info Prints information about reads
find-modified-reads Find names of modified reads through criteria specified by sub commands
window-dens Output windowed densities of all reads
window-grad Output windowed gradients of all reads
peek Display BAM file contigs, contig lengths, and mod types from a "peek" at the
header and first 100 records
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
Subcommands
read-table-show-mods
Prints basecalled len, align len, mod count per molecule
Usage: nanalogue read-table-show-mods [OPTIONS] <BAM_PATH> [SEQ_SUMM_FILE]
Arguments:
<BAM_PATH> Input BAM file. Set to a local file path, or set to - to read from stdin, or set
to a URL to read from a remote file. If using stdin and piping in from `samtools
view`, always include the header with the `-h` option
[SEQ_SUMM_FILE] Input sequence summary file from Guppy/Dorado (optional) [default: ]
Options:
--min-seq-len <MIN_SEQ_LEN>
Exclude reads whose sequence length in the BAM file is below this value. Defaults to 0
[default: 0]
--min-align-len <MIN_ALIGN_LEN>
Exclude reads whose alignment length in the BAM file is below this value. Defaults to
unused
--read-id <READ_ID>
Only include this read id, defaults to unused i.e. all reads are used. NOTE: if there are
multiple alignments corresponding to this read id, all of them are used
--read-id-list <READ_ID_LIST>
Path to file containing list of read IDs (one per line). Lines starting with '#' are
treated as comments and ignored. Cannot be used together with --read-id
--threads <THREADS>
Number of threads used during some aspects of program execution [default: 2]
--include-zero-len
Include "zero-length" sequences e.g. sequences with "*" in the sequence field. By default,
these sequences are excluded to avoid processing errors. If this flag is set, these reads
are included irrespective of any minimum sequence or align length criteria the user may
have set. WARNINGS: (1) Some functions of the codebase may break or produce incorrect
results if you use this flag. (2) due to a technical reason, we need a DNA sequence in the
sequence field and cannot infer sequence length from other sources e.g. CIGAR strings
--read-filter <READ_FILTER>
Only retain reads of this type. Allowed types are `primary_forward`, `primary_reverse`,
`secondary_forward`, `secondary_reverse`, `supplementary_forward`, `supplementary_reverse`
and unmapped. Specify more than one type if needed separated by commas, in which case
reads of any type in list are retained. Defaults to retain reads of all types
-s, --sample-fraction <SAMPLE_FRACTION>
Subsample BAM to retain only this fraction of total number of reads, defaults to 1.0. The
sampling algorithm considers every read according to the specified probability, so due to
this, you may not always get the same number of reads e.g. if you set `-s 0.05` in a file
with 1000 reads, you will get 50 +- sqrt(50) reads. NOTE: a new subsample is drawn every
time as the seed is not fixed. If you want reproducibility, consider piping the output of
`samtools view -s` to our program [default: 1]
--mapq-filter <MAPQ_FILTER>
Exclude reads whose MAPQ (Mapping quality of position) is below this value. Defaults to
zero i.e. do not exclude any read [default: 0]
--exclude-mapq-unavail
Exclude sequences with MAPQ unavailable. In the BAM format, a value of 255 in this column
means MAPQ is unavailable. These reads are allowed by default, set this flag to exclude
--region <REGION>
Only keep reads passing through this region. If a BAM index is available with a name same
as the BAM file but with the .bai suffix, the operation of selecting such reads will be
faster. If you are using standard input as your input e.g. you are piping in the output
from samtools, then you cannot use an index as a BAM filename is not available
--full-region
Only keep reads if they pass through the specified region in full. Related to the input
`--region`; has no effect if that is not set
--tag <TAG>
modified tag
--mod-strand <MOD_STRAND>
modified strand, set this to `bc` or `bc_comp`, meaning on basecalled strand or its
complement. Some technologies like `PacBio` or `ONT` duplex can call mod data on both a
strand and its complementary DNA and store it in the record corresponding to the strand,
so you can use this filter to select only for mod data on a strand or its complement.
Please note that this filter is different from selecting for forward or reverse aligned
reads using the BAM flags
--mod-prob-filter <MOD_PROB_FILTER>
Filter to reject mods before analysis. Specify as low,high where both are fractions to
reject modifications where the probabilities (p) are in this range e.g. "0.4,0.6" rejects
0.4 <= p <= 0.6. You can use this to reject 'weak' modification calls before analysis i.e.
those with probabilities close to 0.5. NOTE: (1) Whether this filtration is applied or
not, mods < 0.5 are considered unmodified and >= 0.5 are considered modified by our
program. (2) mod probabilities are stored as a number from 0-255 in the modBAM format, so
we internally convert 0.0-1.0 to 0-255. Default: reject nothing [default: ]
--trim-read-ends-mod <TRIM_READ_ENDS_MOD>
Filter this many bp at the start and end of a read before any mod operations. Please note
that the units here are bp and not units of base being queried [default: 0]
--base-qual-filter-mod <BASE_QUAL_FILTER_MOD>
Exclude bases whose base quality is below this threshold before any mod operation,
defaults to 0 i.e. unused. NOTE: (1) This step is only applied before modification
operations, and not before any other operations. (2) No offsets such as +33 are needed
here. (3) Modifications on reads where base quality information is not available are all
rejected if this is non-zero [default: 0]
--mod-region <MOD_REGION>
Only keep modification data from this region
--seq-region <SEQ_REGION>
Genomic region from which basecalled sequences are displayed (optional)
--seq-full
Displays entire basecalled sequence (optional)
--show-base-qual
Displays basecalling qualities (optional)
--show-ins-lowercase
Show insertions in lower case
--show-mod-z
Shows modified bases as Z (or z depending on other options)
-h, --help
Print help
read-table-hide-mods
Prints basecalled len, align len per molecule
Usage: nanalogue read-table-hide-mods [OPTIONS] <BAM_PATH> [SEQ_SUMM_FILE]
Arguments:
<BAM_PATH> Input BAM file. Set to a local file path, or set to - to read from stdin, or set
to a URL to read from a remote file. If using stdin and piping in from `samtools
view`, always include the header with the `-h` option
[SEQ_SUMM_FILE] Input sequence summary file from Guppy/Dorado (optional) [default: ]
Options:
--min-seq-len <MIN_SEQ_LEN>
Exclude reads whose sequence length in the BAM file is below this value. Defaults to 0
[default: 0]
--min-align-len <MIN_ALIGN_LEN>
Exclude reads whose alignment length in the BAM file is below this value. Defaults to
unused
--read-id <READ_ID>
Only include this read id, defaults to unused i.e. all reads are used. NOTE: if there are
multiple alignments corresponding to this read id, all of them are used
--read-id-list <READ_ID_LIST>
Path to file containing list of read IDs (one per line). Lines starting with '#' are
treated as comments and ignored. Cannot be used together with --read-id
--threads <THREADS>
Number of threads used during some aspects of program execution [default: 2]
--include-zero-len
Include "zero-length" sequences e.g. sequences with "*" in the sequence field. By default,
these sequences are excluded to avoid processing errors. If this flag is set, these reads
are included irrespective of any minimum sequence or align length criteria the user may
have set. WARNINGS: (1) Some functions of the codebase may break or produce incorrect
results if you use this flag. (2) due to a technical reason, we need a DNA sequence in the
sequence field and cannot infer sequence length from other sources e.g. CIGAR strings
--read-filter <READ_FILTER>
Only retain reads of this type. Allowed types are `primary_forward`, `primary_reverse`,
`secondary_forward`, `secondary_reverse`, `supplementary_forward`, `supplementary_reverse`
and unmapped. Specify more than one type if needed separated by commas, in which case
reads of any type in list are retained. Defaults to retain reads of all types
-s, --sample-fraction <SAMPLE_FRACTION>
Subsample BAM to retain only this fraction of total number of reads, defaults to 1.0. The
sampling algorithm considers every read according to the specified probability, so due to
this, you may not always get the same number of reads e.g. if you set `-s 0.05` in a file
with 1000 reads, you will get 50 +- sqrt(50) reads. NOTE: a new subsample is drawn every
time as the seed is not fixed. If you want reproducibility, consider piping the output of
`samtools view -s` to our program [default: 1]
--mapq-filter <MAPQ_FILTER>
Exclude reads whose MAPQ (Mapping quality of position) is below this value. Defaults to
zero i.e. do not exclude any read [default: 0]
--exclude-mapq-unavail
Exclude sequences with MAPQ unavailable. In the BAM format, a value of 255 in this column
means MAPQ is unavailable. These reads are allowed by default, set this flag to exclude
--region <REGION>
Only keep reads passing through this region. If a BAM index is available with a name same
as the BAM file but with the .bai suffix, the operation of selecting such reads will be
faster. If you are using standard input as your input e.g. you are piping in the output
from samtools, then you cannot use an index as a BAM filename is not available
--full-region
Only keep reads if they pass through the specified region in full. Related to the input
`--region`; has no effect if that is not set
--seq-region <SEQ_REGION>
Genomic region from which basecalled sequences are displayed (optional)
--seq-full
Displays entire basecalled sequence (optional)
--show-base-qual
Displays basecalling qualities (optional)
--show-ins-lowercase
Show insertions in lower case
-h, --help
Print help
read-stats
Calculates various summary statistics on all reads
Usage: nanalogue read-stats [OPTIONS] <BAM_PATH>
Arguments:
<BAM_PATH> Input BAM file. Set to a local file path, or set to - to read from stdin, or set to a
URL to read from a remote file. If using stdin and piping in from `samtools view`,
always include the header with the `-h` option
Options:
--min-seq-len <MIN_SEQ_LEN>
Exclude reads whose sequence length in the BAM file is below this value. Defaults to 0
[default: 0]
--min-align-len <MIN_ALIGN_LEN>
Exclude reads whose alignment length in the BAM file is below this value. Defaults to
unused
--read-id <READ_ID>
Only include this read id, defaults to unused i.e. all reads are used. NOTE: if there are
multiple alignments corresponding to this read id, all of them are used
--read-id-list <READ_ID_LIST>
Path to file containing list of read IDs (one per line). Lines starting with '#' are
treated as comments and ignored. Cannot be used together with --read-id
--threads <THREADS>
Number of threads used during some aspects of program execution [default: 2]
--include-zero-len
Include "zero-length" sequences e.g. sequences with "*" in the sequence field. By default,
these sequences are excluded to avoid processing errors. If this flag is set, these reads
are included irrespective of any minimum sequence or align length criteria the user may
have set. WARNINGS: (1) Some functions of the codebase may break or produce incorrect
results if you use this flag. (2) due to a technical reason, we need a DNA sequence in the
sequence field and cannot infer sequence length from other sources e.g. CIGAR strings
--read-filter <READ_FILTER>
Only retain reads of this type. Allowed types are `primary_forward`, `primary_reverse`,
`secondary_forward`, `secondary_reverse`, `supplementary_forward`, `supplementary_reverse`
and unmapped. Specify more than one type if needed separated by commas, in which case
reads of any type in list are retained. Defaults to retain reads of all types
-s, --sample-fraction <SAMPLE_FRACTION>
Subsample BAM to retain only this fraction of total number of reads, defaults to 1.0. The
sampling algorithm considers every read according to the specified probability, so due to
this, you may not always get the same number of reads e.g. if you set `-s 0.05` in a file
with 1000 reads, you will get 50 +- sqrt(50) reads. NOTE: a new subsample is drawn every
time as the seed is not fixed. If you want reproducibility, consider piping the output of
`samtools view -s` to our program [default: 1]
--mapq-filter <MAPQ_FILTER>
Exclude reads whose MAPQ (Mapping quality of position) is below this value. Defaults to
zero i.e. do not exclude any read [default: 0]
--exclude-mapq-unavail
Exclude sequences with MAPQ unavailable. In the BAM format, a value of 255 in this column
means MAPQ is unavailable. These reads are allowed by default, set this flag to exclude
--region <REGION>
Only keep reads passing through this region. If a BAM index is available with a name same
as the BAM file but with the .bai suffix, the operation of selecting such reads will be
faster. If you are using standard input as your input e.g. you are piping in the output
from samtools, then you cannot use an index as a BAM filename is not available
--full-region
Only keep reads if they pass through the specified region in full. Related to the input
`--region`; has no effect if that is not set
-h, --help
Print help
read-info
Prints information about reads
Usage: nanalogue read-info [OPTIONS] <BAM_PATH>
Arguments:
<BAM_PATH> Input BAM file. Set to a local file path, or set to - to read from stdin, or set to a
URL to read from a remote file. If using stdin and piping in from `samtools view`,
always include the header with the `-h` option
Options:
--min-seq-len <MIN_SEQ_LEN>
Exclude reads whose sequence length in the BAM file is below this value. Defaults to 0
[default: 0]
--min-align-len <MIN_ALIGN_LEN>
Exclude reads whose alignment length in the BAM file is below this value. Defaults to
unused
--read-id <READ_ID>
Only include this read id, defaults to unused i.e. all reads are used. NOTE: if there are
multiple alignments corresponding to this read id, all of them are used
--read-id-list <READ_ID_LIST>
Path to file containing list of read IDs (one per line). Lines starting with '#' are
treated as comments and ignored. Cannot be used together with --read-id
--threads <THREADS>
Number of threads used during some aspects of program execution [default: 2]
--include-zero-len
Include "zero-length" sequences e.g. sequences with "*" in the sequence field. By default,
these sequences are excluded to avoid processing errors. If this flag is set, these reads
are included irrespective of any minimum sequence or align length criteria the user may
have set. WARNINGS: (1) Some functions of the codebase may break or produce incorrect
results if you use this flag. (2) due to a technical reason, we need a DNA sequence in the
sequence field and cannot infer sequence length from other sources e.g. CIGAR strings
--read-filter <READ_FILTER>
Only retain reads of this type. Allowed types are `primary_forward`, `primary_reverse`,
`secondary_forward`, `secondary_reverse`, `supplementary_forward`, `supplementary_reverse`
and unmapped. Specify more than one type if needed separated by commas, in which case
reads of any type in list are retained. Defaults to retain reads of all types
-s, --sample-fraction <SAMPLE_FRACTION>
Subsample BAM to retain only this fraction of total number of reads, defaults to 1.0. The
sampling algorithm considers every read according to the specified probability, so due to
this, you may not always get the same number of reads e.g. if you set `-s 0.05` in a file
with 1000 reads, you will get 50 +- sqrt(50) reads. NOTE: a new subsample is drawn every
time as the seed is not fixed. If you want reproducibility, consider piping the output of
`samtools view -s` to our program [default: 1]
--mapq-filter <MAPQ_FILTER>
Exclude reads whose MAPQ (Mapping quality of position) is below this value. Defaults to
zero i.e. do not exclude any read [default: 0]
--exclude-mapq-unavail
Exclude sequences with MAPQ unavailable. In the BAM format, a value of 255 in this column
means MAPQ is unavailable. These reads are allowed by default, set this flag to exclude
--region <REGION>
Only keep reads passing through this region. If a BAM index is available with a name same
as the BAM file but with the .bai suffix, the operation of selecting such reads will be
faster. If you are using standard input as your input e.g. you are piping in the output
from samtools, then you cannot use an index as a BAM filename is not available
--full-region
Only keep reads if they pass through the specified region in full. Related to the input
`--region`; has no effect if that is not set
--tag <TAG>
modified tag
--mod-strand <MOD_STRAND>
modified strand, set this to `bc` or `bc_comp`, meaning on basecalled strand or its
complement. Some technologies like `PacBio` or `ONT` duplex can call mod data on both a
strand and its complementary DNA and store it in the record corresponding to the strand,
so you can use this filter to select only for mod data on a strand or its complement.
Please note that this filter is different from selecting for forward or reverse aligned
reads using the BAM flags
--mod-prob-filter <MOD_PROB_FILTER>
Filter to reject mods before analysis. Specify as low,high where both are fractions to
reject modifications where the probabilities (p) are in this range e.g. "0.4,0.6" rejects
0.4 <= p <= 0.6. You can use this to reject 'weak' modification calls before analysis i.e.
those with probabilities close to 0.5. NOTE: (1) Whether this filtration is applied or
not, mods < 0.5 are considered unmodified and >= 0.5 are considered modified by our
program. (2) mod probabilities are stored as a number from 0-255 in the modBAM format, so
we internally convert 0.0-1.0 to 0-255. Default: reject nothing [default: ]
--trim-read-ends-mod <TRIM_READ_ENDS_MOD>
Filter this many bp at the start and end of a read before any mod operations. Please note
that the units here are bp and not units of base being queried [default: 0]
--base-qual-filter-mod <BASE_QUAL_FILTER_MOD>
Exclude bases whose base quality is below this threshold before any mod operation,
defaults to 0 i.e. unused. NOTE: (1) This step is only applied before modification
operations, and not before any other operations. (2) No offsets such as +33 are needed
here. (3) Modifications on reads where base quality information is not available are all
rejected if this is non-zero [default: 0]
--mod-region <MOD_REGION>
Only keep modification data from this region
--detailed
Print detailed modification data (JSON)
--detailed-pretty
Pretty-print detailed modification data (JSON)
-h, --help
Print help
find-modified-reads
Find names of modified reads through criteria specified by sub commands
Usage: nanalogue find-modified-reads <COMMAND>
Commands:
all-dens-between Find reads with all windowed modification densities within
specified limits
any-dens-above Find reads with windowed modification density such that at
least one window is at or above the high value
any-dens-below Find reads with windowed modification density such that at
least one window is at or below the low value
any-dens-below-and-any-dens-above Find reads with windowed modification density such that at
least one window is at or below the low value and at least one
window is at or above the high value. This operation may enrich
for reads with spatial gradients in modification density
dens-range-above Find reads with windowed modification density such that max of
all densities minus min of all densities is at least the value
specified. This operation may enrich for reads with spatial
gradients in modification density
any-abs-grad-above Find reads such that absolute value of gradient in modification
density measured in windows is at least the value specified.
This operation enriches for reads with spatial gradients in
modification density
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
window-dens
Output windowed densities of all reads
Usage: nanalogue window-dens [OPTIONS] --win <WIN> --step <STEP> <BAM_PATH>
Arguments:
<BAM_PATH> Input BAM file. Set to a local file path, or set to - to read from stdin, or set to a
URL to read from a remote file. If using stdin and piping in from `samtools view`,
always include the header with the `-h` option
Options:
--min-seq-len <MIN_SEQ_LEN>
Exclude reads whose sequence length in the BAM file is below this value. Defaults to 0
[default: 0]
--min-align-len <MIN_ALIGN_LEN>
Exclude reads whose alignment length in the BAM file is below this value. Defaults to
unused
--read-id <READ_ID>
Only include this read id, defaults to unused i.e. all reads are used. NOTE: if there are
multiple alignments corresponding to this read id, all of them are used
--read-id-list <READ_ID_LIST>
Path to file containing list of read IDs (one per line). Lines starting with '#' are
treated as comments and ignored. Cannot be used together with --read-id
--threads <THREADS>
Number of threads used during some aspects of program execution [default: 2]
--include-zero-len
Include "zero-length" sequences e.g. sequences with "*" in the sequence field. By default,
these sequences are excluded to avoid processing errors. If this flag is set, these reads
are included irrespective of any minimum sequence or align length criteria the user may
have set. WARNINGS: (1) Some functions of the codebase may break or produce incorrect
results if you use this flag. (2) due to a technical reason, we need a DNA sequence in the
sequence field and cannot infer sequence length from other sources e.g. CIGAR strings
--read-filter <READ_FILTER>
Only retain reads of this type. Allowed types are `primary_forward`, `primary_reverse`,
`secondary_forward`, `secondary_reverse`, `supplementary_forward`, `supplementary_reverse`
and unmapped. Specify more than one type if needed separated by commas, in which case
reads of any type in list are retained. Defaults to retain reads of all types
-s, --sample-fraction <SAMPLE_FRACTION>
Subsample BAM to retain only this fraction of total number of reads, defaults to 1.0. The
sampling algorithm considers every read according to the specified probability, so due to
this, you may not always get the same number of reads e.g. if you set `-s 0.05` in a file
with 1000 reads, you will get 50 +- sqrt(50) reads. NOTE: a new subsample is drawn every
time as the seed is not fixed. If you want reproducibility, consider piping the output of
`samtools view -s` to our program [default: 1]
--mapq-filter <MAPQ_FILTER>
Exclude reads whose MAPQ (Mapping quality of position) is below this value. Defaults to
zero i.e. do not exclude any read [default: 0]
--exclude-mapq-unavail
Exclude sequences with MAPQ unavailable. In the BAM format, a value of 255 in this column
means MAPQ is unavailable. These reads are allowed by default, set this flag to exclude
--region <REGION>
Only keep reads passing through this region. If a BAM index is available with a name same
as the BAM file but with the .bai suffix, the operation of selecting such reads will be
faster. If you are using standard input as your input e.g. you are piping in the output
from samtools, then you cannot use an index as a BAM filename is not available
--full-region
Only keep reads if they pass through the specified region in full. Related to the input
`--region`; has no effect if that is not set
--win <WIN>
size of window in units of base being queried i.e. if you are looking for cytosine
modifications, then a window of a value 300 means create windows each with 300 cytosines
irrespective of their modification status
--step <STEP>
step window by this size in units of base being queried
--tag <TAG>
modified tag
--mod-strand <MOD_STRAND>
modified strand, set this to `bc` or `bc_comp`, meaning on basecalled strand or its
complement. Some technologies like `PacBio` or `ONT` duplex can call mod data on both a
strand and its complementary DNA and store it in the record corresponding to the strand,
so you can use this filter to select only for mod data on a strand or its complement.
Please note that this filter is different from selecting for forward or reverse aligned
reads using the BAM flags
--mod-prob-filter <MOD_PROB_FILTER>
Filter to reject mods before analysis. Specify as low,high where both are fractions to
reject modifications where the probabilities (p) are in this range e.g. "0.4,0.6" rejects
0.4 <= p <= 0.6. You can use this to reject 'weak' modification calls before analysis i.e.
those with probabilities close to 0.5. NOTE: (1) Whether this filtration is applied or
not, mods < 0.5 are considered unmodified and >= 0.5 are considered modified by our
program. (2) mod probabilities are stored as a number from 0-255 in the modBAM format, so
we internally convert 0.0-1.0 to 0-255. Default: reject nothing [default: ]
--trim-read-ends-mod <TRIM_READ_ENDS_MOD>
Filter this many bp at the start and end of a read before any mod operations. Please note
that the units here are bp and not units of base being queried [default: 0]
--base-qual-filter-mod <BASE_QUAL_FILTER_MOD>
Exclude bases whose base quality is below this threshold before any mod operation,
defaults to 0 i.e. unused. NOTE: (1) This step is only applied before modification
operations, and not before any other operations. (2) No offsets such as +33 are needed
here. (3) Modifications on reads where base quality information is not available are all
rejected if this is non-zero [default: 0]
--mod-region <MOD_REGION>
Only keep modification data from this region
-h, --help
Print help
window-grad
Output windowed gradients of all reads
Usage: nanalogue window-grad [OPTIONS] --win <WIN> --step <STEP> <BAM_PATH>
Arguments:
<BAM_PATH> Input BAM file. Set to a local file path, or set to - to read from stdin, or set to a
URL to read from a remote file. If using stdin and piping in from `samtools view`,
always include the header with the `-h` option
Options:
--min-seq-len <MIN_SEQ_LEN>
Exclude reads whose sequence length in the BAM file is below this value. Defaults to 0
[default: 0]
--min-align-len <MIN_ALIGN_LEN>
Exclude reads whose alignment length in the BAM file is below this value. Defaults to
unused
--read-id <READ_ID>
Only include this read id, defaults to unused i.e. all reads are used. NOTE: if there are
multiple alignments corresponding to this read id, all of them are used
--read-id-list <READ_ID_LIST>
Path to file containing list of read IDs (one per line). Lines starting with '#' are
treated as comments and ignored. Cannot be used together with --read-id
--threads <THREADS>
Number of threads used during some aspects of program execution [default: 2]
--include-zero-len
Include "zero-length" sequences e.g. sequences with "*" in the sequence field. By default,
these sequences are excluded to avoid processing errors. If this flag is set, these reads
are included irrespective of any minimum sequence or align length criteria the user may
have set. WARNINGS: (1) Some functions of the codebase may break or produce incorrect
results if you use this flag. (2) due to a technical reason, we need a DNA sequence in the
sequence field and cannot infer sequence length from other sources e.g. CIGAR strings
--read-filter <READ_FILTER>
Only retain reads of this type. Allowed types are `primary_forward`, `primary_reverse`,
`secondary_forward`, `secondary_reverse`, `supplementary_forward`, `supplementary_reverse`
and unmapped. Specify more than one type if needed separated by commas, in which case
reads of any type in list are retained. Defaults to retain reads of all types
-s, --sample-fraction <SAMPLE_FRACTION>
Subsample BAM to retain only this fraction of total number of reads, defaults to 1.0. The
sampling algorithm considers every read according to the specified probability, so due to
this, you may not always get the same number of reads e.g. if you set `-s 0.05` in a file
with 1000 reads, you will get 50 +- sqrt(50) reads. NOTE: a new subsample is drawn every
time as the seed is not fixed. If you want reproducibility, consider piping the output of
`samtools view -s` to our program [default: 1]
--mapq-filter <MAPQ_FILTER>
Exclude reads whose MAPQ (Mapping quality of position) is below this value. Defaults to
zero i.e. do not exclude any read [default: 0]
--exclude-mapq-unavail
Exclude sequences with MAPQ unavailable. In the BAM format, a value of 255 in this column
means MAPQ is unavailable. These reads are allowed by default, set this flag to exclude
--region <REGION>
Only keep reads passing through this region. If a BAM index is available with a name same
as the BAM file but with the .bai suffix, the operation of selecting such reads will be
faster. If you are using standard input as your input e.g. you are piping in the output
from samtools, then you cannot use an index as a BAM filename is not available
--full-region
Only keep reads if they pass through the specified region in full. Related to the input
`--region`; has no effect if that is not set
--win <WIN>
size of window in units of base being queried i.e. if you are looking for cytosine
modifications, then a window of a value 300 means create windows each with 300 cytosines
irrespective of their modification status
--step <STEP>
step window by this size in units of base being queried
--tag <TAG>
modified tag
--mod-strand <MOD_STRAND>
modified strand, set this to `bc` or `bc_comp`, meaning on basecalled strand or its
complement. Some technologies like `PacBio` or `ONT` duplex can call mod data on both a
strand and its complementary DNA and store it in the record corresponding to the strand,
so you can use this filter to select only for mod data on a strand or its complement.
Please note that this filter is different from selecting for forward or reverse aligned
reads using the BAM flags
--mod-prob-filter <MOD_PROB_FILTER>
Filter to reject mods before analysis. Specify as low,high where both are fractions to
reject modifications where the probabilities (p) are in this range e.g. "0.4,0.6" rejects
0.4 <= p <= 0.6. You can use this to reject 'weak' modification calls before analysis i.e.
those with probabilities close to 0.5. NOTE: (1) Whether this filtration is applied or
not, mods < 0.5 are considered unmodified and >= 0.5 are considered modified by our
program. (2) mod probabilities are stored as a number from 0-255 in the modBAM format, so
we internally convert 0.0-1.0 to 0-255. Default: reject nothing [default: ]
--trim-read-ends-mod <TRIM_READ_ENDS_MOD>
Filter this many bp at the start and end of a read before any mod operations. Please note
that the units here are bp and not units of base being queried [default: 0]
--base-qual-filter-mod <BASE_QUAL_FILTER_MOD>
Exclude bases whose base quality is below this threshold before any mod operation,
defaults to 0 i.e. unused. NOTE: (1) This step is only applied before modification
operations, and not before any other operations. (2) No offsets such as +33 are needed
here. (3) Modifications on reads where base quality information is not available are all
rejected if this is non-zero [default: 0]
--mod-region <MOD_REGION>
Only keep modification data from this region
-h, --help
Print help
peek
Display BAM file contigs, contig lengths, and mod types from a "peek" at the header and first 100
records
Usage: nanalogue peek <BAM>
Arguments:
<BAM> Input BAM file (path, URL, or '-' for stdin)
Options:
-h, --help Print help
help
error: unrecognized subcommand '--help'
Usage: nanalogue <COMMAND>
For more information, try '--help'.