Helloooo. I'm struggling a bit with getting a random position in a random chromosome from an indexed BAM file, could somebody help me out? Here's what I've currently got:
using my_fields = seqan3::fields<seqan3::field::ref_id,
seqan3::field::ref_offset,
seqan3::field::cigar>;
seqan3::alignment_file_input alignment_file{alignment_file_path, my_fields{}};
uint8_t num_chr = alignment_file.header().ref_ids().size();
uint8_t rand_chr = rand() % (num_chr + 1);
uint64_t rand_pos = ???
I'm just not too sure how to use ref_id_info and ref_dict with what I have above... I guess std::get<0>(ref_id_info)
with the appropriate seqan3::field::ref_id should give me the length of the chromosome, but I don't know how to get this.
Hello, so I want to pass a sam_file_input to a function and have a few questions regarding this:
I will be passing the variable via reference. This means that if I read two records from the file somewhere else, and then pass the file to another function, those two records will have already been read?
Secondly, I realized I can't simply do this: void foo(seqan3::sam_file_input & input_file)
, because the compiler wants the full template types. However, I don't want to give a specific list of fields, in case the function is used with different inputs which have different fields. Can I do the following?:
template<typename F, typename B>
void foo(seqan3::sam_file_input<seqan3::sam_file_input_default_traits<>, F, B> & input_file) {}
I noticed that it does compile and my tests pass, but I want to know what the potential issues would be with this.
For example, if I have the following code:
template<typename F, typename B>
void foo(seqan3::sam_file_input<seqan3::sam_file_input_default_traits<>, F, B> & input_file)
{
for (auto it = input_file.begin(); it != input_file.end(); ++it)
{
seqan3::debug_stream << (*it).id() << '\n';
}
}
But the function is called with a
sam_file_input<seqan3::sam_file_input_default_traits<>, seqan3::fields<seqan3::field::cigar>, seqan3::type_list<seqan3::format_sam>>
then will the compiler give an error that seqan3::field::id
is not present? Or do I have to add the line if ((*it).has_value())
before the printing.
Has anyone experience with SeqAn3/ranges and gcc 9.1? I tested with a later gcc9 version (also don't have another one installled yet) and had no issues but a reviewer encountered the following problem which seems to come from the ranges library when building
…/seqan3/submodules/range-v3/include/range/v3/range/concepts.hpp:56:5: internal compiler error: in nothrow_spec_p, at cp/except.c:1247
Just wondering whether there ever have been issues with a certain compiler version
I am trying to use rabema to evaluate my output. I ran razers3 on hg38 with 100'000 queries. This created a out_gold.sam (4.5GB). Using rabema_prepare_sam
I converted it to out_gold.prep.sam (15GB). But converting it to a bam file samtools view -Sb out_gold.prep.sam > out_gold.bam
fails. I don't really know where to start searching. The error message:
...
[W::sam_hdr_create] Duplicated sequence "" in file "out_gold.prep.sam"
[W::sam_hdr_create] Duplicated sequence "" in file "out_gold.prep.sam"
[W::sam_hdr_create] Duplicated sequence "" in file "out_gold.prep.sam"
[E::sam_hrecs_update_hashes] Duplicate entry "" in sam header
samtools view: failed to add PG line to the header
The resulting out_gold.bam file is only 28Bytes large.
I am following the instructions from here: https://github.com/seqan/seqan/tree/develop/apps/rabema
This branch is 252 commits ahead, 66 commits behind simongog:master.
Is this fork supposed to be the new primary repo?