template <typename traits_type, typename fields_type, typename format_type>
inline void filter_file(seqan3::sam_file_input<traits_type, fields_type, format_type> & input, std::streampos const & file_position)
seek_to
to jump to the position. Then I want to output records for the next 100 base pairs, for example.
std::views::take_while(CONDITION)
instead of or in addition to std::views::filter()
. Make the condition that the position is inside your region. Then it will stop processing input as soon as you leave that region.
template <typename traits_type, typename fields_type, typename format_type>
inline auto get_overlap_records(seqan3::sam_file_input<traits_type, fields_type, format_type> & input,...)
{
...
auto results_list = input | std::views::take_while([file_position](auto & rec) {return file_position != -1;})
| std::views::take_while([end](auto & rec) {return std::make_tuple(rec.reference_id().value(), rec.reference_position().value()) < end;})
| std::views::filter([](auto & rec) {return !unmapped(rec);})
| seqan3::views::to<std::vector>;
...
return results_list;
}
results_list
to an output file just fine (sam_file_output out{out_file}; results_list | out;
), but if I try and access individual elements within results_list, I get errors. Specifically, I tried doing debug_stream << results_list
, and I tried using EXPECT_EQ(results_list1, results_list2)
for two separate calls to the function.
[(CTTTGGGAGGCCAGGAGTTCAACATCAGCCTGGGCAACATGGTGAAACCACGTCTCTACCAAAAATACAAAAATTAGATGGGCATGGTGGCATGTGCCTGT,simulated_mult_chr_small-chr2-65,0,1,6,(unknown file: Failure
C++ exception with description "Access is not allowed because there is no sequence information." thrown in the test body.
"
seqan3::field::alignment
which maybe is not present in my SAM file but the debug_stream/comparison functions try and access.
seqan3::sam_file_input
and default fields), whereas samtools parses & indexes in ~10 minutes. If I specify a subset of fields, it takes 30 minutes to parse with seqan3. Both of these times are using 16 threads.
git clone --recurse-submodules
, I will end up with duplicate seqan3 libraries right?
chopper
also exposes means to set the path where seqan3 is.
git submodule update --init
. recursive shouldn't break anything, but it;s unnecessary