nixpkgsalso by writing some of your own packages/config?
NGLessso that 95% of users wouldn't ever have to even know that nix is a thing that exists
Since this has gotten a bit noisy, as an experiment, I invite you to do join my open office hours tomorrow @ 2pm GMT
This is Sanjiban Sengupta, sophomore in Computer Engineering from IIIT Bhubaneswar, India, would like to contribute to your org for GSoC'20, I have practical and working knowledge of C, C++, Python and Java, for web, I am familiar with HTML, CSS, JS, Bootstrap and frameworks such as ReactJS and NodeJS, also i am acquainted with concepts of ML and AI and know the technicalities to apply these to solve modern real life problems.
Being a beginner at GSOC'20, it will be very grateful if you could provide the required information for beginning processes for the development.
We had a useful zoom discussion today, until we ran out of time, so I am scheduling another round for next week. Open to anyone in the world
Luis Pedro Coelho is inviting you to a scheduled Zoom meeting.
Topic: Open Office Hours
Time: Mar 18, 2020 02:00 PM Greenwich Mean Time
Join Zoom Meeting
Meeting ID: 756 695 245
Sadeep here, from Sri Lanka.
Final year student in CSE, University of Moratuwa.
For my final year project, I am working on "reducing higher dimensional genomic data (we take oligonucleotide frequencies as of now) using Auto Encoders". I am using C to pre-process FASTA and FASTQ files to feed them to the neural network implemented in Python.
I am good in C/C++. I come from a strong algorithm background, having won the Sri Lankan Olympiad and represented Sri Lanka at the International Olympiad in Inforrmatics (https://www.ioi2014.org/ioi-2014.html).
For GSoC, I love to work on a genomics projects since my learning curve would be low.
Is there a real chance of getting selected to an NGLess project since its a little late to the party?
After downloading the gut-short demo. The script "gut-demo.ngl " looks like this
import "parallel" version "1.0"
import "mocat" version "1.0"
import "motus" version "0.1"
import "igc" version "1.0"
igc module version 1.0, causes unexpected results because of its size of 45 bytes at https://ngless.embl.de/resources/Demos/.
As an alternative I changed the version from 1.0 to 0.9, the script works fine.
Since this is not an error in the source code, but simply an availability issue, a github issue would not suffice.
The documentation present at https://ngless.embl.de/tutorial-gut-metagenomics.html#
would require minor documentation updates like
ngless "1.0" (to)------ "1.1"
import "parallel" version "0.6" ------ "1.0"
import "mocat" version "0.0" ------ "1.0"
import "motus" version "0.1"
import "igc" version "0.0" ------ "0.9"
As an idea, I suggest we could make a small mechanism for linking the documentation with the files present on the server, hence there would be dynamic changes to the scripts present at documentation and hence we can reduce human interaction(manual changes) with the documentation.
Its an interesting prospect.
I hope everyone is SAFE and doing well!
I have been thinking about the compute problem, and my main aim was to reduce the processing time(hours) taken by ngless even for tests like "gut-short" and "ocean genomics".
Processing on a local machine limited by our traditional infrastructure would always require much time, I did quite some digging and I am onto something.
Possible Alternative: Using Google Cloud Life-sciences : It enables the life sciences community to process biomedical data at scale. Cost effective and supported by a growing partner ecosystem, Cloud Life Sciences lets you focus on analyzing data and reproducing results while GCP takes care of the rest.
Its a "High Performance Computing" solution towards using NGless and analyzing all the modules in seconds/minutes(not hours). That is a big jump and can mean much time would be saved to do better tasks.
Bioinformaticians can build what they want, not just what they need, using open standards. Researchers can speed up your research, ask new questions, and share data in a secure online environment. IT professionals, rest easy knowing you have the resources you need to meet computational demand, secure data, and ensure system reliability.
I would be exploring more about this solution in the coming days, and if I find a solution that seems useful and would benefit everyone, I will make a medium article(step-by-step guide) and post it here, letting everyone know.
Meanwhile, you may see the documentation for Google Life Sciences,
Genomic Analyses on Google Cloud Platform: https://www.youtube.com/watch?v=27tSivxnQ_E&feature=emb_logo
Case Study by "Color": https://cloud.google.com/customers/color (Color: Enabling scientists to do real-time genomic data exploration)