Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
  • Mar 27 2016 23:03

    Cmdv on master

    Update README.md (compare)

  • Mar 27 2016 23:00

    Cmdv on master

    Create README.md (compare)

not with a DL framework
like, manually. Numpy at most, if using python.
yeah, also an idea
That's the best way to learn it IMO
is there a good tutorial for it or should I just start blank canvas and look up stuff if I'm not sure?
I don't know one off the top of my head, but I am sure some exist
i think starting blank canvs is the best approach
it's difficult for me to follow tutorials
maybe that would be a great way to teach ml. let users just try to attempt writing a neural network on a blank file and allow them to click an "ask for 1:1 help" button if they are struggling
only problem is that there is a point where 1:1 help is not helping
like the users is just over-stimulated in some situations and further chatting will not help
like in school where pupils just fall asleep
i really think there is a big potential for this world if we start teaching people machine learning at a large scale
like teaching 1 billion people about deep learning by the end of this decade
imagine the effects this could have
Google Brain : Intro to TFLite and TFLite Micro ; Please RSVP here https://www.meetup.com/TFUG-Mysuru/events/270390225/
Ashish Tilak
Hi Guys... Can anyone guide me to chat room for extreme beginners? Or is this the right place to ask?
David Cottrell
There are some more active chats on discord probably.


We are biology students of Avans Breda, are trying to run a machine learning script in R with a dataset of human genome sequences. We came across some errors and hope that one of you can help us with this.
The scripts we are trying to run are located at https://github.com/cancer-genomics/delfi_scripts.

Our error comes up while running the 04-script on a single bam file from the original dataset, which is part of the scripts located at the mentioned github.
As far as we understand this script joins the product of the previous scripts (an .rds file) with the sample_reference.csv file located on the github, and than splits the data into 5mb bins. Which are later used in a stochastic gradient boosted alogrithm. The problem is in this bit of code:

1 df.fr <- readRDS("../.../.../ourfilespecification_frags_bin_100kb.rds")
2 master <- read_csv("sample_reference.csv")
3 df.fr2 <- inner_join(df.fr, master, by=c("sample"="WGS ID"))
4 hic.eigen <- (df.fr2 %>% filter(sample=="PGDX10346P1"))$hic.eigen

But while joining, it gives the following error message:

Error in UseMethod("inner_join") :
no applicable method for 'inner_join' applied to an object of class "c('GRanges', 'GenomicRanges', 'Ranges', 'GenomicRanges_OR_missing', 'GenomicRanges_OR_GenomicRangesList', 'GenomicRanges_OR_GRangesList', 'List', 'Vector', 'list_OR_List', 'Annotated', 'vector_OR_Vector')"
Calls: inner_join

We assumed this meant that the inner_join function is not compatible with the GRanges class. We tried changing the object class by first changing the GRanges to a dataframe.

1 df.fr <- data.frame(readRDS("../.../.../ourfilespecification_frags_bin_100kb.rds"))

When we ran the script again we error message changed to this:

Error: by can't contain join column sample which is missing from LHS

├─dplyr::inner_join(df.fr, master, by = c(sample = "WGS ID"))
└─dplyr:::inner_join.data.frame(df.fr, master, by = c(sample = "WGS ID"))
├─dplyr::inner_join(tbl_df(x), y, by = by, copy = copy, ...)
├─dplyr::common_by(by, x, y)
└─dplyr:::common_by.character(by, x, y)
└─dplyr:::common_by.list(by, x, y)
└─dplyr:::glubort(fmt_args(args), ..., .envir = .envir)
Execution halted

It seems to us that there are no identical keys to match up the two dataframes. When we looked at the input file, created from the previous scripts, there is no column called sample. The sample_reference.csv file does have a "WGS ID" column.
Is it possible to join these files, and continue with the scripts.

Another thing which bothers us is that the PGDX10346P1 in the code is the name of a bam file. Do we have to change it to the bam file which we use to run the scripts? But first, the inner_join problem.

Could anyone help us fix this error message? The authors of the paper told us the man who wrote the script is currently unavailable because of medical reasons, so we can't ask them.

Please keep in mind that we are biology students, not informatics or mathmatics students, we are not very good with coding.

With kind regards, School of Life Sciences, Avans university of applied sciences, Breda, The Netherlands.

@GinoRaaijmakers better place to ask would be https://stackexchange.com/
@GinoRaaijmakers you also kind of answered the join question yourself: since there is no column named 'sample' in the .rds file, how would it be possible to join both tables on 'sample' = 'WGS ID'?
Hey anyone here?
MinJune Kim
Hi, I'm a newbie trying to train a NMT model for Korean-English. I have a question about the amount of data it needs. Of course it won't be set in stone but I'm guessing you have a hunch of how much data it will need for it to be decent. I have 1.6M parallel sentence set. Will this be enough or do I have to scour the web for more?
Currently training with openNMT default setting (LSTM encoder/decoder w/ 500 hidden units) btw, probably will look into transformer models later
Hi.. is this channel alive?
Marvin Irwin

This is a pretty abstract question, but since GPT-2 is good at predicting the next word, is it possible to ask the model if a sentence is idiomatic/well formed/normal by asking the model what the probability each word being in the sentence at the position that it's in is?

Does something like this exist? I was unable to figure out an easy way to do this using the source here

Hi guys.. i stuck in a project. requirement is to extract the details from the resume. i had explored that we can achieve this using NLP. honestly i don't have any idea about python programming or machine learning. i am badly in need of help. i got some codes but unfortunately, i am not able to figure out the ways to implement. i tried with pryresparser, resume-classifier package but apart from getting some knowledge i wasn't successful in achieving my goal. If any one willing to help me than please mail me on Mohammedzuhair24@yahoo.com
Andrii Khakhariev
Hi all, I'd like to invite you to a live webinar by AWS and Provectus -- MLOps and Reproducible ML on AWS with Kubeflow and SageMaker. If you're interested, please register here: https://provectus.com/webinar-mlops-and-reproducible-ml-on-aws-with-kubeflow-and-sagemaker-october-2020/
I just joined open source community and wanted to contribute in some open source project please suggest me some
I have hands on experience on python and Machine learning
hi everyone
can someone please explain what the j stands for in ∑jWjXj?
Siddharth Jain
Hello Guys,
Is there anybody has usee nltk in AWS EMR Notebook
I am able to install nltk packages but not able to download pickle files like punk etc
Please help me for the same
Andrii Khakhariev
Hi guys, Join Provectus and AWS Nov. 18 for a live webinar: "Feature Store as a Data Foundation for ML". Learn why a scalable Feature Store is a key component of an advanced data infrastructure, enabling organizations to eliminate rework, enforce data traceability, and reduce both the cost of development and the time-to-market for ML models. Register: https://provectus.com/webinar-feature-store-as-data-foundation-for-ml-november-2020/
Hi everyone,
wiki says
"Sometimes only when the Widrow-Hoff is applied to binary targets specifically, it is referred to as Delta Rule, but the terms seem to be used often interchangeably. The delta rule is considered to a special case of the back-propagation algorithm. "
What's the difference between back-propagation algo and widrow-hoff learning rule? I am just confused about the terminology.
hi ,is this chatroom dummy-friendly?
Chlorine Pentoxide
Hello devs, I recently learned and implemented the NEAT genetic algorithm of mine. I would love to verify the implementation and also want to test it rigorously. If anyone knows how to do so I would be delighted(even pseudocode will serve the purpose)? Thanks in advance!!
Andrii Khakhariev

Hey ML folks,
I'd like to invite you all to a live webinar MLOps and Data Quality: How to Deploy Reliable ML Models in Production by Provectus and AWS. Time and place: Online; February 24, 11AM PT | 2PM ET.
Register: https://provectus.com/webinar-mlops-and-data-quality-deploying-reliable-ml-models-feb-2021/

We will discuss what goes into building such fundamental components of machine learning infrastructure as:

  • Feature Store with reproducible data preparation pipelines
  • Reproducible experimentation & Model Training pipelines
  • Continuous Integration and Delivery for ML (MLOps)
  • Production monitoring and model re-training
  • Data Quality checks and Data Monitoring
    We will also explore why data quality and metadata management are crucial to standardize and streamline machine learning life cycle management. Waiting for you at the webinar!
hey guys
anyone online?
Md. Younus Ahamed Shuvo
Should I split the data set into training and testing then use K cross validation? Or use K cross-validation for the whole dataset?
1 reply
Max Hager
Last message 2021?
Is this still workging?