Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    tensorfoo
    @tensorfoo
    @Kanchan112 yeah that should be totally do-able with those resources. But you might have to reduce the max token size because the GPUs have less than 16Gb vram. Try 120k?
    Kanchan112
    @Kanchan112
    @tensorfoo thank you, we will consider that!
    What does 1 update in the log file correspond to? Does it refer to update after a pass through a batch?
    2 replies
    Aswin Pradeep
    @aswinpradeep
    @agupta54 Thanks for sharing the CLSRIL paper! Btw what would be the recommended specs if we would like to experiment fine tuning ?
    2 replies
    Aswin Pradeep
    @aswinpradeep
    Also while going through your models repository, I can see only kannada based on XLSR. Is there any specific reason for it ?
    1 reply
    Aswin Pradeep
    @aswinpradeep
    @agupta54
    I can see two types of finetuning. One without Hydra and another with it. With branch would you recommend and can you just give a short comment on the key difference between both methods. Also please do recommend an infra to try out the fine-tuning for <50hr.
    1 reply
    Ajitesh Sharma
    @aj7tesh
    @agupta54 @soujyo How do we enable real time transcript generation in Speech Recognition Open API. Assuming audio will be from live mic. Is it processed in javascript and then sent in smaller buffer chunks to speech recognition server, if not how is it be done ?
    1 reply
    Mirishkar S Ganesh
    @mirishkarganesh
    Hi @agupta54 @srajat84 , I am trying to build a pretrain model. I have been encountered with the following error. Please go through it and help me out in solving these issues.
    1 reply
    SUJIT SAHOO
    @Sujit27
    Hi @agupta54. I was going through the Speaker Clustering work that you guys have done here. A link there points me to the Resemblyzer repo that has the Voice encoder to create the embeddings from audio. But I could not find any link to the repo where you have actually done the clustering and the subsequent steps that you mention in the page. Do you have such a repo ?
    3 replies
    Ankur Bhatia
    @ankurbhatia24
    Hi Team,
    I was setting up the Intelligent Data Pipeline (https://open-speech-ekstep.github.io/intelligent_data_pipelines). I have set up the Infra for the same (composer environment) and modified the configs which I could understand. Now my question is how to start running the pipeline on a test audio folder that i have uploaded to my google cloud bucket? How does the process start.
    Also, through the documentation, some parts were unclear for setting up the environment variables. It would be great if someone could help me with it.
    1 reply
    Anchal Jaiswal
    @Anchal5604218_twitter
    Hi Team,
    We are trying to fine-tune your Hindi pre-trained model for our own use case. Our requirement is to transcribe audio that is slightly code-mixed (majority is in Hindi with some English words in between). We have already tried the APIs of big cloud service providers and they don't do a great job at identifying code mixed audio. My question is that would it be possible to fine-tune the Vakyansh model to better identify code-mixed audio, given that we preparing around 200 hours of data for fine-tuning.
    Thanks in advance
    MRG
    @gurjar112
    Hi @agupta54
    I have few queries.
    1. I wanted to generate custom model using script provided by vakyansh repo but custom model is not getting generated, Custom_model repo is blank.
      please let me know where I am doing something Wrong
    2. How inference is done at real time?
    3. What all strategies u have used to deploy a model
      Thanks
    1 reply
    alicekile-tw
    @alicekile-tw
    Hello, is this website hosted somewhere?
    Shahzeb Ali
    @ShahzebAli42
    hi vakyansh you did a fabulous work i have been following your wav2vec2 updates.
    Right now i am finetuning a model. can any one please tell me that why am i getting cuda out of memory error after some epochs. what change can i make in the base_10.yml config to overcome this
    Can anyone describe the parameters in the base_10h.yml? because i am getting good transcription results at 26th epoch weights but bad results at 43rd epoch result?
    1 reply
    Andrew Lauder
    @AndrewLauder
    @srajat84 we are also looking to collect more languages, please pm me
    SUJIT SAHOO
    @Sujit27
    Hi @agupta54 What is the difference between 'Single Model for Inference' and 'Finetuned model' ? I see that both are .pt files, and the latter a much bigger file. I have been using the hindi model (hindi_infer.pt, v2-hydra branch) for the last 4 months. Has there been any new release of hi models ?
    1 reply
    aman313
    @aman313
    Hi Vakyansh team. Congrats on the excellent work you guys have done. Is there a way to get access to the data that has been used for training these models?
    1 reply
    Kanchan112
    @Kanchan112
    Hi Team,
    can we know what data you have used of Nepali language for pretraining as well as finetuning?
    3 replies
    Sourya Kakarla
    @ma08
    Hi! Just got to know about Vakyansh
    Anyone around right now?
    I am going through the docs and the repos to know more. I have a broad question - Is kaldi used in any of Vakyansh's recipes?
    2 replies
    Vasista Lodagala
    @vasistalodagala
    Hi Vakyansh Team. The download links for the Telugu labelled data (1024.93 hours) are not working. Links are available at https://github.com/Open-Speech-EkStep/ULCA-asr-dataset-corpus#telugu-labelled-total-duration-is-102593-hours. They have probably expired now. Request you to look into this issue and restore access.
    Jason Black
    @jeb-orcl
    Greetings, Vakyansh Team. Could someone address the question at Open-Speech-EkStep/ULCA-asr-dataset-corpus#4 about permissions for this data? Thank you.
    Chakshu Gautam
    @ChakshuGautam

    Hey Vakyansh team, I was trying out the demo site - https://inference.vakyansh.in/hindi but the mic is stuck in the state "starting". Alternatively I tried uploading a wav file at the bottom and got the message invalid file. Sharing the file below as well - https://github.com/Code4GovTech/curtain-raiser/blob/master/recording.wav?raw=true

    Let me know how I can test this. Thanks.

    3 replies