Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Rajat Singhal
    @srajat84
    Vakyansh Documentation: https://open-speech-ekstep.github.io/
    vis27
    @vis27
    Hi Rajat
    I wanted to test some languages(like Hindi, English, etc) on your pre trained model
    Could you please help me with the steps?
    Anirudh Gupta
    @agupta54
    hi @vis27 you can follow the steps mentioned in this repository to use the models on your own audio files https://github.com/Open-Speech-EkStep/vakyansh-wav2vec2-experimentation
    check the steps under inference
    vis27
    @vis27
    Thanks
    meeramohan-2020
    @meeramohan-2020
    Hai. Im Meera . I would like to extend your model to collect speech dataset on other languages . is there any alternative solution to google cloud bucket storage
    Rajat Singhal
    @srajat84
    Hi Meera, I assume you looking for other object storage (like S3) support for Intelligent Data Pipelines, right ?
    Rajat Singhal
    @srajat84
    if it is data pipeline, we can extend it by adding more support for other storage type here: https://github.com/Open-Speech-EkStep/audio-to-speech-pipeline/tree/master/packages/ekstep_data_pipelines/common/infra_commons/storage
    Meera Mohan
    @meera_mohan:matrix.org
    [m]
    Thank you sir
    Rajat Singhal
    @srajat84
    @meera_mohan:matrix.org we are already collecting datasets of many languages for pre training. What dataset you are looking for exactly and how much. Please send you dataset requirements.
    vis27
    @vis27

    Hi @agupta54 I was implementing the steps of the link which you have provided but in the bash start_pretraining_base.sh step I am getting the below error:
    "RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx"
    I assume it requires nvidia gpu to pretrain the model so I switched to google colab to leverage GPU but there I was stuck with the last step of the wav2letter module - pip install -e .
    The error was:
    "ERROR: Command errored out with exit status 1: /usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/content/wav2letter/bindings/python/setup.py'"'"'; file='"'"'/content/wav2letter/bindings/python/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output."

    Hence I am unable to complete my testing. Could you please help me out with either of the error?

    Anirudh Gupta
    @agupta54
    @vis27 pretraining is a very computationally intensive process and wouldn't make much sense to do it on a cpu. We assumed that nvidia drivers are already set up.
    On Google colab please make sure that you are executing any shell command with a ! in the beginning of the command or use %%bash in cells which strictly comprise of shell commands. (in your case pip)
    Even then there can be compatibility issues since wav2letter requires a particular version of C++ compiler. And it will be a little tricky to to change the C++ version inside colab.
    vis27
    @vis27
    @agupta54 ofcourse I was using ! in the beginning of shell commands and didn't faced any issues when installing fairseq, kenlm packages in colab. It was wav2letter package which was causing issue. May be it is due to C++ compiler issue.
    Another query I have is if I am using docker environment to execute above commands then I should install nvidia driver inside the docker environment also? Because my laptop had nvidia driver but I guess it was not accessible through docker environment.
    Mirishkar S Ganesh
    @mirishkarganesh
    Hi Team Vakyansh
    How can I test the models on my own audio data? Can you help me with the API if you have any?
    Anirudh Gupta
    @agupta54
    Hi @mirishkarganesh what is the duration of audio you want to test?
    Rishi Lulla
    @rishi-lulla
    Hi.. I am new to Vakyansk. I would like to use the Hindi language pre trained model on some audio files I have. Can you guide me how can I get started? Is there a step by step process available to use the pre trained model?
    Mirishkar S Ganesh
    @mirishkarganesh
    @agupta54 I would like to have transcripts for 20-30 seconds of audio data
    Anirudh Gupta
    @agupta54
    @rishi-lulla pre-trained model will not be able to produce any text, it will produce a representation of the audio. You can simply do it by loading the model in pytorch and do a forward pass. If your goal is to generate text from audio, you will have to use the fine-tuned model.
    @mirishkarganesh for that you can use the this repo https://github.com/Open-Speech-EkStep/vakyansh-wav2vec2-experimentation and use the single_file_inference.sh code. All installation and usage instructions are given.
    Mirishkar S Ganesh
    @mirishkarganesh
    Hi, @agupta54 I have tried to install the repo which you have mentioned above at wav2letter I was stuck with the last step of the wav2letter module - pip install -e . The error was ERROR: Command errored out with exit status 1: /home/asr/anaconda3/envs/ekstp/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/asr/wav2letter/bindings/python/setup.py'"'"'; file='"'"'/home/asr/wav2letter/bindings/python/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output.
    Gowtham.R
    @gowtham1997

    Hi Vakyansh team,

    Thanks for the amazing work on CLSRIL-23
    Do you have the statistics/distribution of how many hours of pretraining data was used for each of the 23 indic languages?

    14 replies
    Surendrasingh Sucharia
    @surendrasinghs

    I (as a product manager) would love to understand Vakyansh. Can you help me understand various building blocks?

    It would be great if you can share

    • Are there any user facing apps which are leveraging micro services?
    • List of micro services
    • Any pluggable tool or widgets?
    6 replies
    Anirudh Gupta
    @agupta54
    This message was deleted
    Aswin Pradeep
    @aswinpradeep
    image.png
    Hi Team, i am pretty new to grpc.
    I cloned the open api and ran the server.py it seems model is loaded and server is running fine.
    However, when I edit the IP address in main.py and make it access the localhost, running the client code is giving out this error:
    soujyo
    @soujyo
    @aswinpradeep It should work fine if you had used '127.0.0.1:50051' in main.py
    Aswin Pradeep
    @aswinpradeep
    image.png
    its still the same
    @soujyo
    soujyo
    @soujyo
    It runs fine at my end.Lets connect on this once tomorrow
    Aswin Pradeep
    @aswinpradeep
    ok sure
    Aswin Pradeep
    @aswinpradeep
    @soujyo
    How can we connect? shall I drop you an E-mail?
    soujyo
    @soujyo
    Sure something for 4pm would be good
    Aswin Pradeep
    @aswinpradeep
    @agupta54
    I have been trying to run the open Api by directly using the finetuned models downloaded from the models repo and they were not working.
    However, tried generating a custom model using the script in wav2vec experimentation repo, and things are fine. Can you help me understand why is this custom_model generation necessary and what is basically happening in this phase.
    1 reply
    Ajitesh Sharma
    @aj7tesh

    hi @agupta54 This is Ajitesh from Team Anuvaad, there couple of problems we are facing. For some of them Soujyo helped still a few of them are there. Kindly let us know how to solve them:

    1. To run the srt generation I have to comment this :
      from inverse_text_normalization.run_predict import inverse_normalize_text
      in model_service.py and set enableInverseTextNormalization=False in examples/python/main.py

      Otherwise with ITN enabled I get following error

      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/inverse_text_normalization/hi/taggers/tokenize_and_classify_final.py", line 20, in <module>
      from inverse_text_normalization.hi.graph_utils import GraphFst, delete_extra_space, delete_space
      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/inverse_text_normalization/hi/graph_utils.py", line 53, in <module>
      suppletive = pynini.string_file(get_abs_path(data_path + 'suppletive.tsv'))
      File "extensions/_pynini.pyx", line 1042, in _pynini.string_file
      File "extensions/_pynini.pyx", line 1118, in _pynini.string_file
      _pywrapfst.FstIOError: Read failed

    1. I was trying to generate srt for various length of audio files. In most of the cases it was working when length is <1 min with punctuation enabled
      However with a file of aorund 3 min, I was able to get srt printed howver in the last step I was getting below error which I think is related to punctuation because when I set it to false it works.

      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/punctuate/punctuate_text.py", line 76, in get_tokens_and_labels_indices_from_text
      output = self.model(input_ids)
      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
      result = self.forward(input, **kwargs)
      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 149, in forward
      return self.module(
      inputs, kwargs)
      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
      result = self.forward(*input,
      kwargs)
      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/transformers/models/albert/modeling_albert.py", line 1069, in forward
      outputs = self.albert(
      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
      result = self.forward(input, **kwargs)
      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/transformers/models/albert/modeling_albert.py", line 685, in forward
      embedding_output = self.embeddings(
      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
      result = self.forward(
      input, **kwargs)
      File "/home/ajitesh/anaconda3/envs/server-vakyansh/lib/python3.8/site-packages/transformers/models/albert/modeling_albert.py", line 239, in forward
      embeddings = inputs_embeds + position_embeddings + token_type_embeddings
      RuntimeError: The size of tensor a (848) must match the size of tensor b (512) at non-singleton dimension 1

    1. What are the idea system requirements for hosting open speech api service with these fine tuned model? CPU as well as GPU?
    4 replies
    Mirishkar S Ganesh
    @mirishkarganesh
    Traceback (most recent call last):
    File "../../utils/inference/single_file_inference.py", line 386, in <module>
    result = parse_transcription(args_local.model, args_local.dict, args_local.wav, args_local.cuda, args_local.decoder, args_local.lexicon, args_local.lm_path, args_local.half)
    File "../../utils/inference/single_file_inference.py", line 361, in parse_transcription
    model.cuda()
    AttributeError: 'dict' object has no attribute 'cuda'
    2 replies
    can you help me with the above issue?
    Akash Singh
    @singhaki
    hi team,
    I was trying to generate hindi pretrained model using generate_custom_model.sh but the generated model is giving blank output
    1 reply
    Akash Singh
    @singhaki

    hi team,
    I was trying to generate hindi pretrained model using generate_custom_model.sh but the generated model is giving blank output

    Also,
    first i was trying to generate custom model from hindi_finetuned_4k, there was an error for w2v_path, so i passed the hindi_pretrained_4k, model got converted but while inferencing on single audio file it was giving blank output. which pretrained checkpoint to use for custom model generation?

    Second, i have also tried to convert the model to huggingface transformers model, but the converted model was giving random output. How can we convert to hugging face? as i have tried converting fairseq english model to hugging face. It got converted.

    1 reply
    tensorfoo
    @tensorfoo
    hi guys. Just came across Vakyansh, really impressed so far! well done