Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    quantotto
    @quantotto
    @mfoglio I spent significant amount of time trying to extract MVs without decoding, but it is not currently possible. decode() must be called on packet to extract frame side data (which in turn contains motion vectors).
    quantotto
    @quantotto
    @mfoglio and I am pretty happy with performance. Doing that in real time from RTSP, reading data from 3 cameras on raspberry Pi. Didn't top CPU. RAM might be more of bottleneck depending on frames sizes, but I avoided accumulating frames themselves in memory and it was fine even with UltraHD video.
    Rana Banerjee
    @Rana-Euclid
    Hi guys, thanks a lot for your efforts in creating this great library!. Would appreciate a lot if anyone can help me with my use case: I am consuming a live rtsp stream, processing the frames in opencv and now would like to stream the processed frames individually, but in as compressed form as possible, transmit to a server and then play it in a browser at the client side.
    I am trying to encode an individual frame in .h264 and send the encoded packet as it is without wrapping it in any container. I plan to use these packets directly in a javascript player. So needed help with 2 things:
    1. How to convert a numpy image into an h264 encoded packet? I understand I may have to input X number of frames before I get any output, and that is fine. But hopefully every frame after that would give me a packet output that maps to the frame that was input X frames ago. Is my understanding correct? Is this the best way to get the most compression at real time and how exactly do I do this?
    2. Once the packet is received at the other end what player can be used to play it in a browser. Do I need to out it in a container for this? The issue is that it will be a live stream with continuous packets coming in without any end time.
    quantotto
    @quantotto
    @Rana-Euclid regarding (1), this is how I do it (frames is a list of images, numpy arrays). Maybe it is not precisely what you need, but you get an idea...)
            def encode_frames(self, calculate_bitrate: bool, options: dict) -> bytes:
                """Encodes frames and returns movie buffer
    
                Args:
                    calculate_bitrate (bool)
                    options (dict): [extra options for av library]
    
                Returns:
                    [bytes]: [movie buffer]
                """
                container = "mp4"
                frames = self.all_frames()
                if not frames:
                    return bytes()
                mp4_bio = io.BytesIO()
                mp4_bio.name = f"out.{container}"
                new_options = options.copy()
                new_options["video_track_timescale"] = str(self._samplerate)
                mp4_container = av.open(
                    mp4_bio,
                    mode="w",
                    format=container,
                    options=new_options
                )
                self._bio.seek(0)
                with av.open(self._bio, mode="r") as video_in:
                    in_stream = video_in.streams.video[0]
                    stream = mp4_container.add_stream("h264", self.FPS)
                    stream.pix_fmt = "yuvj420p"
                    stream.extradata = in_stream.extradata
                    stream.bit_rate = in_stream.bit_rate
                stream.width = frames[0].shape[1]
                stream.height = frames[0].shape[0]
                for i, frame in enumerate(frames):
                    format = "rgb24" if frame.ndim >= 3 else "gray"
                    video_frame = av.VideoFrame.from_ndarray(frame, format=format)
                    for packet in stream.encode(video_frame):
                        mp4_container.mux(packet)
                if i >= 0:
                    for p in stream.encode(None):
                        mp4_container.mux(p)
                mp4_container.close()
                return mp4_bio.getvalue()
    regarding (2), I had similar requirement and used MPEG dash, but this is far beyond the scope of this library.
    Rana Banerjee
    @Rana-Euclid
    Thanks for the quick revert @quantotto . I see that the frames are being encoded in an mp4 container. Is that absolutely necessary? Can't I create and transmit packets without wrapping it in a container?
    quantotto
    @quantotto
    @Rana-Euclid you don't have to use "mp4", it can be "mpegts" or others. I believe it can be anything that you see as output of ffmpeg -formats command
    Rana Banerjee
    @Rana-Euclid
    @quantotto Thanks a lot! Much appreciated.. I will check it out further.
    1 reply
    Rana Banerjee
    @Rana-Euclid
    Sorry about the trouble, but well the issue is that I need to reduce my network bandwidth while keeping the system real time. So I need to transmit encoded packets for every frame. The moment I wrap the packet to a container (flv or mp4 etc.) the file size increases. Hence I want to just transmit the packets without encoding it in any container. Is it possible in the first place with pyav or ffmpeg?
    quantotto
    @quantotto
    @Rana-Euclid try using format="h264". It will create raw H264 video buffer.
    bruno messias
    @devmessias

    Hi. It's possible to use PyAV VideoFrame to stream into YouTube using the rmtp protocol?

    With ffmpeg it's possible using this

    import subprocess 
    import cv2
    
    cap = cv2.VideoCapture(0)
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
    
    command = ['ffmpeg',
                '-f', 'rawvideo',
                '-pix_fmt', 'bgr24',
                '-s','640x480',
                '-i','-',
                '-ar', '44100',
                '-ac', '2',
                '-acodec', 'pcm_s16le',
                '-f', 's16le',
                '-ac', '2',
                '-i','/dev/zero',   
                '-acodec','aac',
                '-ab','128k',
                '-strict','experimental',
                '-vcodec','h264',
                '-pix_fmt','yuv420p',
                '-g', '50',
                '-vb','1000k',
                '-profile:v', 'baseline',
                '-preset', 'ultrafast',
                '-r', '30',
                '-f', 'flv', 
                'rtmp://a.rtmp.youtube.com/live2/[STREAMKEY]']
    
    pipe = subprocess.Popen(command, stdin=subprocess.PIPE)
    
    while True:
        _, frame = cap.read()
        pipe.stdin.write(frame.tostring())
    
    pipe.kill()
    cap.release()
    12 replies
    Miguel Martin
    @miguelmartin75
    Is it possible to set the start time of an output audio or video container or do I have to generate blank audio/video frames to emulate this?
    1 reply
    szucsba
    @szucsba1998

    Hello!
    I would like to restream an RTMP stream to another one and if the input stream stops I want to display a static black image, could you please guide me how to start with it, the best would be if i could overlay the input stream to an static image? The restreaming works , but I do not know how to make a black image packet to mux it to the output stream when the input is over, here is my code:

    
    output_url= "rtmp://server.hu/main/test_input"
    input_url = 'rtmp://server.hu/main/192.168.1.104'
    container_in = av.open(input_url)
    container_out = av.open(output_url, format='flv', mode='w')
    
    
    
    video_stream_input = container_in.streams.video[0]
    audio_stream_input = container_in.streams.audio[0]
    
    video_stream_output = container_out.add_stream(template=video_stream_input)
    audio_stream_output = container_out.add_stream(template=audio_stream_input)
    # stream.width = 480
    # stream.height = 320
    video_stream_output.width = container_in.streams.video[0].codec_context.width
    video_stream_output.height = container_in.streams.video[0].codec_context.height
    video_stream_output.pix_fmt = container_in.streams.video[0].codec_context.pix_fmt
    # audio_stream = container.add_stream(codec_name="aac", rate=16000)
    the_canvas = np.zeros((video_stream_output.height, video_stream_output.width, 3), dtype=np.uint8)
    the_canvas[:, :] = (32, 32, 32)  # some dark gray background because why not
    my_pts = 0
    
    
    while True:
        packet = next(container_in.demux(video_stream_input, audio_stream_input),None)
    
        # packet = None
        if packet is None:
            print("stream is not running")
            #TODO static black image
        else:
            if packet.stream.type == 'video':
                packet.stream = video_stream_output
                container_out.mux(packet)
            elif packet.stream.type == 'audio':
                packet.stream = audio_stream_output
                container_out.mux(packet)

    Thanks in advance!

    6 replies
    Mahesh
    @maheshgawali

    I have an nginx RTMP server and am trying to save and read the stream from it for audio processing. For saving the stream to a file i am using the record all option in nginx conf.

    for reading the stream i am using av.open(<rtmp_path>) in python code, but the start of the file that is saved by nginx vs the python code is different by 10-15 seconds, nginx file recording being ahead of the av one.

    Is there a way to reduce this lag.
    One option would be to read from the file being written to by nginx - I tried using av.open(<nginx_file_recording>) on that file and kept getting av.error.EOFError which means reading an active file being written to wont work out of the box due to seeking issues, hence wondering about how to reduce the lag between both sources.
    Last resort is to not save file from nginx itself, but do it from the python script, I wanted to avoid this as the nginx way is a good fallback in case the script doesnt work or errors out for some reason in production and I dont want to loose that 15 second data as well.

    my nginx conf looks like this

    rtmp {
        server  {
            listen  1935;
            application live {
            live on;
            wait_key on;
            wait_video on;
            record all;
            record_path /home/tmp/recordings;
            allow publish 127.0.0.1;
            allow play 127.0.0.1;
        }
    }
    quantotto
    @quantotto

    @maheshgawali as an option, why not configuring nginx RTMP to record in chunks. For example, setting record_suffix -%d-%b-%y-%T.flv; and record_max_size 256K; or maybe chunked recording as follows:

    recorder chunked {
        record all;
        record_unique on;
        record_interval 15s;
        record_path /var/rec/chunked;
    }

    (change values as you see fit) and then you can read files with PyAV after they have been closed / finalized (if new file appears, then you can safely read previous one/s)

    Mahesh
    @maheshgawali
    @quantotto : recording in chunks will also add delay like in this case a 15 seconds delay, I want to capture the stream as soon as possible. I already explored this approach, had tried reducing the interval upto 2 seconds.
    Time for stream to reach nginx RTMP + recording interval + post processing = this accumulates to more than 2 seconds and for real time use case we cannot go beyond 2 seconds as user experience deteriorates
    I'll explore using the option record_interval 1s
    Also, is there a way to determine from rtmp_container=av.open(<rtmp_stream_path>) whether an rtmp stream has stoppped ?
    2 replies
    sunset2017
    @sunset2017
    Hi folks, I was trying to generate some videos, but when I encode the packets, the timestamp (pts) seems getting messed up by the output video stream. The timestamp becomes 16x of the original timestamp, I suspect it's due to Stream.time_base, when after I set it explicitly, it's still not working. does anyone have this issue before?
    quantotto
    @quantotto
    @sunset2017 I think I ran into something similar as well and after some trial and error, I found this option that you can pass to container: video_track_timescale. It should be 1/time_base. I normally read time_base from input stream. Not guaranteed it will solve your issue, but worth trying.
    samplerate = int(1 / in_stream.time_base)
    options = {}
    ...
    options["video_track_timescale"] = str(samplerate)
    mp4_container = av.open(
        mp4_bio,
        mode="w",
        format=container,
        options=options
    )
    ...
    sunset2017
    @sunset2017
    thanks @quantotto !!
    szucsba
    @szucsba1998

    Hello!

    How do I generate an empty/silent 'aac' frame?
    I've tried to do it the numpy way , but it says that it is no yet supported to convert from numpy array with format 'aac'.

    2 replies
    Samuel Lindgren
    @samiamlabs
    Hi! I'm trying to use h264_v4l2m2m to encode on a Raspberry Pi 4 with 64-bit OS, but am getting a segfault in the codec.encode(frame) function. Not sure if I'm configuring something incorrectly or if it is simply not supported. Does anyone here know?
    szucsba
    @szucsba1998

    Hi!

    I want to skip my input streams frame to the latest one, I mean I process a few different video stream concurrently and i switch between them regularly , and I want to stay synchronized in time. I tried inputcontainer.seek(0) but it throws av.error.ValueError: [Errno 22] Invalid argument. So my question is how do I use seek() properly to skip to the live part of the stream. Here is corresponding codepart:

    s_in = streams[actualStream]
    v_in = s_in.streams.video[0]
    a_in = s_in.streams.audio[0]
    
    if v_in is not None: v_in.seek(0)
    if a_in is not None: a_in.seek(0)
    
    for frame in s_in.decode(v_in,a_in):
    quantotto
    @quantotto

    @szucsba1998 are those RTSP streams? Since you deal with live stream, there is no classic seek concept. Once you established a connection, streaming starts and receiving end should read packets in order received. demux or decode basically return a generator that yields packets or frames respectively as they are available. If processing lags behind, buffers start filling up. I saw some gstreamer options coupled with opencv to maintain a buffer of a single frame, but not with ffmpeg. With that, what I normally do in this situation is creating multiple threads (or processes in Python to avoid GIL issues), read packets or frames ASAP and put them into a queue and then if I need only the latest frame, I can access queue's tail. Alternatively, you could limit queue size and then only a few latest frames will be there when you read from queue and this way not much memory is consumed.

    To make it even more CPU and memory efficient, you could queue packets and not frames (using demux instead of decode). They are compressed and you also don't decode frames that you end up not using. This has a caveat of making sure that you decode a keyframe packet before you decode frames that follow it.

    quantotto
    @quantotto
    Also, there is a buffer_size ffmpeg option, but technically it is only for UDP protocol. FFMPEG docs that I found say nothing about its effect on other protocols (rtmp / rtsp). Maybe worth playing with it and seeing whether it helps.
    horamarques
    @horamarques
    Hi all, is there a way to identify if an I-Frame is IDR or not in a h264 video stream?
    quantotto
    @quantotto
    @horamarques looks like you need to parse h264 bytestream, extract NAL units and check their type. NALU type 0x05 is denoting it is IDR slice. The complexity of this is that depending on the container, h264 might be either in Annex B format or in AVCC. libav (ffmpeg) doesn't expose convenience methods / flags to quickly check if it is IDR, only AV_PKT_FLAG_KEY flag, which is checked by is_keyframe of packet object.
    NewUserHa
    @NewUserHa
    Hi, I want to receive multi unknown UDP streams. But PYAV will block if one of many UDP streams has not started yet. How can I continue to check the stats of each stream in whole blocking codes?
    NewUserHa
    @NewUserHa
    like, I want to ffmpeg -i udp_input0 -i udp_input1 -filter_complex hstack=inputs=2 output. how to make any missing udp_input displaying as static black image stream, when PYAV will block/stop if any stream missed just like what FFMPEG will do?
    34 replies
    Guilherme Richter
    @IamRichter
    Hey, I was trying to do something with opencv, but some people suggested me to do it using ffmpeg. Is there an easy way to get a single frame from a stream like you would with ffmpeg -i RTSP_URI -vframes 1 frame.jpg? I was looking at the most basic av example and without the for I just get a collection.
    Guilherme Richter
    @IamRichter
    Actually, what I wanted is to limit the number of frames per second in a RTSP stream, but this is not as easy as I wished (as far as looked up), so I was hoping to just get the "last frame" using ffmpeg, repeat it a couple of times per minute and call it a day.
    NewUserHa
    @NewUserHa
    @IamRichter I think the basic usage in PYAV doc already show you how to achieve that
    NewUserHa
    @NewUserHa
    what is the encode argument for x264 like profile fast/slow in Stream.encode(frame=None) of PYAV?
    NewUserHa
    @NewUserHa
    Samuel Lindgren
    @samiamlabs

    Hi all!

    I'm trying to use the hardware encoder h264_nvmpi on a Jetson Xavier to stream video with WebRTC (aiortc).

    The code that creates the encoder context looks like this:

    def create_encoder_context(
        codec_name: str, width: int, height: int, bitrate: int
    ) -> Tuple[av.CodecContext, bool]:
        codec = av.CodecContext.create(codec_name, "w")
        codec.width = width
        codec.height = height
        codec.bit_rate = bitrate
        codec.pix_fmt = "yuv420p"
        codec.framerate = fractions.Fraction(MAX_FRAME_RATE, 1)
        codec.time_base = fractions.Fraction(1, MAX_FRAME_RATE)
        codec.options = {
            "profile": "baseline",
            "level": "31",
            "tune": "zerolatency"  # does nothing using h264_omx,
        }
        codec.open()
        return codec, codec_name == "h264_nvmpi"
    
    self.codec, self.codec_buffering = create_encoder_context(
                        "h264_nvmpi", frame.width, frame.height, bitrate=self.target_bitrate
                    )

    The encoding itself seems to work fine but the video I can see in the browser is very delayed and it seems to get worse the more time passes.

    I get this output from the codec:

    INFO:aioice.ice:Connection(2) Check CandidatePair(('172.19.0.2', 53024) -> ('172.19.0.1', 56520)) State.IN_PROGRESS -> State.SUCCEEDED
    INFO:aioice.ice:Connection(2) ICE completed
    Connection state is connected
    INFO:aiortc.codecs.h264:=======> Creating h264 encoder!
    Opening in BLOCKING MODE
    Opening in BLOCKING MODE 
    NvMMLiteOpen : Block : BlockType = 4 
    ===== NVMEDIA: NVENC =====
    NvMMLiteBlockCreate : Block : BlockType = 4 
    875967048
    842091865
    H264: Profile = 66, Level = 31 
    NVMEDIA_ENC: bBlitMode is set to TRUE 
    Opening in BLOCKING MODE
    Opening in BLOCKING MODE 
    NvMMLiteOpen : Block : BlockType = 4 
    ===== NVMEDIA: NVENC =====
    NvMMLiteBlockCreate : Block : BlockType = 4 
    875967048
    842091865
    H264: Profile = 66, Level = 31 
    NVMEDIA_ENC: bBlitMode is set to TRUE

    I suspect I have some flag or option wrong but I'm pretty inexperienced with FFmpeg and don't really know where to start...

    If anyone has any tips or ideas they would be greatly appreciated :)

    Nitan Alexandru Marcel
    @nitanmarcel
    Hello.
    Oh, enter doesn't send a new line
    Anyway, I'm looking for a way to stream audio from an audio url from a link provided from YouTube-Dl library. I have to convert it to PCM 16 bit, 48k then send the data to another library for downloading. So far I've achieved this using an asyncio subprocess exec and I just found your library and I want to translate it to it. Are there any examples of doing this or something similar out there or can someone point me in the corect direction?
    Nitan Alexandru Marcel
    @nitanmarcel
    What I think I'm actually looking is to take the original bytes from the url using something like aiohttp then send the data though the library and recive the converted bytes
    Is this possible without revolving around subprocesses?
    NewUserHa
    @NewUserHa
    ffmpeg cli is good in your case
    quantotto
    @quantotto
    @nitanmarcel pyav allows re-encoding / transcoding the stream in memory (if io.BytesIO) is used for output buffer. It will not create any subprocesses and will use libav libraries. If you already use ffmpeg, you can definitely translate command line into a program that uses pyav. It allows supplying options by passing options dictionary to av.open method
    Nitan Alexandru Marcel
    @nitanmarcel
    @quantotto thanks, I'll take a look into it
    NewUserHa
    @NewUserHa
    container_out = av.open('udp://224.0.0.1:999?overrun_nonfatal=1', format='mpegts', mode='w')
    video_stream = container_out.add_stream('libx264', 30)
    video_stream.low_delay = True
    video_stream.width = 1280
    video_stream.height = 720
    video_stream.options = {'preset': 'veryfast', 'tune': 'film,zerolatency'}
    
    graph = av.filter.Graph()
    
    src0 = graph.add('testsrc', 's=1280x720:r=30')
    f01 = graph.add('drawtext', "text='%{n}@%{localtime}@%{pts}':y=190:fontsize=20")
    f02 = graph.add('format','pix_fmts=yuv420p')
    f03 = graph.add('scale', '640:360:flags=lanczos')
    
    src1 = graph.add('testsrc', 's=1280x720:r=30')
    f11 = graph.add('drawtext', "text='%{n}@%{localtime}@%{pts}':y=190:fontsize=20")
    f12 = graph.add('format','pix_fmts=yuv420p')
    f13 = graph.add('scale', '640:360:flags=lanczos')
    
    xstack = graph.add('xstack', 'inputs=2:layout=0_0|w0_0')
    
    src0.link_to(f01)
    f01.link_to(f02)
    f02.link_to(f03)
    f03.link_to(xstack, 0, 0)
    
    src1.link_to(f11)
    f11.link_to(f12)
    f12.link_to(f13)
    f13.link_to(xstack, 0, 1)
    
    xstack.link_to(graph.add('buffersink'))
    graph.configure()
    
    subprocess.Popen(
        """ffplay -f lavfi -i testsrc=r=30 -vf "drawtext=text='%{n}@%{localtime}@%{pts}':y=190:fontsize=20""", 
        shell=True)
    subprocess.Popen(
        """ffplay -fflags nobuffer udp://224.0.0.1:999?overrun_nonfatal=1""",
        shell=True)
    while 1:
        for packet in video_stream.encode(xstack.pull()):
            container_out.mux(packet)
    Why does streaming latency of this code keep increasing forever? any ideas?
    the latency = the %{localtime} in ffplay -i testsrc and ffplay udp://
    is it because that the PYTHON is slow?
    arsserpentarium
    @arsserpentarium
    Hello. Do anybody have example, how to record video with sound?
    quantotto
    @quantotto
    @NewUserHa it is probably less about Python, but more about encoding speed. If your encoding pipeline is slow indeed, most of the chances it is CPU bound. Check if you have one of the cores maxed out at 100% while running your code. I saw similar issues on weaker machines, like Raspberry Pi.
    NewUserHa
    @NewUserHa
    @quantotto But no, overall cpu usage ~40%, and all logical core usage <70%.
    @quantotto would you like to try that code?
    quantotto
    @quantotto
    @NewUserHa do I just replace 224.0.0.1 with localhost?