Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
    Igor Rendulic
    @simunovic-antonio Hey Antonio. I've implemented exactly what you're working on (RTSP stream to MP4 segments). You can check the docs on how to use it here: http://developers.chryscloud.com/edge-proxy/homepage/. Specifically you need to create conf.yaml file under .../data folder with on_disk: true flag turned on. It will also remove old MP4 segments based on your definition. If you're interested in a solution on how to store MP4 segments with PyAv then check this file: https://github.com/chryscloud/video-edge-ai-proxy/blob/master/python/archive.py
    Andre Vallestero
    Does anyone know how I would be able to send decoded frames as y4m format to a piped output? My current method of converting it to a ndarray and writing the bytes to the pipe output produces the following error message from my video encoder (rav1e):
    !! Could not input video. Is it a y4m file?
    Error: Message { msg: "Could not input video. Is it a y4m file?"
    Fredrik Lundkvist
    Hi guys! Is setting the DAR on a stream not supported at the moment? When i set a stream with defined DAR as the template for my output stream, i get a ValueError specifying invalid argument upon writing to the output stream
    Yun-Ta Tsai

    Hi, I am trying to encode frames as H265 packets to stream the network. But for some reason, the packet size is always zero (works on H264 however). Is this expected? Thanks in advance.

        codec = av.CodecContext.create('hevc', 'w')
        codec.width = 1280
        codec.height = 960
        codec.pix_fmt = 'yuvj420p'
        codec.time_base = Fraction(1, 36)
        yuv_packets = []
        for rgb_frame in rgb_frames:
            transposed_rgb_frame = np.rint(rgb_frame.transpose(1, 0, 2) *
            frame = av.VideoFrame.from_ndarray(transposed_rgb_frame)
            packets = codec.encode(frame)
            for packet in packets:


    x265 [info]: build info [Linux][GCC 8.3.1][64 bit] 8bit
    x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
    x265 [info]: Main profile, Level-4 (Main tier)
    x265 [info]: Thread pool created using 20 threads
    x265 [info]: Slices                              : 1
    x265 [info]: frame threads / pool features       : 4 / wpp(15 rows)
    x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
    x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
    x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 3
    x265 [info]: Keyframe min / max / scenecut / bias: 25 / 250 / 40 / 5.00
    x265 [info]: Lookahead / bframes / badapt        : 20 / 4 / 2
    x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0
    x265 [info]: References / ref-limit  cu / depth  : 3 / off / on
    x265 [info]: AQ: mode / str / qg-size / cu-tree  : 2 / 1.0 / 32 / 1
    x265 [info]: Rate Control / qCompress            : CRF-28.0 / 0.60
    x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra
    x265 [info]: tools: strong-intra-smoothing lslices=6 deblock sao
    WARNING:deprecated pixel format used, make sure you did set range correctly
    (1280, 960, 3)
    (1280, 960, 3)

    Hi guys,

    if I push a frame through a filter graph the filtered frame lost its time_base:

    import av.filter
    input_container = av.open(format='lavfi', file='sine=frequency=1000:duration=5')
    input_audio_stream = input_container.streams.audio[0]
    agraph = av.filter.Graph()
    abuffer = agraph.add_abuffer(template=input_audio_stream)
    abuffersink = agraph.add("abuffersink")
    for frame in input_container.decode():
        new_frame = agraph.pull()

    I took a look at the c api and there I can get the time base via the filter context. I couldn't find any way to access it through pyav. Any idea?
    I guess pyav should add it to the pulled frame

    1 reply
    Patrick Snape
    I have an open issue at the moment (72 hours old) that I wanted some advice on: PyAV-Org/PyAV#778
    I've been thinking about it over the past few days and I might just be being naive and what I want to do is in fact not currently possible - would love some feedback
    Nikolay Tiunov
    Hi everyone! Does anybody know how to make the pyave encode h264 frames using the AVCC format not Annex B?
    1 reply
    hi all, just trying to use pyav to add some meta-data to a container. does it have bindings for that? if so, they don't seem to be documented anywhere. I tried this search query, which turned up nothing useful. I also tried output_container.metadata["title"] = "foo" and output_stream.metadata["title"] = "foo", neither of which resulted in a file which appeared to have any such attribute associated with it when inspected in Windows Explorer.
    I realize that this functionality is built into the ffmpeg binary and trivially accessible via -metadata but I wasn't using it in my script and I'd prefer not to start now if that can be avoided
    hey good news! actually the reason that it didn't appear to be working is because Windows Explorer is bad, not because your bindings don't have feature parity to ffmpeg. So, thank you so much contributors for your time and patience, great work on this thing, sorry I didn't realize this earlier. For posterity, it isn't explicitly documented, but I was still able to find out how to do this from combing sources and finding some tests
    Good day people! I am a Python developer looking for a way to extract motion vectors from an RSTP video. Before I dive into the library, I would like to ask whether this is something that I could achieve using PyAV. In particular, I need to avoid decoding the entire video to save CPU usage. Thank you
    George Sakkis

    Hi all, new to PyAV and video software in general. The example on parsing has a comment "We want an H.264 stream in the Annex B byte-stream format" but doesn't mention why. Indeed, trying the same example with the original mp4 input file cannot parse any packets.

    More importantly, and this is my main question, is there a general way to write packets of an arbitrary av.VideoStream in a way that they can be parsed later, ideally without an intermediate step of calling ffmpeg?

    1 reply
    Muhammad Ali
    I m using https://pyav.org/docs/develop/cookbook/basics.html#remuxing with python3.9 but script fails to run complaining cache download failing , what I may be missing
    1 reply
    Hello! Is there a way to receive more P-frames and B-frames when streaming a video?
    I am still trying to get motion vectors without decoding the actual frame. Is it possible?
    bruno messias
    Hi. The method from cython copy_array_to_plane is exposed to python?
    8 replies
    @mfoglio I spent significant amount of time trying to extract MVs without decoding, but it is not currently possible. decode() must be called on packet to extract frame side data (which in turn contains motion vectors).
    @mfoglio and I am pretty happy with performance. Doing that in real time from RTSP, reading data from 3 cameras on raspberry Pi. Didn't top CPU. RAM might be more of bottleneck depending on frames sizes, but I avoided accumulating frames themselves in memory and it was fine even with UltraHD video.
    Rana Banerjee
    Hi guys, thanks a lot for your efforts in creating this great library!. Would appreciate a lot if anyone can help me with my use case: I am consuming a live rtsp stream, processing the frames in opencv and now would like to stream the processed frames individually, but in as compressed form as possible, transmit to a server and then play it in a browser at the client side.
    I am trying to encode an individual frame in .h264 and send the encoded packet as it is without wrapping it in any container. I plan to use these packets directly in a javascript player. So needed help with 2 things:
    1. How to convert a numpy image into an h264 encoded packet? I understand I may have to input X number of frames before I get any output, and that is fine. But hopefully every frame after that would give me a packet output that maps to the frame that was input X frames ago. Is my understanding correct? Is this the best way to get the most compression at real time and how exactly do I do this?
    2. Once the packet is received at the other end what player can be used to play it in a browser. Do I need to out it in a container for this? The issue is that it will be a live stream with continuous packets coming in without any end time.
    @Rana-Euclid regarding (1), this is how I do it (frames is a list of images, numpy arrays). Maybe it is not precisely what you need, but you get an idea...)
            def encode_frames(self, calculate_bitrate: bool, options: dict) -> bytes:
                """Encodes frames and returns movie buffer
                    calculate_bitrate (bool)
                    options (dict): [extra options for av library]
                    [bytes]: [movie buffer]
                container = "mp4"
                frames = self.all_frames()
                if not frames:
                    return bytes()
                mp4_bio = io.BytesIO()
                mp4_bio.name = f"out.{container}"
                new_options = options.copy()
                new_options["video_track_timescale"] = str(self._samplerate)
                mp4_container = av.open(
                with av.open(self._bio, mode="r") as video_in:
                    in_stream = video_in.streams.video[0]
                    stream = mp4_container.add_stream("h264", self.FPS)
                    stream.pix_fmt = "yuvj420p"
                    stream.extradata = in_stream.extradata
                    stream.bit_rate = in_stream.bit_rate
                stream.width = frames[0].shape[1]
                stream.height = frames[0].shape[0]
                for i, frame in enumerate(frames):
                    format = "rgb24" if frame.ndim >= 3 else "gray"
                    video_frame = av.VideoFrame.from_ndarray(frame, format=format)
                    for packet in stream.encode(video_frame):
                if i >= 0:
                    for p in stream.encode(None):
                return mp4_bio.getvalue()
    regarding (2), I had similar requirement and used MPEG dash, but this is far beyond the scope of this library.
    Rana Banerjee
    Thanks for the quick revert @quantotto . I see that the frames are being encoded in an mp4 container. Is that absolutely necessary? Can't I create and transmit packets without wrapping it in a container?
    @Rana-Euclid you don't have to use "mp4", it can be "mpegts" or others. I believe it can be anything that you see as output of ffmpeg -formats command
    Rana Banerjee
    @quantotto Thanks a lot! Much appreciated.. I will check it out further.
    1 reply
    Rana Banerjee
    Sorry about the trouble, but well the issue is that I need to reduce my network bandwidth while keeping the system real time. So I need to transmit encoded packets for every frame. The moment I wrap the packet to a container (flv or mp4 etc.) the file size increases. Hence I want to just transmit the packets without encoding it in any container. Is it possible in the first place with pyav or ffmpeg?
    @Rana-Euclid try using format="h264". It will create raw H264 video buffer.
    bruno messias

    Hi. It's possible to use PyAV VideoFrame to stream into YouTube using the rmtp protocol?

    With ffmpeg it's possible using this

    import subprocess 
    import cv2
    cap = cv2.VideoCapture(0)
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
    command = ['ffmpeg',
                '-f', 'rawvideo',
                '-pix_fmt', 'bgr24',
                '-ar', '44100',
                '-ac', '2',
                '-acodec', 'pcm_s16le',
                '-f', 's16le',
                '-ac', '2',
                '-g', '50',
                '-profile:v', 'baseline',
                '-preset', 'ultrafast',
                '-r', '30',
                '-f', 'flv', 
    pipe = subprocess.Popen(command, stdin=subprocess.PIPE)
    while True:
        _, frame = cap.read()
    12 replies
    Miguel Martin
    Is it possible to set the start time of an output audio or video container or do I have to generate blank audio/video frames to emulate this?
    1 reply

    I would like to restream an RTMP stream to another one and if the input stream stops I want to display a static black image, could you please guide me how to start with it, the best would be if i could overlay the input stream to an static image? The restreaming works , but I do not know how to make a black image packet to mux it to the output stream when the input is over, here is my code:

    output_url= "rtmp://server.hu/main/test_input"
    input_url = 'rtmp://server.hu/main/'
    container_in = av.open(input_url)
    container_out = av.open(output_url, format='flv', mode='w')
    video_stream_input = container_in.streams.video[0]
    audio_stream_input = container_in.streams.audio[0]
    video_stream_output = container_out.add_stream(template=video_stream_input)
    audio_stream_output = container_out.add_stream(template=audio_stream_input)
    # stream.width = 480
    # stream.height = 320
    video_stream_output.width = container_in.streams.video[0].codec_context.width
    video_stream_output.height = container_in.streams.video[0].codec_context.height
    video_stream_output.pix_fmt = container_in.streams.video[0].codec_context.pix_fmt
    # audio_stream = container.add_stream(codec_name="aac", rate=16000)
    the_canvas = np.zeros((video_stream_output.height, video_stream_output.width, 3), dtype=np.uint8)
    the_canvas[:, :] = (32, 32, 32)  # some dark gray background because why not
    my_pts = 0
    while True:
        packet = next(container_in.demux(video_stream_input, audio_stream_input),None)
        # packet = None
        if packet is None:
            print("stream is not running")
            #TODO static black image
            if packet.stream.type == 'video':
                packet.stream = video_stream_output
            elif packet.stream.type == 'audio':
                packet.stream = audio_stream_output

    Thanks in advance!

    6 replies

    I have an nginx RTMP server and am trying to save and read the stream from it for audio processing. For saving the stream to a file i am using the record all option in nginx conf.

    for reading the stream i am using av.open(<rtmp_path>) in python code, but the start of the file that is saved by nginx vs the python code is different by 10-15 seconds, nginx file recording being ahead of the av one.

    Is there a way to reduce this lag.
    One option would be to read from the file being written to by nginx - I tried using av.open(<nginx_file_recording>) on that file and kept getting av.error.EOFError which means reading an active file being written to wont work out of the box due to seeking issues, hence wondering about how to reduce the lag between both sources.
    Last resort is to not save file from nginx itself, but do it from the python script, I wanted to avoid this as the nginx way is a good fallback in case the script doesnt work or errors out for some reason in production and I dont want to loose that 15 second data as well.

    my nginx conf looks like this

    rtmp {
        server  {
            listen  1935;
            application live {
            live on;
            wait_key on;
            wait_video on;
            record all;
            record_path /home/tmp/recordings;
            allow publish;
            allow play;

    @maheshgawali as an option, why not configuring nginx RTMP to record in chunks. For example, setting record_suffix -%d-%b-%y-%T.flv; and record_max_size 256K; or maybe chunked recording as follows:

    recorder chunked {
        record all;
        record_unique on;
        record_interval 15s;
        record_path /var/rec/chunked;

    (change values as you see fit) and then you can read files with PyAV after they have been closed / finalized (if new file appears, then you can safely read previous one/s)

    @quantotto : recording in chunks will also add delay like in this case a 15 seconds delay, I want to capture the stream as soon as possible. I already explored this approach, had tried reducing the interval upto 2 seconds.
    Time for stream to reach nginx RTMP + recording interval + post processing = this accumulates to more than 2 seconds and for real time use case we cannot go beyond 2 seconds as user experience deteriorates
    I'll explore using the option record_interval 1s
    Also, is there a way to determine from rtmp_container=av.open(<rtmp_stream_path>) whether an rtmp stream has stoppped ?
    2 replies
    Hi folks, I was trying to generate some videos, but when I encode the packets, the timestamp (pts) seems getting messed up by the output video stream. The timestamp becomes 16x of the original timestamp, I suspect it's due to Stream.time_base, when after I set it explicitly, it's still not working. does anyone have this issue before?
    @sunset2017 I think I ran into something similar as well and after some trial and error, I found this option that you can pass to container: video_track_timescale. It should be 1/time_base. I normally read time_base from input stream. Not guaranteed it will solve your issue, but worth trying.
    samplerate = int(1 / in_stream.time_base)
    options = {}
    options["video_track_timescale"] = str(samplerate)
    mp4_container = av.open(
    thanks @quantotto !!


    How do I generate an empty/silent 'aac' frame?
    I've tried to do it the numpy way , but it says that it is no yet supported to convert from numpy array with format 'aac'.

    2 replies
    Samuel Lindgren
    Hi! I'm trying to use h264_v4l2m2m to encode on a Raspberry Pi 4 with 64-bit OS, but am getting a segfault in the codec.encode(frame) function. Not sure if I'm configuring something incorrectly or if it is simply not supported. Does anyone here know?


    I want to skip my input streams frame to the latest one, I mean I process a few different video stream concurrently and i switch between them regularly , and I want to stay synchronized in time. I tried inputcontainer.seek(0) but it throws av.error.ValueError: [Errno 22] Invalid argument. So my question is how do I use seek() properly to skip to the live part of the stream. Here is corresponding codepart:

    s_in = streams[actualStream]
    v_in = s_in.streams.video[0]
    a_in = s_in.streams.audio[0]
    if v_in is not None: v_in.seek(0)
    if a_in is not None: a_in.seek(0)
    for frame in s_in.decode(v_in,a_in):

    @szucsba1998 are those RTSP streams? Since you deal with live stream, there is no classic seek concept. Once you established a connection, streaming starts and receiving end should read packets in order received. demux or decode basically return a generator that yields packets or frames respectively as they are available. If processing lags behind, buffers start filling up. I saw some gstreamer options coupled with opencv to maintain a buffer of a single frame, but not with ffmpeg. With that, what I normally do in this situation is creating multiple threads (or processes in Python to avoid GIL issues), read packets or frames ASAP and put them into a queue and then if I need only the latest frame, I can access queue's tail. Alternatively, you could limit queue size and then only a few latest frames will be there when you read from queue and this way not much memory is consumed.

    To make it even more CPU and memory efficient, you could queue packets and not frames (using demux instead of decode). They are compressed and you also don't decode frames that you end up not using. This has a caveat of making sure that you decode a keyframe packet before you decode frames that follow it.

    Also, there is a buffer_size ffmpeg option, but technically it is only for UDP protocol. FFMPEG docs that I found say nothing about its effect on other protocols (rtmp / rtsp). Maybe worth playing with it and seeing whether it helps.
    Hi all, is there a way to identify if an I-Frame is IDR or not in a h264 video stream?
    @horamarques looks like you need to parse h264 bytestream, extract NAL units and check their type. NALU type 0x05 is denoting it is IDR slice. The complexity of this is that depending on the container, h264 might be either in Annex B format or in AVCC. libav (ffmpeg) doesn't expose convenience methods / flags to quickly check if it is IDR, only AV_PKT_FLAG_KEY flag, which is checked by is_keyframe of packet object.
    Hi, I want to receive multi unknown UDP streams. But PYAV will block if one of many UDP streams has not started yet. How can I continue to check the stats of each stream in whole blocking codes?
    like, I want to ffmpeg -i udp_input0 -i udp_input1 -filter_complex hstack=inputs=2 output. how to make any missing udp_input displaying as static black image stream, when PYAV will block/stop if any stream missed just like what FFMPEG will do?
    34 replies
    Guilherme Richter
    Hey, I was trying to do something with opencv, but some people suggested me to do it using ffmpeg. Is there an easy way to get a single frame from a stream like you would with ffmpeg -i RTSP_URI -vframes 1 frame.jpg? I was looking at the most basic av example and without the for I just get a collection.
    Guilherme Richter
    Actually, what I wanted is to limit the number of frames per second in a RTSP stream, but this is not as easy as I wished (as far as looked up), so I was hoping to just get the "last frame" using ffmpeg, repeat it a couple of times per minute and call it a day.
    @IamRichter I think the basic usage in PYAV doc already show you how to achieve that
    what is the encode argument for x264 like profile fast/slow in Stream.encode(frame=None) of PYAV?