Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
    Igor Rendulic
    Has anyone succesfully compiled PyAv in Docker form arm64v8 platform?
    Rizwan Ishaq
    Hi, I want to use PyAv
    suppose, I have an image from opencv want to encode it in h264 stream
    so for each image from opencv to h264 encoded image
    I tried
    but how to get the bytes ... or without writing to the container, how can I get the h264 encode frame from image??
    Jani Šumak

    HI guys!

    Before I start: great work. I am working on an ffmpeg API wrapper and did my best with some other libraries, but with PyAV I got my “hello world” (wav to mp3) working in 20 min :)

    I wanted to know if there is any option to pass a opened file to the av.openglobal method? I am using FastAPI (starlett) and would like to leverage streming and async functions as much as possible.

    Currently I save the incoming file to a temp folder, process it, and then crate a background taks to clean it up. This makes sense for some taskst, but for some it would be nice if I could just pass the file object to av.open.

    Hope this makes sanse.


    I need help
    Can someone help me?

    I am trying to make av.AudioFrame.
    I want to construct it from pydub.AudioSegment.

    I found the method av.AudioFrame.from_ndarray but it gives me AssertionError.

    Antonio Šimunović

    Hello, I could use some help with remuxing scenario. I'm trying to remux a RTSP H264 stream from IP camera to fragmented MP4 stream to use it in a live streaming WebSocket solution. On the client side is the MediaSource object. Having trouble getting the bytes result from the output buffer, where getvalue() call end with an empty bytes object for all non-keyframe packets.

    Here is the code:

    This is the produced output:

    YES True
    NO False
    NO False
    NO False
    NO False
    NO False
    NO False
    YES True
    NO False

    I expect that every call to getvalue() returns a non-empty bytes object, but only after muxing keyframe packet I will get non-empty result in call to BytesIO.getvalue().
    What am I missing?

    1 reply
    Igor Rendulic
    @simunovic-antonio Hey Antonio. I've implemented exactly what you're working on (RTSP stream to MP4 segments). You can check the docs on how to use it here: http://developers.chryscloud.com/edge-proxy/homepage/. Specifically you need to create conf.yaml file under .../data folder with on_disk: true flag turned on. It will also remove old MP4 segments based on your definition. If you're interested in a solution on how to store MP4 segments with PyAv then check this file: https://github.com/chryscloud/video-edge-ai-proxy/blob/master/python/archive.py
    Andre Vallestero
    Does anyone know how I would be able to send decoded frames as y4m format to a piped output? My current method of converting it to a ndarray and writing the bytes to the pipe output produces the following error message from my video encoder (rav1e):
    !! Could not input video. Is it a y4m file?
    Error: Message { msg: "Could not input video. Is it a y4m file?"
    Fredrik Lundkvist
    Hi guys! Is setting the DAR on a stream not supported at the moment? When i set a stream with defined DAR as the template for my output stream, i get a ValueError specifying invalid argument upon writing to the output stream
    Yun-Ta Tsai

    Hi, I am trying to encode frames as H265 packets to stream the network. But for some reason, the packet size is always zero (works on H264 however). Is this expected? Thanks in advance.

        codec = av.CodecContext.create('hevc', 'w')
        codec.width = 1280
        codec.height = 960
        codec.pix_fmt = 'yuvj420p'
        codec.time_base = Fraction(1, 36)
        yuv_packets = []
        for rgb_frame in rgb_frames:
            transposed_rgb_frame = np.rint(rgb_frame.transpose(1, 0, 2) *
            frame = av.VideoFrame.from_ndarray(transposed_rgb_frame)
            packets = codec.encode(frame)
            for packet in packets:


    x265 [info]: build info [Linux][GCC 8.3.1][64 bit] 8bit
    x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
    x265 [info]: Main profile, Level-4 (Main tier)
    x265 [info]: Thread pool created using 20 threads
    x265 [info]: Slices                              : 1
    x265 [info]: frame threads / pool features       : 4 / wpp(15 rows)
    x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
    x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
    x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 3
    x265 [info]: Keyframe min / max / scenecut / bias: 25 / 250 / 40 / 5.00
    x265 [info]: Lookahead / bframes / badapt        : 20 / 4 / 2
    x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0
    x265 [info]: References / ref-limit  cu / depth  : 3 / off / on
    x265 [info]: AQ: mode / str / qg-size / cu-tree  : 2 / 1.0 / 32 / 1
    x265 [info]: Rate Control / qCompress            : CRF-28.0 / 0.60
    x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra
    x265 [info]: tools: strong-intra-smoothing lslices=6 deblock sao
    WARNING:deprecated pixel format used, make sure you did set range correctly
    (1280, 960, 3)
    (1280, 960, 3)

    Hi guys,

    if I push a frame through a filter graph the filtered frame lost its time_base:

    import av.filter
    input_container = av.open(format='lavfi', file='sine=frequency=1000:duration=5')
    input_audio_stream = input_container.streams.audio[0]
    agraph = av.filter.Graph()
    abuffer = agraph.add_abuffer(template=input_audio_stream)
    abuffersink = agraph.add("abuffersink")
    for frame in input_container.decode():
        new_frame = agraph.pull()

    I took a look at the c api and there I can get the time base via the filter context. I couldn't find any way to access it through pyav. Any idea?
    I guess pyav should add it to the pulled frame

    1 reply
    Patrick Snape
    I have an open issue at the moment (72 hours old) that I wanted some advice on: PyAV-Org/PyAV#778
    I've been thinking about it over the past few days and I might just be being naive and what I want to do is in fact not currently possible - would love some feedback
    Nikolay Tiunov
    Hi everyone! Does anybody know how to make the pyave encode h264 frames using the AVCC format not Annex B?
    1 reply
    hi all, just trying to use pyav to add some meta-data to a container. does it have bindings for that? if so, they don't seem to be documented anywhere. I tried this search query, which turned up nothing useful. I also tried output_container.metadata["title"] = "foo" and output_stream.metadata["title"] = "foo", neither of which resulted in a file which appeared to have any such attribute associated with it when inspected in Windows Explorer.
    I realize that this functionality is built into the ffmpeg binary and trivially accessible via -metadata but I wasn't using it in my script and I'd prefer not to start now if that can be avoided
    hey good news! actually the reason that it didn't appear to be working is because Windows Explorer is bad, not because your bindings don't have feature parity to ffmpeg. So, thank you so much contributors for your time and patience, great work on this thing, sorry I didn't realize this earlier. For posterity, it isn't explicitly documented, but I was still able to find out how to do this from combing sources and finding some tests
    Good day people! I am a Python developer looking for a way to extract motion vectors from an RSTP video. Before I dive into the library, I would like to ask whether this is something that I could achieve using PyAV. In particular, I need to avoid decoding the entire video to save CPU usage. Thank you
    George Sakkis

    Hi all, new to PyAV and video software in general. The example on parsing has a comment "We want an H.264 stream in the Annex B byte-stream format" but doesn't mention why. Indeed, trying the same example with the original mp4 input file cannot parse any packets.

    More importantly, and this is my main question, is there a general way to write packets of an arbitrary av.VideoStream in a way that they can be parsed later, ideally without an intermediate step of calling ffmpeg?

    1 reply
    Muhammad Ali
    I m using https://pyav.org/docs/develop/cookbook/basics.html#remuxing with python3.9 but script fails to run complaining cache download failing , what I may be missing
    1 reply
    Hello! Is there a way to receive more P-frames and B-frames when streaming a video?
    I am still trying to get motion vectors without decoding the actual frame. Is it possible?
    bruno messias
    Hi. The method from cython copy_array_to_plane is exposed to python?
    8 replies
    @mfoglio I spent significant amount of time trying to extract MVs without decoding, but it is not currently possible. decode() must be called on packet to extract frame side data (which in turn contains motion vectors).
    @mfoglio and I am pretty happy with performance. Doing that in real time from RTSP, reading data from 3 cameras on raspberry Pi. Didn't top CPU. RAM might be more of bottleneck depending on frames sizes, but I avoided accumulating frames themselves in memory and it was fine even with UltraHD video.
    Rana Banerjee
    Hi guys, thanks a lot for your efforts in creating this great library!. Would appreciate a lot if anyone can help me with my use case: I am consuming a live rtsp stream, processing the frames in opencv and now would like to stream the processed frames individually, but in as compressed form as possible, transmit to a server and then play it in a browser at the client side.
    I am trying to encode an individual frame in .h264 and send the encoded packet as it is without wrapping it in any container. I plan to use these packets directly in a javascript player. So needed help with 2 things:
    1. How to convert a numpy image into an h264 encoded packet? I understand I may have to input X number of frames before I get any output, and that is fine. But hopefully every frame after that would give me a packet output that maps to the frame that was input X frames ago. Is my understanding correct? Is this the best way to get the most compression at real time and how exactly do I do this?
    2. Once the packet is received at the other end what player can be used to play it in a browser. Do I need to out it in a container for this? The issue is that it will be a live stream with continuous packets coming in without any end time.
    @Rana-Euclid regarding (1), this is how I do it (frames is a list of images, numpy arrays). Maybe it is not precisely what you need, but you get an idea...)
            def encode_frames(self, calculate_bitrate: bool, options: dict) -> bytes:
                """Encodes frames and returns movie buffer
                    calculate_bitrate (bool)
                    options (dict): [extra options for av library]
                    [bytes]: [movie buffer]
                container = "mp4"
                frames = self.all_frames()
                if not frames:
                    return bytes()
                mp4_bio = io.BytesIO()
                mp4_bio.name = f"out.{container}"
                new_options = options.copy()
                new_options["video_track_timescale"] = str(self._samplerate)
                mp4_container = av.open(
                with av.open(self._bio, mode="r") as video_in:
                    in_stream = video_in.streams.video[0]
                    stream = mp4_container.add_stream("h264", self.FPS)
                    stream.pix_fmt = "yuvj420p"
                    stream.extradata = in_stream.extradata
                    stream.bit_rate = in_stream.bit_rate
                stream.width = frames[0].shape[1]
                stream.height = frames[0].shape[0]
                for i, frame in enumerate(frames):
                    format = "rgb24" if frame.ndim >= 3 else "gray"
                    video_frame = av.VideoFrame.from_ndarray(frame, format=format)
                    for packet in stream.encode(video_frame):
                if i >= 0:
                    for p in stream.encode(None):
                return mp4_bio.getvalue()
    regarding (2), I had similar requirement and used MPEG dash, but this is far beyond the scope of this library.
    Rana Banerjee
    Thanks for the quick revert @quantotto . I see that the frames are being encoded in an mp4 container. Is that absolutely necessary? Can't I create and transmit packets without wrapping it in a container?
    @Rana-Euclid you don't have to use "mp4", it can be "mpegts" or others. I believe it can be anything that you see as output of ffmpeg -formats command
    Rana Banerjee
    @quantotto Thanks a lot! Much appreciated.. I will check it out further.
    1 reply
    Rana Banerjee
    Sorry about the trouble, but well the issue is that I need to reduce my network bandwidth while keeping the system real time. So I need to transmit encoded packets for every frame. The moment I wrap the packet to a container (flv or mp4 etc.) the file size increases. Hence I want to just transmit the packets without encoding it in any container. Is it possible in the first place with pyav or ffmpeg?
    @Rana-Euclid try using format="h264". It will create raw H264 video buffer.
    bruno messias

    Hi. It's possible to use PyAV VideoFrame to stream into YouTube using the rmtp protocol?

    With ffmpeg it's possible using this

    import subprocess 
    import cv2
    cap = cv2.VideoCapture(0)
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
    command = ['ffmpeg',
                '-f', 'rawvideo',
                '-pix_fmt', 'bgr24',
                '-ar', '44100',
                '-ac', '2',
                '-acodec', 'pcm_s16le',
                '-f', 's16le',
                '-ac', '2',
                '-g', '50',
                '-profile:v', 'baseline',
                '-preset', 'ultrafast',
                '-r', '30',
                '-f', 'flv', 
    pipe = subprocess.Popen(command, stdin=subprocess.PIPE)
    while True:
        _, frame = cap.read()
    12 replies
    Miguel Martin
    Is it possible to set the start time of an output audio or video container or do I have to generate blank audio/video frames to emulate this?
    1 reply

    I would like to restream an RTMP stream to another one and if the input stream stops I want to display a static black image, could you please guide me how to start with it, the best would be if i could overlay the input stream to an static image? The restreaming works , but I do not know how to make a black image packet to mux it to the output stream when the input is over, here is my code:

    output_url= "rtmp://server.hu/main/test_input"
    input_url = 'rtmp://server.hu/main/'
    container_in = av.open(input_url)
    container_out = av.open(output_url, format='flv', mode='w')
    video_stream_input = container_in.streams.video[0]
    audio_stream_input = container_in.streams.audio[0]
    video_stream_output = container_out.add_stream(template=video_stream_input)
    audio_stream_output = container_out.add_stream(template=audio_stream_input)
    # stream.width = 480
    # stream.height = 320
    video_stream_output.width = container_in.streams.video[0].codec_context.width
    video_stream_output.height = container_in.streams.video[0].codec_context.height
    video_stream_output.pix_fmt = container_in.streams.video[0].codec_context.pix_fmt
    # audio_stream = container.add_stream(codec_name="aac", rate=16000)
    the_canvas = np.zeros((video_stream_output.height, video_stream_output.width, 3), dtype=np.uint8)
    the_canvas[:, :] = (32, 32, 32)  # some dark gray background because why not
    my_pts = 0
    while True:
        packet = next(container_in.demux(video_stream_input, audio_stream_input),None)
        # packet = None
        if packet is None:
            print("stream is not running")
            #TODO static black image
            if packet.stream.type == 'video':
                packet.stream = video_stream_output
            elif packet.stream.type == 'audio':
                packet.stream = audio_stream_output

    Thanks in advance!

    6 replies

    I have an nginx RTMP server and am trying to save and read the stream from it for audio processing. For saving the stream to a file i am using the record all option in nginx conf.

    for reading the stream i am using av.open(<rtmp_path>) in python code, but the start of the file that is saved by nginx vs the python code is different by 10-15 seconds, nginx file recording being ahead of the av one.

    Is there a way to reduce this lag.
    One option would be to read from the file being written to by nginx - I tried using av.open(<nginx_file_recording>) on that file and kept getting av.error.EOFError which means reading an active file being written to wont work out of the box due to seeking issues, hence wondering about how to reduce the lag between both sources.
    Last resort is to not save file from nginx itself, but do it from the python script, I wanted to avoid this as the nginx way is a good fallback in case the script doesnt work or errors out for some reason in production and I dont want to loose that 15 second data as well.

    my nginx conf looks like this

    rtmp {
        server  {
            listen  1935;
            application live {
            live on;
            wait_key on;
            wait_video on;
            record all;
            record_path /home/tmp/recordings;
            allow publish;
            allow play;

    @maheshgawali as an option, why not configuring nginx RTMP to record in chunks. For example, setting record_suffix -%d-%b-%y-%T.flv; and record_max_size 256K; or maybe chunked recording as follows:

    recorder chunked {
        record all;
        record_unique on;
        record_interval 15s;
        record_path /var/rec/chunked;

    (change values as you see fit) and then you can read files with PyAV after they have been closed / finalized (if new file appears, then you can safely read previous one/s)

    @quantotto : recording in chunks will also add delay like in this case a 15 seconds delay, I want to capture the stream as soon as possible. I already explored this approach, had tried reducing the interval upto 2 seconds.
    Time for stream to reach nginx RTMP + recording interval + post processing = this accumulates to more than 2 seconds and for real time use case we cannot go beyond 2 seconds as user experience deteriorates
    I'll explore using the option record_interval 1s
    Also, is there a way to determine from rtmp_container=av.open(<rtmp_stream_path>) whether an rtmp stream has stoppped ?
    2 replies
    Hi folks, I was trying to generate some videos, but when I encode the packets, the timestamp (pts) seems getting messed up by the output video stream. The timestamp becomes 16x of the original timestamp, I suspect it's due to Stream.time_base, when after I set it explicitly, it's still not working. does anyone have this issue before?
    @sunset2017 I think I ran into something similar as well and after some trial and error, I found this option that you can pass to container: video_track_timescale. It should be 1/time_base. I normally read time_base from input stream. Not guaranteed it will solve your issue, but worth trying.
    samplerate = int(1 / in_stream.time_base)
    options = {}
    options["video_track_timescale"] = str(samplerate)
    mp4_container = av.open(
    thanks @quantotto !!