With the development of online video platforms, users’ demand for 4K high-definition image quality is growing. However, many users have found that even after they purchasing a membership to the video platform, the quality of 4K content is not as expected, and sometimes even blurry and stuttering. This phenomenon involves multiple factors, such as video coding, network bandwidth, and video transmission.
One of the key factors resulting the decline in 4K video clarity is the compression of video streams by various platforms to conserve bandwidth, which can sometimes result in a lower bit rate that fails to fully exploit the potential of 4K resolution.
In this context, efficiently compressing and transmitting 4K video has become a key technical challenge. This article will explore how to use MYIR’s MYD-CZU4EV-V2 MPSoC-based platform to access true 4K 60UHD-SDI video sources, use VCU for efficient H.265 encoding and decoding, and then leverage SGMII 10 GigE for network streaming to ensure smooth transmission of high-quality 4K video.
1. Causes of Declining Video Quality and Optimization Methods
1) Bandwidth limitation: As the number of users increases, the bandwidth of servers and networks often cannot meet the needs of 4K video streaming.
2) Inadequate compression algorithms: The traditional video compression technology does not perform well on high-resolution
content, which is easy to lead to blurred image.
3) Optimization of video streaming: In the streaming process, network bandwidth and video compression efficiency directly determine the clarity and smoothness of video playback. In order to ensure efficient transmission of 4K video on 10 Gigabit Ethernet, this design adopts the following optimization measures:
- Reasonable bit rate control: adjust the target bit rate of H.265 encoding while ensuring video clarity, to avoid the impact on video quality due to excessively low bit rates or bandwidth waste due to excessively high bit rates. The bit rate can be dynamically adjusted based on network conditions using CBR or VBR modes.
- Low delay mode: VCU supports the low delay encoding mode to guarantee that the video is compressed and transmitted with the lowest possible delay, enhancing the user's viewing experience.
- Network transport protocol selection: Select an appropriate transport protocol based on application scenarios. UDP is recommended for real-time transmission, while TCP is recommended for data reliability.
2. The Advantages of MPSoC and VCU Architectures in 4K UHD Audio and Video Broadcasting
1) Combination of high performance and low power consumption: The Zynq UltraScale+ MPSoC uses a 16nm FinFET technology with integrated multi-core processors and programmable logic to improve performance while reducing power consumption, which is critical for audio and video broadcasting as it reduces energy consumption while ensuring high-definition video transmission.
2) Real-time compression and decompression capabilities: The integrated VCU supports H.264/AVC and H.265/HEVC standards, which can achieve real-time compression and decompression of videos with up to 4K UHD resolution. This means that in broadcast applications, VCU can be utilized for efficient video encoding, reducing storage space and bandwidth requirements while maintaining video quality.
3) Multi-video streaming capability: VCU is capable of processing up to eight different video streams simultaneously, which is useful for 4K UHD broadcast applications that need to broadcast multiple video sources simultaneously. This multitasking capability makes MPSoC ideal for multimedia centers and video servers.
4) Flexibility and scalability: MPSoC's Programmable Logic (PL) provides the flexibility of any to any high-speed video/audio interface, which can bring customized image and video processing capabilities to the multimedia pipeline. This programmability allows the system to adapt to the changing demands of audio and video broadcasting.
5) Dedicated hardware acceleration: MPSoC provides dedicated processing engines, such as the ARM Cortex A53 based APU, Mali graphics processing unit, etc. These dedicated hardware can accelerate graphics and video processing tasks, improving the overall performance of the system.
6) Support for diverse video formats: VCU supports up to 4:2:2 10-bit UHD-4K video formats for professional and high-end consumer production and post-production solutions. This extensive format support enables MPSoC to be employed in a wide range of different audio and video broadcast scenarios.
7) Integrated multimedia framework support: MPSoC, when combined with the common multimedia framework GStreamer, allows for the development of hardware-accelerated multimedia applications. This integrated support simplifies the development process and enables developers to rapidly implement complex audio and video processing tasks.
8) Optimized power management: Zynq UltraScale+ MPSoC positions components such as the processing engine and hardware codecs in distinct power domains with separate rails. This configuration can be utilized to design optimized power management solutions for the entire system to further reduce system power consumption.
9) High-speed interconnecting peripherals: MPSoC provides high-speed interconnecting peripherals, such as the integrated DisplayPort interface module, which supports operating rates up to 6 Gb/s. This helps process real-time audio and video streams from PS or PL, further reducing system BOM costs.
10) Support for the new generation of terrestrial digital TV broadcasting technology: With the arrival of the ultra-high-definition TV era, the MPSoC and VCU architectures can support the new generation of terrestrial digital TV broadcasting technology, such as DVB-T2, ATSC 3.0 and DTMB-A, which support higher video quality and new broadcast application modes.
3. System architecture overview
In this design, we utilize the Zynq UltraScale+ MPSoC platform, specifically the model MYIR XCZU4EV, to implement H.265 compression of SDI video through the FPGA, and push it to a 10 Gigabit Ethernet via the SGMII interface. The system architecture mainly includes the following parts:
1) Video Input
The video input source can be an SDI camera, an SDI signal generator, or an HDMI-to-SDI conversion device that allows HDMI signalsfrom a computer to be fed into the system. The video signal is processed by Texas Instruments’ LMH1219 chip for signal equalization, converting single-ended signals to differential signals, which are then fed into the FPGA for further processing.
2) SDI Video Decoding
Inside the FPGA, the UHD-SDI GT IP core is employed to deserialize the SDI video, converting the signal into the AXI4-Stream format for further processing. The SMPTE UHD-SDI RX SUBSYSTEM IP core decodes the SDI video into an RGB format for display or additional manipulation.
3) Video Frame Buffer and Processing
After decoding, the video frames are stored in the PS-side DDR4 memory. Xilinx's Video Frame Buffer Write IP core is used to manage this process. At this stage, users can apply color conversion, scaling, and other video processing tasks.
4) H.265 Video Compression
The RGB video frames are then encoded and compressed using the Zynq UltraScale+ VCU IP core. The Video Codec Unit (VCU) supports YUV420 format and can achieve encoding resolutions up to 4K at 60fps, making it ideal for high-resolution, high frame-rate applications.
5) SGMII 10Gb Ethernet Transmission
Once the video stream has been compressed in H.265 format, it is transmitted via the SGMII interface to a 10Gb Ethernet network. Using the PetaLinux system and TCP/UDP protocols, the compressed video stream is sent to a PC or server, where it can be played back in real-time using software such as VLC media player.
3. Engineering Workflow
1) SDI input: Signal equalization is performed through LMH1219, and SDI signals are converted to AXI4-Stream format.
Through the HDMI to SDI converter, 4K 60FPS video is output to FPGA through 12G UHD-SDI. Users can also use SDI industrial cameras;
2) Video decoding: UHD-SDI GT IP core completes video deserialization, and SMPTE UHD-SDI RX SUBSYSTEM IP core decodes the video into RGB signals.
3) Video buffering: Use Video Frame Buffer Write IP core to write the video to DDR4.
Users can choose to make customer ISP here, such as image scaling, splicing Users can choose to make customer ISP here, such as image scaling, splicing.
4) Video compression: Use the Zynq UltraScale+ VCU IP core to perform H.265 compression on the video.
5) Network transmission: Use the SGMII 10 Gigabit Ethernet interface to push the compressed H.265 video stream to the PC via the UDP protocol and play it using the VLC player.
MYD-CZU3EG/4EV/5EV-V2 Development Board
4. Conclusion
As video content increasingly moves toward 4K resolution, the SGMII 10Gb Ethernet video compression and streaming solution based on the Zynq UltraScale+ MPSoC platform and VCU offers an efficient way to compress and transmit 4K video. This solution ensures low latency and high-quality image output, making it ideal for applications in video surveillance, medical imaging, industrial automation, and other scenarios where high-resolution video is critical.
For users seeking better viewing experiences on network video platforms, service providers and video platforms must optimize video encoding and network transmission to meet the demand for high-quality 4K video content.
5. Efficient Network Offloading with MYC-7A100T
When streaming to the PC through SGMII Gigabit Ethernet, the CPU cannot afford the high-speed throughput here because it is a 10 Gigabit network. Here we need to use network offloading. MYIR's MYC-J7A100T SoM can collect SGMII 10 Gigabit Ethernet data through SFP. The PC reads the video source through PCIE to realize the 10 Gigabit network port data packet unloading. In our upcoming series of articles, we will dive deeper into how the MYC-7A100T’s SFP acquisition and PCIe XDMA interrupt-based data retrieval can optimize performance and enhance your video streaming experience.
MYD-7A100T Development Board
|