Comparative Analysis of Entropy Modeling Strategies in Learned Image Compression: Hyperprior, Autoregressive, and Transformer-Based Approaches

Comparative Analysis of Entropy Modeling Strategies in Learned Image Compression: Hyperprior, Autoregressive, and Transformer-Based Approaches Electronic Devices Hajar Ait Lamkademe 15 1 2026 https://doi.org/10.6025/ed/2026/15/1/33-48 https://www.dline.info/ed/fulltext/v15n1/edv15n1_3.pdf This paper presents a systematic comparative analysis of entropy modeling strategies in learned image compression (LIC), evaluating hyperprior (HP), autoregressive (AR), and transformer based (TR) approaches under a controlled experimental framework. Entropy modeling critically determines compression efficiency by estimating the probability distribution of latent representations, directly influencing the rate term in rate distortion optimization. To isolate the impact of entropy modeling, all architectures share identical encoder decoder backbones, latent dimensionality, and quantization schemes, with entropy modeling as the sole variable. Results reveal a clear hierarchy in entropy modeling accuracy, quantified by cross entropy gap: hyperprior models exhibit the largest gap due to limited spatial dependency capture; autoregressive models substantially reduce this gap by leveraging causal local context; and transformer based models achieve the smallest gap by exploiting long range global dependencies, particularly benefiting high complexity content. However, improved accuracy entails significant computational trade offs. Context utilization efficiency analysis shows autoregressive models excel with small contexts but face diminishing returns with larger ones. Crucially, decoder centric complexity emerges as a decisive practical constraint. Hyperprior models enable parallel decoding with minimal latency and linear scaling, making them ideal for latency sensitive applications. Autoregressive models suffer from strictly sequential decoding, resulting in super linear latency growth with resolution rendering them impractical for real time or high resolution scenarios. Transformerbased models offer superior compression gains but incur high memory demands and quadratic complexity in global attention configurations; however, configurable attention mechanisms enable controllable performance complexity trade-offs. Rate distortion complexity Pareto analysis confirms no single approach dominates universally: hyperpriors excel in low complexity regimes, transformers lead in high quality compression, and autoregressive models occupy an intermediate position. The study concludes that entropy modeling selection must balance compression efficiency against decoder feasibility, with scalable context utilization being critical for realworld deployment.