Real-Time Multi-Person Facial Expression Recognition Pipelines for CPU and Edge Deployment: A Systematic Review of Evidence and Performance Insights

Pathirana, HSPLM; Pradeep, RMM; De Silva, LDTT

dc.contributor.author	Pathirana, HSPLM
dc.contributor.author	Pradeep, RMM
dc.contributor.author	De Silva, LDTT
dc.date.accessioned	2026-03-06T06:18:09Z
dc.date.available	2026-03-06T06:18:09Z
dc.date.issued	2026-01
dc.identifier.uri	https://ir.kdu.ac.lk/handle/345/9036
dc.description.abstract	Real-time multi-person facial expression recognition (FER) is increasingly important for applications such as telemedicine, e-learning, workplace safety, and human–computer interaction, particularly in resource constrained CPU and edge-device environments. Despite rapid advances, selecting FER pipelines that balance accuracy, latency, and scalability without GPU acceleration remains a significant challenge. This paper presents a PRISMA-guided systematic review of 34 peer-reviewed studies published between 2015 and 2025, sourced from IEEE Xplore and ScienceDirect, with the aim of identifying the most practical FER pipelines for real-time multi-person FER on consumer-grade hardware with CPUs and integrated GPUs. The review evaluates four widely adopted approaches, including MTCNN, RetinaFace, MediaPipe Face Detection, and DeepFace, using reported metrics such as face detection accuracy, expression classification performance, latency, frames per second (FPS), multi-face robustness, and CPU feasibility. The analysis reveals that MediaPipe-based pipelines are consistently reported to achieve approximately 30–60 FPS on commodity CPUs, enabling stable multi-face tracking with low computational overhead. In contrast, RetinaFace demonstrates higher face detection accuracy, while DeepFace-based FER pipelines achieve higher expression classification accuracy when combined with robust face detection. However, both typically operate at approximately 5–10 FPS on CPUs without optimization, which limits their scalability in crowded or time-critical scenarios. The review also identifies inconsistent evaluation protocols and incomplete hardware reporting across studies, which hinder reproducibility and fair comparison. Overall, the findings position MediaPipe as the most practical solution for real-time multi-person FER on CPU and edge platforms and highlight the need for standardized evaluation frameworks to support future research and deployment.	en_US
dc.language.iso	en	en_US
dc.subject	facial expression recognition, multi-person detection, edge computing, media pipe, real-time systems	en_US
dc.title	Real-Time Multi-Person Facial Expression Recognition Pipelines for CPU and Edge Deployment: A Systematic Review of Evidence and Performance Insights	en_US
dc.type	Article Abstract	en_US
dc.identifier.faculty	FOC	en_US
dc.identifier.journal	FOCSS	en_US
dc.identifier.issue	6	en_US
dc.identifier.pgnos	5	en_US

Files in this item

Name:: FOCSS 2026 5.pdf
Size:: 494.5Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

FOC STUDENT SYMPOSIUM 2026 [52]

Show simple item record