Generating Personalized Summaries of Day Long Egocentric Videos

The popularity of egocentric cameras and their always-on nature has lead to the abundance of day long first-person videos. The highly redundant nature of these videos and extreme camera-shakes make them difficult to watch from beginning to end. These videos require efficient summarization tools for...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 6 vom: 15. Juni, Seite 6832-6845
1. Verfasser: Nagar, Pravin (VerfasserIn)
Weitere Verfasser: Rathore, Anuj, Jawahar, C V, Arora, Chetan
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article Research Support, Non-U.S. Gov't
LEADER 01000caa a22002652c 4500
001 NLM331549573
003 DE-627
005 20250302133844.0
007 cr uuu---uuuuu
008 231225s2023 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2021.3118077  |2 doi 
028 5 2 |a pubmed25n1104.xml 
035 |a (DE-627)NLM331549573 
035 |a (NLM)34613911 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Nagar, Pravin  |e verfasserin  |4 aut 
245 1 0 |a Generating Personalized Summaries of Day Long Egocentric Videos 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 23.05.2023 
500 |a Date Revised 23.05.2023 
500 |a published: Print-Electronic 
500 |a Citation Status MEDLINE 
520 |a The popularity of egocentric cameras and their always-on nature has lead to the abundance of day long first-person videos. The highly redundant nature of these videos and extreme camera-shakes make them difficult to watch from beginning to end. These videos require efficient summarization tools for consumption. However, traditional summarization techniques developed for static surveillance videos or highly curated sports videos and movies are either not suitable or simply do not scale for such hours long videos in the wild. On the other hand, specialized summarization techniques developed for egocentric videos limit their focus to important objects and people. This paper presents a novel unsupervised reinforcement learning framework to summarize egocentric videos both in terms of length and the content. The proposed framework facilitates incorporating various prior preferences such as faces, places, or scene diversity and interactive user choice in terms of including or excluding the particular type of content. This approach can also be adapted to generate summaries of various lengths, making it possible to view even 1-minute summaries of one's entire day. When using the facial saliency-based reward, we show that our approach generates summaries focusing on social interactions, similar to the current state-of-the-art (SOTA). The quantitative comparisons on the benchmark Disney dataset show that our method achieves significant improvement in Relaxed F-Score (RFS) (29.60 compared to 19.21 from SOTA), BLEU score (0.68 compared to 0.67 from SOTA), Average Human Ranking (AHR), and unique events covered. Finally, we show that our technique can be applied to summarize traditional, short, hand-held videos as well, where we improve the SOTA F-score on benchmark SumMe and TVSum datasets from 41.4 to 46.40 and 57.6 to 58.3 respectively. We also provide a Pytorch implementation and a web demo at https://pravin74.github.io/Int-sum/index.html 
650 4 |a Journal Article 
650 4 |a Research Support, Non-U.S. Gov't 
700 1 |a Rathore, Anuj  |e verfasserin  |4 aut 
700 1 |a Jawahar, C V  |e verfasserin  |4 aut 
700 1 |a Arora, Chetan  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 45(2023), 6 vom: 15. Juni, Seite 6832-6845  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnas 
773 1 8 |g volume:45  |g year:2023  |g number:6  |g day:15  |g month:06  |g pages:6832-6845 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2021.3118077  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 45  |j 2023  |e 6  |b 15  |c 06  |h 6832-6845