NOI E REDU TIO FOR FA E IDE TIFI TIO IN VIDEO h E ARHA PO lcc1rical ( 'ouirll ) ~ u g ill< 'mbl<> of ppro ach R prc ion u C n1putatimwl l).) 57 n omplexit)' . . . . . . . . . . . . . . . . . . . . . . . . . ()2 Conclusion and Futur Work 7.1 Snmmary of Coui rilmtious 7.2 Futur \i\.ork . . . . . . . . 64 () Bibli graphy 67 n· List of Figures 3.1 (a) 10 sampl function., fun ti n stimc t d from th tinlc t d from h P pnor: (b) 10 sam pl P po, t rior: (c) t h blnr plol i. the truc> fun ction , 1he d ashed red ploi is i hC' pr<'dicl C'd fnuc1 iou by n p*) and th 1 P ( h asc' d , tancl rcl deviation away from tb gr y area. mark . 22 3.2 arnpl re tc ngle f aturr . Two-r tangle f atur . arc shown in (A) and (B), (C) shows a thrcc'-n'ctanglc f<'atnrc'. and (D ) a four-r<'ctanglc' feature . Figur is ado1 t d fr 111 th paper authorc' l by Viola and .Jonr.' . 25 [2004]. . . . . 3.3 In1ag b fore (l ft) and after (right) histogram equalization. The hi to gram of intensity level. of ach i1nage c re illu~trat d in t h second row. 27 .4 Lo cal Binary P 8ttern op erator 2 3.5 Histogram of Orient l Gradi nts. Left : r aw im8ge; Right : xtracted HOG fe at ur s (using blo cks of 16 x 16 pix l ) . . . . . . 4.1 A ~ . 30 mplc t)eq 1cn e frmn the YouTub e Celebritirs d ai asct collected bv Kim et al. [200 ] ( t p ), it s ~ R h asrd on intrnsity fcaturrs (n1icldle), and a 1natchrcl E SR from anot h r clip (bottom ) . . . . . . . . . . . . 39 v List of Figures 3.1 ( ) 10 mplr f met ion, , tinwt d from t h fun tion e, tin1a t cl fr ) lll h P p P prior : (b ) 10 sarn1l ter ior : (c) t h blue plot is t h t ruf' fnnct ion , t h<' dashed n 'd plot is i he pr<'dict ('d fuuci ion by GP (h asC'd on p*) and t h gr ay ar ea: n1a rk . t ancl ard clc>via tion away fr om t lw pr diet ed fun tion (ba ed on a*) . . . . . . . . . . . . . . . . . . . . .2 ample rect an gle featur . Two-r ct an gl . 22 fe ature'. arr sh own in ( ) a nd (B), (C ) shows a thrC'('-f('C't all gle f<' a tun '. awl (D ) a four-rC'ct a n gk featur . Figur is adopt d frmn t h p a p er au thored by Viol a and .Jon es [2004]. . . . . .3 . ~5 Im age b efore (l ft ) a nd aft r (right ) hist ogran1 qnalization . Th e histogr am of inten ity levels of a h im age are illustrat e l in the sc ond r w. 27 .. . . . . . . . . . . . 3.4 L .5 Hi togra m of Oriente l Gradient s. Left : raw in1age: Right : al Binary P att ern oper ator HOG fe atures (using llocks of 16 x 1G pixel ) xtractecl .. . . . . . :30 4.1 A sample sequen c frmn the You ulw C lcbrit.i s cl at asci collect C'cl hY I in1 et al. [200 ] (top ) it s and a matched ~ , R 1 asrd on intensity features (midcl h ), R from another dip (bottom ) . . . . . . . . . . . . 39 v 4.2 lw spcciali:0ation stC'p for k - 4 (lH'st vie\Vc'd in color ) . . . . . . . . . 15 4. ampl n i t d in the g n r alization st p when tra ining frame d t <17 a 4.4 Fl w h art for t h ici ntifi mmnnum ut-off forE tion proc . s (l ft ): effrct of 'R conficl ncr (T) on accura y (ri ght ) ' 'D d at as rt 5.1 .2 x plorin g t h a mpl facrs extra ct cl from :\I - ~l o B o d a t ns t . I r ovidC'd b:v 1 vika lp . 52 a nd Triggs [20 10] . . . . . . amplcfa c<" x tract dfro m You ub C leb r iti s d Rtas t: (a). (b ). and (c) ilhvtrat smnpl s fro m:) cliff rC'nt di p ~ of th r sanw p rrson \ '] . . . . . s;) Ackno"W ledgernents iL t , I w uld like t thank my sup n·1sor. Profes:or Li a n g ncouragr1n nt , p a ti nc . and h lp in I th my hrn for his support ducation a nd lif . His insight and guidanc<' was \Yhat lll <:H lC' this tlwsi~ possil>l<'. and I am gratdnl to him for a llilH · antd n1ic achi ve1n nt I h av h d durin g thi. d gr from working with him in term. hop to follow hi . I hav l arrwd a n <'nonnous mount both a r 'SCRrchn nnd a : nprrvisor. I f h w xampl in t h futur . A numhrr of sourrrs fundrd my r srar h. th Univer it In pc rticnlar. I \..wmld ]ikC' to 1 hank of ~Iorthern British Colun1l ia, ~Iit a .. and ~ 'ER ' fo r tlw financial upport that they provided eith r dir tly or through gr ant: awardrd to Dr. Thank to all my fri nd in Prince G org , :\Tahid and 1\Iani, I\1ojtaba, Dhawal , and Behro z. You w r hrn. Iona and R a hin1 , what made living h re so an1azingly re- warding. I am de ply grateful to my fan1ily. My fath r , en1at H as anpour. who en1phas izecl the importan e of education and who in. tille l in n1 goals and th conficlen to a hieve then1 . ::\1y mother , the ins piration to t high oheyla :'\orouzi. -vdw h as always h <'C'll a somT<' of motivatiou al1C l sl r<'ngi l1, and 111y rol<'-modd for hard work prrsistenc and personal sacrifice. . And 1ny sistrr Baluu H assanpour. who h as bern my rnot.ional anchor throu gh my lif . inally, I would lik thank my lovely lmsban L Sanwd Karclan . It was difficult for V II us , incc wc livcd a cross t hc provinc<' from on<' a not hC'r for most of t lw t illl<' of thi:-; d gr an1ad up port d 111 and pati nt manner. t all t im , in a loving. 1111 at he tic opti1nist i . caln1 , hi. t he. i i d lica t l t o him . Vlll Chapter 1 Introduction Fac i lentificati n i c n idcred as n of th mo t important applications of imagr analysis and understanding awl has rrc<'ivcd a lot of att<'ntion frmu tlH' uwchilH' vi. ion con1n1unity through th pa t veral year . he classical fRee idrntification task involve id ntifying a . ubj ect from a inglc im gc and with only a f w training sampl s availabl . For uch configuration. th s am ple images 1nu t be carefully record cl in a controlled environ1nent. However, in r al-worlcl appli c tions. su h goo d quality . ampl are not asily att ainable . Fortunat ly, with th xt n 1vc av ilability of digit l imaging d v1crs . suffici nt dat a i. ace . sible to allow the r ognition process t he l asrd on imag -set to inwge- sct matching. hnag -set could 1 e cith r a c llection of single hot inu1gcs featuring a JH'rso n , or a s<'qtH'llC'<' of fnllll< 'S in a video . he wee lth of infonnation extracted frmn a sequence of fr ames in a viclC'o featuring an in livicluar s fac of-fers the potenti al to oYcl-conw th e r cognition errors thnt mn~· 1 occur due to the im1wrfcct quality of thC' intagC's which is au absent privilege in t.Jw ingle hot face id ntification t a. k . In t hik g n ral and vid ba d fa n. i1nag -s t l asrd fac id nt ifi a t ion in id ntifi ation in particular is pot ntially mor pr mising than u ing inglc-, h t im g , . Thi. typ of fac idc>ntification t nd, to be mor robust ·me th r cognizer gets to sc many mor possil 1 \'ariations in app c ranc of th ubj ct. H w , , r , with in1ag -, t s,a n w chall ngr cm rgc>s dn to the nncrrtainty on how w ll eachimag r pre, n du t (i) poor quality of th etc.), (ii) partial exist :.nc p se etc.), th individnal"·lwi is associ a tc>dwith. of th imag rnc rt aintiesm y (e.g. low r . olntion . illnmination contrast , facr in tlw imag ('·~ fir.ll of vic' w ( .g. oc ·lusion n l (iii) failure of th fac det ctor alg )rithm to accnratrl~· spot th<' face. All in all, unc rt aintir ar haracterize th in1po, ed du to t hr fa t t ha ach imagr rnay not fully ubjc t' face. Therefore' , it is iu1portant to dC'visC' algorit luns that can fnllv and dficiC'utl:v <'Xploit the aYailab le data. Thi can b d n through reducing the pffrct of noi. y san1ples by designing a repres ntation structure capable of relaxing the noi. e in each ·equcnce. complement d by dev loping a recognition pro clure th at rrject the \\T n g d ciswn. affected by noise. D evising a principled way to :yst matically deal with t h un ·er- tainties on how w ll each frame can r present the individual h as n t 'y t been directly ' a.dclres eel in the liter ature. This is the focu of this thesis. 2 . Ov rvi w of thi th 1.1 h qu nturn pr bal ilit , theory off r c l p w rful fram work f r rr1 r , rnting infor- m bon and making inf r nc . in )\ t m . with n1nlti1l de l with un ertainti , , qu Emhun th , s it nsl (i.e., To achirY f th y. trn1. of nne rt ainty. T ry 1 r pos s to considc'r an rns rnll ibl initial tat .... imultc n v nt) , nrc fall p s- tt mpts to fin l thr most prohc hlr out ·ome thi .. th math matical fonnalism f quan- tmn theory <'xtcnds the ordinary logic h)· the c·ow·c·pt of snnultan O?JS d czdalnltfy a n pt introduced by v m :;'\ urnann an 1 to c ntinu th r , ning whil coni lrring th y t n1. In pir d b 'Y r [l< .-.r:] - which ailows ph)'Sicists un · rtaint)' css ocie~tcd with thr stat f thi. lin of re, arch, w prop s a Quantum Probabilit ' In- spin'cl Fn-tlll<'Work (rdcrrcd to as QPIF ) for fa('<' idc'utification ill videos which llS<'S the quantun1 probabilitie t addr s, the und rl)ring unc rt a inti ,s asso ciat d with the of a h imag . It is ugg t d that th mathen1a i al foundation. of quantum probabiliti · v.:onld c n tru t a ound knowl dge r pr tion ext racted from th space, wher entation ystrm . In t hi repr sent at ion, in forma- images of the known identities form . ul .~ p · c in a H ilb rt ach ub pac represents on . ubject of known identity. \'Then pr , nte l with an in1ag - et that belong to a subj ct of 1nknown id ntity, an n n1ble of uncer- tain states is generated from th imag - et . R ecognition i posed a an optirnizat ion problcn1 i.o find 1h<' l>C'st-rank<'d snhsp acc to whi('h the <'US<'mhle of stat cs corresponds . ext, w will provicl a dual xtcnsion of the quantun fran1 work called En, emble f Ab. tract Sequence Representatives (r f rr cl to as EA R clcscrib din sect ion 4.2). In E n n1cnclatnre '~'ill be R approach , the representation structure of all ima ges - <'ithC'r of known or uuknown id<'ntity - is the ScUll<' as the COll<'<'Pt of iuitial stat<' in the quantum formalif-lm. This c-t pproach t,owa rds dat a r< ' ]H'C'S<'Ili at ion would uniformly r lax th n i r pr 111 11 of th raw dat · points by tr nsferrin g th m int a high r l vel ntati n pc R i. . built through a pro a h the r aw im ages in ord r deal with ut lier. . r lu noi ~ . Thi:, i ~ foll owed r y a fil t ring mrch a nism to imilar to t h m aj ori ly a ppr a he. (e.g. w rk by \Yan g c ncl [2007], , , tha t includes sa mpling a n l suprrpo ition f u et al. [2 01~] , a nd f t h imagr -set b ased fa c id ntifi a tion t e1 l. [2 01 2b], Ki1n J e1l. h n [200<], \YHng an g ct al. [201 ]) t h a ns c , inglr s tructnr r to mode] R trir t o m od l t h varia tions in e1 pp ar a ncr )f t hr subject ach image- et , a h E in an imag - · i . 'imilarity f E ' R · is a k n lc t d as t h dis( a ncc bC't wre n each p air of tr ain (of known identity) and t . t (of unknown i lEnti ty) im ag -. ts. IdC'nt ific a tion is p erformed by finding th mo t imilar r pres nta tic n of a known candicla t to diffr rcnt r pr ent ations of th unknown ubj ct, a nd t h n aggr gating th iclr ntification res ult s of all candid at<'s via m ajority vot in g ov<'r the' diif'nC'nt l'C' }H'<'S<' Ilt at iou s g <' JH Tat <'d for th unknown subj ect . Although EASR r educe the noi e in d a t a . th y arc linea r r present at ion s and ar e not cap able of capturing th nnd rlying n on-lin ar structnr of the d ata. Ther fore, on t op of th e EASR r pres nt ation 1nethod we int ro duce a n ensemble of bin ary G au ·sian 1 rocess ( GP) model in a one-versus-r st s tting for capturing t h e underlying non-linearity in t h d at a . To reduce the amount of noise pre, ent l t o the G P n1od els during the training we us a learning sch me called p eric lization - gen r r e1 lizat ion. Th specialization tep a.tt e1npts to find a subset of tra ining lata samples such t h at the hi ghest discrimina Liv . amplrs to train to t h p ower is achieved (i . . , only s lcct tlw 111< st challen gin g classifirr , not all t h training sampl rs) . The gen er alization strp a.ttrnlpt. ' to reclucc> t lw cffr t of possibly noisy training samples b)· r e-t r aining tlH' mod<'l on thcnw uns<'<'ll smupl s frmn training set t h ·rt tlw modC'l failed to classify J fi tly t hu rr ti n pr making ur 111 1n that t h 111 pr di tion . \V woull r f r t fa t iclrnti- l 'l g n raliz , w ll. Finally f b th 111 th ds to id ntify th ul j thi .. lwl ri 1 id ntificRtion sy,,t 111 a. E prob qu n 1.2 Main contribution t in th R+ P. Th<' main contrilmiion oftlli: work is two-fold : Fir t w propo mmnn1z th th or two r pr , ntati n , tructur ,, for imag -. rt . that ar drsign d to ff ct of n i y fram . In pir d by the concepts in quantum pr balJility QPIF and it dual xt n i n E R ar cl :ign d . uch that the impact oft hosr frame · that ar not u · ful for th id nhfi ·ation ta.·k (probably dur to occlusion , low r olution. or failur of th fa tracker algorithm) i reduc d. econd a novell arning sch 111 wa prop of binary Gaussian proce models. Thi l arning cheme training data in ord r to not onl ' in rea e t h but als to build th l for effici nt training of n n models u ·ing th 1nble 1 ctively an1plr frmn th eli criminat i n power of the cla ifier , l ast po ·sibl co1nputational c ·t and with mini1num introduction of noise. A. ess1n nt of the propo ed 111 thod on three pul licly cwailabl ben hnutrk dataset drnwnstr at PS signific antly higher perfonnanc com pared to t h preYi us n1et hods in t.h<' lit<'ra,turc including stai <'-of-the-art. P art of t h contents of thi. thesis h as lJecn publish ed iu the llt h IEE tional Intcnw- nfcr ncC' on Autonwtic FacC' ncl GC'sturr Recognition (He ssanpour an l r:: t hen [20lr:a]) and another pe-rt is pnblisll<'d as a t<'chuical r<'port at tb<' Int llig n niY r i y of J. Lab ratory - rth rn ritish ·Olll pntational olmnbia (H ssanpour and h n [201r::b]). 1.3 . Organization of thi th h r t of thi 1 nm nt i,' organiz l a, f llow. : Iu chapt<'r 2 the' prcvi m.· wm ks ou t lH' t lu <'<' uuuu aspc' c1 s of 1hi s r< 'Sc'arch arC' di cu d. Thi. incluck. Im age-. t 1 a, d Infonnation R tri YaL an l In chapt r th a e Id ntification. uantunr Theory in au ia n Pr c s · ~ in Computer Vision. b e kgr und inform atio n on which t h foundation of t h pro- pos d m thod , i , based n is explain d. In th fir st two sPct ions of thi · · l~apte r. thP mathematical of Qu antmn Th or,v a nd Gaussian Proc sse's a re disc ussed. one pt In th the third e tion the Fa D t cti n algorithrns that Rr us d to lo 'R e f<1ce. in a vid o fran1 e are bri fly introduced . Th Ex raction rnethod . And in the finals fourth , ction touche: on the Fc atnr tion , a brief introduction on the W el h t-te t is provid d. Thi · statistical t e 't is en1ploy d to che ·k whet h r t h im1 rov rnent · ar signifi ant. In chapt r 4, th propo cd algorithn1. are elabor ted in light f the n1at hen1atica l fonnclati 11 di cusfl din chapter 3. The propo eclrnethod are: (i) Quantum Prob al ility InspirTd FrmnC'work ( Q Pil~ ). (ii) Em.;c'lnhlc of Ahst nwt 'cquc'llC<' Hc'JH'<'S( 'll1 a t i\TS ( AS ) , nd (iii) Ensernblc of aussi 11 Process l\ [od ls on top uf the EA R rq proach (E SR+ P ) with R hiC'rarchical approach on the icl ntification 1 rocC'ss. G In chapt.n r:: t lw rxprrimC'ntal :dup for pcrfonnr-UH 'C' c' aJnatiou of the prop os('d m th d again t th pr viou in lud an introdu 1 You 1b m th d. in th n , l britir ). a t w 11 a th lit ratnrr i._ d , cril d . This (nmnrl:v. H nda / "'Vcluation D, rttings that wer . valuating all m t hod · to all ,,. for fair comparison. t r h apt rr 1 <'d o o. and qually s t for lwsr sc't t ings includr : facr dr- tion a lg rithm imag re, olntion. extra t '1 f at nrrs. and partitioning tlw avai lablr lata into train an l t st . uh cb. In hapt r ' the i l ntifica tion accunwr . of t h hrc' proposed approach ,s as well a th mo t , u cr,-. ful nwth ds in th lit ratnrr arr reportrd . av rag con1putation tirn s of all n1 tho ls ar a lgorit hm · dditioually, the notrd in order io giv(' a scn~c' of each omputational c n1pl xity. Finally, in chapt r 7 t hi do unwnt i. conclucl d by sumnwrizing the' rnain contribution of th propo d method and highlighting th res<' arch. 7 futnrr clirrctions of thi. Chapter 2 Previous Work In thi chapt r th 1 r \'lOU work onlu ·t ed in th lit ratnre ar discu. sed . In t hP fir st S<'ction , an ovC'rviC'w of t lw current im ag<'-sd bas<'d fa.c< ' identification t<'chniqw's is provid d. Next , in th nd tion , a bri f . urv y . umm arizing the application. of quantmn prob ability th ory in c:on1put er sci n in gen ral an l Inach in lee r ning t ask in p arti ular is provid d . Finally, t he third s ction gives a brief hi tory on mploying G au sian proc 2.1 In most models to soh: n1achine vi ion probl 111.·. Image-set based Face Identification f the publish d . tudies th t ask of in1age-s t bas d face iclcntificat ion is addrcss d in two ste1 s: (i) r pres nt ati n of the im age-set s , (ii ) findin g a snite1 ble similarity n1e .sure 1 tw en t h m. In the f llowing f t hi section , the b asic concept s of thP most known approaches (including the st 8tC'-of-th -art ) arc claboratPcl . 2.1.1 R pr ntati n It i e nt i l t a bl t r pr cnt v r. · 1111ag - t (oft n with varying nurnl er of ompared with nn g , in ide) in a unifi d 1nann r , each )i her. t he aiu1 of a rcpr<'s<'ni at ion , t rncl Hl'(' is to provic 1<- a wdl-ddincd nwt hod tran. f r informati n pr m1 edcl r l in a , t )f im ag s into unific'd strnctnr rve a much inf rn1 Rti n a. po" il l . R pr srn at i m s1ructur P of t h inw g an b and 3 - , et s ith r p ran1 tri or non-paran1r tric . P aram tric m t h d P aram tri 111 th d atten1pt t o repr . nt ach in1age-s t with a data- drivC' n clistribution fun tion. For in t anc to r andj l Yi t a l. [200 ] fit a aus:1 n n1ixtur " m odel a h im ag -set and u , this di tribution a· t h , r pr ~ cnt a ti ve of th rr. ·p r ·tivr 1mag - t. P ar am etri m thod., how ver , uff r fr rn t h a. smnption tha t all imagret r pre nting th arne identity ar drawn fro111 the . am distribution. How vr r , 1npirical re ult s haY hown that t hi i mo t likel)' not the cas . Th refor c. n1e1 jority of th curr nt works d ign a non-p aram etric r pr s nt ation ·tru ture. Non-par am tric m thods Th non-parametric rei rcsentation are divicl d into two at g n e linear. Lin ar R pr ntation . I\ Iosi uo1 ahl<' lilH';.u· 111<'1 hods indnd<': 9 linrar and non- • ~Iutual ubspacr tru t a lin ar , ub p h u li l c n • Di rirninant 1 1 proposed by Yamaguchi <'tal. [1D9] con- 1cthod for a ch imag -. t and al ulat . t h . imila rity using n gl b tw n th Rn ni c l orr lRti n. D ' '. propos cl 1- y Ki1n t Rl. [2007], fin ls w . ub. p R a n optin1a l di s rimin a nt fnn r t ion t lw t trnnsfon ns t hr inwgf'-sr1 s into nnot llPr . pa in whi ch th v\·ithin-class Rnoni ·al ·orrcla tion. ar , n1axi1nizf' l whil th , b tw n- lR Non- lin ar R pr anoni Rl cmT lRti on. ar mini1n izC'd . nt ti n . :\ on- linrc r 111 t h ds inclnclC': • Con ·tra ined }.f ut u a l u b ·p ac :\I t h od 1\I I. prop os cl by Fuku i a nd Ya m- aguchi [ 200~ ]. on trucL a · n. tr in d .~ ul . p ac th at only in ·lude. the ff ctiv ompon nt of th inpu t i1n age. f r r ecognition ( u. ing princip a l compon nt RnEtl- y i ) a nd n1ca ur th imilc rity b etw an gle b tween t h two u b p ac n im age-set. as t h multiplf' can on ical . • K rn 1 Grassm a nni an Di ·t a n e KGD. prop osed b,v \Yan g a nd Shi [200 J. 1s a kernel gener aliza tion of th Gra 1nanni a n di. t an ce in ord er to ca pture the nonlinear structures in th im age-s t . • ::\Ianifold Dis rimina nt Ana lysi !\fDA. proposed by V\Tang and h en [2009], form s t h<' su b spacc's for each i1uagr-sc1 with loc ally linear models ( i. c. , m a nifolds) and att empt to learn a n en1b edding . p a . wh r a h 1nanifold i cmnpa t but n1a.n ifolds of diff rent classes ar a s p a ratccl as possil l • ~1anifold- l\lanifold Dist an l\ I l\ I D , propoJ•cl by v\T( .u g ct al. [201 2b]. for nm- latcs th<' recognition task as computation of dist ;uH ·c lH'tvvccu t-vYo lo cally' li n<'ar SUr Sp c C' S of data (i.e. lllRllifo]d. ). 10 2.1.2 . il i r pre n v rv n r which nnag - . h mo~ d in 1nifi l 1nann r. w ·imilar to Pa h c n art to find out t h r. and con. qu n 1)·. d v lop a o p<'rfonu a ·k. ·ud1 c-t id<'ntifica ion. Tll<' llH ' hod of :imilau1v llH'H ·nn·mc'n1 \Yet\' d p nd h nt ur 1 on hod . If h r pr , n a 1m m h r prP <'ntation 1 para m tric. a. in i r pr . rntf'rl a. a work by , r a ndj lo,·i u .. ian n1ixtur mod l. h n a pair of nnag -: h i niPa ur d by 1nulan v b t\\' calcul ating he' lH' W< '<'ll-. <' eli: rilm ion di taw< ' (<'.g .. I\11lllJrt< k-Lc,ilJln div< 'lg< 'lH '<') . ~I a ·ur m n of , imilan y in non-param n r pn' c·n at ion. can b divided into hr e ca eO'ori Examplar Ba one d. The fiL a O'orv of ·imilari v m a. ·ur rn n 0 ha are ba d on calculatinCT h e di anc imao e- t . In tanc • Affin b m hod. a r th we n r pr , c·n ati\' , of thC' t\vo in l1d : Convex Hull ba ed Imag -. Di a nc AHI D and CHI D. propo. ed by C ,·ikalp and Trigg )010]. repr ~ ent each image- · by an affine/con\· x hull d riv d bv panning he ub pace u ing he image, in th s t. Th , imilaritv i. IIH '<-L lll<'d rl.' h l tru lw eli ·tcuH '<' hP \V< '<'ll th<' do.T:t <'X<'mplm: of c·adt imag<'-.'< . ur . imilarit y ·an aL o b m a. ur d b a.~ d on the r pres nt at ion true ure a. a whol . In. m1r . in lude : • Covarianc<> rP <'n i:criminant L arnmg DL. prop ~ d by \Yang <'t nl. [2012 a] . r('p- th unag -~ t hv it: ·o,·arianc 1natrix (i .r .. :econd-order ~tati . t1c). Tlu~ 11 fonrm latcH t hr prohl< m as classification in t hC' R icmannian 111<-tnifold. d fun ti n c nv rt. t h 1 u lid B n ,p wh r 111 h Examplar nd c v nan ._ ur 111 nt 111 pro- rix fr m R i mann ian 1nanif lcl to f . in1ilarity i.' , traightf rward. t ru t r . par ppr rinwt d ~e a r . t p int po in1ilaritv 111 , sur 111 nt mrthod th at utili zes lwth th : tru ·tnral infor- • 111ati n of th im ag -,·rt.. a: thi · ap1 r a h, J{ .XP. propo. d by Il n rt a l. [2 012], pro- ell a: th ir r pr . nt a tiv s. Tlw k rnrl xt n sion of 1\ TP. a llov\'s [ r m d lling t h complrx non-linrar st ruct nr s hat are n1b dd d in th d ata. • R gulariz d i'\ear t Point. R .YP pr p o.rd b)' Yan g tal. [2 01:], mod ls r,ch imag - t a a r egulari z d affin hull a nd 111 a: ur .· the sin1ila rity lwtwC'en thr two ' t , 1 y calculating th di stanc b Lw hull repr nth ne cue~t point s between th(' two enting a ch image-s t. R P i an improv m ent over A 1\ rp in tern1s of compl xity reduction. 2.1.3 Other M ethods Som r ent methods have a holi tic approach t v.; ards th ir reprr entation , tructurr which is complemented by their own r presentation-spe ific approach for imilarity mrasur ,ment . Instances inclu lr: • Dictionary-b sed face idrntificettion from Video DFRV, propo ed b)· Ch n et nl. [2012], cap t ur s variRtions in videos and rc orcls them in R clictimwrv while rrmovin t.h ir rcclunclancics. Iclenti:fi at ion is 1 rfonnc cl via majorit)· voting. 12 • ).lean rtiz t b par~ cpwucc ti 11 hip tvY n c 11 c vail bl training i1n g . t C'nt ation .] R, 1 r p s 'd b: frnnws in c pr h 11 1 lassi ficabon l. [201 ] p rf rm. a j( int ( 1 timization t) d t nnin • Joint th R< pr s<'ntation-hasC'd vid o s qn nc<' as a n R propos<'d c lin ar r la- bnilo it n1 d l. ni t nl. [2014], r 1 rrs nts all nsc'mhlr to su ppress thr f fl'rd of for a m or stahl rrcov ry. • hnag - t l a. cl llnl ratiY R prr.~ nt a tiou and 1 lnssification I R 1 , pro- p o d by Zlm ct nl. [201 4]. m d L h pr 1 nnag -s t as a cmlV<'X or n'gnlarizrd hull a n l ·alculat , th orr la tio11 b t w 11 th 2.2 cli.'tc nee to th imag -: t. in h e ga ll <'ry con:idering the two. Quantum Theory in Information R e trie val In t h p a t two decade , quantun1 th ory h a found it way thr ugh th orrtical conlput r scien e problen1 . From algorithms for d at ba , earch (work 1 y Grov r [1 6]) to decision th ory (w rk by Pathos a nd Bu 111 'er [2 009]). ga1n theory (work by Piotrowski and Sladkowski [2003]), and infonn ation retrieval (work by Pi\'I'Owar ki et al. [2010]) to n ame a few. The quantu1n information retri val fr am work for inst c n c , fo cu . es on r pre\ nting queries and documents in ten11s of quantu1n pro1 ability theory in order to leal with the nn< '<'rtaiuty i111pos<'d hy <-u nbiguons qn<'rics. cas 111higuons qu<'rics might illclUen t·hcnl the prirnitiv c nc pt of quantum t h ory ar : (i) ", t at fth s<'L a h known n1pl ys th cv nt is · the t . u h r pr s ntRti n supp rL 1 arning while b 1ng ff tiv n1put ati nally ffi i nt . ~ 8 . - ·;~ ~ B ~=;: -~ t -- l;:.o· ., Figure 3.2: a1npl r tangle f Rt ur . Two-r ctangle f aturcs arc> sho\Vll in (A) and (B ) . (C) how a thr -r tangl f at ur . and (D ) a f ur-r tangle> f at nrP. Figur i. adopt d from th p ap r author db · Viola and Jones [2004]. In the cond tage , a impl clas ifier i trained to elect a f w critical vi. u al featur s from a very larg set of pot ential f ature . The cla ificr is by Schapire t al. [1998] whi h in thi work , h as b en used t da Boo t propo build a gr 1 dy feat ure s lector. AdaBoost aggr gates the pr diet ions of a large ~ t of ( v; ak ) le arn rs Yia a w ighted majority vote. The v. eight giv n to ach v.. ak l arner det rmine the> import an e of that learn r , and in thi case , the feature. In th third stag , the rlas ificr Rre cmnbined in a '·cas a le" setting that allmY. for discarding the b ackground rC'gions of tll<' i111agc so 1hai i he focu s renwius so ld~ · on th pr n1ising fac -like regions. Each layer of th ca-. ad classifier onl)' lets through th sub-wind ws in th image t hat it predicts to be a p siti\'C one i. . part i a ll~ · 2;) containing th fac<'. ar a fth fa m 3.3.2 he fine l output of 1he cascade classifi<'r \Yordd dct<'rminc he h tati nary1n1 ag. f r Vi u 1 Tr In r m nt 1 1 rnin h In r m nt all ar11ing [ r is na l kin r ackin g (I\ ·r) a lgorith m pr pos d by Hoss C' cl. [200 ] att mp . t d a l \ ith 11011-,' ati n ar.v data (vicl ~) wh 'f both th t a r gPt obj ct and the h ackg r und hang m · r ti1n (i .. , ·am ra motion ). 1 his a lgorithn1 rffic1rntly l a rn n 1 upd at s a low dim n ,'ional .'nh pac repref-, ·nt at ion of t hC' targr>t obj t. ThC' tar ~ct object is first mod<'ll<'cl h.' a cmn pact 1 C'pr<'sc nt at ion s1lll('( tln ' t lm1 f'acilit at h a ng bj t r cognition. Th 111 appearan ul p ac f th t r get m d 1 i. ontinuousl)· updat d to rrflC'ct t h r l)j ct Yia 8 11 fficirnt in TC'nwntal Ill<'tho d w hi h all w for tr a king. 3.4 Feature Extraction In order to C'Xp<'rimcut with different fC' at nrC' typc•s, a nd show 1h at t hC' proposed a lgorithn1 w uld work well with a ny of th m. we u s . in th lit ratur , nanwly, histogram on1n1on visual feature . us d qu a lized int n sity level , Local Bina ry Patt ern (LBP) codes. and Histogram of Ori nt d Gradi nt (HOG ) des riptor . The a knl at ion proccdur<' of of thesC' feature' s arc hric'fly di scussC'd in th<' rC'st of this s<'ction . 26 3.4.1 Hi togram qu liz ti H istognun rqualization i. a c utra st nhanc nwnt t chniqnr by adjnst ing t lw intPn- sit i s of i1na g 1 ix ls by rv nl~· clis t rilmt ing t h<'m in the inw gc hist ogrmn . his ad- justment r snlt s in a high r contra.' t for t h )S<' n ·g ion ~ pn'vionslv with n lower local contrast . I h ·togrmn equa liza tion gr a tlv C' nh c-11H '<'S the qu a lity of the inlH g <'s with pc or illmnina tion . This met hod is SJWcifica llv n~ dul for low qnnlitv \'icleos. 1gnre :3.:3 shows a u in1agr h 'for e' a nd after his togr a m Pqu a li za t ion . 5000 6000 4000 4000 3000 2000 2000 1000 0 0 0 100 200 0 100 200 Figure 3.3: h nagc' lH'fore (lC'ft ), aucl after (ri ght ) histogram ecpwlization . Th<' histognuu of iutcusity levels of ench ima ge arc illustrated in th<' second row. 27 Lo al Binary P tt rn 3.4.2 L cal inary P tt rn (L nity b ' j la t al. [1. ) 1 r t r fir t intro ln thr rna hine vi i n ·onln1u- d t ]. i a gray- cal inYariant t xtnr d s riptor. LBP 01 rat r is kw wn i o h' robust to mmw1 onic grm·-~ ('alc vcu ia1 iou whi('h allows for dintina1 ion f th f p or ilhnnina ti on in im ag ,'. ~ bo, its compnt ntio n a l simplicity n1akes ffr t a n cd~· . is in clu~ll it a perf ct t ol for imag an \' rvi \\' of h \\' to l riY t h nging reRl-tim<' s ttin gs. Thi. s ct ion is r cod .. L mor compreh 'llsive r'xrlanation of this proc<'ss is gi Y<'ll in Pi<'tibiiJH 'll [2 01 J ]. Figur ·. illu trat Th L P patt rn forth pix 1 in th th . valu 3 lJlo k cTOI pc>d frmn a gn1~·-l vel inu:tgE. a n arbitra ry 3 · nt r (nmn d c .9r) is dC'riv db~· t hrc'sholding L fit neighbouring pixels (n amed a g7)1) with r sp C'ct to the valn<' of .rJr: g p7 g p6 gps g po gc gp4 g pl g p2 g p3 25 26 46 83 48 52 91 56 85 0 0 0 1 1 1 1 1 Figur 3.4: Lo cal Bin ary P att rn Binary cod e: 0011 1110 pPrator P-1 LBP code= L iyn (yp 1 - Yc)2 1 (3.9) 1=0 whcr<' r is th<' llUllllH'r of ll<'ighh onring pixds aucl the si gn hmctiou is dC'llllC'd a.': si.r;n( .r ) = On ·e Lh loc al h inar p t Lrrns ar 1 .r > 0 (: ~ .1 0 ) akulat eel for every pixel thr m gbout tlw whole irna.g<'/patch. t hr LBP cc d<' for that image'/ patch is d<'ri\'<'d as t hc histogram of tlH'Sf' p tt rn . l a not rn . Hmv v r L P do pat w rd n t u. . om patt rn w ull fall in t h patt rns r cJa · ·ifi l into tw Th LBP p att rn tion tw hav tw tran, ition .4 will b 2P liff r nt LBP 2P bin, in onst ru ·t ing t h histogran1. In t hrr binning, t h L P . am bin . o p rform t h a t g >ri . : unif nn a uci n n-uniform . f a pix l i, nit bin rv uring 1 ix L , t h r n ighl that f r on, id r d nniform if t ht r an' nonr or tw > tr a nsi- cl .3 It i tlw t th r ar P (P - 1) p atterns that r ~ 1 att C'rn th at h e v no t r a n. ition .. 0 The rrs t fp c tt rn whi h ar non-unifonn fall int o th<' la t lin . Th<'rrforr . thr fin a l histogram Finall , th hi. togr rn g n r at don a h p a tch of t hr irnag i. con ·ate n at d which f rm a v t r call d the LBP fcatur I' t r. Please notr tha t in t his work w u sc' th normalized v r i n of thi LBP f atur vrctor. 111 aning tha t a ll imag · constrnct a unit V<'ctor wh<'n l'C'}>r<' S< 'nt<'d by t h<'ir r<'SlH 'ct iv<' LBP fc ';-dlnT vc 'c1 or. 3.4.3 Histogram of Orient e d Gradients Histogrmn of Orient d Gra lient (HO G ) descriptor wa.· fir st proposed by D ala l c ncl Triggs [2005] for the purp ose of object d tection in g ner a l a nd huma n d etccti n in parti ular. This method i. based on the id a that t h local can be l s rib d by th distribution of intensit To calculat the HO ppearan e of a n obj ect gradient s (i.e. cdg directions ). d script or , ca h imag is fir t divided into sma ll connected ~jFor instanc , the binary code stated in Figur .4 contRins two transitions; one fr m t he 2 n d hi t to t h Hl , and t h other from t h 7th hit to Lh 1It 4 'boos ( P - 1) different numl er of 0 (or l s), then put t h m in (P ) desired place · 5 ~ iLher all bits are 0 or all ar l. 29 llo ks and a hi ·tognun of gra lirnt din'C'timv is cakulat eel for t hr pi ds within ec-wh 11 ck. his is clrriv d l y fi]t ring a h block with a d<'rivntiv mask such as [ - L 0 l] (horizontal rrlgr drt ctor). [-1. 0. l]T (' ''rtical d gc d 'tee t or). and other nwn' cmnpl x rnasks sn h as ob 1 1nask. c f thrsr hist< grmns (s c lw H )" d script or is repr sc'nt c>cl b~, the cone at <'nH t ion ignrP :L!1 ). Figure 3.5: Hist grmn of ( rirntcd racli nt:. L ft: raw image': Right: rxtrac!<'d HOG features (using block: of lG lG pixrl: ) 3.5 W e lch t-t est For pPrfonnanc cmnparison purposes 1> tween different nwthods. w nC'ecl c:m statistical tool to cmnpare the RccuraciPs (mran ± standard deviation ) cl rived b~, diffcrrnt rnethocls. Wrkh's t-test (or unequal vetriances t-test ), proposed b~, \Y<'kh [ 1~) 11]. is a 1w -sa1nplr test that i. nsecl to test t hr h~' pot hesis t h · t hvo popnla tions haw' equal means. Welch's t-te::-;t is performed whrn thC' ·1ssmuptiou of equal variance bE'twC'<'ll two populcli ion::-; is not sa tisfirc l. In this si ncl:v. t hC' accuracy of differ<'nt algori t lnns , ·nry at cliff<'H'll<'C' ratr (i.< ., cliif<'H'll<'<' variances), thc'reforC', thC' npproprint<' test is \\'c,kh 30 t-tc st. dditi n a ll~' vY lch t-t st i 1nor r li able th a n th m or c 1nmonly us cl stud nt t -1 , t wh n th t\;~,r if thrr is n with th our r a th in11l m nt r l C' O l fo r a n algorithn1 so tha t '~ r a n rnn it t xa t sam ult . an1pl , h aY u n qual an1pl siz s. T h is pror rt~' conws h a ndy tra in/tr ·t p a rt it ion of d a t a. howevrr, w ::;t ill n cl t o comp are gain t th r Lp ctiv m th od 's. \\' lch ', t-t t cl 'fin t lw. tRti.tic t h t h fo llowin g for mula: (:3.11 ) wher X 1 , .':>I a nd l\T1 ar e the fir t The \ V l h- att rt hwait a o iat d with t hi Ya n a nc ·a mple ·s m a n , v rian ·r, a nd sizr r rs pectively. eq uati on r. · u,· d t calc ulat t h e lrgrrcs of frcrdom v tin1a e: (3. 12 ) wh r e ZJ1 = fir st 1 - 1 and v2 = N2- 1 are the d egr e · of freed mn a sociat d \\·ith t h e nd . econd varian e esti1nate re. p ect ivel . The a pproxi1na t c d egr of fr ecdon1 is rounded down to the n ar est int eger . These two calcula ted st a tisti s . t and v , a n b e u. ed with the t -distribut ion to t est the null hy pothrsis that the two population h ave equ al m eans (using a t·wo-tai lecl i.C'st ). Al t.cruat ivcly as in this st n<' easily addre db d finin g a n w v nt , ub: pac ancl Px t r nd ing t he f'V<' nt s spac<' t o cov<'r t hi n w ub 1 a 4.1 .2 . Th r t Image- f th algorithm r nw ins int ac t . t s of unknown id ntiti s a tate Each unknown identity (i.e .. image's in th e cmT<'sp owlin g prob<' im age-s<'t ) rs r<'pr en t ed as a n ens ml l of st at . The feature vect or of ach in1ag in t h prob e im ag -, et is considered as an lem nt ary obj ect which defin es a pur :-;t a t e ¢ . ,'m g ( .1) w th n con trnct s v ral mix d st a te. ', s the :snp eq osition of few rand omly sel r cte 1 p?tr states in orcl r t number of attribute th a t each state m aximize th can represent , as well as to minimize the effect of nncontrollt'd vari ant s in t he inwgt's. Sup<'rposition leacls to g<'nerating initial st ntC's t b at cHC' morC' r o bn : -~ t t h nn t lw nois~· singl inw ges and tlwrdore act ivrly improving t h rt'cognit ion process. In ord<'r to l<'pr<'scnt thc unknuwn ide'tlt itv. it i. H'quiu'd to de'filH' a probability eli, tributi n p(¥) v r th mll of particq ating m1nd sta t'S. Th distri lnlt1 m '11. L 1 fin d in ,' U h c wa~· that ach p( V 1 ) n'H t: a 1 la t iv int <'r-similarit Y m 'C\sur of th ranlom p11T , tat'.' {¢.J ..J-ln } that con t11wt tb i 111 m1.nd s t at<'~·,: n, p(4'1 ) =- \ 11 I 11 D, rl ( ),. r/Jk) ( .l) D, I 1 ,,.h r metric d(.r . y ) c·omput<'.' tlw th uclicl 'ni l eli. tc1nc lwtw<'<'ll point~ .r nnd y ; 11 is numb r of pun . t, t <'.' C'Oll.' ruct 1ng C'ach nu.rr d at<': and .Y is the' nmnb 'r of mud state\ in rn,'rmll . f inallV.JJ(V 1 ) i.- det 'nnincd hy ( 1.1 ). 4.1.3 r R cognition w that t h one pts of l nt and n. m bl of stot s a re clarified in the cout ext of face iclrntification in in1ag -s t .. w de. crib the proc s.· of rc>coguition of a n unknown id nt ity wh n th . yst m i. pr nt d with a n w llll.'<'<' ll t 'st image-. ct . In this section , ,,. explain h ''" QPIF a.signs thr most probablc> icl ntity to th t st i1na gc- et rcpre nt cl by an rnscmbl f initial states via arch ing in the event s spac<' grn rat d fr 111 thr training image-srt .. \\.'it hont loss of g<'lH'ntlit y, let us assnmc we· h av<' id uti tie's. Th n whC'n present<>d with an imagC'-SC't erate 1 iuwgc•-se·t s for 1he• known f an unknown icl ' ntit~·, w< g<'n- initi l st ate's. :\ow , t lw task of recognition is nrri d out ns follows : fir:-.t. the similarity 1 c>tw<'<'ll all initial , tat s { ·, ..J 1 \'} nnd VC'nts { 1 • 1 _ 1 .u} is cakulnt<'d followi 11 g (:~ . L I) . Tll is r<'st!l t s i11 (he· Pn jrct ion Ill at rix P.~J l\" whc ·rc· <'nell <'klll<'ll t 1\, ') r: <> ·) P- wh r the ;th r w f m trix P (i .. , {q( ~lv 7 ) ..7 - I·S}) repr sPnts th qn Fmhnn proba- of a h initial stat biliti (~.2) belonging to th known idC'ntity 1• Th n, if it is in. ertecl into ( .. ). along \\'ith he probe 1 ility distribution of initial state's deriv d 1y (-±.1), thr { ~JJI.J = l .v} will b 1narginaliz cl ut and t h 1 rohalnlity oft h C'\'C'nt 1 1 h ing t rn (i.. th e prol al ility that thi. prob i1nag -set b long.· to tlw known identity rPpn-'sented by 1 ) i cal ulat d. H w \' r, 1 rior t {q ( zi4'J)•] = l } thi. procr. s, w have p as~rd th distribution through a . haping function: ( 1. :3) rp( x) r pre ent the exp onen ial function e:r. Thi haping functi on wa elected n1pirically 1 a ·ed on the experi1n ntal r .· ults. We will employ the slmr)('cl distribution {qs(S 1 the result of m arginali zing {qs(S7 1 1 1 ) .1 = l :N} to d <'riv(' • 1 ) ,1 = l:N} over th • q(S1 ). Thc r1( 1 ) is 1 s t of all initial . tate { 4'J IJ=l. }. The q(S7 ) i calculat d for all probabili tic eYent. following this pro crclur and finally the Pv nt with the hi gh st prob ability is rer orted a th system 's pr dicti n of the id<'ntity of the unknown (i.<'., proh<') imag<'-sC't. s cl<'scrilwd a hovC' , QPIF <'mplo\s a simplified int rpretation of the quantlun fonnalisn1 to aggregate th e info nn atiou of the known Rnd unknown iclentitic and us the en 'emble of initiRl st ates for searching in the v nt. p c to assign t h , most prob able icl ntity to each prob im age-set. En 4.2 mbl of Ab tr t R pr qu nc nta- tiv Data r pr s ntation mpl o~ ·pd in th \ ,' R a pp1oach is lll ~ Jlr<'d by the lP il~ .·true( nrc xt nd ~ I 's rC'plrsc'n l a tion mc' thod for unknown idc' utiti<'s intn du ·rei ea rli r. \ Y (i .. , 1Lrn1bl of :t0t . ) t o r JH c>:'nt thC' kno\\'11 id<'nlitic'~ a. we'll. . sa r s ult , both r pr , ntation: of th known a nd unknmn1 idcntlti '!:-> ca n c'xploit tlw n lvantag r s of l11} ling a n l : upcrp o ~it ion. , ncb a~ n o is n 'lclxa t 1011 . . m nti on fully ·h ar e teriz t c. arli r. in im ag -s ' ( lm: 'cl fa · ' id 'nt ific a t ion . ('ncb image mav not th individua l' : fnc C'. Thi::, ll W \' 1)(' dnC' to (i) poor qn a lit~· of thC' ( .g. low rrsoluti n . illumina tion . r tc .). (ii) p ar t ia l c·x istC'uce of tb <' fa ce' in thC' nnag imag<'·s fidel of \'i<'w (c .g. occlusion , po. C' . c·tc. ). a wl (iii) fail ure · of 11H' fa c< ' d< ·1c·ctm algorithm t urat 1 po th face'. , uch i::n C's would cast lmcert a int y on tlw cl gre that a f c icl ntifica tion 111 thocl , h oulcl r ly on ra ch individu a l im age' in a n 1mag - t (for image in 1 oth g, llery a ncl1 rob e's t s). A R is a vcctor-basc •d l'<'j)l'C'S('lltation strnctnrc for im agc·-:C't s th a t cHldn·ssC'S th<' unc rt ainties mention d ab ov c. follow. : it r lax e. the noi. e in t h r r aw clc1t a point s by tran ~ fe rrin g t h n1 into a high r level r pre. nt at ion structur using st r atifi r d sa mpling and up rpo, ition . Th n . it cal ·u lat s th and t st scqu ncrs. Afterward: th in1il0riti · b tw n 0ch p a ir of tra in recognition is p erfonnrd by find in g t lw nwst . imilar known E SHs l o d iifC'rrnt gc nerat ed u n k nown E 'R s a nclid a t c.', a nd then aggrrgating t h e identification r s ult s of a ll n mcliclntrs v ia nwjority voting. In t h res t of this sC'ction , WC' fir st ex pla in t h ' r C' p r r sentnt ion struct urc of E . \ SH . and tb n cliscnss the mrt h ocl of simil ar ity nwasnrC'uwnt bc twC'c' n <'n ch E . \ I t h nt i~ llC'CC'ssary for either pcrfonuin ~ th identification task or rankin~ hffcrC'llt candidates in t rm of th ir imilc rit , t t h - I r b imag -s t. 4.2.1 R pr nt tion .1) can b r pre~ nt d as as t of norn1 liz .d n lim n- video t)equcn (1 p of Figur i n al f tur t r ( a h r f rrrd o a. n) xtra ed from rYrr.v fran1e. How vrr, u h pnmc r. r prC'. ent c tion i. pron to noise, it i. b n ficial to transform b cau. th Y v tor into a noi. e-rrlaxrcl . C'conclar. r pre ·entation strnctnrr. ficd sampling. we draw (with rcplac<'nwnt) a S<'1 of sing strati- vcct ors from <'ach s1 ratum (i.e. qu n e f fra111 • in a vid o). whi h ar t h n grouprd into . rv r l non-OYEr lapping ubset of iz m (i .. m ,. ctor p r snb ~ t ). Th n. for rach . ubsrt, a new feature vect r (r pr nt d by p) i on tructed u ina (4.4). ( 4.4 ) We r fer to the e n w n dim n ional unit f atur v ct r a resentative (ASRs). m1n1n11Z1ng the 1 b t ract S quence R p- Superposition lead to constructing mor robu t ._ ample 1 y ffect of und sir d variati ns in single n i y Image. and therefore actively improving the id entification accuracy. For a h sequ nee we con tru :t a set of of Al tract Sequ n SRs of size !II and refer t it a· Ensc1nble Representatives (EASR). Top of Figure 4.1 ,how · the first 27 franws of a raw sequence lalwllcd .Tl. lu the 111iddh' of Fignr<' 4.1 a snhsct of ASHs forming the Jl 's EASR is pres nt d. 1 he i lea 1 ehind introducing cnsc1nlles along lease note that this is th ::;am conc·er · a::; generating a ml.red state from a s "t of pure statu.; as in quation .1. Ra'v S qu nc Jl ••• EASR J5 ••• ignre 4.1: A san1ple sequence frmn the YonTulw ' lcbri ties da tctsrt coll('ctC'd b~· Kim et al. [200) (top), its ASR basC'd on intensity feature's (n1iddlC' ), ancl amatch<'d ASR fnnn another clip (bottom ) v. ith t h mcjorit. votin~ is in. pir<'d by t lw cmH'<'pt. of bagging and cxploi t-in~ the' kn wl d g of th , whi ·h ar kn vYn f r t h ir robust p rform a nre r w l in d ]). in highly noi y d e1 a (. 4.2.2 In Similarity M rd r to find t h fir L v. find t h ur m n sinlilarit)' be \\' en two \·id<'O f.; qu llC ~s l and .J ( cl notrd Cl.S s1]) simil e ri ty v;!q l R frcnn th w n ll pos.'i bl R pair., of t h forn1 (!3~, (3~) R of, eqn<'nre 1 and .1~ is thr q th s<'rtC'd as tlH' fiual pr<'diction of the' systC'm . Ensemble of Gau 4.3 ian Proce Mode ls on top of EASR, a Hierarchical Approach In the pr viou ction En mbl of b tract qu nc R presentative, (EA R ) was introduced that i a v ctor-based 1n th d f r r pr s nting a video sequPncr . W mentioned ·hat ach E R is built by sampling and ·up rposition to recluc noise. follow d by a filt ring 111 chanisn1 to deal \Vith outliers. EASR models th ,·aria tions of the subj ect in an image-s t and the in1ilari ty of EASR s can be u d for identification purposes. However , EASR represent ation i lin ar and would fail to capture th nonlin<'ar undrrlying structure of the d at a . In order to address th non-linearity in d at a. w r ropo to us an en, 1nlle of binary Gaussian proc ss (GP) models in a on -versu -rest set ting on top of the EASR repres ntati n metho l. Th result is a hierarchy of two 1nain modules: A R module and GP 1nodule. EASH module off0rs lwttrr resista nce to noi ·y data nne! th P module in orporates a l c rning schc1nc call0d sp rcialization f r cff ivr trainin g of c u ruse1nl l of binary gcncrnlizn hou P cl8ssificrs (r 11·1hling furt hrr noi:-:;r reel u cb em). tructur m t lw identification process cmnbincs both nwdn]cs nsmg a hierarchical m x1n11z id ntifi ti n r t . h- r f this s t tion describ s t h GP ul in d tail . 4.3.1 M d I Gau ian Pro In th curr nt w rk , ,,.r u r gr s th r to con stru 't a P binar y classifier (i .. , Y1 E { - L + 1} ). F r a h . ul j ct i. this la. sifi r is ca p abl of i l ntifying th subj ct i v r u th r f th t ubj t . F r th i has rn 1 samples in i ota] for tra ining. to coll t th ake of implcm nt Rtion . let us cssnme subj ect \\·(, la l)('] ilwsC' sample's a~ (+ 1). In onlc'r ( - 1) lal ll d a mpl s, we , nb-. a mpl fr m th tra ining t an equal mln1bn of dat points el n ging to th r st f th s ur j ·t s in t h e ga ll r :v s u h tha t the tot al number of ample for tra ining i clo e. t po ·sibl t o 2m 1 . This is to n1ake s ure that th tra ining d at a 'a inpl , are 1 alanced a nd av id bi a · in t h classifi r ·.· d rc ision m aking process . For ach . ubje t i in a gallery with L ubj ct., we on t r 1 ·t one GP model GP 1 . This n1odel i train d to predict wheth r a equen · b elongs to the subj ct i or not . T predict th id ntity of a prob vid eo sequen ·e p with 111.P fr an1 ' the fnun pres nt cl to all L mod ls. E ch GP 1 pr diet s ih wher the /h iten1 h w th the /h frame as input. xpectcd value ar e /' * tha t is a " ctor of rn P lengt h . f GP 1 uncl rlying functi n (f *) with lassification of each fra1ne is b as d on th tlign of p *, if it is ncgativP it nwans GP 1 r jects tlw possibility that this fr a m e b el n g to the snbj rct i and vice' versa. lu orclC'r i.o aggrq?;at <' 1 he on1 puts of all m P fnunc s . v\'t' cakul a1 c the av rag of ll f l, j E [l..L] a nd rccor 1 it a. th over all output of G P 1 (i.<' .. sum-fu sion) . Aft er ·a.lculating the 'lggr<'gatcd output for C'Yrry m oclc'l, icl cnti t ~' of the 4 ( EASR similarity t.-.1--'--i......ll-~ , I I ' \ ' ' t-+--+--t---1 Figure'.±.:... : Th<' speci<-llizRtion step fork = (lH'. t vif'wC'd in color ) subject with thr hi gh st nggregatC'cl ontpnt is rPportPcl as th<' prC'dictcd idf' ntit:v by the GP rn.'cn1bl . 4.3.2 Specialization - G n eralization Learning S ch em e GP binary cla. sifiC'rs arr sensitiv to thr qnalit.v of tn1ining sa n1plf's. thus a sintple randmn san1pling process without a n~' provisicm for a\'oicling nois~' sc-1n1plrs rf'dnc<'s tbC' identification power of thr resulting nwclC'l. In this section. \\"(' clescril>C' our lC'arning schenw which r lies on EASRs for finclin g t hr nwst r levant sC'CJlH'ncrs for 1r<:1ining each binary GP nwcld (i.C' .. specialization stC'p. schrn1atically shown in Figure' 1.2). contplC'mrntecl by a genC'ralization strp which triC's to ·1llrviHtC' the C'ifcct of potC'ntia lly noisy frarnes in t hC' 1raiuing s<-nnplrs. Starting with 11 subjects awl 111 seqlH'lH '<'S for <'nch suhj<'ct m th(' tn1ining d ntn. WC' have' a SC2n xm which ccmtains SC'C]lH'llC'C'S for <'ncb suhj<'c·t . p ializati n t p (Fi ur 4.2): 2. al ul t th air-wi . imilarity 11 1 tw n ra ·h two . ubjects 1 and j. follow- ing (4. ) . 3. For each sul jcct 1 find t lw top k ll< 'arC'st sn hjcct s wi1 h t lw ltigll<'st 8 11 and storC' . J . lll 1' 4. Train GP 1 with all fram , from SQ 1 • [l .. m] as (+ 1) instmlC'es and randomly ub- a n1pl qual numb er f fram fr 1n .J ,[ l..rn]· J E 1 as ( - 1) instan rs. G n ralization t ep: 5. U GP 1 t l b el a h equ nc m Q1.[ 1 m]· j ~ .. TS 1 , for ra ·h fn rne f if GP1 (j ) > 0 (i. e., n1i lab ll d ) del it toG nL 1 list to])(' r trained to GP 1 . 6. Updat GP 1 with all fr am e fin GenL 1 as (- 1) in tances. In th specialization step for each GP modeL fran1es oft h training cqu n c for thr targrt identity are used n,s (+ 1) instances. The (- 1) instances arC' randomly snhsampled fron1 the s quenc s b longing to th k nearest subj ct. to the t ·nget i l ntity, as d etern1in d by the EASR in1ilarity (Figur 4.2 ). The g al of the sp ·ializat ion st p is to fore that separ ate G P to learn distinrtiY f at n r Cl ·h subj rt frotn th most similar subj rt.s to him / her in th gallery. However , we h ave to 1nakc sure that t lw G P bin ary classifier would generalize w ll on th sul j ts whmu it h as n t s en during it s fir st b atch of trniuing (i.e., 46 .3 : c rnpl P m d l f r th Figur p ializ ti n d in t h g neralizati n . t p wh n tr aining a .1 qu n t p ). In g n rali z t i n . t p , w' n1nclomh· ,' U b-smnpl s n1 fr am s from vid o. f aturing t ho ' E' . ul j e L not n~ rd in t b ' sp cializa tion st p and evaluat m th rr ·t 3 (t h correc t lab l would b ( - 1) bin ary la · ifi r . If th lal 1 i: n )t the w kn w thi . 11111 i. d fi nit ly not f at nr ing the r<' 1 ct ivc subj ect ), th n mod l i. re-train d with thi. sa nrpl g n r liz ati n . t p pro id ,' m r probl m pa as a (- 1) instance. In other words. the ( - 1) in, a n c . to t hr th at th e nw d r l L irnproving the gcucrali;mbilit y of the Th gen eraliz P m odrl in arras of t hr corr ct ly iclrut if . uch in. t ancrs , t bus P uw dd . i n t p also rninin1iz the effe · of noi y franws in t he ( + 1) in- t anc s. For ex an1pl , c n id r th fir t 3 fr am s of . equence J1 : hown at the top of Figur 4.1. The e fr am e do n ot provide any u eful infon nation for ident ify ing t he subject in that viclco . In tlw initi al tr aining of t bC' GP mff-ct . The valuE' f T f r ea h d tas t i ' s le t 1 1 as d on cro~~-validation 0\' r the train in g set (th el t d point ar highlight d in t h gnq h in Figurr 4 .4-right). , sir11ilarity of tw A Rs fa lls 1 tw r n zrro a nd on , nw .r(T ) = 1. Hmvl'vrr, in our experi1nents, T is much m all r (alway, le .. than 0.0. ) . 9 ' } igur<' .).1: 'amr 1 fac ', 5.1.1 Hn I xt Htctc'd from Honda/ ' ' ' ' ' ' '' • ' ' ,o'; ', : '' .. , ~ ",' , 'D da t as<'( D datasC't i: a collrction of-< vidf' >. tc'cord d from 20 ~u l>jcct s in ordc'r Honda / C to form a common ground for as:rs:-, nH'll( c f diffC'rcnt [;.H·<· id Pnt ific Ht ion a lgorithm: (s c Figur 5.1). ach snbj 'Ct has at lea. t 2 vidPos (cxC'q>t for ow' snbjcct ). id<'o. h av an rqual resolution of ()-!0 th \'icl o. vary from 71 to ()4: c nclllO. r sp ctivcly. 5.1.2 CM -MoBo 11 thP -l 0 and n 'cord<•d at l.)fp:-, rate' . Duration of frames. \\'ith man and standard dt'VH1twn of 2.- .'2 'T\1l- 1o o dataset was primarily collcctt'd for nntomntic idcnt1ficntion of pt'oplc b~· gait. Ilov·:<'V<'r, it has lH'C'll rccC'ntly ns<'cl for inw g<•-sc t lm scd fJ<'<'t '' 1t h f<'\\ <'l tlwn four walkin g pattnns is cxclncl<·d !'tom the clntn s<'l, tlllts onh· th<' first 2l snhj('< ·ts 'I ' • ' ' Chapter 5 Experimental Design In thi h apt r , w bri fly lescTib th d at ascts and valua tion set ting for our exp r- imcnts. 5.1 D atasets Thre publicly ava ila ble lJC'nclnnark d at a. ct s ar r used for evalnation of the prop osed methods in t his study: Honcl c / C D d at ase t collect ed by L c et a l. [2003], Cl\I U- l\1o o d a t.as t collect ed 1 y Gross a nd Shi [2001 ], and t be more challenging You Tub e Cclc'britics collC'dC'd 1>Y Killl d rd. [200 ]. !) () Figur 5.1: 5.1.1 mnpl fac xtra t cl from H onda/ D latasrt Honda/ UC SD H nda/ "C D datas t i.' a to f rm om1non gro und for a~sessm nt of diffrrrnt facr idrnhfica tion algorithms oll ·ti n f ::>9 vid o. r cordPd from 20 subjects in orcl r Figure 5.1). Each subj ct ha. at l at 2 vidro. (Pxccpt for onr snbjcct ). ( vide have an qual reso lution of 6-10 ll the 4 0 and recc rdrcl at 1.)fps ratr. Duration of th vid os vary fron1 71 to 64.5 fram es. with 1nea n and ~tandarcl drviation of 2'- .2 and 110. respecti,·ely. 5.1.2 CMU-MoBo 1 MU-M Bo clatas twas primarily collected for automatic identification of p oplc b)· gait. H owev r, it has brC'n r cc nt ly u~rd for imn gr-sct bnsed fcwe identifi<' <1t ion stndic~ as w 11 (srr igurr .5.2). This datas t contains , ·ideo srq1H'llC'CS from (j cam<'Hl ,·icws of 25 su1 jrcts performing four dif-f<'rr11L walking activit ics on a t rendmill: slm\·, fnst. on inclined snrfac , and holding n ball. Following tlw lit nature. t ht' sHh.it'd \\ 1th fewer thrm four walking pa!t<'l'n~ is excluclc·d from the dntc1s<'l, thn~ only the ii.r~t 21 ~uhj<'ds Figure .~. an1pl0 fa ·p C'Xtra ·t d frmn and Trigg. [2010) ar u ·eel. 11 \·ideo· ar of 640 of th Yideo. vary fr01n 202 to ~ ~I -:'do o data~et, provicl0l by Ccvik.- lp 0 resoluti Hl cllld r0cordccl at 30fps ratr. Duration 7 fran1 s, with 1ncan and standard deviation of 495.6 and 169. r spe ·tiy ly. 5 .1.3 YouTub e C elebrities You Tub Celebrities clata'- et is a culle tion of real-world vid o, from YouTulw website f aturing 47 c lehrities (~ e Figure 5.3). Th Yicleos c-u noisy. low resolution, rmd dcn1on trate large Yariations in illumination, pos0 , expression , awl otlwr uncontrolled conditions. For <'ach snl)j<'d thC'r<' ar<' :3 video dips . whnc <'a('h clip is diYid<'d into several s qu nccs of unequal rcsolntion and clnra ion (l>PtW<'C'll 7 io 3r:0 frames. wiih mean a nd standard clcvintion of 16' .0 and '-1. '- r<'sprctivcly). Thcr<' is n totnlllmllb<'r of 1910 sequences. all cucodcd in l\1P ~ G4 nt 2!)fps nlt<'. !)2 (a) (b) (c) 1 i ·nre .. 3: mnplr fac . extracted fr m YouTubr C'l britiPs d atasrt ; (a), (b), and (c) illnstratc sa mpl< 'S from :3 diff<'U'ut clips of th<' sc-Ull<' ]H'rsou 5.2 Evaluation Settings In t hi ecti n, w de crib t h procedure for prrparation oft he training and trst clat a. \!I/ followed the c nnnon 5.2.1 ttings u~rd in the litrratur to a llow for fair comparison. Face D etection It i a common practice to fir t track an d crop fa 'f'S from rach framr and only pass the snbj('cts fac<'s to the r<'cogni:;;<'r. SillC'<' th<' ohj<'di,-<, oftl1i~ study is fac<' idc'utificntion. it is mor convenient to only pa . . s the ul jccts· facrs to the rf'cognizcr. Therefore. it is nee ssary t a pply a pri or algorithm to track/detect and a ntom aticnlly crop the face s frmn each video fr an1r. Similar to the pnvions works in the litc'rnturc. Y iolnJ on s met ho l. proposed by Viola a nd .Jones [200 -l], is used for ext rnd ing fnccs 111 the r:•) .).) 1 2 H onda/ l D and \ i la-J n alg ri thm fa iL to clet H u t c l. [2012] w 3 ' 1l- Io o d at as<'ts. For t llC' YonT11lH' lL c thr In t fa ~ m numb r of sequ nc 5.2.2 ll t h R . Thu , f 11 wing r m nt all arning for Vi u a l Tracking (IV ) algorithm , pr P · cl b. Ros~ et al. [200 ]. I T return._ th ·cqu nces, hm\' \ er. som cl<' britics d ai.as<'t, t he m ay not r pre~ nt a face ar a in all frames of all 1910 OITect face (see Fi gnr 4.;3). 4 oluti n cropped face.' arr r !:--izrd to an equal rf's lntion. Im ages in Hond a/ clat asd nrc r<'sizc'd to 20 d ata._ t to 20 20 pi xds, ·~n -I\ loBo 1o JO D 10 pix<' ls, a nd YonTuiH' 20 pix ls (20 x 20 r solution was sc'lectc'cl to reducr the c mput a ti nalco.t). 5.2.3 Fe atures In ord r to exp rm1 n t \\'it h different feature tYpes, wr usc histogram equalized int n sity levels for the Hond a/U SD dataset. Loca l Binary Pattern (LB P ) codes, proposed l y Oj a la et a l. [2002], for the CI\ 1U-l\1oBo dat asPt. and Histogram of Oriented Gradient (HOG ) descriptors, propos<'d by D a lal aw l Triggs [200.'>], for ill<' YonTnl )(' 1 The author w ulcl like to t h ank D r. Liang C hen for pro,·iding th<' Honda/ CS D dntn.set with fac s detect d. 2 In this work, we h av directly used t he pr -processt'd vrrsion of C:t-.IU-:t-.IoBo dataset pnn·idccl by t h e authors of evika l1 c-mcl Triggs [2 010]. The pn·-processing procedur0 include's fan• tracking, resolution , a nd feat ure extraction . 3 The a uth or would lil] wrsion of til<' YouTub e ~E' l c hrit.ics dataset. 5.2.4 Tr in/ T l"'or th Hond a/ t im t arr ng m nt D clatns t \\' C' randmnly sf'lpct 20 srquc:ucrs (onr viclro p8r subject) T f r training and thr rC'st fort sting. It sh nld })(' notrd that, thC'r is an alt rnativ evaluati n , tting for t h H onda/ D dRtas t which usrs a prrd :finrd s t of 20 , equen e. (onC' Yid o pC'r suhjPct) for training with mt any random permutation .. m rrcent alg ri hms ( .g., R:\P I R , an l tch proposed mrthods) , chievr 1007c' accm·acy with this prcdcfi.JH'cl s<'t tiug, we' usc' i he rawlolll ~C't t ing which provide's lllOlT variation in ord r to hav a mor meaningful comparison. F r th ~I - l oBo dat as t we also randomlY s I ret 24 srquPnces, on vid o prr ub j ct for training and t h rest for te, tin g. For the YouTul e C'elel riti s dataset w perform f)-fold cro ·s-valiclation , follow ing the va luation protocol used by Hu tal. [2012]. Sequrnces of rach subject arc scqu ntially p artition d (no prior shuffling) into 5 folds, wllC're ach fold contains exa tly 9 sequences (from 3 clips) with n1inimal ov rlap bet-v\·crn fold ,' . In each fold, l clip i. r and01nly sel cted for training (3 scquencrs) and the' othrr 2 clips nrr used ns trst d ata (6 s quenc s). It is in1portant to nlE'ntion that there is anotlwr evaluation srtting for thr YonTulH' l britics d ataset first nsC'd by \i\'ang tal. [20l:..b]. In this setting, for rv<'r~· snbjcct in each fold 9 sequences (3 p<'r cljp) is randomly s< lc'cU d: 3 sC'qncnccs (1 per chp ) for l.r a iniu ~, and i.hC' n·sl. for t <'S! iu~. 0 Th e a uthor wo uld of t.lw IIOC fC'atur<'s. lik<' to t ll m1k Dr. Linng hen for prm·iding hi:-; coviou s l~r b cau it is an <'cl.'IC'r t·as k to id<'llt i:f.v tlH' snhj<'ct wit-h t.hC' s< cowl sc'tting, th re i. alr a ly on vid o s qncnc from each clip avail blr in t-he training rt. which fa tor. out cliffr r nc s in app arance of th subj rct in diff rent clips. For t hi rra , on wr b li \' t lw t- t h first srtt ing is losrr t r al world scrnarios thus we adopted the proto l used by H u C't rd. [20 12]. or all t hrc cla t a, rL we r<'port c-H·cun lcy rPsults for th full length s qn nee, as w 11 trun at cl s quru · s the1t only contain thP first .JO consc'cutivr fr ame's of each vidro ~ cq u ncr. All vah w t ions ar don using .5- fo lcl cross- \'alirla t ion . Chapter 6 Results and Discussions In thi cti n. we , Ulllnlanz th id ntification accuracies of the thrr proposrd ap- proachC': aw l cou1par<' them against tll<' most sw·c<·ssfnl and n·<·<·ut mdlwds in tlw lit rature (na1n ly, I\IS~ L :\ID , AHI ' D /C HISD. ~A~P. R:\P. ~1 '~ RC , .JSR. and I 'C RC in chronological ord r). Except for JSR , for all ot lwr met hocls wr nsccl the cod provided b the a uthors adju ted with their 'uggcstcd paramC'trr values. For J R we did not h ave acce s to th code thus report the rrsults provided by thr author.. How ver , it should b e noted that thr evaluation s t tings for .J 'R arc different than what we are using in this study they used the \Ynng et al. srtting for the YouTub c C l britie. data. et (whi h leads to high r accurac~' rr~mlts compnrcd to the Hurt al. [2012] s setting u sed here), and :~0 x :~0 resolution for both YouTubr 1 ch'hritirs awl MU-Mol3o d ai asc!s. o m a ke the cmnpnnsons fnir wr usrd the same rvaluntion S('( tiugs, including frctnrc type for trainin g all algorithms (i.<'., in(<'llsity levels for Ilond il / C 'SD, LBP for ,:\1 -.I\1oBo , and II 0 ; for YonTnlw '<'l<'hrit ics ). Int <'n's( in g))', this cnhmH'<'llH'llt :>7 l<'d to improved an·nrar:v for all algoritluns (including the old er algoritlu ns su ch as NP) on th riginal paper.._. uTub 1 brities datas<'t mnparcd to the r sult reported in th lso, it mu. t be not d that th original cvalu tion of RNP was lone nlv on 29 sul)jert. forth YouT11b ekbritiPs dataset and th results obtained in the r sp ct i ve pap r (Yang ct al. [20 13]) cur highC'r than t h result t> obt ainecl on the fu ll data t. ddit ion all)·, ?\I-' 'R ' (Ortiz t al. [201 :3]) comes with its own face tracking c:dgorithm which ,,.Rs clisabl din onr valuations, sine<' th aim is to cornpare nly th id ntifica ion 1 wrr of diffcrc'nt algorithms. ther fore. th sam tracking algorithm is u · d for all valuations. P rformance rcsnlt s on C'ach of t lw t hr<> b '11 ·hm ark dat asC't s is clcriv d by exactly following t h protocol de~cribecl in the Evaluation , ettings s<>ction in chapter .S. Thi protocol is the . am a.' that in the relatC'd works in thr literature to allow for fair comparis n . \ V<> perform \\Teich's t-test (srr W lch [1947]) to ch<>ck whrth<>r th<' inlprovcuH'llt in pcrfonnaiH '<' of tlw propos<·d UH'tlwd(s) is s1a1i:-;1ically :-.ignificant compared to th be. t perfonnancP of thr cont<>nder methods. Outcomes of the significan e test are clescrib<>cl along ,,·ith the summary of rwrfornwncr results . Tabl s 6.1-6.3 ummanze the ]\fran ± Standm-d D euiatzon of the icl<>ntification rates for djfferent met hods in the litcrat urr and the t hr<>e mrt hocb proposed in this work (namely, QPIF, EASR , and EASR+GP) on Honda /CCS D , C;\1U-:"JoBo, nncl YouTube C lebrities d8tasets for both 1he truncCit<>d s<>qll<'IlC('S (only thr first 50 frames are available to perform th<> identi!icatiou task), ns well as th<' full length v ideo sequ nces. O n the Honda/ UCSD dnt asC'(, both QP IF nud E 'H mrt hods p<'rfornwd well [or t.hc full length viclro S<'qlH'llC'('S and EASR +G r on( purfornwd al ot lwr llH't hods [or b oth truncated as well a.s the full l<'ng! h vidPo scqtH'nccs. The ddkrnH ·(' in Hlcnt iti- abl 6.1: Id ntifi ·ati n Rate (o/c) ().Iran ± ' t andanl cYiat ion ) Method f iff r nt ~I t h ds on Honda/U Year 50 fram s F\1ll l nglh 1 7. ± G.l2 90.2G 1: 2.1. ~I 200 7.G -±: 2. 1 OG . ll ± lAO HI 2010 .21 ± a 3 ..5() ± 3. 9 HI D 2010 G.1.1 2. 2 Tp 2011 7.1 ± .01 R Tp 201:3 2Ll 1 ( 3 .3' _i 2 .2 2013 ± 1.10 92 .31 ± 1.11 03 . r: ± 2.29 I .\I ~1. '. 'R ' I 011 fl.2 01.2 ~ 2.2 OG . 11 ± 2.29 05 .3 ± 1.1.5 QPI 1.2 ± 2.29 97.44 ± 3.63 AR 91.2 1.·10 98.46 ± 1.40° 94.87 ± 4.05 99. 9 ± 1.15* EASR+GP Dat set ot * in eli ates tat is(jcally significant improvc>mcnt of ace uracy com parc>d to Lhe> sc>cond b st re ·ult at n = 0 .05 o indicate's statishc a llv . ignificanr improvC'rnC'nt of a(Tttracy comparw l to thC' sC'cond hC'st r sult at o- = 0.1 cation rat s for EASR+GP rnethocl i -· significantly (statistical) higher than the brst ont nding rnethod in the lit erat ur . On the C 1 -:MoBo dataset (sre Table 6.2), theE SR+GP method achiew•d a slightly lower id ntific ation rRt comp Rrrd to I R c (less than 1ex). HmYC'\T'l', based on the st atistical c n alysis, t herr is no significant diifNcncP between the r snlL' . It shou ld be uot <·d that t h<' Houda/UCS D and Cl\llJ-1\loBo dnt as<'1 s nrc <'OlllJUunh· used as l enchmarks and consid('rccl as cnsier r<'cognit ion tasks since nwst oft lH' nl gu- rithrns in the li Lrr a turr h ave f1 lrcndy nchicvc'cl a l>m'c' ( Ol)(' nccurncy. Therefor<'. then' (o/c) bl .2: Id ,ntific ti n Rate (::.. Ican ± 1andard Dc\·iat ion) Method f DifLr nt 1 ,thod Year on Fulll ngth L :l\1 1 2.50 ± 2.71 97 .22 ± 1.70 ID 200 1.17 + G.. G 95.2 ± 2. HI D 2010 92.50 ± 2.71 .5 .5G ± 2. 1 'HID 2010 ( 2 ..50 ± 2.71 .G1 Tp 2011 a2.50 R 'P 201 :3 ( 2.50 j_ 2. 71 a .17 ± 0.7G a .33 1. 1.1G l\I 'R ' 2013 1.11 ± 3.la 9 .33 ± 1 ..52 2011 94.44 ± 2.2ot 99.44 ± 0.76t QPIF a2 .50 -± 2. 71 7.50 EA R 91.91 9 9 ± 1.1Gt 93 .Gl ± 2.71 t 9 9 ± 1.1Gt EASR+ GP I -l\1o o D ata t 2.71 1.39 1. 1 t : t indirat s no sig nifica nt cliff renee b twPen the best perform ance res ult (bold) a nd the pro- pos d approa h · (stati sti all)·) i not much r on1 for imprO\·em nt. Hm\·ever , we belirYc that the results on t hr 1nost challenging dataset , YouTubc Celebrities. can rank cliff rent algorithms in terms of p rfonnance and effici ncy. For the YouTube C l brities dataset (sre Table 6.3), QPIF and E 'R slightly outperform cl the contending mrt hods on t hr full length vidro , cquen es ancl the ASR+GP approach achiev d significantly better results a nd improved state-of-theart by ~ 4 o/c for the full length sequencC's. EASR + GP al ·o achieves the highest a.ccnr acy for th(' 1n me a 1cd s<'q ncncc's. T lw sn p<'r i or n 'snl1 s of ill<' p1 o pos<'d lll<'1 hod can be attributed to its cnpabilit)' of handling c'xi rem ely nois.r srunpl<'s in the YouTu 1>{' elebritics dataset nwn' efficiently compnr<'cl to the rest of the met hods in t h<' lit cr- aturc> . ()() al l ·· : Id ntifi ati n R at ai.as<'t ( ll<'an ± (%) of Diff r nt I th ds n YouTub t andard D<'viat ion) M thod Year 50 fram s Full length T\1 I\I 1 70 .57 ± f .33 G5 . 2 ± 4.5G ~I 200 GJ .2G .7 G9 .22 + 1.90 HL' 2010 G . 13 ±·1.1G G3 . :3 ± 3.2 1 HI 2010 G7. 73 ± .r: .09 G9 .G5 ± 1.5C 20 11 7.5C ±.5.71 73 .10 ± 3.1 20 1:3 G .50 l::: .. 0 7. 1 _L 3.G5 2013 70 .7 .3.1 72 .20 J: 3.52 2011 GG .. ± 1.73 70 .71 ±:3.11 QPIF G9 .01 ± 1.:3G 74.82 ± 2.49 E 70 . 13 ± ·1.1G 74.18 ± 3.35 73 .12 ± 3.11 77.23 ± 3.81° ~I'. 'R ' R EASR+GP rote: o inclicat s t at is ti ally sign ifi ·aut improvcm nt of acc t1racy compmcd tot he second besl r suit atn= 0.1 It i a lso worth m ntioning that a ,~ implP rn rmblr of GP binary classifirrs without c'mployiug t hC' spcciali;mi icm - gC'ucr a lizai ion learning st rai c·gy pnforuH'port d for th r for '"e did n tin lud thi. r. llt in Table G.. PI vYhile as an l 0 x 0 reso lution, R m t hod. prrformecl vrr closr to each ot h r in all t hr e datas ,t , xprct cL E , 'R i P p rform d l rttrr than both of tlJPsr m tho Is. The P and E R rnal>l 1 us to achieve a b tt er prrformanc , noticrably oml inatiou of high r than th indi,·i hnl components of the method ( A R and thi ~ in Teas in idcntifica t ion rate to t hr a well as t h P ). We attril utP SR · fitrrngth in d aling with noisy fram , r ·s str ngth in capturing underlying 11011-linear st ructures in data. Computational Complexity 6 .1 \ Ve also report the average computation tin1e of all mrt hods in exprriments on the YouTul C 1 brities dataset for the truncated sequences (with 50 frame~). All the timing re ult s ar reported based on running ::\Iatl~ c bl the 1: " rag . using the notion of quantun11 robability theory, namely, QPIF ancl its dual Pxtrnsion EA R. The propo ed reprcs ntation structures are dr ·ign d to minimizr the cffrct of noisy fram , in a video based fac icl ntification task. T!JPsr two rrprcsentation , tructur s specifically target those frames that arr not nsdnl forth idrntifica- tion task, nwinly clue to fac occlusion, low rrsolution of image, or failnre oft hr fa ·e trackrr algorit lun. Ther forr, unlike most of thr nlC'thods in the literature which usc sophish a ted nm1-linPar rcprrscntntious, these h\"O methods k< <'P t lH H'JH'<'S<'lli.a t iou linear and simpl<' whil<• n•1 aiuing t ll<' Sll]H'rior lH'rfonllntity. A promising extrnsion of this w rk wonlcl b to modify the ntli r filtPring proc ss in E 'R by utilizing a prol abili. tic approach that is a 1 l to a. sign R (kgrer of uncc'rtainty on h w w 11 each fran1e i. a good r pre. entative of an individual's face. Thi. should b a con1paniecl by a prediction nwt hod that can exploit t hr rxtr information provided by . uch w ighted sampl s. Consequently wr will have a clas ification approach that is aware of tlw quality oft he sam1 les nncl know . on which of ilH'lll it should rely the most in ord<'l' to perform the• id<'ll1iii.('ation task ffecti Ye ly. • En1ploy EASR approach Ellong with its outlier filtering process as a genend purpose filtering approach to improve o1 her methods in the litPrnturr in terms of t ll<'ir n'sili<'IH'<' to uoisy fnuncs. • sr oth<'r kern<'ls (e .g. , p~Tamid match k<'rncl propos<'cl h~· Tl' locali~<'d fC'atnr<' d<'scriptors such as , calP-Invariant FC'atur<' Transform (SIFT) pr po, db.' L w • b y nl th [20 4]. fRee i l ntific tion task and trst h propos d m thod with da ak t for oth r t r s of \'i 1 o bas d r ·ognition tasks (r.g .. objrct cat g rization). GG Bibliography ranclj lmric. . lwkhnarovich, .J. Fi.-lwr. R . 'ipolla. and T. D arrell. Face recog- niti n with image : :; t. u, ing n1a nifold drn. ity divergence . In IEEE omp1der Vz zon and Patt nz Rc rogmhon Yolmw· 1. pagc·s S 1- S onfcrcnc on . 200.rJ. L li E B all ntine. The st ati. tical intc'rpr tat ion of qu antum nw ·lwnics. R vz ws of Mod rn Ph y ir , 42( 4):35 . 1970. Leo Breiman. B agging pr dictor. . .A1achm learning. 24(2):12:3- 140. 1996. H . Cevikalp and B. Triggs. Face rccogn i1ion based on image· SC'1 s. I11 IEEE Confcrcnre on Computer Vision and Pa ttern Recognition. pages 2567- 2.57:3, 2010. Yi-Ch en Chen. Vish al I. Dictionary-based fac P at el, P .Jonathon recognition from video . Phillips. and R anw Chdh!p} a. In Andrew Fitzgibbon. 'v tlana Lazchnik Pietro Peron a Yoi chi Sato aw l Cordelia Schmid, editors. ComJndc r ' ' Vision, ECCV 201 2, vohune 7577 of Lerturr Not( s lll Computer Clcncc, p age~ 766- 779. Spring r B erlin I-Icidcllwrg, 201 2. Zh n \1i , Hong Clwng, Shignnng h(O)::H)(i :~12. 2014. I 09 2r:-23 12. doi : ht tp: I I dx. doi .orgl 10.101 G/.i .ncncorn.20 13 .12. 004. URL http://www.s ciencedirect.com/science/article /p ii /S092 52312 1301148X. l avn et Dalal an 1 In ill Triggs. His grams of orient d grAdi nt s for hmnan d tech n. ompu.t r l L wn and Patt rn R cogn1 i10n. 2005. oc1 ty on} rene on \'olumr 1, p ag s VPR 2005. IEEE Comput r G- f);). IE E, 2005. arl H nrik Ek, Philip H ' orr, and :'\C'il D Lawrrnc r. 'au. sian process latrnt vari a blr m drL f r lmmAn pos s tinl Ation. In flfachm p ag 1 2- 14' . lrar-nm g fo r muUzmodal int raction, pring r, 200 . I\ azu hiro Fukui and samu Yamaguchi . Facr recognition usin g multi-viewpoint p at- t rn for rob t vi. ion. In Robot7 c. Re s arch, p agrs 192 201. pringer 2005. Kri ten Grmuna n a nd Tre\·or D arr ll. Thr pyrAmid lllAtch krnwl: w ith s t of feature R a lph Gro · and Jianb R eport Th ffici nt learning Journal of llfochm Lrar-nmg Rrsrarch. :72.5 760, 2007 . 'hi. The c1n11 motion of body (mobo) d atabase. Technical 'I\1U-RI-TR-Ol-1 , R ob otics Institute. Pitt sburgh, PA, June 2001. Lov K Grover. A fA t qua ntum m ech Anica l a lgorithm for datahAsr search. In Proceed- ings of th tw nt;y-eighth annual A CJ\1 symposimn on Theory of computmg. 1 ages 212- 219. AC 1, 1996. egar H assan p our and Li a n g Chen. A hierarchical training and idcntificat ion nlf't hod using gaussian process rnodrls fo r face rccogn iticm in video!:'. In Tht 1 Jth IEEE Int rnatiorw.l Confcrrncr on Automaf1c Fuce ond Gcstul'e Rccoqmtzon, :.JllGn. N egar H assa n pour ancl Liang Chen . qwmt mn theory inspirrd frnmcwork for fa('c id<'u1ificatiou iu \'id<'os. Tcdllli('al rq)()r l. Uni \'<'rsi!y of .:'-Jorllwru I3rilish Columl>iil . D cp arinH'nt of Computer 1 CH'llC'<', Computational Int cllig<'llC<' Laboratur~·. :20l.Jb . Yiqun II u , jmal • inlat d n ar 1ian rm d Robyn Owens. Face n'cognition using sparse approx- t point._ brtwrcn imag . ct. . Pattern A naly 1: and Machin g nc . IEEE Tra.n actwns on, hi h K ap r. Kri . trn Int lli- (10):19 2 2004 2012. nnnnan, R aqn l rtasun , and Trrvor D arrell. ctiv learn- ing with ga ussinn pro ·rssc's for ohjPc:t rntPgori"'ation. In Compv,t r Vis1:on 1 2001. IC l 2001. IEEE 11th Interrwtzonal Kihwan Kin1 . Don gr~· ol Lee . a n l Irfan analy . i ofm tion traj ctori ..·. In onfrrrnrr on pages 1- . I Si-ia . E. 2007. Russwn procr. s rrgres. ion flow for ompvt r V1 s1on (I V), 2011 IEEE Int rna- twnal Conf rcncr on. pages 11G ~1- 1171. IEEE . 2011. Iiny ung Ki1n . . Kmn ar. V . P avlovi , and H. Rowky. Face tracking and recognition with visual con tr int::-; in r a l-worll vid os. In IEEE Con.frr ncf on Compvter Vi wn and Patt rn R cognzhon. p ages 1- . 200 · . Tae-K 'Un Kin1. J osef Kittler. and Rob erto ipolla. Discriminative I aruing nnd rrcog- nition of imag set classes using canonical corrolations. Patt rn1 Anal.1Js?.s and fda- chine Intelligenc . IEEE Transactions on, 29(G):l00.5 101 . 2007. Ku ang-Chih Lee. J. H o I\ Iing-Hsu an Yan g, and D . KriPgman. Video-b ased face recognition using probabilisti c a ppearc ncP m a nifold .. In Computer l 'zs wn and Patt ern R rogm.tion. 2003. Proceedings. 2003 IEEE Compu teT Socu ty Confcrrnc on volunw 1) p ages I- 313- I- 320 vol.1 ) 2003. D avid G Lowe. Distinctive in1age fcaturrs frmn scalr-invariant kcypoints. I nterna- tional )o'U,Tnal of computer vision. 60 (2):9 1- 110, ...,00-1 . Kcviu P Murphy. !11achine lro:r?ling: a. pnJbahtltsfz c JWrsp rdzuc. ~111 pn '~s. :2012. imo Ojala ) Math PiC'hbiiw'l1 and David Harwood. A cmnpnrntivc st ndy oft t>xt un' 1 learning for robust visual tracking. l nt rnational Journal of Comput r Vision , 77 (1-3) :125- 141 200 . I R obert n1argn1 : 0 20- 691. chapir Yoav Freuncl, PEt r n w xplanation for th artlrtt and Wee un L r. Boosting th ff ctivencs. of voting m thods . Annals of . tati t1.cs, pagr s lG.- l - 16 6 199 . P a ul Vi ola and ~Ii h a l .Jonr,. Robust r cal-timr facr d tection . International Jov.rnal of Comput r V?s107L 57:1 7- 154, 2004 . .John von ::\r um ann and Robert T B ycr. Jda.th nwtz cal foundat ions of quantum m echan? cs. Princeton ni\'Crsity Pr s ·, 195.5 . Ruiping \\Tang and Xilin Ch n. ~ I anifold discriminant an alysis. In IEEE Con} r n ee on Comput T Vi wn and Patt ern R ccogmhon, p agrs 429- 43E.i . 2009 . R uipin g \ Vang, H uimin Guo. L. . Davis, and Qionglw i Dai. Covariancr di, criminative 1 arning: A n atural and effici ent approach t o imagr set classifica tion . In Com puter Vision and Patt rn R cogniti on (C VPR ) . 201 2 IEEE Co11.frn"ncc on. p ages 2496- 2503. Jun 2012 a . doi: 10.1109/ CVPR .2012 .6247965. Ruiping \ Vang Shiguang Shan , Xilin Chen , Qiongha i Dai. and \\'en Gao. i\lanifolclInanifold dist ance and it s application to face recognition \\'ith image s ts. IEEE Tran sactions on Im.age Processing, 21(10 ):440G-,117D, :201 2h. Tiesh ng Wang and Prngfri Shi. Krrn 1 gr a:-;sm annian cbst ancc's and di, crimiu nnt analysis for facr rr cogni t ion from image srt ;-). Po tt crn R erognit to 11 L ett ers. 30 ( 13) · 1161- 1165 , 2009 . BC'ruard L \V<'kh. Th<' f!;<'ll<'raliza( iou of ~1 w lcu! 'sj)roi>l<·m wlwu :-i<'V< ·rcd diifcr<'lll pop- ula tion variances art' iuYolvcd . Btom (' /nka , :31 ( 1/2):28 ~3 5, 1917 . 71 0. Yamaguchi K. Fukui , and K . s qu n . In Pro 1aC'da. Face rC'cognition u smg t<'mporal image' ding of 3rd IEEE Int rnational Con} r nee on Automatic Fac and Gestur R cogn1iion page ~1eng Yan g, P ngf i Zhu , L. Van 1 - 32 , 1 9 . ool and L i Zhang. Fac rccogniti n bas d on reg- ubri;:;;Pd nrarC'st points h twe n i1nage sc'ts. In 1Oth IEEE Internatwnal onferenr and Workshops on Automat?.c Fa , and G siur Rrrogm.tion p ag s 1 7, 2013. Pcn gfci Zhu. \ angmf'ng Zuo , L i Zhc ng . .- K . ba d collal oratiYe repr sent tion f r fa hiu ancl D. Zhan g. Image set- recognition. Infonnation Forrnsics and Secunty, IEEE Transact? on. on, 0(7): 1120- 11;32, 2014 . ..., ,) r~