Josh Journal Club
Jet reconstruction, substructure, boosted object tagging

I. Introduction

	High energy -> collimated jets, \Delta\Theta ~ m_0/E_0

	Boosted object tagging: looking for top, Higgs, BSM

	Not to be discussed today, but important
		Different QCD structure -> different soft junk
		LHC is not just hi E, but hi Lumi
			-> ISR, multiple interactions, pile up: interferes with jets
				These things have different distributions
				Use: pruning, trimming, etc.
				Can tell you about color representation (octet, singlet) resonance
	
				
II. GOALS:

	A. Use new clustering algorithms for substructure
	B. Use substructure to find ~100 GeV particles
	C. Practical implementation using the language of A and B
	
	
III. Modern (for substructure) clustering algorithms

	Clustering: Mapping from parton/hadron 4-momenta {pi} to jets
		in our examples: hadron-level (using Pythia to go from parton -> hadron)
	
	Old way: iterative cone-based algorithms
		Difficulties:
			1. Hard to make IR collinear safe, computationally costly
				Soft things tend to have large effects... see Gavin Salam's talks for examples
				See, e.g. syscone
			2. Difficult to see substructure
				End up with a bunch of cones and a bunch of particles
	
	New way: sequential recombination
		Define a distance metric dij between pseudojets (parton/hadrons or calorimeter cells)
			... pseudojets will be grouped together into jets
		Also define a single-object distance (e.g. with respect to the beam) diB
		
		Algorithm:
			1. Find min{dij, diB} 
			2a. If dij is the minimal guy, then replace {pi, pj} -> {p(i+j)} in pseudojets
			2b. If dib is the minimal guy, remove i from pseudojets, add to list of final jets
			3. repeat
			
		Differs from cone algorithm in the periphery
			... the crap that goes around the hard stuff
			
		Benefit for substructure: allows you to better "go back in history" of parton shower
			to see, for example, hard splitting
			
		dij = min (kTi^2, kTj^2)^p Delta R / R^2
			p = choice of {-1,0,1}
			R ~ size of the jet, gives relative importance of dij vs diB
			kT = sqrt(px^2+py^2)
			Delta R = sqrt(Delta eta^2 + Delta phi^2)
				Delta eta: difference in pseudorapidity (instead of azimuthal angle)
				Delta phi: barrel angle between two jets
		
		diB = kTi^p
		
		p = 1 is the kT algorithm: soft stuff combined first
		p = 0 is the Cambridge-Aachen algorithm: nearby stuff combined first
		p = -1 anti-kT: hard stuff combined first
		
		larger p -> more parton shower like... soft first,  hard parton at the end
		smaller (neg) p -> more cone-line, jets look more like cones
		
		
IV. Substructure and boosted object tagging
		Several ideas, field is still in a bit of a mess---many new developments
		No consensus on an ideal algorithm, we'll focus on two from the theory community

			1. Johns Hopkins
					- Cluster with Cambridge-Aachen with large R (e.g. R=1)
						end up with a bunch of jets
					- Then "decluster" step by step starting at the end
						to understand the things that went into each jet
						If min(pT1, pT2) < delta(p) * pT(jet from clustering) and ONLY the minimum of those satisfy this
							(if one is much softer than the other)
							then throw out the softer guy

						Repeat until you have four subject:
						 1. both too hard, don't satisfy inequality
						 2. both too soft, both satisfy inequality
						 3. too close (Delta R < delta(R))
						 4. only 1 pseudojet left
								Why? Expect particle decay to go to equal pt stuff
										(on average)
									 Expect parton shower to go like 1/pT
										if one is much softer, probably from parton shower
					
						If we get 2,3,4 => say it is irreducible and we stop
						Repeat until up to 4 irreducible subjets (4 is a choice for tops)
							why 4? Expect 3 jets for top, but can have one hard parton
						
							If 3-4 jets with total mass near mt, 2 subjects near mW and cos theta(W) << 1... then top
							i.e. want jet that separates into 3-4 subjets that reconstruct mt and mW

					
				See how this is better than cone? The sense of "history" 
				in the clustering tells us about subjects that go into the jets
				
				
				2. N-subjettiness, tn (tau_N)
					
					a) Take all constituients of a jet (clustered somehow), and recluster using exclusive kT algorithm
						exclusive kT: modified kT so that you require exactly N subjets, also throw out soft things
					
					b) tn = 1/do Sum_k: pT,k min_i(Delta Ri,k) 
						where i indexes other jets
						do = Sum_k: pT,k R 
						not an integer---tells you how much event looks like it has N-subjets
					
						if tn approx 0, then radiation aligned with subject directions (< or = N subjets)
						if tn >> 0, then radiation misaligned (> N subjects)
						
						Take ratios to determine which N is best
							e.g. for boosted W, a good variable to cut on is tau2/tau1
							e.g. for boosted top, tau3/tau2
							
			
PRACTICUM FRI after 12:30