Today's lecture:
* Communication in HFT
* Other places where HFT technology can be applied: cISP
* (Largely technical lecture)
* (Will talk very little about economics or politics of HFT today)

Communication in HFT
* Latency is everything:
--> within a datacenter
--> across datacenters
--> within a metro
--> within a computer
--> within an FPGA

This lecture:
* Many different techniques to reduce latency
* Techniques aren't really unified except in that they all target low latency
* Think of them as a bag of clever tricks for low latency
* The tricks span different aspects of computer systems

The architecture of a trading system:
* Ample electricity
* Significant electric consumption and backup electricity sources.
* Matching engine
   --> Multiple engines, each handles a subset of financial instruments 
* Trading firm system connects over 2 cables to matching engine for redundancy
* Order gateways, recently implemented in FPGAs
* Market data publishing system: typically anonymous re: who executed what trade
* Signals from other cities (e.g., futures lead) or
  exchanges in the same metro (e.g., fragmentation)

Communication between Chicago and NYC:
* The first attempt at low-latency: the gold line
* Status quo then: ISPs laying fibre
  --> Optimized for ease of repair
  --> Laying down cable near railroad lines
  --> Employed meet-me rooms and exchange centres with switches
  --> All of this adds latency
  --> Sufficient to gain a speed advantage if you are competing vs. a human
* Getting a private connection from an ISP at that time:
  --> VPN techniques
  --> Tunneling techniques
  --> Much higher latency
      --> (can encounter queueing delays, contention from other traffic, etc.)
  --> Not optimized for HFT
* Gold line:
  --> Dark fiber
  --> Fiber dedicated for use by one participant and not generally available
  --> Stitched together by taking existing cable fragments and connecting them
  --> Cable fragments were chosen to be as close to the geodesic as possible.

Moving beyond the gold line to fiber layed out *for HFT*
* Spread Networks
  --> Featured in Flash Boys
  --> Loosely inspired the movie Hummingbird Project
* Decided to lay out new cable entirely
* Built out of 250 separate (and new) chunks for secrecy
* Significant investment: $300 to 500 M
* But heavy fees to use Spread Networks' fiber: estimated at 176K per month
(fees today for lower latency links are much lower)
* Created interesting pinch points +
  rent-seeking behavior by owners of pinch points
* Probably didn't literally "drill through rock" as much as existing accounts suggest :)
* Reduced slack in cable:
  --> complicated repairs 
  --> but lowered latency by a bit
* Typical practice: pack many communication channels into a fibre
  --> Caused interference between these channels and bit flips
  --> Need FEC to fix this issue
  --> But that's expensive and adds delay,
  --> So forego FEC, but lower the latency in the process
  --> Reduce the amount of packing to reduce the amount of interference
* Predictability and fairness across different users of Spread Networks
  --> Want to ensure all users get the same latency
  --> If one strand has slightly longer latency, it would give some subscribers a benefit
  --> Equalized delay across all fibers within a cable,
      by lengthening strands that had lower refractive indices
* Had significant economic consequences for HFT:
  --> Firms who didn't pay for this kind of cable found themselves at a serious disadvantage

From fiber to microwave links: 
* Light travels much faster over the air than in fiber: c vs. 2/3*c
* This can significantly improve latency (Table 5.1, page 146 of HFT book)
* Microwave links: old idea going back to AT&T long lines for telephony,
* But emphasis here was on lower latency and higher bandwidth than telephony:
  --> Reduce latency in repeaters, required to re-amplify signals
* Because the competition was just fiber,
  microwave links could depart from geodesic and
  still have a latency advantage
* Communication licenses:
  --> Need to ensure that you don't
      interfere with other transmissions in the same frequency
  --> Path coordination notice is the FCC mechanism for that.
* Structural issues:
  --> Towers need to be high enough to have line of sight communication
  --> Towers need to support bulky antennas
* Used the 6 GHz band
  --> Quite widely used for other purposes
  --> But intereference with other incumbents is an issue
  --> Aside: licensed and unlicensed spectrum, 6 GHz was recently unlicensed for WiFi use
  --> But 6 GHz is the most reliable across distances:
      in general, higher frequencies => more attenuation

Lossy, but low latency microwave links:
* Consolidation across HFT firms and communication providers
* 3 main CHI-NYC links today (as many as 15 to 17 at one point)
* Focus on latency at the expense of packet drops
  --> "Better to be first 99% of the time than second 100% of the time"
* Low-latency techniques in microwave:
  --> Increase hop length, reduces number of repeaters
  --> Hop length went from 50 km to 110 km.
  --> Use higher microwave frequencies: 11, 18, and 23 GHz
  --> Higher frequency increases attenuation in general,
      but allows you to get closer to the geodesic,
      because 6 GHz (the more reliable freq) near geodesic is already crowded
  --> Place repeater equipment at the top of the tower instead of base
  --> Build new towers if needed: reuse them for regional cellular connectivity
  --> Shrink "fiber tail": get antenna as close to the datacenter as possible

mmWave and lasers for high bw and low latency metro communication:
* Between exchanges in the NY/NJ area, need to transmit order book information
* Tends to be much higher in volume than signals on future prices
* microwave links max out at around 150 Mbit/s, need 1 Gbit/s
* microbuilds: connect together existing cables at strategically important locations
* move to higher frequencies: 70 to 80 GHz.
  --> In general, higher frequencies => higher bandwidth => higher capacity
  --> but these frequencies are even more fragile and have very short range
  --> again, increase range by a bit at the cost of getting some packet drops
  --> also augment with directional lasers to improve resilience to rain
  --> hybrid systems that switch between microwave and laser
  --> interesting nugget: bird droppings on laser units: need a coating to slough it off

Queueing delays during microbursts
* Short durations of time when market data is sent at higher rate than mmwave can support
* Last 10s of microseconds (the net. literature calls these microbursts)
* Need to edit down the data at the source if possible
  to reduce queueing delays due to these microburts
* Another alternative, use a higher bandwidth communication medium in metro contexts:
  LMDS: intermediate between micro and mmwave
  Uses analog techniques to approach information theoretic capacity limits

Fills vs. market data streams:
* Fills of orders generate information on stock prices
* So does the market data stream from the exchange
* But the fills arrive earlier
* Gives trading firms incentives to send in "scout orders" to just probe the order book
* Then use the information gleaned from these probes to
  actually (profitably) execute a larger order
* Efforts by exchanges to reduce the gap between fill and market data streams
* Some exchanges send the fills later (e.g., Eurex)
* Interesting question on incentives:
  Which kind of data should come earlier?
  One view: sending fill information earlier,
  allows slower trading firms to get an edge (beginning of page 165)

Other techniques:
* Speculative triggering
  --> Start sending out order even before market data arrives
  --> If market data doesn't confirm your hypothesis, scramble checksum
  --> Can also just keep sending invalid orders all the time
  --> Make some subset of these orders based on market data
* Can lead to overload of exchange systems
  --> Exchanges trying to push back against this by discarding invalid orders early
  --> Alternatively, ask MPs to put in a discard IP
     on these orders for early discard at exchange
* Cable coiling: Exchange should ensure
  all participants have nearly the same latency to matching engine
* Overclocking:
  allows participants to run machines faster by designing better cooling solutions
* Choice of language and bare-metal programming
* FPGAs instead of servers
  --> Using more customized place and route processes on an FPGA
  --> Keep computations close together within an FPGA

************************************
The cISP paper:
* Addresses the question of how to get to a near-speed-of-light Internet
* Hints at what general-purpose use of HFT-type technologies could look like
* Indirectly answers: are there other uses for HFT-type technologies?

Why I like it:
* Interesting use of multiple techniques:
  --> Actual experiments on MW link
  --> Trading data analysis
  --> Pulling together many diverse data sources
  --> Emulation
  --> Optimization problem formulation
* Clean algorithmic problem that abstracts out
  specifics of low-latency techniques
* Ambitious take on what the Internet could look like at an infrastructral level
* Not clear whether microwave in the wide-area Internet will be available, but:
  --> Other wide-area low-latency techniques seem to be popping up
  --> CHISEL (NSDI 2024) to provision low latency optical slices that avoid packet switching delays
  --> Microsoft acquisition of Lumenisity: hollow core fiber cable

Main idea in cISP (borrowed from slide of NSDI talk at
https://www.usenix.org/system/files/nsdi22_slides_bhattacherjee.pdf:
* Designing a network topology for a speed-of-light Internet service provider in the US
* Approach:
  --> Select cities to connect (large cities and their associated suburbs in the US)
  --> Identify "feasible tower-to-tower hops":
      --> A pair of towers must be high enough to clear earth's curvature and obstacles in Fresnel zone.
      --> At most 100 km apart
  --> Pick shortest path of towers to *link* a pair of cities
      (need to run a shortest path algo., page 5 of paper)
  --> select an optimal subset of these *links*:
      --> minimize "stretch": ratio of actual distance of fiber+microwave AND geodesic distance
      --> subject to budget constraints (cost of building MW towers)
  --> input: tower data, terrain data, traffic matrix between cities
  --> output available here: https://drive.google.com/file/d/1953R0AYBysEmzoYa5HVEwUZmHj5yR0AV/view
* Even with rain, the performance of cISP is much better than fiber (Figure 4)

Real data from McKay brothers link:
* Round trip of 7.7 ms, consistent with Table 5.1,
in fact slightly better than its Einsteinian limit, which is curious
(the HFT book seems to account for the fact that light pulses bounce around within a cable)
(and there might be slight differences between start and end points between the two sources)
* At least 120 Mbit/s bandwidth
* High packet loss rate with no FEC
  --> Consistent with the HFT book: prioritize latency over losses
  --> Median underying bit error rate of 10**-5 can be corrected with lightweight FEC
  --> For cISP, the additional overhead of FEC is ok and STILL provides a benefit over fiber
  --> For a HFT firm, maybe not.
  --> Similar comments for cISP links not following geodesic paths:
      STILL better than fiber, but maybe not good for HFT

Market data analysis:
* Bad weather correlates to latency increases
* some microwave networks were always up because the network latency does not exceed 4.3 ms,
  but lowest fiber latency is 6.65 ms.

Questions to reflect on:
* What's the need for cISP-like technologies?
* Are there other approaches to achieving low latency that are much easier to implement?