"Sundry" lecture:
=================
* Cover a few different aspects of how computer systems interact with finance
* Mostly from the technical side.
* A few financial aspects for color.

3 topics in particular:
=================
* Distributed Systems: Consensus
* Programming Languages: OCaml
* Generative AI: BloombergGPT
* Not exhaustive in any way, just a particular set of examples
* Idea is to whet your appetite so that you can dig deeper as your interests take you
* All of these involve computer systems at the bleeding edge:
  --> High performance consensus, functional programming, and generative AI are all topics of mainstream research in the CS research world

Distributed Systems and Consensus:
=================
* Heart of many large-scale services today
* You may have heard of protocols like Raft and Multi-Paxos
* Typically give you reasonable performance
--> (~10K requests per second, a few ms latency)
* What if you want an order of magnitude better performance?
* One way to do it:
--> Build a highly customized network fabric
--> Use "bare-metal" technologies like RDMA
--> Nezha's approach: use clock synchronization instead.
* Much more in the Nezha slide deck

Programming Languages and OCaml:
================
* Paper: Caml trading: experiences with functional programming on Wall Street
* Describes Jane Street's experience with OCaml
* Unusual for its time (and even now):
* Most finance firms use
  C++ for performance in production or
  Python for productivity in research
* What does JS use OCaml for?
  --> Live monitoring of risks and positions
  --> Trading Systems
  --> Order Management and transmission systems,
  --> Historic data management
  --> Quantitative research

But, first, what is OCaml?
=================
* Functional programming language
* Functions are first class citizens
* Declarative rather than imperative
* Typically associated with a few different features

Features of functional programming:
===============
* higher-order functions (functions that take functions as arguments)
  --> filter, map, fold, etc.
* algebraic data types
* strong static type systems
* immutability
* lambda functions
* Many other mainstream languages have functional features now:
  --> Python, C++, Go, Java
* Functional programming is now more of a style of programming,
  rather than an attribute of specific programming languages

Important point that keeps coming up in the paper:
================
* The ability of a system "not to trade"
* cf. safety incidents like Knight Capital Group
* "one of the easiest ways that a trading company can put itself out
   of business is through faulty software"

What JS values about OCaml:
================
* Readability
* Performance
* Macros

Readability:
================
* Readability is critical for code that makes trading decisions
* Important for code reviews and catching errors before production
* Common practice in most of the software industry,
  but esp. important in trading
* Terseness: contrast a for loop vs. a map function
* Immutability: arguments to functions are not mutable by default
* Pattern matching: algebraic data types:
  give examples of expressions and parsing
* Labeled Arguments: so that you don't swap arguments
* Type systems: make illegal states unrepresentable:
  --> Make the compiler work for you
* Polymorphic variants: Avoid exceptions, but instead define
  erroneous states as part of the possible values of a data type
* Modularity

Performance:
=================
* They are mostly discussing performance _predictability_,
* Easy to determine how fast a piece of code will run and
  how much space it is going to use
* Important for systems where responsiveness and scalability matter
* Can move garbage collection work off the critical path
* Foreign function interface to interact with C++ libraries,
  e.g., numpy in Python
  Need to do FFI to interact with certain native libs
  Alternative is to use poor FFI interfaces from other managed langs like C#
* Compiler is easy and straightforward to understand

Macros:
================
* Modify language at syntactic level
* camlp4 is a macro system that:
  understands the OCaml AST and
  can be used to add new syntax to the system
  or change the meaning of existing syntax
* allows you to modify/rewrite
  parts of a program into other parts
  using a rewrite engine
* similar to the C pre-processor

OCaml drawbacks:
================
* Generic operations, e.g., generic printers
* objects in OCaml can hamper productivity
  esp. for programmers from other langs
* lack of optimizations in compiler
* lack of parallelism
* cathedral development model cf. Eric Raymond's Cathedral vs. Bazaar
* programming in the large: ecosystem, build tools, package manager, stdlibs

Some counterintuitive benefits:
================
* Hiring
* Easier for others to become productive in the language

What's unclear:
===============
* Could these benefits have accrued from other functional langs?
* Could we do this using functional programming styles
  in other mainstream langs?

Why it may have succeeded:
==============
* Stringent requirements for correctness
* Early success in OCaml for research made it easier to use as primary language
* Small team size
* Specialized in-house software

BloombergGPT paper:
=================
* An LLM tailored to finance

What's interesting about it:
================
* A (smaller) LLM tailored to finance: only 50 billion parameters
(gpt-4 is rumored to have 1 trillion+)
* Outperforms prior approaches on finance tasks
* Competitive on general-purpose language tasks

Dataset that went into training:
================
* Carefully curated
* Table 1: About half of the tokens are fintech specific AND
  the remaining half are public: 363 B vs. 345 B.

Financial datasets:
================
* WEB: crawl high-quality websites that have financially relevant information,
  not a general crawl of the web
* NEWS: news sources excluding news articles written by Bloomberg journalists
* FILINGS: Financial statements made by companies and
  that are made available to the general public.
* PRESS: Company press releases
* BlOOMBERG: Bloomberg news and opinions and analysis, real-time news

Public datasets:
===============
* The pile (includes GitHub and FreeLaw)
* C4 (includes patents)
* Wikipedia

Tokenization:
==============
Unigram tokenizer algorithm to decide how to split words into tokens
Parallel tokenizer training

Training:
=============
Tried to leverage recent results on Chinchilla scaling laws
to decide how big the model should be (parameters) and
          how big the dataset should be (tokens)
They arrive at 50B parameters and 1000B tokens
They only have about 700B, but are limited by amount of domain-specific data
This gives them some headroom for failures, restarts, etc.
Essentially: answers the question of how best to make use of compute budget
Used SageMaker for training
Training chronicles that document their various attempts
along the way to getting to their final model

Evaluation:
==============
* Public fintech benchmarks
* Internal benchmarks
* General language benchmarks
* Main takeaway: Competitive with much larger models on general tasks,
but performs better on financial tasks

Use cases enabled by it:
===============
* Generating Bloomberg Query Language
* Suggesting news headlines given section content
* Financial question answering, e.g., who is CEO of company X?

Openness:
=============
* decided not to release model 
* worries around public model weights eventually lead to leaks of training data

Takeaways:
============
* interesting domain-specific use of LLMs
* not totally clear how it does both
  (1) better perf on fintech tasks AND
  (2) competitive perf on general tasks. 
* likely because of all the attention that went into data cleaning
* nicely integrates many cutting edge AI techniques
  along with domain-specific data
  --> unigram tokenizer
  --> chinchilla scaling
  --> public datasets
  --> private datasets,
  ...