Clojure/West 2015: Notes from Day One

Life of a Clojure Expression

  • John Hume, duelinmarkers.com (DRW trading)
  • a quick tour of clojure internals
  • giving the talk in org mode (!)
  • disclaimers: no expert, internals can change, excerpts have been mangled for readability
  • most code will be java, not clojure
  • (defn m [v] {:foo “bar” :baz v})
  • minor differences: calculated key, constant values, more than 8 key/value pairs
  • MapReader called from static array of IFns used to track macros; triggered by ‘{‘ character
  • PersistentArrayMap used for less than 8 objects in map
  • eval treats forms wrapped in (do..) as a special case
  • if form is non-def bit of code, eval will wrap it in a 0-arity function and invoke it
  • eval’s macroexpand will turn our form into (def m (fn [v] {:foo “bar :baz v}))
  • checks for duplicate keys twice: once on read, once on analyze, since forms for keys might have been evaluated into duplicates
  • java class emitted at the end with name of our fn tacked on, like: class a_map$m
  • intelli-j will report a lot of unused methods in the java compiler code, but what’s happening is the methods are getting invoked, but at load time via some asm method strings
  • no supported api for creating small maps with compile-time constant keys; array-map is slow and does a lot of work it doesn’t need to do

Clojure Parallelism: Beyond Futures

  • Leon Barrett, the climate corporation
  • climate corp: model weather and plants, give advice to farmers
  • wrote Claypoole, a parallelism library
  • map/reduce to compute average: might use future to shove computation of the average divisor (inverse of # of items) off at the beginning, then do the map work, then deref the future at the end
  • future -> future-call: sends fn-wrapped body to an Agent/soloExecutor
  • concurrency vs parallelism: concurrency means things could be re-ordered arbitrarily, parallelism means multiple things happen at once
  • thread pool: recycle a set number of threads to avoid constantly incurring the overhead of creating a new thread
  • agent thread pool: used for agents and futures; program will not exit while threads are there; lifetime of 60 sec
  • future limitations
    • tasks too small for the overhead
    • exceptions get wrapped in ExecutionException, so your try/catches won’t work normally anymore
  • pmap: just a parallel map; lazy; runs N-cpu + 3 tasks in futures
    • generates threads as needed; could have problems if you’re creating multiple pmaps at once
    • slow task can stall it, since it waits for the first task in the sequence to complete for each trip through
    • also wraps exceptions just like future
  • laziness and parallelism: don’t mix
  • core.async
    • channels and coroutines
    • reads like go
    • fixed-size thread pool
    • handy when you’ve got a lot of callbacks in your code
    • mostly for concurrency, not parallelism
    • can use pipeline for some parallelism; it’s like a pmap across a channel
    • exceptions can kill coroutines
  • claypoole
    • pmap that uses a fixed-size thread pool
    • with-shutdown! will clean up thread pool when done
    • eager by default
    • output is an eagerly streaming sequence
    • also get pfor (parallel for)
    • lazy versions are available; can be better for chaining (fast pmap into slow pmap would have speed mismatch with eagerness)
    • exceptions are re-thrown properly
    • no chunking worries
    • can have priorities on your tasks
  • reducers
    • uses fork/join pool
    • good for cpu-bound tasks
    • gives you a parallel reduce
  • tesser
    • distributable on hadoop
    • designed to max out cpu
    • gives parallel reduce as well (fold)
  • tools for working with parallelism:
    • promises to block the state of the world and check things
    • yorkit (?) for jvm profiling

Boot Can Build It

  • Alan Dipert and Micha Niskin, adzerk
  • why a new build tool?
    • build tooling hasn’t kept up with the complexity of deploys
    • especially for web applications
    • builds are processes, not specifications
    • most tools: maven, ant, oriented around configuration instead of programming
  • boot
    • many independent parts that do one thing well
    • composition left to the user
    • maven for dependency resolution
    • builds clojure and clojurescript
    • sample boot project has main method (they used java project for demo)
    • uses ‘–‘ for piping tasks together (instead of the real |)
    • filesets are generated and passed to a task, then output of task is gathered up and sent to the next task in the chain (like ring middleware)
  • boot has a repl
    • can do most boot tasks from the repl as well
    • can define new build tasks via deftask macro
    • (deftask build …)
    • (boot (watch) (build))
  • make build script: (build.boot)
    • #!/usr/bin/env boot
    • write in the clojure code defining and using your boot tasks
    • if it’s in build.boot, boot will find it on command line for help and automatically write the main fn for you
  • FileSet: immutable snapshot of the current files; passed to task, new one created and returned by that task to be given to the next one; task must call commit! to commit changes to it (a la git)
  • dealing with dependency hell (conflicting dependencies)
    • pods
    • isolated runtimes, with own dependencies
    • some things can’t be passed between pods (such as the things clojure runtime creates for itself when it starts up)
    • example: define pod with env that uses clojure 1.5.1 as a dependency, can then run code inside that pod and it’ll only see clojure 1.5.1

One Binder to Rule Them All: Introduction to Trapperkeeper

  • Ruth Linehan and Nathaniel Smith; puppetlabs
  • back-end service engineers at puppetlabs
  • service framework for long-running applications
  • basis for all back-end services at puppetlabs
  • service framework:
    • code generalization
    • component reuse
    • state management
    • lifecycle
    • dependencies
  • why trapperkeeper?
    • influenced by clojure reloaded pattern
    • similar to component and jake
    • puppetlabs ships on-prem software
    • need something for users to configure, may not have any clojure experience
    • needs to be lightweight: don’t want to ship jboss everywhere
  • features
    • turn on and off services via config
    • multiple web apps on a single web server
    • unified logging and config
    • simple config
  • existing services that can be used
    • config service: for parsing config files
    • web server service: easily add ring handler
    • nrepl service: for debugging
    • rpc server service: nathaniel wrote
  • demo app: github -> trapperkeeper-demo
  • anatomy of service
    • protocol: specifies the api contract that that service will have
    • can have any number of implementations of the contract
    • can choose between implementations at runtime
  • defservice: like defining a protocol implementation, one big series of defs of fns: (init [this context] (let …)))
    • handle dependencies in defservice by vector after service name: [[:ConfigService get-in-config] [:MeowService meow]]
    • lifecycle of the service: what happens when initialized, started, stopped
    • don’t have to implement every part of the lifecycle
  • config for the service: pulled from file
    • supports .json, .edn, .conf, .ini, .properties, .yaml
    • can specify single file or an entire directory on startup
    • they prefer .conf (HOCON)
    • have to use the config service to get the config values
    • bootstrap.cfg: the config file that controls which services get picked up and loaded into app
    • order is irrelevant: will be decided based on parsing of the dependencies
  • context: way for service to store and access state locally not globally
  • testing
    • should write code as plain clojure
    • pass in context/config as plain maps
    • trapperkeeper provides helper utilities for starting and stopping services via code
    • with-app-with-config macro: offers symbol to bind the app to, plus define config as a map, code will be executed with that app binding and that config
  • there’s a lein template for trapperkeeper that stubs out working application with web server + test suite + repl
  • repl utils:
    • start, stop, inspect TK apps from the repl: (go); (stop)
    • don’t need to restart whole jvm to see changes: (reset)
    • can print out the context: (:MeowService (context))
  • trapperkeeper-rpc
    • macro for generating RPC versions of existing trapperkeeper protocols
    • supports https
    • defremoteservice
    • with web server on one jvm and core logic on a different one, can scale them independently; can keep web server up even while swapping out or starting/stopping the core logic server
    • future: rpc over ssl websockets (using message-pack in transit for data transmission); metrics, function retrying; load balancing

Domain-Specific Type Systems

  • Nathan Sorenson, sparkfund
  • you can type-check your dsls
  • libraries are often examples of dsls: not necessarily macros involved, but have opinionated way of working within a domain
  • many examples pulled from “How to Design Programs”
  • domain represented as data, interpreted as information
  • type structure: syntactic means of enforcing abstraction
  • abstraction is a map to help a user navigate a domain
    • audience is important: would give different map to pedestrian than to bus driver
  • can also think of abstraction as specification, as dictating what should be built or how many things should be built to be similar
  • showing inception to programmers is like showing jaws to a shark
  • fable: parent trap over complex analysis
  • moral: types are not data structures
  • static vs dynamic specs
    • static: types; things as they are at compile time; definitions and derivations
    • dynamic: things as they are at runtime; unit tests and integration tests; expressed as falsifiable conjectures
  • types not always about enforcing correctness, so much as describing abstractions
  • simon peyton jones: types are the UML of functional programming
  • valuable habit: think of the types involved when designing functions
  • spec-tacular: more structure for datomic schemas
    • from sparkfund
    • the type system they wanted for datomic
    • open source but not quite ready for public consumption just yet
    • datomic too flexible: attributes can be attached to any entity, relationships can happen between any two entities, no constraints
    • use specs to articulate the constraints
    • (defspec Lease [lesse :is-a Corp] [clauses :is-many String] [status :is-a Status])
    • (defenum Status …)
    • wrote query language that’s aware of the defined types
    • uses bi-directional type checking: github.com/takeoutweight/bidirectional
    • can write sensical error messages: Lease has no field ‘lesee’
    • can pull type info from their type checker and feed it into core.typed and let core.typed check use of that data in other code (enforce types)
    • does handle recursive types
    • no polymorphism
  • resources
    • practical foundations for programming languages: robert harper
    • types and programming languages: benjamin c pierce
    • study haskell or ocaml; they’ve had a lot of time to work through the problems of types and type theory
  • they’re using spec-tacular in production now, even using it to generate type docs that are useful for non-technical folks to refer to and discuss; but don’t feel the code is at the point where other teams could pull it in and use it easily

ClojureScript Update

  • David Nolen
  • ambly: cljs compiled for iOS
  • uses bonjour and webdav to target ios devices
  • creator already has app in app store that was written entirely in clojurescript
  • can connect to device and use repl to write directly on it (!)

Clojure Update

  • Alex Miller
  • clojure 1.7 is at 1.7.0-beta1 -> final release approaching
  • transducers coming
  • define a transducer as a set of operations on a sequence/stream
    • (def xf (comp (filter? odd) (map inc) (take 5)))
  • then apply transducer to different streams
    • (into [] xf (range 1000))
    • (transduce xf + 0 (range 1000))
    • (sequence xf (range 1000))
  • reader conditionals
    • portable code across clj platforms
    • new extension: .cljc
    • use to select out different expressions based on platform (clj vs cljs)
    • #?(:clj (java.util.Date.)
      :cljs (js/Date.))
    • can fall through the conditionals and emit nothing (not nil, but literally don’t emit anything to be read by the reader)
  • performance has also been a big focus
    • reduced class lookups for faster compile times
    • iterator-seq is now chunked
    • multimethod default value dispatch is now cached