Notes from Clojure/Conj 2017

It was a good Conj. Saw several friends and former co-workers of mine, heard some great talks about machine learning, and got some good ideas to take back to my current gig.

There were some dud talks, too, and others that promised one (awesome) thing and delivered another (boring) thing, but overall it’s inspiring to see how far Clojure has come in just ten years.

My notes from the conference:

DAY ONE

KEYNOTE FROM Rich Hickey: Effective Programs, or: 10 years of Clojure

  • clojure released 10 years ago
  • never thought more than 100 people would use it
  • clojure is opinionated
    • few idioms, strongly supported
    • born out of the pain of experience
  • had been programming for 18 years when wrote clojure, mostly in c++, java, and common lisp
  • almost every project used a database
  • was struck by two language designers that talked disparagingly of databases, said they’d never used them
  • situated programs
    • run for long time
    • data-driven
    • have memory that usually resides in db
    • have to handle the weirdness of reality (ex: “two-fer tuesday” for radio station scheduling)
    • interact with other systems and humans
    • leverage code written by others (libraries)
  • effective: producing the intended result
    • prefers above the word “correctness”, none of his programs ever cared about correctness
  • but: what is computing about?
    • making computers effective in the world
    • computers are effective in the same way people are:
      • generate predictions from experience
      • enable good decisions
    • experience => information => facts
  • programming is NOT about itself, or just algorithms
  • programs are dominated by information processing
  • but that’s not all: when we start talking to the database or other libraries, we need different protocols to talk to them
  • but there’s more! everything continues to mutate over time (db changes, requirements change, libraries change, etc)
  • we aspire to write general purpose languages, but will get very different results depending on your target (phone switches, device drivers, etc)
  • clojure written for information-driven situated programs
  • clojure design objectives
    • create programs out of simpler stuff
    • want a low cognitive load for the language
    • a lisp you can use instead of java/c# (his common lisp programs were never allowed to run in production)
  • says classes and algebraic types are terrible for the information programming problem, claims there are no first-class names, and nothing is composable
  • in contrast to java’s multitude of classes and haskell’s multitude of types, clojure says “just use maps”
  • says pattern matching doesn’t scale, flowing type information between programs is a major source of coupling and brittleness
  • positional semantics (arg-lists) don’t scale, eventually you get a function with 17 args, and no one wants to use it
  • sees passing maps as args as a way to attach names to things, thinks it’s superior to positional args or typed args
  • “types are an anti-pattern for program maintenance”
  • using maps means you can deal with them on a need-to-know basis
  • things he left out deliberately:
    • parochialism: data types
    • “rdf got it right”, allows merging data from different sources, without regard for how the schemas differ
    • “more elaborate your type system, more parochial the types”
    • in clojure, namespace-qualified keys allow data merging without worrying about colliding schemas (should use the reverse-domain scheme, same as java, but not all clojure libraries do)
    • another point: when data goes out over the wire, it’s simple: strings, vectors, maps. clojure aims to have you program the same inside as outside
  • smalltalk and common lisp: both languages that were designed by people for working programmers, and it shows
    • surprisingly, the jvm has a similar sensibility (though java itself doesn’t)
  • also wanted to nail concurrency
    • functional gets you 90% of the way there
  • pulled out the good parts of lisp
  • fixed the bad parts: not everything is a list, packages are bad, cons cell is mutable, lists were kind of functional, but not really
  • edn data model: values only, the heart of clojure, compatible with other languages, too
  • static types: basically disagrees with everything from the Joy of Types talk
  • spec: clojure is dynamic for good reasons, it’s not arbitrary, but if you want checking, it should be your choice, both to use it at all and where to use it

Learning Clojure and Clojurescript by Playing a Game

  • inspired by the gin rummy card game example in dr scheme for the scheme programming language
  • found the java.awt.Robot class, which can take screenshots and move the mouse, click things
  • decided to combine the two, build a robot that could play gin rummy
  • robot takes a screenshot, finds the cards, their edges, and which ones they are, then plays the game
  • lessons learned:
    • java interop was great
  • when clojurescript came out, decided to rebuild it, but in the browser
  • robot still functions independently, but now takes screenshot of the browser-based game
  • built a third version with datomic as a db to store state, allowing two clients to play against each other
  • absolutely loves the “time travel” aspects of datomic
  • also loves pedestal

Bayesian Data Analysis in Clojure

  • using clojure for about two years
  • developed toolkit for doing bayesian statistics in clojure
  • why clojure?
    • not as many existing tools ass julia or R
    • but: easier to develop new libraries than in julia (and most certainly R)
    • other stats languages like matlab and R don’t require much programming knowledge to get started, but harder to dev new tools in them
  • michaellindon/distributions
    • open-source clojure lib for generating and working with probability distributions in clojure
    • can also provide data and prior to get posterior distribution
    • and do posterior-predictive distributions
  • wrote a way to generate random functions over a set of points (for trying to match noisy non-linear data)
  • was easy in clojure, because of lazy evaluation (can treat the function as defined over an infinite vector, and only pull out the values we need, without blowing up)
  • …insert lots of math that i couldn’t follow…

Building Machine Learning Models with Clojure and Cortex

  • came from a python background for machine learning
  • thinks there’s a good intersection between functional programming and machine learning
  • will show how to build a baby classification model in clojure
  • expert systems: dominant theory for AI through 2010s
    • limitations: sometimes we don’t know the rules, and sometimes we don’t know how to teach a computer the rules (even if we can articulate them)
  • can think of the goal of machine learning as being to learn the function F that, when applied to a set of inputs, will produce the correct outputs (labels) for the data
  • power of neural nets: assume that they can make accurate approximations for a function of any dimensionality (using simpler pieces)
  • goal of neural nets is to learn the right coefficients or weights for each of the factors that affect the final label
  • deep learning: a “fat” neural net…basically a series of layers of perceptrons between the input and output
  • why clojure? we already have a lot of good tools in other languages for doing machine learning: tensorflow, caffe, theano, torch, deeplearning4j
  • functional composition: maps well to neural nets and perceptrons
  • also maps well to basically any ML pipeline: data loading, feature extraction, data shuffling, sampling, recursive feedback loop for building the model, etc
  • clojure really strong for data processing, which is a large part of each step of the ML pipeline
    • ex: lazy sequences really help when processing batches of data multiple times
    • can also do everything we need with just the built-in data structures
  • cortex: meant to be the theano of clojure
    • basically: import everything from it, let it do the heavy lifting
  • backend: compute lib executes on both cpu and gpu
  • implements as much of neural nets as possible in pure clojure
  • meant to be highly transparent and highly customizable
  • cortex represents neural nets as DAG, just like tensorflow
    • nodes, edges, buffers, streams
  • basically, a map of maps
    • can go in at any time and see exactly what all the parameters are, for everything
  • basic steps to building model:
    • load and process data (most time consuming step until you get to tuning the model)
    • define minimal description of the model
    • build full network from that description and train it on the model
  • for example: chose a credit card fraud dataset

Clojure: Scaling the Event Stream

  • director, programmer of his own company
  • recommends ccm-clj for cassandra-clojure interaction
  • expertise: high-availability streaming systems (for smallish clients)
  • systems he builds deal with “inconvenient” sized data for non-apple-sized clients
  • has own company: troy west, founded three years ago
  • one client: processes billions of emails, logs 10–100 events per email, multiple systems log in different formats, 5K–50K event/s
  • 10–100 TB of data
  • originally, everything logged on disk for analysis after the fact
  • requirements: convert events into meaning, support ad-hoc querying, generate reports, do real-time analysis and alerting, and do it all without falling over at scale or losing uptime
  • early observations:
    • each stream is a seq of immutable facts
    • want to index the stream
    • want to keep the original events
    • want idempotent writes
    • just transforming data
  • originally reached for java, since that’s the language he’s used to using
  • data
    • in-flight: kafka
    • compute over the data: storm (very powerful, might move in the direction of onyx later on)
    • at-rest: cassandra (drives more business to his company than anything else)
  • kafka: partitioning really powerful tool for converting one large problem into many smaller problems
  • storm: makes it easy to spin up more workers to process individual pieces of your computation
  • cassandra: their source of truth
  • query planner, query optimizer: services written in clojure, instead of throwing elasticsearch at the problem
  • recommends: Designing Data-Intensive Applications, by Martin Kleppmann
  • thinks these applications are clojure’s killer app
  • core.async gave them fine-grained control of parallelism
  • recommends using pipeline-async as add-on tool
  • composeable channels are really powerful, lets you set off several parallel ops at once, as they return have another process combine their results and produce another channel
  • but: go easy on the hot sauce, gets very tempting to put it everywhere
  • instaparse lib critical to handling verification of email addresses
  • REPL DEMO
  • some numbers: 0 times static types would have saved the day, 27 repos, 2 team members

DAY TWO

The Power of Lacinia and Hystrix in Production

  • few questions:
    • anyone tried to combine lysinia and hystrix?
    • anyone played with lacinia?
    • anyone used graphql?
    • anyone used hystrix?
  • hystrix : circuit-breaker implementation
  • lacinia: walmart-labs’ graphql
  • why both?
  • simple example: ecommerce site, aldo shoes, came to his company wanting to rennovate the whole website
  • likes starting his implementations by designing the model/schema
  • in this case, products have categories, and categories have parent/child categories, etc
  • uses graphvis to write up his model designs
  • initial diagram renders it all into a clojure map
  • they have a tool called umlaut that they used to write a schema in a single language, then generate via instaparse representations in graphql, or clojure schema, etc
  • lacinia resolver: takes a graphql query and returns json result
  • lacinia ships with a react application called GraphiQL, that allows you to through the browser explore your db (via live queries, etc)
  • gives a lot of power to the front-end when you do this, lets them change their queries on the fly, without having to redo anything on the backend
  • problem: the images are huge, 3200×3200 px
  • need something smaller to send to users
  • add a new param to the schema: image-obj, holds width and height of the image
  • leave the old image attribute in place, so don’t break old queries
  • can then write new queries on the front-end for the new attribute, fetch only the size of image that you want
  • one thing he’s learned from marathon running (and stolen from the navy seals): “embrace the suck.” translation: the situation is bad. deal with it.
  • his suck: ran into problem where front-end engineers were sending queries that timed out against the back-end
  • root cause: front-end queries hitting backend that used 3rd-party services that took too long and broke
  • wrote a tiny latency simulator: added random extra time to round-trip against db
  • even with 100ms max, latency diagram showed ~6% of the requests (top-to-bottom) took over 500ms to finish
  • now tweak it a bit more: have two dependencies, and one of them has a severe slowdown
  • now latency could go up to MINUTES
  • initial response: reach for bumping the timeouts
  • time for hystrix: introduce a circuit breaker into the system, to protect the system as a whole when an individual piece goes down
  • hystrix has an official cloure wrapper (!)
  • provides a macro: defcommand, wrap it around functions that will call out to dependencies
  • if it detects a long timeout, in the future, it will fail immediately, rather than waiting
  • as part of the macro, can also specify a fallback-fn, to be called when the circuit breaker is tripped
  • adding that in, the latency diagram is completely different. performance stays fast under much higher load
  • failback strategies:
    • fail fast
    • fail silently
    • send static content
    • use cached content
    • use stubbed content: infer the proper response, and send it back
    • chained fallbacks: a little more advanced, like connecting multiple circuit breakers in a row, in case one fails, the other can take over
  • hystrix dashboard: displays info on every defcommand you’ve got, tracks health, etc
  • seven takeaways
    • MUST embrace change in prod
    • MUST embrace failure: things are going to break, you might as well prepare for it
    • graphql is just part of the equation, if your resolvers get too complex, can introduce hystrix and push the complexity into other service
    • monitor at the function level (via hystrix dashboard)
    • adopt a consumer-driven mindset: the users have the money, don’t leave their money on the table by giving them a bad experience
    • force yourself to think about fallbacks
    • start thinking about the whole product: production issues LOOK to users like production features
  • question: do circuit-breakers introduce latency?
    • answer: a little bit at the upper end, once it’s been tripped

The Tensors Must Flow

  • works at magento, lives in philly
  • really wants to be sure our future robot masters are written in clojure, not python
  • guildsman: tensorflow library for clojure
  • tensorflow: ML lib from google, recently got a c api so other languages can call into it
  • spoiler alert: don’t get TOO excited. everything’s still a bit of a mess
  • but it DOES work, promise
  • note on architecture: the python client (from google) has access to a “cheater” api that isn’t part of the open c api. thus, there’s some things it can do that guildsman can’t because the api isn’t there
  • also: ye gods, there’s a lot of python in the python client. harder to port everything over to guildsman than he thought
  • very recently, tensorflow started shipping with a java layer built on top of a c++ lib (via jni), which itself sits on top of the tensorflow c api, some people have started building on top of that
  • but not guildsman: it sits diretly on the c api
  • in guildsman: put together a plan, then build it, and execute it
  • functions like guildsman/add produce plan maps, instead of executing things themselves
  • simple example: adding two numbers: just one line in guildsman
  • another simple example: have machine learn to solve | x – 2.0 | by finding the value of x that minimizes it
  • tensorflow gives you the tools to find minima/maxima: gradient descent, etc
  • gradient gap: guildsman can use either the clojure gradients, or the c++ ones, but not both at once
    • needs help to port the c++ ones over to clojure (please!)
  • “python occupies the 9th circle of dependency hell”: been using python lightly for years, and still has problems getting dependencies resolved (took a left turn at the virtual environment, started looking for my oculus rift)
  • demo: using mnist dataset, try to learn to recognize handwritten characters

The Dawn of Lisp: How to Write Eval and Apply in Clojure

  • educator, started using scheme in 1994, picked up clojure in 2009
  • origins of lisp: john mccarthy’s paper: recursive functions of symbolic expressions and their computation by machine, part i
  • implementation of the ideas of alonzo church, from his book “the calculi of lambda-conversion”
  • “you can always tell the lisp programmers, they have pockets full of punch cards with lots of closing parenthses on them”
  • steve russel (one of the creators of spaceware) was the first to actually implement the description from mccarthy’s paper
  • 1962: lisp 1.5 programmer’s manual, included section on how to define lisp in terms of itself (section 1.6: a universal lisp function)
  • alan kay described this definition (of lisp in terms of lisp) as the maxwell equations of software
  • how eval and apply work in clojure:
    • eval: send it a quoted list (data structure, which is also lisp), eval will produce the result from evaluating that list
      • ex: (eval ‘(+ 2 2)) => 4
    • apply: takes a function and a quoted list, applies that function to the list, then returns the result
      • ex: (apply + ‘(2 2)) => 4
  • rules for converting the lisp 1.5 spec to clojure
    • convert all m-expression to s-expressions
    • keep the definitions as close to original as possible
    • drop the use of dotted pairs
    • give all global identifiers a ‘$’ prefix (not really the way clojure says it should be used, but helps the conversion)
    • add whitespace for legibility
  • m-expressions vs s-expressions:
    • F[1;2;3] becomes (F 1 2 3)
    • [X < 0 -> -X; T -> X] becomes (COND ((< X 0) (- X)) (T X))
  • dotted pairs
    • basically (CONS (QUOTE A) (QUOTE B))
  • definitions: $T -> true, $F -> false, $NIL, $cons, $atom, $eq, $null, $car, $cdr, $caar, $cdar, $caddr, $cadar
    • note: anything that cannot be divided is an atom, no relation to clojure atoms
    • last few: various combos of car and cdr for convenience
  • elaborate definitions:
    • $cond: own version of cond to keep syntax close to the original
    • $pairlis: accepts three args: two lists, and a list of existing pairs, combines the first two lists pairwise, and combines with the existing paired list
    • $assoc: lets you pull key-value pair out of an association list (list of paired lists)
    • $evcon: takes list of paired conditions and expressions, plus a context, will return the result of the expression for the first condition that evaluates to true
    • $evlist: takes list of expressions, with a condition, and a context, and then evalutes the result of the condition + the expression in a single list
    • $apply
    • $eval
  • live code demo

INVITED TALK FROM GUY STEELE: It’s time for a New Old Language

  • “the most popular programming language in computer science”
  • no compiler, but lots of cross-translations
  • would say the name of the language, but doesn’t seem to have one
  • so: CSM (computer science metanotation)
  • has built-in datatypes, expressions, etc
  • it’s beautiful, but it’s getting messed up!
  • walk-throughs of examples, how to read it (drawn from recent ACM papers)
  • “isn’t it odd that language theorists wanting to talk about types, do it in an untyped language?”
  • wrote toy compiler to turn latex expressions of CSM from emacs buffer into prolog code, proved it can run (to do type checking)
  • inference rules: Gentzen Notation (notation for “natural deduction”)
  • BNF: can trace it all the way back to 4th century BCE, with Panini’s sanskrit grammar
  • regexes: took thirty years to settle on a notation (51–81), but basically hasn’t changed since 1981!
  • final form of BNF: not set down till 1996, though based on a paper from 1977
  • but even then, variants persist and continue to be used (especially in books)
  • variants haven’t been a problem, because they common pieces are easy enough to identify and read
  • modern BNF in current papers is very similar to classic BNF, but with 2 changes to make it more concise:
    • use single letters instead of meaningful phrases
    • use bar to indicate repetition instead of … or *
  • substitution notation: started with Church, has evolved and diversified over time
  • current favorite: e[v/x] to represent “replace x with v in e”
  • number in live use has continued to increase over time, instead of variants rising and falling (!)
  • bigger problem: some sub variants are also being used to mean function/map update, which is a completely different thing
  • theory: these changes are being driven by the page limits for computer science journals (all papers must fit within 10 years)
  • overline notation (dots and parentheses, used for grouping): can go back to 1484, when chuquet used underline to indicate grouping
    • 1702: leibnitz switched from overlines to parentheses for grouping, to help typesetters publishing his books
  • three notations duking it out for 300 years!
  • vectors: notation -> goes back to 1813, and jean-robert argand (for graphing complex numbers)
  • nested overline notation leads to confusion: how do we know how to expand the expressions that are nested?
  • one solution: use an escape from the defaults, when needed, like backtick and tilde notation in clojure
  • conclusions:
    • CMS is a completely valid language
    • should be a subject of study
    • has issues, but those can be fixed
    • would like to see a formal theory of the language, along with tooling for developing in it, checking it, etc
    • thinks there are opportunities for expressing parallelism in it

Day Three

Declarative Deep Learning in Clojure

  • starts with explanation of human cognition and memory
  • at-your-desk memory vs in-the-hammock memory
  • limitation of neural networks: once trained for a task, it can’t be retrained to another without losing the first
    • if you train a NN to recognize cats in photos, you can’t then ask it to analyze a time series
  • ART architecture: uses two layers, F1 and F2, the first to handle data that has been seen before, the second to “learn” on data that hasn’t been encountered before
  • LSTM-cell processing:
    • what should we forget?
    • what’s new that we care about?
    • what part of our updated state should we pass on?
  • dealing with the builder pattern in java: more declarative than sending a set of ordered args to a constructor
  • his lib allows keyword args to be passed in to the builder function, don’t have to worry about ordering or anything
  • by default, all functions produce a data structure that evaluates to a d4j object
  • live demos (but using pre-trained models, no live training, just evaluation)
  • what’s left?
    • graphs
    • front end
    • kafka support
    • reinforcement learning

Learning Clojure Through Logo

  • disclaimer: personal views, not views of employers
  • logo: language to control a turtle, with a pen that can be up (no lines) or down (draws lines as it moves)
  • …technical difficulties, please stand by…
  • live demo of clojure/script version in the browser
  • turns out the logo is a lisp (!): function call is always in first position, give it all args, etc
  • even scratch is basically lisp-like
  • irony: we’re using lisp to teach kids how to program, but then they go off to work in the world of curly braces and semicolons
  • clojure-turtle lib: open-source implementation of the logo commands in clojure
  • more live demos
  • recommends reading seymour papert’s book: “Mindstorms: Children, Computers, and Powerful Ideas”
  • think clojure (with the power of clojurescript) is the best learning language
  • have a tutorial that introduces the turtle, logo syntax, moving the turtle, etc
  • slowly introduces more and more clojure-like syntax, function definitions, etc
  • fairly powerful environment: can add own buttons for repeatable steps, can add animations, etc
  • everything’s in the browser, so no tools to download, nothing to mess with
  • “explaining too early can hurt”: want to start with as few primitives as possible, make the intro slow
  • can create your own lessons in markdown files, can just append the url to the markdown file and it’ll load (!)
  • prefer that you send in the lessons to them, so they can put them in the lessons index for everyone to benefit
  • have even translated the commands over to multiple languages, so you don’t have to learn english before learning programming (!)
  • lib: cban, which has translations of clojure core, can be used to offer translations of your lib code into something other than english
  • clojurescript repls: Klipse (replaces all clojure code in your page with an interactive repl)
  • comments/suggestions/contributions welcome

Introducing elm-present

I’m in love with Elm. No, really.

I don’t know if it’s just that I’ve been away from front-end development for a few years, but working in Elm has been a breath of fresh air.

When my boss offered to let us give Tech Talks on any subject at the last company meetup, I jumped at the chance to talk about Elm.

And, of course, if I was going to give a talk about Elm, I had to make my slides in Elm, didn’t I?

So I wrote elm-present.

It’s a (very) simple presentation app. Slides are json files that have a title, some text, a background image, and that’s it. Each slide points to the one before it, and the one after, for navigation.

elm-present handles reading in the files, parsing the json, and displaying everything (in the right order).

And the best part? You don’t need a server to run it. Just push everything up to Dropbox, open the present.html file in your browser, and voilà!

You can see the talk I gave the meetup here, as a demo.

Notes from LambdaConf 2015

Haskell and Power Series Brought to Life

  • not interested in convergence
  • laziness lets you handle infinite series
  • head/tail great for describing series
  • operator overloading lets you redefine things to work on a power series (list of Nums) as well as Nums
  • multiplication complication: can’t multiply power series by a scalar, since they’re not the same type
  • could define negation as: negate = map negate
    • instead of recursively: negate(x:xs) = negate x : negate xs
  • once we define the product of two power series, we get integer powers for free, since it’s defined in terms of the product
  • by using haskell’s head-tail notation, we can clear a forest of subscripts from our proofs
  • reversion, or functional inversion, can be written as one line in haskell when you take this approach:
    • revert (0:fs) = rs where rs = 0 : 1/(fs#rs)
  • can define integral and derivative in terms of zipWith over a power series
  • once we have integrals and derivatives, we can solve differential equations
  • can use to express generating functions, which lets us do things like pascal’s triangle
  • can change the default ordering of type use for constants in haskell to get rationals out of the formulas instead of floats
    • default (Integer, Rational, Double)
  • all formulas can be found on web page: ???
    • somewhere on dartmouth’s site
  • why not make a data type? why overload lists?
    • would have needed to define Input and Ouput for the new data type
    • but: for complex numbers, algebraic extensions, would need to define your own types to keep everything straight
    • also: looks prettier this way

How to Learn Haskell in Less than 5 Years

  • Chris Allen (bitemyapp)
  • title derives from how long it took him
    • though, he says he’s not particularly smart
  • not steady progress; kept skimming off the surface like a stone
  • is this talk a waste of time?
    • not teaching haskell
    • not teaching how to teach haskell
    • not convince you to learn haskell
    • WILL talk about problems encountered as a learner
  • there is a happy ending: uses haskell in production very happily
  • eventually made it
    • mostly working through exercises and working on own projects
    • spent too much time bouncing between different resources
    • DOES NOT teach haskell like he learned it
  • been teaching haskell for two years now
    • was REALLY BAD at it
    • started teaching it because knew couldn’t bring work on board unless could train up own coworkers
  • irc channel: #haskell-beginners
  • the guide: github.com/bitemyapp/learnhaskell
  • current recommendations: cis194 (spring ’13) followed by NICTA course
  • don’t start with the NICTA course; it’ll drive you to depression
  • experienced haskellers often fetishize difficult materials that they didn’t use to learn haskell
  • happy and productive user of haskell without understanding category theory
    • has no problem understanding advanced talks
    • totally not necessary to learn in order to understand haskell
    • perhaps for work on the frontiers of haskell
  • his materials are optimized around keeping people from dropping out
  • steers them away from popular materials because most of them are the worst ways to learn
  • “happy to work with any of the authors i’ve critized to help them improve their materials”
  • people need multiple examples per concept to really get it, from multiple angles, for both good and bad ways to do things
  • doesn’t think haskell is really that difficult, but coming to it from other languages means you have to throw away most of what you already know
    • best to write haskell books for non-programmers
    • if you come to haskell from js, there’s almost nothing applicable
  • i/o and monad in haskell aren’t really related, but they’re often introduced together
  • language is still evolving; lots of the materials from 90s are good but leave out a lot of new (and useful!) things
  • how to learn: can’t just read, have to work
  • writing a book with Julie (?) @argumatronic that will teach haskell to non-programmers, should work for everyone else as well; will be very, very long (longer than Real World Haskell)
  • if onboarding new employee, would pair through tutorials for 2 weeks and then cut them loose
  • quit clojure because he and 4 other clojurians couldn’t debug a 250 line ns

Production Web App in Elm

  • app: web-based doc editor with offline capabilities: DreamWriter
  • wrote original version in GIMOJ: giant imperative mess of jquery
  • knew was in trouble when he broke paste; could no longer copy/paste text in the doc
  • in the midst of going through rewrite hell, saw the simple made easy talk by rich hickey
  • “simple is an objective notion” – rich hickey
    • measure of how intermingled the parts of a system are
  • easy is subjective, by contrast: just nearer to your current skillset
  • familiarity grows over time — but complexity is forever
  • simpler code is more maintainable
  • so how do we do this?
    • stateless functions minimize interleaving
    • dependencies are clear (so long as no side effects)
    • creates chunks of simpleness throughout the program
    • easier to keep track of what’s happening in your head
  • first rewrite: functional style in an imperative language (coffeescript)
    • fewer bugs
  • then react.js and flux came out, have a lot of the same principles, was able to use that to offload a lot of his rendering code
    • react uses virtual dom that gets passed around so you no longer touch the state of the real dom
  • got him curious: how far down the rabbit-hole could he go?
    • sometimes still got bugs due to mutated state (whether accidental on his part or from some third-party lib)
  • realized: been using discipline to do functional programming, instead of relying on invariants, which would be easier
  • over 200 languages compile to js (!)
  • how to decide?
  • deal-breakers
    • slow compiled js
    • poor interop with js libs (ex: lunar.js for notes)
    • unlikely to develop a community
  • js but less painful?
    • dart, typescript, coffeescript
    • was already using coffeescript, so not compelling
  • easily talks to js
    • elm, purescript, clojurescript
    • ruled out elm almost immediately because of rendering (!)
  • cljs
    • flourishing community
    • mutation allowed
    • trivial js interop
  • purescript
    • 100% immutability + type inference
    • js interop: just add type signature
    • functions cannot have side effects* (js interop means you can lie)
  • so, decision made: rewrite in purescript!
    • but: no react or flux equivalents in purescript (sad kitten)
  • but then: a new challenger: blazing fast html in eml (blog post)
    • react + flux style but even simpler and faster (benchmarked)
  • elm js interop: ports
    • client/server relationship, they only talk with data
    • pub/sub communication system
  • so, elm, hmm…
    • 100% immutability, type inference
    • js interop preserves immutability
    • time travelling debugger!!!
    • saves user inputs, can replay back and forth, edit the code and then replay with the same inputs, see the results
  • decision: rewrite in elm!
  • intermediate step of rewriting in functional coffeescript + react and flux was actually really helpful
    • could anticipate invariants
    • then translate those invariants over to the elm world
    • made the transition to elm easier
  • open-source: rtfledman/dreamwriter and dreamwriter-coffee on github
  • code for sidebar looks like templating language, but is actually real elm (dsl)
  • elm programs are built of signals, which are just values that change over time
  • only functions that have access to a given signal have any chance of affecting it (or messing things up)
  • so how was it?
    • SO AWESOME
    • ridiculous performance
    • since you can depend on the function always giving you the same result for the same arguments, you can CACHE ALL THE THINGS (called lazy in Elm)
    • language usability: readable error messages from the compiler (as in, paragraphs of descriptive text)
    • refactoring is THE MOST FUN THING
    • semantic versioning is guaranteed. for every package. enforced by the compiler. yes, really.
    • diff tool for comparing public api for a lib
    • no runtime exceptions EVER
  • Elm is now his favorite language
  • Elm is also the simplest (!)
  • elm-lang.org

Clojure/West 2015: Notes from Day Three

Everything Will Flow

  • Zach Tellman, Factual
  • queues: didn’t deal with directly in clojure until core.async
  • queues are everywhere: even software threads have queues for their execution, and correspond to hardware threads that have their own buffers (queues)
  • queueing theory: a lot of math, ignore most
  • performance modeling and design of computer systems: queueing theory in action
  • closed systems: when produce something, must wait for consumer to deal with it before we can produce something else
    • ex: repl, web browser
  • open systems: requests come in without regard for how fast the consumer is using them
    • adding consumers makes the open systems we build more robust
  • but: because we’re often adding producers and consumers, our systems may respond well for a good while, but then suddenly fall over (can keep up better for longer, but when gets unstable, does so rapidly)
  • lesson: unbounded queues are fundamentally broken
  • three responses to too much incoming data:
    • drop: valid if new data overrides old data, or if don’t care
    • reject: often the only choice for an application
    • pause (backpressure): often the only choice for a closed system, or sub-system (can’t be sure that dropping or rejecting would be the right choice for the system as a whole)
    • this is why core.async has the puts buffer in front of their normal channel buffer
  • in fact, queues don’t need buffer, so much as they need the puts and takes buffers; which is the default channel you get from core.async

Clojure At Scale

  • Anthony Moocar, Walmart Labs
  • redis and cassandra plus clojure
  • 20 services, 70 lein projects, 50K lines of code
  • prefer component over global state

Clojure/West 2015: Notes from Day Two

Data Science in Clojure

  • Soren Macbeth; yieldbot
  • yieldbot: similar to how adwords works, but not google and not on search results
  • 1 billion pageviews per week: lots of data
  • end up using almost all of the big data tools out there
  • EXCEPT HADOOP: no more hadoop for them
  • lots of machine learning
  • always used clojure, never had or used anything else
  • why clojure?
    • most of the large distributed processing systems run on the jvm
    • repl great for data exploration
    • no delta between prototyping and production code
  • cascalog: was great, enabled them to write hadoop code without hating it the whole time, but still grew to hate hadoop (running hadoop) over time
  • december: finally got rid of last hadoop job, now life is great
  • replaced with: storm
  • marceline: clojure dsl (open-source) on top of the trident java library
  • writing trident in clojure much better than using the java examples
  • flambo: clojure dsl on top of spark’s java api
    • renamed, expanded version of climate corp’s clj-spark

Pattern Matching in Clojure

  • Sean Johnson; path.com
  • runs remote engineering team at Path
  • history of pattern matching
    • SNOBOL: 60s and 70s, pattern matching around strings
    • Prolog: 1972; unification at its core
    • lots of functional and pattern matching work in the 70s and 80s
    • 87: Erlang -> from prolog to telecoms; functional
    • 90s: standard ml, haskell…
    • clojure?
  • prolog: unification does spooky things
    • bound match unbound
    • unbound match bound
    • unbound match unbound
  • clojurific ways: core.logic, miniKanren, Learn Prolog Now
  • erlang: one way pattern matching: bound match unbound, unbound match bound
  • what about us? macros!
  • pattern matching all around us
    • destructuring is a mini pattern matching language
    • multimethods dispatch based on pattern matching
    • case: simple pattern matching macro
  • but: we have macros, we can use them to create the language that we want
  • core.match
  • dennis’ library defun: macros all the way down: a macro that wraps the core.match macro
    • pattern matching macro for defining functions just like erlang
    • (defun say-hi
      ([“Dennis”] “Hi Dennis!”)
      ([:catty] “Morning, Catty!”))
    • can also use the :guard syntax from core.match in defining your functions’ pattern matching
    • not in clojurescript yet…
  • but: how well does this work in practice?
    • falkland CMS, SEACAT -> incidental use
    • POSThere.io -> deliberate use (the sweet spot)
    • clj-json-ld, filter-map -> maximal use
  • does it hurt? ever?
  • limitations
    • guards only accept one argument, workaround with tuples
  • best practices
    • use to eliminate conditionals at the top of a function
    • use to eliminate nested conditionals
    • handle multiple function inputs (think map that might have different keys in it?)
    • recursive function pattern: one def for the start, one def for the work, one def for the finish
      • used all over erlang
      • not as explicit in idiomatic clojure

Clojure/West 2015: Notes from Day One

Life of a Clojure Expression

  • John Hume, duelinmarkers.com (DRW trading)
  • a quick tour of clojure internals
  • giving the talk in org mode (!)
  • disclaimers: no expert, internals can change, excerpts have been mangled for readability
  • most code will be java, not clojure
  • (defn m [v] {:foo “bar” :baz v})
  • minor differences: calculated key, constant values, more than 8 key/value pairs
  • MapReader called from static array of IFns used to track macros; triggered by ‘{‘ character
  • PersistentArrayMap used for less than 8 objects in map
  • eval treats forms wrapped in (do..) as a special case
  • if form is non-def bit of code, eval will wrap it in a 0-arity function and invoke it
  • eval’s macroexpand will turn our form into (def m (fn [v] {:foo “bar :baz v}))
  • checks for duplicate keys twice: once on read, once on analyze, since forms for keys might have been evaluated into duplicates
  • java class emitted at the end with name of our fn tacked on, like: class a_map$m
  • intelli-j will report a lot of unused methods in the java compiler code, but what’s happening is the methods are getting invoked, but at load time via some asm method strings
  • no supported api for creating small maps with compile-time constant keys; array-map is slow and does a lot of work it doesn’t need to do

Clojure Parallelism: Beyond Futures

  • Leon Barrett, the climate corporation
  • climate corp: model weather and plants, give advice to farmers
  • wrote Claypoole, a parallelism library
  • map/reduce to compute average: might use future to shove computation of the average divisor (inverse of # of items) off at the beginning, then do the map work, then deref the future at the end
  • future -> future-call: sends fn-wrapped body to an Agent/soloExecutor
  • concurrency vs parallelism: concurrency means things could be re-ordered arbitrarily, parallelism means multiple things happen at once
  • thread pool: recycle a set number of threads to avoid constantly incurring the overhead of creating a new thread
  • agent thread pool: used for agents and futures; program will not exit while threads are there; lifetime of 60 sec
  • future limitations
    • tasks too small for the overhead
    • exceptions get wrapped in ExecutionException, so your try/catches won’t work normally anymore
  • pmap: just a parallel map; lazy; runs N-cpu + 3 tasks in futures
    • generates threads as needed; could have problems if you’re creating multiple pmaps at once
    • slow task can stall it, since it waits for the first task in the sequence to complete for each trip through
    • also wraps exceptions just like future
  • laziness and parallelism: don’t mix
  • core.async
    • channels and coroutines
    • reads like go
    • fixed-size thread pool
    • handy when you’ve got a lot of callbacks in your code
    • mostly for concurrency, not parallelism
    • can use pipeline for some parallelism; it’s like a pmap across a channel
    • exceptions can kill coroutines
  • claypoole
    • pmap that uses a fixed-size thread pool
    • with-shutdown! will clean up thread pool when done
    • eager by default
    • output is an eagerly streaming sequence
    • also get pfor (parallel for)
    • lazy versions are available; can be better for chaining (fast pmap into slow pmap would have speed mismatch with eagerness)
    • exceptions are re-thrown properly
    • no chunking worries
    • can have priorities on your tasks
  • reducers
    • uses fork/join pool
    • good for cpu-bound tasks
    • gives you a parallel reduce
  • tesser
    • distributable on hadoop
    • designed to max out cpu
    • gives parallel reduce as well (fold)
  • tools for working with parallelism:
    • promises to block the state of the world and check things
    • yorkit (?) for jvm profiling

Boot Can Build It

  • Alan Dipert and Micha Niskin, adzerk
  • why a new build tool?
    • build tooling hasn’t kept up with the complexity of deploys
    • especially for web applications
    • builds are processes, not specifications
    • most tools: maven, ant, oriented around configuration instead of programming
  • boot
    • many independent parts that do one thing well
    • composition left to the user
    • maven for dependency resolution
    • builds clojure and clojurescript
    • sample boot project has main method (they used java project for demo)
    • uses ‘–‘ for piping tasks together (instead of the real |)
    • filesets are generated and passed to a task, then output of task is gathered up and sent to the next task in the chain (like ring middleware)
  • boot has a repl
    • can do most boot tasks from the repl as well
    • can define new build tasks via deftask macro
    • (deftask build …)
    • (boot (watch) (build))
  • make build script: (build.boot)
    • #!/usr/bin/env boot
    • write in the clojure code defining and using your boot tasks
    • if it’s in build.boot, boot will find it on command line for help and automatically write the main fn for you
  • FileSet: immutable snapshot of the current files; passed to task, new one created and returned by that task to be given to the next one; task must call commit! to commit changes to it (a la git)
  • dealing with dependency hell (conflicting dependencies)
    • pods
    • isolated runtimes, with own dependencies
    • some things can’t be passed between pods (such as the things clojure runtime creates for itself when it starts up)
    • example: define pod with env that uses clojure 1.5.1 as a dependency, can then run code inside that pod and it’ll only see clojure 1.5.1

One Binder to Rule Them All: Introduction to Trapperkeeper

  • Ruth Linehan and Nathaniel Smith; puppetlabs
  • back-end service engineers at puppetlabs
  • service framework for long-running applications
  • basis for all back-end services at puppetlabs
  • service framework:
    • code generalization
    • component reuse
    • state management
    • lifecycle
    • dependencies
  • why trapperkeeper?
    • influenced by clojure reloaded pattern
    • similar to component and jake
    • puppetlabs ships on-prem software
    • need something for users to configure, may not have any clojure experience
    • needs to be lightweight: don’t want to ship jboss everywhere
  • features
    • turn on and off services via config
    • multiple web apps on a single web server
    • unified logging and config
    • simple config
  • existing services that can be used
    • config service: for parsing config files
    • web server service: easily add ring handler
    • nrepl service: for debugging
    • rpc server service: nathaniel wrote
  • demo app: github -> trapperkeeper-demo
  • anatomy of service
    • protocol: specifies the api contract that that service will have
    • can have any number of implementations of the contract
    • can choose between implementations at runtime
  • defservice: like defining a protocol implementation, one big series of defs of fns: (init [this context] (let …)))
    • handle dependencies in defservice by vector after service name: [[:ConfigService get-in-config] [:MeowService meow]]
    • lifecycle of the service: what happens when initialized, started, stopped
    • don’t have to implement every part of the lifecycle
  • config for the service: pulled from file
    • supports .json, .edn, .conf, .ini, .properties, .yaml
    • can specify single file or an entire directory on startup
    • they prefer .conf (HOCON)
    • have to use the config service to get the config values
    • bootstrap.cfg: the config file that controls which services get picked up and loaded into app
    • order is irrelevant: will be decided based on parsing of the dependencies
  • context: way for service to store and access state locally not globally
  • testing
    • should write code as plain clojure
    • pass in context/config as plain maps
    • trapperkeeper provides helper utilities for starting and stopping services via code
    • with-app-with-config macro: offers symbol to bind the app to, plus define config as a map, code will be executed with that app binding and that config
  • there’s a lein template for trapperkeeper that stubs out working application with web server + test suite + repl
  • repl utils:
    • start, stop, inspect TK apps from the repl: (go); (stop)
    • don’t need to restart whole jvm to see changes: (reset)
    • can print out the context: (:MeowService (context))
  • trapperkeeper-rpc
    • macro for generating RPC versions of existing trapperkeeper protocols
    • supports https
    • defremoteservice
    • with web server on one jvm and core logic on a different one, can scale them independently; can keep web server up even while swapping out or starting/stopping the core logic server
    • future: rpc over ssl websockets (using message-pack in transit for data transmission); metrics, function retrying; load balancing

Domain-Specific Type Systems

  • Nathan Sorenson, sparkfund
  • you can type-check your dsls
  • libraries are often examples of dsls: not necessarily macros involved, but have opinionated way of working within a domain
  • many examples pulled from “How to Design Programs”
  • domain represented as data, interpreted as information
  • type structure: syntactic means of enforcing abstraction
  • abstraction is a map to help a user navigate a domain
    • audience is important: would give different map to pedestrian than to bus driver
  • can also think of abstraction as specification, as dictating what should be built or how many things should be built to be similar
  • showing inception to programmers is like showing jaws to a shark
  • fable: parent trap over complex analysis
  • moral: types are not data structures
  • static vs dynamic specs
    • static: types; things as they are at compile time; definitions and derivations
    • dynamic: things as they are at runtime; unit tests and integration tests; expressed as falsifiable conjectures
  • types not always about enforcing correctness, so much as describing abstractions
  • simon peyton jones: types are the UML of functional programming
  • valuable habit: think of the types involved when designing functions
  • spec-tacular: more structure for datomic schemas
    • from sparkfund
    • the type system they wanted for datomic
    • open source but not quite ready for public consumption just yet
    • datomic too flexible: attributes can be attached to any entity, relationships can happen between any two entities, no constraints
    • use specs to articulate the constraints
    • (defspec Lease [lesse :is-a Corp] [clauses :is-many String] [status :is-a Status])
    • (defenum Status …)
    • wrote query language that’s aware of the defined types
    • uses bi-directional type checking: github.com/takeoutweight/bidirectional
    • can write sensical error messages: Lease has no field ‘lesee’
    • can pull type info from their type checker and feed it into core.typed and let core.typed check use of that data in other code (enforce types)
    • does handle recursive types
    • no polymorphism
  • resources
    • practical foundations for programming languages: robert harper
    • types and programming languages: benjamin c pierce
    • study haskell or ocaml; they’ve had a lot of time to work through the problems of types and type theory
  • they’re using spec-tacular in production now, even using it to generate type docs that are useful for non-technical folks to refer to and discuss; but don’t feel the code is at the point where other teams could pull it in and use it easily

ClojureScript Update

  • David Nolen
  • ambly: cljs compiled for iOS
  • uses bonjour and webdav to target ios devices
  • creator already has app in app store that was written entirely in clojurescript
  • can connect to device and use repl to write directly on it (!)

Clojure Update

  • Alex Miller
  • clojure 1.7 is at 1.7.0-beta1 -> final release approaching
  • transducers coming
  • define a transducer as a set of operations on a sequence/stream
    • (def xf (comp (filter? odd) (map inc) (take 5)))
  • then apply transducer to different streams
    • (into [] xf (range 1000))
    • (transduce xf + 0 (range 1000))
    • (sequence xf (range 1000))
  • reader conditionals
    • portable code across clj platforms
    • new extension: .cljc
    • use to select out different expressions based on platform (clj vs cljs)
    • #?(:clj (java.util.Date.)
      :cljs (js/Date.))
    • can fall through the conditionals and emit nothing (not nil, but literally don’t emit anything to be read by the reader)
  • performance has also been a big focus
    • reduced class lookups for faster compile times
    • iterator-seq is now chunked
    • multimethod default value dispatch is now cached