Ron Toland
About Canadian Adventures Keeping Score Archive Photos Replies Also on Micro.blog
  • Notes from Clojure/Conj 2017

    It was a good Conj. Saw several friends and former co-workers of mine, heard some great talks about machine learning, and got some good ideas to take back to my current gig.

    There were some dud talks, too, and others that promised one (awesome) thing and delivered another (boring) thing, but overall it’s inspiring to see how far Clojure has come in just ten years.

    My notes from the conference:

    DAY ONE

    KEYNOTE FROM Rich Hickey: Effective Programs, or: 10 years of Clojure

    • clojure released 10 years ago
    • never thought more than 100 people would use it
    • clojure is opinionated
      • few idioms, strongly supported
      • born out of the pain of experience
    • had been programming for 18 years when wrote clojure, mostly in c++, java, and common lisp
    • almost every project used a database
    • was struck by two language designers that talked disparagingly of databases, said they’d never used them
    • situated programs
      • run for long time
      • data-driven
      • have memory that usually resides in db
      • have to handle the weirdness of reality (ex: “two-fer tuesday” for radio station scheduling)
      • interact with other systems and humans
      • leverage code written by others (libraries)
    • effective: producing the intended result
      • prefers above the word “correctness”, none of his programs ever cared about correctness
    • but: what is computing about?
      • making computers effective in the world
      • computers are effective in the same way people are:
        • generate predictions from experience
        • enable good decisions
      • experience => information => facts
    • programming is NOT about itself, or just algorithms
    • programs are dominated by information processing
    • but that’s not all: when we start talking to the database or other libraries, we need different protocols to talk to them
    • but there’s more! everything continues to mutate over time (db changes, requirements change, libraries change, etc)
    • we aspire to write general purpose languages, but will get very different results depending on your target (phone switches, device drivers, etc)
    • clojure written for information-driven situated programs
    • clojure design objectives
      • create programs out of simpler stuff
      • want a low cognitive load for the language
      • a lisp you can use instead of java/c# (his common lisp programs were never allowed to run in production)
    • says classes and algebraic types are terrible for the information programming problem, claims there are no first-class names, and nothing is composable
    • in contrast to java’s multitude of classes and haskell’s multitude of types, clojure says “just use maps”
    • says pattern matching doesn’t scale, flowing type information between programs is a major source of coupling and brittleness
    • positional semantics (arg-lists) don’t scale, eventually you get a function with 17 args, and no one wants to use it
    • sees passing maps as args as a way to attach names to things, thinks it’s superior to positional args or typed args
    • “types are an anti-pattern for program maintenance”
    • using maps means you can deal with them on a need-to-know basis
    • things he left out deliberately:
      • parochialism: data types
      • “rdf got it right”, allows merging data from different sources, without regard for how the schemas differ
      • “more elaborate your type system, more parochial the types”
      • in clojure, namespace-qualified keys allow data merging without worrying about colliding schemas (should use the reverse-domain scheme, same as java, but not all clojure libraries do)
      • another point: when data goes out over the wire, it’s simple: strings, vectors, maps. clojure aims to have you program the same inside as outside
    • smalltalk and common lisp: both languages that were designed by people for working programmers, and it shows
      • surprisingly, the jvm has a similar sensibility (though java itself doesn’t)
    • also wanted to nail concurrency
      • functional gets you 90% of the way there
    • pulled out the good parts of lisp
    • fixed the bad parts: not everything is a list, packages are bad, cons cell is mutable, lists were kind of functional, but not really
    • edn data model: values only, the heart of clojure, compatible with other languages, too
    • static types: basically disagrees with everything from the Joy of Types talk
    • spec: clojure is dynamic for good reasons, it’s not arbitrary, but if you want checking, it should be your choice, both to use it at all and where to use it
    Learning Clojure and Clojurescript by Playing a Game
    • inspired by the gin rummy card game example in dr scheme for the scheme programming language
    • found the java.awt.Robot class, which can take screenshots and move the mouse, click things
    • decided to combine the two, build a robot that could play gin rummy
    • robot takes a screenshot, finds the cards, their edges, and which ones they are, then plays the game
    • lessons learned:
      • java interop was great
    • when clojurescript came out, decided to rebuild it, but in the browser
    • robot still functions independently, but now takes screenshot of the browser-based game
    • built a third version with datomic as a db to store state, allowing two clients to play against each other
    • absolutely loves the “time travel” aspects of datomic
    • also loves pedestal
    Bayesian Data Analysis in Clojure
    • using clojure for about two years
    • developed toolkit for doing bayesian statistics in clojure
    • why clojure?
      • not as many existing tools ass julia or R
      • but: easier to develop new libraries than in julia (and most certainly R)
      • other stats languages like matlab and R don’t require much programming knowledge to get started, but harder to dev new tools in them
    • michaellindon/distributions
      • open-source clojure lib for generating and working with probability distributions in clojure
      • can also provide data and prior to get posterior distribution
      • and do posterior-predictive distributions
    • wrote a way to generate random functions over a set of points (for trying to match noisy non-linear data)
    • was easy in clojure, because of lazy evaluation (can treat the function as defined over an infinite vector, and only pull out the values we need, without blowing up)
    • …insert lots of math that i couldn’t follow…
    Building Machine Learning Models with Clojure and Cortex
    • came from a python background for machine learning
    • thinks there’s a good intersection between functional programming and machine learning
    • will show how to build a baby classification model in clojure
    • expert systems: dominant theory for AI through 2010s
      • limitations: sometimes we don’t know the rules, and sometimes we don’t know how to teach a computer the rules (even if we can articulate them)
    • can think of the goal of machine learning as being to learn the function F that, when applied to a set of inputs, will produce the correct outputs (labels) for the data
    • power of neural nets: assume that they can make accurate approximations for a function of any dimensionality (using simpler pieces)
    • goal of neural nets is to learn the right coefficients or weights for each of the factors that affect the final label
    • deep learning: a “fat” neural net…basically a series of layers of perceptrons between the input and output
    • why clojure? we already have a lot of good tools in other languages for doing machine learning: tensorflow, caffe, theano, torch, deeplearning4j
    • functional composition: maps well to neural nets and perceptrons
    • also maps well to basically any ML pipeline: data loading, feature extraction, data shuffling, sampling, recursive feedback loop for building the model, etc
    • clojure really strong for data processing, which is a large part of each step of the ML pipeline
      • ex: lazy sequences really help when processing batches of data multiple times
      • can also do everything we need with just the built-in data structures
    • cortex: meant to be the theano of clojure
      • basically: import everything from it, let it do the heavy lifting
    • backend: compute lib executes on both cpu and gpu
    • implements as much of neural nets as possible in pure clojure
    • meant to be highly transparent and highly customizable
    • cortex represents neural nets as DAG, just like tensorflow
      • nodes, edges, buffers, streams
    • basically, a map of maps
      • can go in at any time and see exactly what all the parameters are, for everything
    • basic steps to building model:
      • load and process data (most time consuming step until you get to tuning the model)
      • define minimal description of the model
      • build full network from that description and train it on the model
    • for example: chose a credit card fraud dataset
    Clojure: Scaling the Event Stream
    • director, programmer of his own company
    • recommends ccm-clj for cassandra-clojure interaction
    • expertise: high-availability streaming systems (for smallish clients)
    • systems he builds deal with “inconvenient” sized data for non-apple-sized clients
    • has own company: troy west, founded three years ago
    • one client: processes billions of emails, logs 10–100 events per email, multiple systems log in different formats, 5K–50K event/s
    • 10–100 TB of data
    • originally, everything logged on disk for analysis after the fact
    • requirements: convert events into meaning, support ad-hoc querying, generate reports, do real-time analysis and alerting, and do it all without falling over at scale or losing uptime
    • early observations:
      • each stream is a seq of immutable facts
      • want to index the stream
      • want to keep the original events
      • want idempotent writes
      • just transforming data
    • originally reached for java, since that’s the language he’s used to using
    • data
      • in-flight: kafka
      • compute over the data: storm (very powerful, might move in the direction of onyx later on)
      • at-rest: cassandra (drives more business to his company than anything else)
    • kafka: partitioning really powerful tool for converting one large problem into many smaller problems
    • storm: makes it easy to spin up more workers to process individual pieces of your computation
    • cassandra: their source of truth
    • query planner, query optimizer: services written in clojure, instead of throwing elasticsearch at the problem
    • recommends: Designing Data-Intensive Applications, by Martin Kleppmann
    • thinks these applications are clojure’s killer app
    • core.async gave them fine-grained control of parallelism
    • recommends using pipeline-async as add-on tool
    • composeable channels are really powerful, lets you set off several parallel ops at once, as they return have another process combine their results and produce another channel
    • but: go easy on the hot sauce, gets very tempting to put it everywhere
    • instaparse lib critical to handling verification of email addresses
    • REPL DEMO
    • some numbers: 0 times static types would have saved the day, 27 repos, 2 team members
    DAY TWO

    The Power of Lacinia and Hystrix in Production

    • few questions:
      • anyone tried to combine lysinia and hystrix?
      • anyone played with lacinia?
      • anyone used graphql?
      • anyone used hystrix?
    • hystrix : circuit-breaker implementation
    • lacinia: walmart-labs’ graphql
    • why both?
    • simple example: ecommerce site, aldo shoes, came to his company wanting to rennovate the whole website
    • likes starting his implementations by designing the model/schema
    • in this case, products have categories, and categories have parent/child categories, etc
    • uses graphvis to write up his model designs
    • initial diagram renders it all into a clojure map
    • they have a tool called umlaut that they used to write a schema in a single language, then generate via instaparse representations in graphql, or clojure schema, etc
    • lacinia resolver: takes a graphql query and returns json result
    • lacinia ships with a react application called GraphiQL, that allows you to through the browser explore your db (via live queries, etc)
    • gives a lot of power to the front-end when you do this, lets them change their queries on the fly, without having to redo anything on the backend
    • problem: the images are huge, 3200x3200 px
    • need something smaller to send to users
    • add a new param to the schema: image-obj, holds width and height of the image
    • leave the old image attribute in place, so don’t break old queries
    • can then write new queries on the front-end for the new attribute, fetch only the size of image that you want
    • one thing he’s learned from marathon running (and stolen from the navy seals): “embrace the suck.” translation: the situation is bad. deal with it.
    • his suck: ran into problem where front-end engineers were sending queries that timed out against the back-end
    • root cause: front-end queries hitting backend that used 3rd-party services that took too long and broke
    • wrote a tiny latency simulator: added random extra time to round-trip against db
    • even with 100ms max, latency diagram showed ~6% of the requests (top-to-bottom) took over 500ms to finish
    • now tweak it a bit more: have two dependencies, and one of them has a severe slowdown
    • now latency could go up to MINUTES
    • initial response: reach for bumping the timeouts
    • time for hystrix: introduce a circuit breaker into the system, to protect the system as a whole when an individual piece goes down
    • hystrix has an official cloure wrapper (!)
    • provides a macro: defcommand, wrap it around functions that will call out to dependencies
    • if it detects a long timeout, in the future, it will fail immediately, rather than waiting
    • as part of the macro, can also specify a fallback-fn, to be called when the circuit breaker is tripped
    • adding that in, the latency diagram is completely different. performance stays fast under much higher load
    • failback strategies:
      • fail fast
      • fail silently
      • send static content
      • use cached content
      • use stubbed content: infer the proper response, and send it back
      • chained fallbacks: a little more advanced, like connecting multiple circuit breakers in a row, in case one fails, the other can take over
    • hystrix dashboard: displays info on every defcommand you’ve got, tracks health, etc
    • seven takeaways
      • MUST embrace change in prod
      • MUST embrace failure: things are going to break, you might as well prepare for it
      • graphql is just part of the equation, if your resolvers get too complex, can introduce hystrix and push the complexity into other service
      • monitor at the function level (via hystrix dashboard)
      • adopt a consumer-driven mindset: the users have the money, don’t leave their money on the table by giving them a bad experience
      • force yourself to think about fallbacks
      • start thinking about the whole product: production issues LOOK to users like production features
    • question: do circuit-breakers introduce latency?
      • answer: a little bit at the upper end, once it’s been tripped
    The Tensors Must Flow
    • works at magento, lives in philly
    • really wants to be sure our future robot masters are written in clojure, not python
    • guildsman: tensorflow library for clojure
    • tensorflow: ML lib from google, recently got a c api so other languages can call into it
    • spoiler alert: don’t get TOO excited. everything’s still a bit of a mess
    • but it DOES work, promise
    • note on architecture: the python client (from google) has access to a “cheater” api that isn’t part of the open c api. thus, there’s some things it can do that guildsman can’t because the api isn’t there
    • also: ye gods, there’s a lot of python in the python client. harder to port everything over to guildsman than he thought
    • very recently, tensorflow started shipping with a java layer built on top of a c++ lib (via jni), which itself sits on top of the tensorflow c api, some people have started building on top of that
    • but not guildsman: it sits diretly on the c api
    • in guildsman: put together a plan, then build it, and execute it
    • functions like guildsman/add produce plan maps, instead of executing things themselves
    • simple example: adding two numbers: just one line in guildsman
    • another simple example: have machine learn to solve | x - 2.0 | by finding the value of x that minimizes it
    • tensorflow gives you the tools to find minima/maxima: gradient descent, etc
    • gradient gap: guildsman can use either the clojure gradients, or the c++ ones, but not both at once
      • needs help to port the c++ ones over to clojure (please!)
    • “python occupies the 9th circle of dependency hell”: been using python lightly for years, and still has problems getting dependencies resolved (took a left turn at the virtual environment, started looking for my oculus rift)
    • demo: using mnist dataset, try to learn to recognize handwritten characters
    The Dawn of Lisp: How to Write Eval and Apply in Clojure
    • educator, started using scheme in 1994, picked up clojure in 2009
    • origins of lisp: john mccarthy’s paper: recursive functions of symbolic expressions and their computation by machine, part i
    • implementation of the ideas of alonzo church, from his book “the calculi of lambda-conversion”
    • “you can always tell the lisp programmers, they have pockets full of punch cards with lots of closing parenthses on them”
    • steve russel (one of the creators of spaceware) was the first to actually implement the description from mccarthy’s paper
    • 1962: lisp 1.5 programmer’s manual, included section on how to define lisp in terms of itself (section 1.6: a universal lisp function)
    • alan kay described this definition (of lisp in terms of lisp) as the maxwell equations of software
    • how eval and apply work in clojure:
      • eval: send it a quoted list (data structure, which is also lisp), eval will produce the result from evaluating that list
        • ex: (eval '(+ 2 2)) => 4
      • apply: takes a function and a quoted list, applies that function to the list, then returns the result
        • ex: (apply + '(2 2)) => 4
    • rules for converting the lisp 1.5 spec to clojure
      • convert all m-expression to s-expressions
      • keep the definitions as close to original as possible
      • drop the use of dotted pairs
      • give all global identifiers a ‘$’ prefix (not really the way clojure says it should be used, but helps the conversion)
      • add whitespace for legibility
    • m-expressions vs s-expressions:
      • F[1;2;3] becomes (F 1 2 3)
      • [X < 0 -> -X; T -> X] becomes (COND ((< X 0) (- X)) (T X))
    • dotted pairs
      • basically (CONS (QUOTE A) (QUOTE B))
    • definitions: $T -> true, $F -> false, $NIL, $cons, $atom, $eq, $null, $car, $cdr, $caar, $cdar, $caddr, $cadar
      • note: anything that cannot be divided is an atom, no relation to clojure atoms
      • last few: various combos of car and cdr for convenience
    • elaborate definitions:
      • $cond: own version of cond to keep syntax close to the original
      • $pairlis: accepts three args: two lists, and a list of existing pairs, combines the first two lists pairwise, and combines with the existing paired list
      • $assoc: lets you pull key-value pair out of an association list (list of paired lists)
      • $evcon: takes list of paired conditions and expressions, plus a context, will return the result of the expression for the first condition that evaluates to true
      • $evlist: takes list of expressions, with a condition, and a context, and then evalutes the result of the condition + the expression in a single list
      • $apply
      • $eval
    • live code demo
    INVITED TALK FROM GUY STEELE: It’s time for a New Old Language
    • “the most popular programming language in computer science”
    • no compiler, but lots of cross-translations
    • would say the name of the language, but doesn’t seem to have one
    • so: CSM (computer science metanotation)
    • has built-in datatypes, expressions, etc
    • it’s beautiful, but it’s getting messed up!
    • walk-throughs of examples, how to read it (drawn from recent ACM papers)
    • “isn’t it odd that language theorists wanting to talk about types, do it in an untyped language?”
    • wrote toy compiler to turn latex expressions of CSM from emacs buffer into prolog code, proved it can run (to do type checking)
    • inference rules: Gentzen Notation (notation for “natural deduction”)
    • BNF: can trace it all the way back to 4th century BCE, with Panini’s sanskrit grammar
    • regexes: took thirty years to settle on a notation (51–81), but basically hasn’t changed since 1981!
    • final form of BNF: not set down till 1996, though based on a paper from 1977
    • but even then, variants persist and continue to be used (especially in books)
    • variants haven’t been a problem, because they common pieces are easy enough to identify and read
    • modern BNF in current papers is very similar to classic BNF, but with 2 changes to make it more concise:
      • use single letters instead of meaningful phrases
      • use bar to indicate repetition instead of … or *
    • substitution notation: started with Church, has evolved and diversified over time
    • current favorite: e[v/x] to represent “replace x with v in e”
    • number in live use has continued to increase over time, instead of variants rising and falling (!)
    • bigger problem: some sub variants are also being used to mean function/map update, which is a completely different thing
    • theory: these changes are being driven by the page limits for computer science journals (all papers must fit within 10 years)
    • overline notation (dots and parentheses, used for grouping): can go back to 1484, when chuquet used underline to indicate grouping
      • 1702: leibnitz switched from overlines to parentheses for grouping, to help typesetters publishing his books
    • three notations duking it out for 300 years!
    • vectors: notation -> goes back to 1813, and jean-robert argand (for graphing complex numbers)
    • nested overline notation leads to confusion: how do we know how to expand the expressions that are nested?
    • one solution: use an escape from the defaults, when needed, like backtick and tilde notation in clojure
    • conclusions:
      • CMS is a completely valid language
      • should be a subject of study
      • has issues, but those can be fixed
      • would like to see a formal theory of the language, along with tooling for developing in it, checking it, etc
      • thinks there are opportunities for expressing parallelism in it
    Day Three

    Declarative Deep Learning in Clojure

    • starts with explanation of human cognition and memory
    • at-your-desk memory vs in-the-hammock memory
    • limitation of neural networks: once trained for a task, it can’t be retrained to another without losing the first
      • if you train a NN to recognize cats in photos, you can’t then ask it to analyze a time series
    • ART architecture: uses two layers, F1 and F2, the first to handle data that has been seen before, the second to “learn” on data that hasn’t been encountered before
    • LSTM-cell processing:
      • what should we forget?
      • what’s new that we care about?
      • what part of our updated state should we pass on?
    • dealing with the builder pattern in java: more declarative than sending a set of ordered args to a constructor
    • his lib allows keyword args to be passed in to the builder function, don’t have to worry about ordering or anything
    • by default, all functions produce a data structure that evaluates to a d4j object
    • live demos (but using pre-trained models, no live training, just evaluation)
    • what’s left?
      • graphs
      • front end
      • kafka support
      • reinforcement learning
    Learning Clojure Through Logo
    • disclaimer: personal views, not views of employers
    • logo: language to control a turtle, with a pen that can be up (no lines) or down (draws lines as it moves)
    • …technical difficulties, please stand by…
    • live demo of clojure/script version in the browser
    • turns out the logo is a lisp (!): function call is always in first position, give it all args, etc
    • even scratch is basically lisp-like
    • irony: we’re using lisp to teach kids how to program, but then they go off to work in the world of curly braces and semicolons
    • clojure-turtle lib: open-source implementation of the logo commands in clojure
    • more live demos
    • recommends reading seymour papert’s book: “Mindstorms: Children, Computers, and Powerful Ideas”
    • think clojure (with the power of clojurescript) is the best learning language
    • have a tutorial that introduces the turtle, logo syntax, moving the turtle, etc
    • slowly introduces more and more clojure-like syntax, function definitions, etc
    • fairly powerful environment: can add own buttons for repeatable steps, can add animations, etc
    • everything’s in the browser, so no tools to download, nothing to mess with
    • “explaining too early can hurt”: want to start with as few primitives as possible, make the intro slow
    • can create your own lessons in markdown files, can just append the url to the markdown file and it’ll load (!)
    • prefer that you send in the lessons to them, so they can put them in the lessons index for everyone to benefit
    • have even translated the commands over to multiple languages, so you don’t have to learn english before learning programming (!)
    • lib: cban, which has translations of clojure core, can be used to offer translations of your lib code into something other than english
    • clojurescript repls: Klipse (replaces all clojure code in your page with an interactive repl)
    • comments/suggestions/contributions welcome
    → 5:00 AM, Oct 17
  • Clojure/West 2015: Notes from Day Three

    Everything Will Flow

    • Zach Tellman, Factual
    • queues: didn't deal with directly in clojure until core.async
    • queues are everywhere: even software threads have queues for their execution, and correspond to hardware threads that have their own buffers (queues)
    • queueing theory: a lot of math, ignore most
    • performance modeling and design of computer systems: queueing theory in action
    • closed systems: when produce something, must wait for consumer to deal with it before we can produce something else
      • ex: repl, web browser
    • open systems: requests come in without regard for how fast the consumer is using them
      • adding consumers makes the open systems we build more robust
    • but: because we're often adding producers and consumers, our systems may respond well for a good while, but then suddenly fall over (can keep up better for longer, but when gets unstable, does so rapidly)
    • lesson: unbounded queues are fundamentally broken
    • three responses to too much incoming data:
      • drop: valid if new data overrides old data, or if don't care
      • reject: often the only choice for an application
      • pause (backpressure): often the only choice for a closed system, or sub-system (can't be sure that dropping or rejecting would be the right choice for the system as a whole)
      • this is why core.async has the puts buffer in front of their normal channel buffer
    • in fact, queues don't need buffer, so much as they need the puts and takes buffers; which is the default channel you get from core.async

    Clojure At Scale

    • Anthony Moocar, Walmart Labs
    • redis and cassandra plus clojure
    • 20 services, 70 lein projects, 50K lines of code
    • prefer component over global state
    → 7:00 AM, Apr 29
  • Clojure/West 2015: Notes from Day Two

    Data Science in Clojure

    • Soren Macbeth; yieldbot
    • yieldbot: similar to how adwords works, but not google and not on search results
    • 1 billion pageviews per week: lots of data
    • end up using almost all of the big data tools out there
    • EXCEPT HADOOP: no more hadoop for them
    • lots of machine learning
    • always used clojure, never had or used anything else
    • why clojure?
      • most of the large distributed processing systems run on the jvm
      • repl great for data exploration
      • no delta between prototyping and production code
    • cascalog: was great, enabled them to write hadoop code without hating it the whole time, but still grew to hate hadoop (running hadoop) over time
    • december: finally got rid of last hadoop job, now life is great
    • replaced with: storm
    • marceline: clojure dsl (open-source) on top of the trident java library
    • writing trident in clojure much better than using the java examples
    • flambo: clojure dsl on top of spark's java api
      • renamed, expanded version of climate corp's clj-spark

    Pattern Matching in Clojure

    • Sean Johnson; path.com
    • runs remote engineering team at Path
    • history of pattern matching
      • SNOBOL: 60s and 70s, pattern matching around strings
      • Prolog: 1972; unification at its core
      • lots of functional and pattern matching work in the 70s and 80s
      • 87: Erlang -> from prolog to telecoms; functional
      • 90s: standard ml, haskell...
      • clojure?
    • prolog: unification does spooky things
      • bound match unbound
      • unbound match bound
      • unbound match unbound
    • clojurific ways: core.logic, miniKanren, Learn Prolog Now
    • erlang: one way pattern matching: bound match unbound, unbound match bound
    • what about us? macros!
    • pattern matching all around us
      • destructuring is a mini pattern matching language
      • multimethods dispatch based on pattern matching
      • case: simple pattern matching macro
    • but: we have macros, we can use them to create the language that we want
    • core.match
    • dennis' library defun: macros all the way down: a macro that wraps the core.match macro
      • pattern matching macro for defining functions just like erlang
      • (defun say-hi (["Dennis"] "Hi Dennis!") ([:catty] "Morning, Catty!"))
      • can also use the :guard syntax from core.match in defining your functions' pattern matching
      • not in clojurescript yet...
    • but: how well does this work in practice?
      • falkland CMS, SEACAT -> incidental use
      • POSThere.io -> deliberate use (the sweet spot)
      • clj-json-ld, filter-map -> maximal use
    • does it hurt? ever?
    • limitations
      • guards only accept one argument, workaround with tuples
    • best practices
      • use to eliminate conditionals at the top of a function
      • use to eliminate nested conditionals
      • handle multiple function inputs (think map that might have different keys in it?)
      • recursive function pattern: one def for the start, one def for the work, one def for the finish
        • used all over erlang
        • not as explicit in idiomatic clojure
    → 7:00 AM, Apr 28
  • Clojure/West 2015: Notes from Day One

    Life of a Clojure Expression

    • John Hume, duelinmarkers.com (DRW trading)
    • a quick tour of clojure internals
    • giving the talk in org mode (!)
    • disclaimers: no expert, internals can change, excerpts have been mangled for readability
    • most code will be java, not clojure
    • (defn m [v] {:foo "bar" :baz v})
    • minor differences: calculated key, constant values, more than 8 key/value pairs
    • MapReader called from static array of IFns used to track macros; triggered by '{' character
    • PersistentArrayMap used for less than 8 objects in map
    • eval treats forms wrapped in (do..) as a special case
    • if form is non-def bit of code, eval will wrap it in a 0-arity function and invoke it
    • eval's macroexpand will turn our form into (def m (fn [v] {:foo "bar :baz v}))
    • checks for duplicate keys twice: once on read, once on analyze, since forms for keys might have been evaluated into duplicates
    • java class emitted at the end with name of our fn tacked on, like: class a_map$m
    • intelli-j will report a lot of unused methods in the java compiler code, but what's happening is the methods are getting invoked, but at load time via some asm method strings
    • no supported api for creating small maps with compile-time constant keys; array-map is slow and does a lot of work it doesn't need to do

    Clojure Parallelism: Beyond Futures

    • Leon Barrett, the climate corporation
    • climate corp: model weather and plants, give advice to farmers
    • wrote Claypoole, a parallelism library
    • map/reduce to compute average: might use future to shove computation of the average divisor (inverse of # of items) off at the beginning, then do the map work, then deref the future at the end
    • future -> future-call: sends fn-wrapped body to an Agent/soloExecutor
    • concurrency vs parallelism: concurrency means things could be re-ordered arbitrarily, parallelism means multiple things happen at once
    • thread pool: recycle a set number of threads to avoid constantly incurring the overhead of creating a new thread
    • agent thread pool: used for agents and futures; program will not exit while threads are there; lifetime of 60 sec
    • future limitations
      • tasks too small for the overhead
      • exceptions get wrapped in ExecutionException, so your try/catches won't work normally anymore
    • pmap: just a parallel map; lazy; runs N-cpu + 3 tasks in futures
      • generates threads as needed; could have problems if you're creating multiple pmaps at once
      • slow task can stall it, since it waits for the first task in the sequence to complete for each trip through
      • also wraps exceptions just like future
    • laziness and parallelism: don't mix
    • core.async
      • channels and coroutines
      • reads like go
      • fixed-size thread pool
      • handy when you've got a lot of callbacks in your code
      • mostly for concurrency, not parallelism
      • can use pipeline for some parallelism; it's like a pmap across a channel
      • exceptions can kill coroutines
    • claypoole
      • pmap that uses a fixed-size thread pool
      • with-shutdown! will clean up thread pool when done
      • eager by default
      • output is an eagerly streaming sequence
      • also get pfor (parallel for)
      • lazy versions are available; can be better for chaining (fast pmap into slow pmap would have speed mismatch with eagerness)
      • exceptions are re-thrown properly
      • no chunking worries
      • can have priorities on your tasks
    • reducers
      • uses fork/join pool
      • good for cpu-bound tasks
      • gives you a parallel reduce
    • tesser
      • distributable on hadoop
      • designed to max out cpu
      • gives parallel reduce as well (fold)
    • tools for working with parallelism:
      • promises to block the state of the world and check things
      • yorkit (?) for jvm profiling

    Boot Can Build It

    • Alan Dipert and Micha Niskin, adzerk
    • why a new build tool?
      • build tooling hasn't kept up with the complexity of deploys
      • especially for web applications
      • builds are processes, not specifications
      • most tools: maven, ant, oriented around configuration instead of programming
    • boot
      • many independent parts that do one thing well
      • composition left to the user
      • maven for dependency resolution
      • builds clojure and clojurescript
      • sample boot project has main method (they used java project for demo)
      • uses '--' for piping tasks together (instead of the real |)
      • filesets are generated and passed to a task, then output of task is gathered up and sent to the next task in the chain (like ring middleware)
    • boot has a repl
      • can do most boot tasks from the repl as well
      • can define new build tasks via deftask macro
      • (deftask build ...)
      • (boot (watch) (build))
    • make build script: (build.boot)
      • #!/usr/bin/env boot
      • write in the clojure code defining and using your boot tasks
      • if it's in build.boot, boot will find it on command line for help and automatically write the main fn for you
    • FileSet: immutable snapshot of the current files; passed to task, new one created and returned by that task to be given to the next one; task must call commit! to commit changes to it (a la git)
    • dealing with dependency hell (conflicting dependencies)
      • pods
      • isolated runtimes, with own dependencies
      • some things can't be passed between pods (such as the things clojure runtime creates for itself when it starts up)
      • example: define pod with env that uses clojure 1.5.1 as a dependency, can then run code inside that pod and it'll only see clojure 1.5.1

    One Binder to Rule Them All: Introduction to Trapperkeeper

    • Ruth Linehan and Nathaniel Smith; puppetlabs
    • back-end service engineers at puppetlabs
    • service framework for long-running applications
    • basis for all back-end services at puppetlabs
    • service framework:
      • code generalization
      • component reuse
      • state management
      • lifecycle
      • dependencies
    • why trapperkeeper?
      • influenced by clojure reloaded pattern
      • similar to component and jake
      • puppetlabs ships on-prem software
      • need something for users to configure, may not have any clojure experience
      • needs to be lightweight: don't want to ship jboss everywhere
    • features
      • turn on and off services via config
      • multiple web apps on a single web server
      • unified logging and config
      • simple config
    • existing services that can be used
      • config service: for parsing config files
      • web server service: easily add ring handler
      • nrepl service: for debugging
      • rpc server service: nathaniel wrote
    • demo app: github -> trapperkeeper-demo
    • anatomy of service
      • protocol: specifies the api contract that that service will have
      • can have any number of implementations of the contract
      • can choose between implementations at runtime
    • defservice: like defining a protocol implementation, one big series of defs of fns: (init [this context] (let ...)))
      • handle dependencies in defservice by vector after service name: [[:ConfigService get-in-config] [:MeowService meow]]
      • lifecycle of the service: what happens when initialized, started, stopped
      • don't have to implement every part of the lifecycle
    • config for the service: pulled from file
      • supports .json, .edn, .conf, .ini, .properties, .yaml
      • can specify single file or an entire directory on startup
      • they prefer .conf (HOCON)
      • have to use the config service to get the config values
      • bootstrap.cfg: the config file that controls which services get picked up and loaded into app
      • order is irrelevant: will be decided based on parsing of the dependencies
    • context: way for service to store and access state locally not globally
    • testing
      • should write code as plain clojure
      • pass in context/config as plain maps
      • trapperkeeper provides helper utilities for starting and stopping services via code
      • with-app-with-config macro: offers symbol to bind the app to, plus define config as a map, code will be executed with that app binding and that config
    • there's a lein template for trapperkeeper that stubs out working application with web server + test suite + repl
    • repl utils:
      • start, stop, inspect TK apps from the repl: (go); (stop)
      • don't need to restart whole jvm to see changes: (reset)
      • can print out the context: (:MeowService (context))
    • trapperkeeper-rpc
      • macro for generating RPC versions of existing trapperkeeper protocols
      • supports https
      • defremoteservice
      • with web server on one jvm and core logic on a different one, can scale them independently; can keep web server up even while swapping out or starting/stopping the core logic server
      • future: rpc over ssl websockets (using message-pack in transit for data transmission); metrics, function retrying; load balancing

    Domain-Specific Type Systems

    • Nathan Sorenson, sparkfund
    • you can type-check your dsls
    • libraries are often examples of dsls: not necessarily macros involved, but have opinionated way of working within a domain
    • many examples pulled from "How to Design Programs"
    • domain represented as data, interpreted as information
    • type structure: syntactic means of enforcing abstraction
    • abstraction is a map to help a user navigate a domain
      • audience is important: would give different map to pedestrian than to bus driver
    • can also think of abstraction as specification, as dictating what should be built or how many things should be built to be similar
    • showing inception to programmers is like showing jaws to a shark
    • fable: parent trap over complex analysis
    • moral: types are not data structures
    • static vs dynamic specs
      • static: types; things as they are at compile time; definitions and derivations
      • dynamic: things as they are at runtime; unit tests and integration tests; expressed as falsifiable conjectures
    • types not always about enforcing correctness, so much as describing abstractions
    • simon peyton jones: types are the UML of functional programming
    • valuable habit: think of the types involved when designing functions
    • spec-tacular: more structure for datomic schemas
      • from sparkfund
      • the type system they wanted for datomic
      • open source but not quite ready for public consumption just yet
      • datomic too flexible: attributes can be attached to any entity, relationships can happen between any two entities, no constraints
      • use specs to articulate the constraints
      • (defspec Lease [lesse :is-a Corp] [clauses :is-many String] [status :is-a Status])
      • (defenum Status ...)
      • wrote query language that's aware of the defined types
      • uses bi-directional type checking: github.com/takeoutweight/bidirectional
      • can write sensical error messages: Lease has no field 'lesee'
      • can pull type info from their type checker and feed it into core.typed and let core.typed check use of that data in other code (enforce types)
      • does handle recursive types
      • no polymorphism
    • resources
      • practical foundations for programming languages: robert harper
      • types and programming languages: benjamin c pierce
      • study haskell or ocaml; they've had a lot of time to work through the problems of types and type theory
    • they're using spec-tacular in production now, even using it to generate type docs that are useful for non-technical folks to refer to and discuss; but don't feel the code is at the point where other teams could pull it in and use it easily

    ClojureScript Update

    • David Nolen
    • ambly: cljs compiled for iOS
    • uses bonjour and webdav to target ios devices
    • creator already has app in app store that was written entirely in clojurescript
    • can connect to device and use repl to write directly on it (!)

    Clojure Update

    • Alex Miller
    • clojure 1.7 is at 1.7.0-beta1 -> final release approaching
    • transducers coming
    • define a transducer as a set of operations on a sequence/stream
      • (def xf (comp (filter? odd) (map inc) (take 5)))
    • then apply transducer to different streams
      • (into [] xf (range 1000))
      • (transduce xf + 0 (range 1000))
      • (sequence xf (range 1000))
    • reader conditionals
      • portable code across clj platforms
      • new extension: .cljc
      • use to select out different expressions based on platform (clj vs cljs)
      • #?(:clj (java.util.Date.) :cljs (js/Date.))
      • can fall through the conditionals and emit nothing (not nil, but literally don't emit anything to be read by the reader)
    • performance has also been a big focus
      • reduced class lookups for faster compile times
      • iterator-seq is now chunked
      • multimethod default value dispatch is now cached
    → 7:00 AM, Apr 27
  • RSS
  • JSON Feed
  • Surprise me!