Michael A. Nielsen, A visual proof that neural nets can compute any function, Chapter 4 of “Neural Networks and Deep Learning”, 2015.
Great interactive essay on neural networks based computing. The text is readable on mobile, but the interactions aren’t touch friendly. I don’t blame the author, only the poor tooling we have to publish universally accessible content.
Bret Victor, Scientific Communication As Sequential Art, 2011.
This is a proposal of what scientific communication could look like. It is composed of text and images in web format, so they are accessible on mobile. In addition, the document is interactive, since the computer can be more powerful than paper and is able to run simulations for deeper insight. Unfortunately, I think such effort is unrealistic to maintain for most content publishers. Given the current state of content publishing tooling, such effort requires computer skills that most people do not possess. We can dream of a world in which such literacy is as expected as it is to write long prose for scientific publications. This will probably require new kinds of tools to “express with computers”.
In computer graphics, photorealism drives the field toward a goal that everyone can relate to and evaluate. Other areas of computing, like user interface design, have harder goals to define. I’ll look at the example of computer animation tools.
Here’s a nice video about the making of the movie Jurassic Park:
The amount of work and invention required to produce convincingly looking dinosaurs is staggering. And it was done in 1993!
I think computer graphics are fortunate to have a clearly defined benchmark. In the physical world, we all have an intimate experience of how light, materials and living creatures behave. Our brain is extremely good at spotting inconsistencies. The public wouldn’t settle for less than convincingly looking pictures. That sets a very high bar for the visual effects teams, and drives formidable invention in the field.
Other computing areas where the upper bar isn’t defined precisely, or for which we don’t have a natural equivalent, are much harder to drive up. This includes the very way we use computers to do graphics (the user interface and the graphics programming languages), as well as everyday software we use to share and communicate.
2. Defining other goals
What would be the equivalent of photorealism for user interface design? We could think of “body realism” or “cognitive realism”: the idea that we should match our universal motor and cognitive abilities, and work out how we can best interface with the hand-eye coordination or temporal pattern detection that we can naturally perform.
For example, animators are mental dancers who have an intimate sense of timing. They can capture the nuances of body language and common physical behavior. But in practice, they have to lay out their intuition over a spatial representation of time, and learn abstract animation curves to express the inner dancing that naturally comes to them.
When I see the tool above, I am impressed by its elegance and power. However, when working on actual shipping projects, animation curves are more likely to look like this:
The animator has to handle the abstraction of representing time over space, and deal with the mental complexity that the representation quickly grows into. Learning how to use professional animation tools in order to ship anything interesting is a huge effort—no matter how much the big names in the animation industry downplay it as “just” a tool.
One of the magicians behind Jurassic Park, Phil Tippets, who was in charge of the animatronics and stop-motion animation, had this to say about digital computer animation:
We [in the stop-motion animation team] are used to actually walking up to the puppet, making each of these moves by hand. I’m not used to sitting down at a keyboard and having to hit buttons. It’s kind of like animating with boxing gloves on.
Tippets laments about the indirection of animating a digital puppet compared to animating a physical puppet. But that only calls for the opposite problem: animators have to animate their puppets, whether digital or physical, down to the rotation angle of the fingers joints. In real life, when we reach to grasp a glass of water, we don’t do so with an explicit knowledge of our fingers rotation angle. Instead, these details are implicitly integrated by our motor abilities, and we act on them on a higher level of abstraction. It is that level that’s meaningful to us and to the audience of animated pictures.
So it looks like the animation process happens at either one layer of abstraction above or below the layer that’s actually meaningful to the animator—without ever being at the sweet spot. I wonder if this constant up and down the ladder of abstraction is a fundamental limit, i.e. this is what makes an animator an animator, or an arbitrary limit, i.e. this is how the current animation systems work.
Does it make sense to think of an animation system where we direct characters by instructing meaning, instead of tediously manipulating single components? If I want to invoke rhythms like “3, 2, 1, go!” or “↗↘↗↘↗↘”, how would I tell the computer what to do in terms of what I want to do?
Ronald M. Baecker. A Conversational Extensible System for the Animation of Shaded Images (PDF), 1976. This paper is about SHAZAM, which was an animation system made at Xerox PARC. I stole two formulations from the paper: It’s essential that a computer animator develop an ability to sense which aspects of a system’s limitations are arbitrary and which are fundamental (p6), and It was impossible to explain to the animator what I was doing in terms of what he was doing (p6).
A peek into the Human Advancement Research Community.
HARC stands for Human Advancement Research Community. It’s a long range research group in computing funded by Y Combinator Research. It includes people like Alan Kay, Dan Ingalls, Bret Victor, Chaim Gingold and Toby Schachman, amongst many others I'd be excited to learn more about.
HARC is inspired by the early ARPA initiative from the 1960s: to be a high-risk, high-gain, far out research fund for computing related areas.
Götz Bachmann, Professor for Digital Cultures at the Institute for Culture and Aesthetics of Digital Media in Lüneburg, Germany, wrote an article about the research project on LIMN, a scholarly journal and art magazine. Here is the link for the article: Utopian Hacks.
This article is the first I saw relating some inside work happening at HARC. Bachmann specifically targets Bret Victor’s research group currently named “Realtalk”. Here is a selection of insightful quotes from the article:
Regarding HARC overall goal:
The overall goal is to create a rupture of a fundamental kind, a jump in technology equivalent to the jump in the 1960s and early 1970s when the quadruple introduction of the microprocessor, the personal computer, the graphical user interface, and the Internet revolutionized what computing could be by turning the computer into a medium.
Turning computing into media was already in the 1960s and 1970s meant to work with technology against technology: by using new computational capabilities, a medium was carved out that complies less with perceptions at the time of what computing is, and more with what a medium that forms a dynamic version of paper could look like.
Regarding Alan Kay’s research in the 1970s about programming and the object orientation idea (which is not what current object oriented programming languages are about):
The first iterations of Smalltalk [a programming language created by Alan Kay and his group] were experiments in object orientation that aimed to model all programming from scratch after a distributed system of message passing. Later versions gave up on this, and after an initial phase of success, Smalltalk eventually lost the battle over the dominant form of object orientation to the likes of C++ and Java.
The work of Alan Kay and his “Learning Research Group” can thus be seen as both a lost holy grail of computing before it was spoiled by a model of computing as capitalism cast in hard.
Kay’s work can be seen as a benchmark in radical engineering, as such enabling us to critique the stalemate and possible decline in quality of most currently available imaginaries about technologies.
About HARC Realtalk research group:
The group is constructing a series of operating systems for a spatial dynamic medium, each building on the experiences of building the last one, and each taking roughly two years to build. The current OS is named “Realtalk” and its predecessor was called “Hypercard in the World” (both names pay respect to historical, heterodox programming environments: Smalltalk in the 1970s and Hypercard in the 1980s).
As if to echo Nietzsche’s [...], a larger goal is to make new thoughts possible, which have until now remained “unthinkable” due to contemporary media’s inadequacies.
Enhanced forms of embodied cognition, and better ways of cooperative generation of ideas could cure the loneliness and pain that are often part of deep thought.
And a final thought-provoking reminder from some HARC researchers point of view:
The radical engineers would also be the first to state that the same interim solutions, if stopped in their development and reified too early, are potential sources of hacks in the derogatory sense. The latter is, according to their stories, exactly what happened when, 40 years ago, the prototypes left the labs too soon, and entered the world of Apple, IBM, and Microsoft, producing the accumulation of bad decisions that led to a world where people stare at smartphones.
ARPA. US government science and research fund started in 1958. Many computing pioneers (Ivan Sutherland, Douglas Engelbart, Alan Kay, etc.) and inventions (the Internet, graphical user interfaces) originated from that research group. Wikipedia.
Ivan Sutherland invented a revolutionary way of designing on a computer in 1962. We can find some of his core ideas in modern 3D software.
Here's Alan Kay presenting a revolutionary program made by Ivan Sutherland for his PhD thesis in 1962:
Sketchpad seems smarter and different from common 2D drawing programs. In Sketchpad, you draw without trying too hard, then you assign goals and constraints to the computer, and it resolves the shape into what you instructed it to be. In contrast, regular 2D drawing programs encourage you to precisely build your shapes bottom up and step by step.
Alan Kay laments about how surprisingly hard it is to find something like Sketchpad available for the general public today.
Design by simulation
The way Sketchpad does things reminds me of how we work with 3D software. Here's an example from Cinema 4D:
In this example, the scene is animated using physics and constraints only:
The head of the character is constrained to aim at the direction of the ball (along the red X axis) with 2 degrees of freedom (the head rotates left and right, up and down).
The ball is set with an initial velocity toward the character, and a physics simulation deals with the rest.
The result is a goofy character that follows the ball. There was no manual keyframing involved in the making of the animation.
This way of doing things is closer in spirit to Sketchpad's. We setup the initial conditions, we instruct the computer with goals and constraints, and the solvers do the rest. It is a wonderful way to experiment. The possibilities are immense. I'm often paralyzed by the combinatorial explosion of the interesting situations we can setup.
However, with this method of design, a great deal of blind trial and error is necessary to build a meaningful scene. It is like reverse engineering biological evolution. Most experiments won't snap into an "aha!" resolution.
It took me several attempts to find the right axis and degrees of freedom I should constrain to, as well as a careful setup of the initial conditions, before I got the animation above. And I still should tweak the animation to improve the character performance: I could add a delay between the ball position and the head tracking, and make the character body react to the ball when it hits it.
The learning curve required to art direct physics and constraints based animations is steep. I think there is a huge opportunity to make the discovery process more fun and natural with improved user interfaces and feedback loops.
The tools that were developed for digital computer animation and 3D visual effects, unlike 2D drawing programs, seem to be a direct descendant of Sketchpad. I believe this is not a coincidence. We can trace the inception of 3D graphics to the ARPA project and the University of Utah in the 60s and 70s. For example, Edwin Catmull, the inventor of texture mapping and other fundamental 3D graphics routines, now president of Pixar and Disney Animation, was a student of Ivan Sutherland.
ARPA. US government science and research fund started in 1958. Many computing pioneers (Sutherland, Douglas Engelbart, Alan Kay) and inventions (the Internet, graphical user interfaces) originated from that research group. Wikipedia.
Edwin Catmull. 3D computer graphics pioneer. President of Pixar and Disney Animation Studios. Wikipedia.
Texture mapping is a technique used to project a 2D texture on a 3D model. It is the way we dress up 3D objects with colors and materials. Wikipedia.
I was writing a note about a topic of interest. I made a quick and excited draft on my iPad. When I moved on to my PC to write the final note, a whole week flew by and I found myself lost in long paragraphs, over ambitious goals and no clear point to make. I had no idea what I wanted to say.
This happens constantly including right now.
So here's the point: writing is essential to clarify ideas. But be careful of rhetorics. Words and sentences can draw you where you didn't want to go and trap you there.
Did you know people in the Maghreb write their dialect in Latin and use numbers to denote specific Arabic characters?
The Maltese language is fascinating to me. It hits very close to home.
It is a mix of Arabic, Italian, French and English written in Latin script:
In this video, we learn that the Maltese Language branched out independently from North African Arabic during the Middle Ages. It detached from the Quran, and managed to mix Arabic, Latin languages and English in an official form spoken by hundreds of thousands of people.
I knew nothing about Maltese culture and language, but I immediately felt intimate with such a mix.
Moroccans for example speak a mix of Arabic, Berber, French and Spanish in a dialect called “darija”. In order to write darija on phones and computers, people use a mix of Latin characters and... numbers! The numbers are there to make up for specific Arabic characters which Latin script has no equivalent for, like “7” for “ح” or “9” for “ق”. For example, we can write “مرحبا” as “mar7ba”, which means welcome:
Is 3D animation with inverse kinematics a valid example of goal driven programming?
I'm interested in computer based tools that allow humans to create complex systems using natural user interfaces. I'm learning about computing tools from the past and present. I want to get the general landscape, and to understand what the capabilities and limitations of the available tools are. One way to do that is to find the right people and learn from them.
Alan Kay is a computer researcher and a member of a remarkable group who pioneered graphical user interfaces at Xerox PARC. Alan Kay claims that the current state of computer programming is sad. He says that we ignored and forgot important insights laid out by brilliant early inventors.
I'll give an example of such an early invention, then I'll present a modern tool, and I'll ask if the modern tool correctly represents the intent of the pioneer.
In 1962, Ivan Sutherland created a revolutionary computer program called Sketchpad:
Alan Kay frequently shows Sketchpad in his hypnotic talks. He defines Sketchpad as the first “non-procedural programming system”. That means the user tells the computer what he wants (aka. goals), instead of telling the computer how to do what he wants (aka. procedures). The audience is usually amazed by how advanced the system from 1962 is, and why such capabilities aren't common to all modern graphics software.
Bret Victor, a human-computer interface designer who helped resurfacing some of these groundbreaking solutions, defined Sketchpad as an example of “goal driven programming”.
2. Inverse kinematics
In modern 3D computer animation, there is a fascinating technique used for character animation called inverse kinematics:
Consider the 3D skeleton in the video. Each joint of the skeleton is constrained to move and rotate in some directions only. The animator can then move and rotate specific control points like the wrist or the hip. In the video above, only the hip is moved and rotated. The rest of the skeleton follows along according to the given constraints. The result is a natural looking motion that doesn't require to manually position every single joint.
Here's the Wikipedia definition of inverse kinematics:
“An animated figure is modeled with a skeleton of rigid segments connected with joints, called a kinematic chain. The kinematics equations of the figure define the relationship between the joint angles of the figure and its pose or configuration. Inverse kinematics compute the joint angles for a desired pose of the figure.”
In short, inverse kinematics is a solver that runs behind the scenes to satisfy in real-time the intent of the animator.
Can we consider 3D animation with inverse kinematics as an example of goal driven programming?
Alan Kay. How to Invent the Future II (YouTube), 2017. Every talk by Alan Kay is basically a rant on how bad things turned out for modern computing. Highly recommended!
Bret Victor. The Future of Programming (Vimeo), 2013. Bret Victor sets up his presentation in 1973, down to the white shirt and tie. He uses that theatrical trick to provoke the audience about how things in the 1973's future didn't go as well as they should have gone. Additional notes from the talk can be found at worrydream.com/dbx/
You shouldn't reinvent the wheel. Reinventing the wheel is the best way to understand the wheel.
I worked as a freelance web designer and developer. One of my clients was a director of photography. He wanted an online portfolio. My proposal was to build a static website without using any CMS (“Content Management System”: a computer application that allows the creation and modification of digital content, like WordPress).
I argued that my solution would cost less, take less time to develop, and require cheaper hosting.
On the flip side, updating the portfolio would require basic HTML and server management understanding. I offered to include a few HTML and file transfer courses in my proposal. I'd show how to add ready-made snippets of HTML into a document and upload it to a server, so my client could add new media content himself. My client was playful and smart. He agreed.
That project never shipped.
I often want to tell something about an experience or an idea I had. But most of the time, it never gets out there. I start by writing a note, and I get lost thinking about presenting it nicely, packaging it into a separate web page, and making it relevant and thorough enough to stand on its own.
These are impossible and counterproductive targets for what is essentially thinking out loud notes. As a result, most notes I write end up forgotten on my iPad.
Then I got an idea: create a presentation template, make the process of adding new notes as automatable as possible, and dedicate a page to all the notes. In one word: blog.
It is very tempting to tinker and try to come up with new solutions for every problem. But sometimes the best way to go is to use a proven solution.
Nevertheless, I enjoy traveling the path and reinventing the wheel. It illuminates the history of things and how they came to be.
Oskar Stålbergreinvented the rule of the road (Twitter): Units got stuck when meeting in narrow passages. I fixed it by introducing left handed traffic. Never felt so clever.
While reading tutorials or watching presentations, I often come across this sentence: “it's simple!”. The author uses that sentence to introduce an idea and explain something. I think “it's simple” shouldn't be used to explain something.
Using “it's simple” didactically means:
I am enthusiastic about the topic
You can understand what I'm explaining
There shouldn't by any disagreement about what I'm saying
But saying “it's simple” can:
Undermine the listener's efforts
Set false expectations
Imply that anything less than immediate understanding is stupid
Whenever I witness a coincidence, my inner voice shouts biological evolution!
By that I mean: I just witnessed a collision of chains of events that makes sense in my perspective.
Example: I crack my fingers, and at the exact same time, the door cracks because of the wind. Two similar sounds happening at the same time are worthy of notice.
Biological evolution is the history of coincidences happening in a deterministic physical world. Structures of the world can produce simultaneous events that resonate with each other and create something new.
There is a debate in the tech world about mobile computing (smartphones and tablets) vs. desktop computing (desktops and laptops). Below is a perspective for the creative professionals.
I think the problem is about where does work happen.
Sitting at a desk is the symbol of getting work done.
Tools like pen, paper and desktop computers have evolved around the desk + chair model. But I can't sit at a desk all the time. I'd rather move freely and choose my spot and body configuration depending on my mood.
I do desks because that's where I find the tools I need. I need Cinema 4D, Photoshop, After Effects, Sublime Text, note editors, web browsers, file compressors, server files uploaders, as much processing power as possible, and the ability to handle and switch between tens, sometimes hundreds of elements to ship a project.
Ideally I'd like to do all that wherever I want: sitting, standing, walking, next to my friend, inside, outside, going to bed, waking up, in the kitchen, in the toilets, all with maximum compute power and energy autonomy.
That's why mobile spread out wider than desktop. It's about the mobility. Below is one of the series of tweets that prompted this post:
“No need to debate “kids use phones” or “iPad sits in a drawer”. Reality, laptops sit on desks, used less by most. Mobile = important.”
Right now mobile devices, i.e. phones and tablets, have nowhere near the power and speed of use PCs provide. Current mobile devices are extremely frustrating for people having complex workflows: 3D graphics, software development, video editing, etc.
The future is keeping the mobility of mobile devices:
Read and write anywhere
Photo and video anywhere
While matching the speed and depth current desktop computers provide, so you could:
Work at your desk, then get up and...
Edit a 3D model while pacing the room
Update your website while walking outside
Get back to your desk and continue
Creative people dream of continuous tooling availability, but the current crop of mobile hardware and software isn't capable enough for digital creative workflows.
Update 7 April 2017
The question isn't whether or not we can create on mobile—of course we can. Creation has been mobile for thousands of years. Creative people can create with anything. The question is: how will we do on mobile what we currently do on desktop?
The installation looks important and immersive, because it is well staged. It is afforded to human perception in a way that makes it a part of the world we live in. Humans pay attention to it.
Nature is full of mind blowing structures that science helps capturing. But relatively few of these structures are widely afforded to human perception as if they were part of this world. Yet, the world is made of these mind blowing structures.
In Star Trek Voyager, Season 6 Episode 14, “Memorial”, the spaceship's crew experiences unexpected memories of a massacre. They remember the killing of dozens of innocent people. Those memories put the crew in deep pain.
Later on, they discover that their memories were created by this structure:
A civilization has erected this monument. It emits a signal throughout the system. Anyone who comes close enough will experience memories of the massacre.
The structure contains a synaptic transmitter, says Seven of Nine, a member of the crew. It was designed to send neurogenic pulses throughout this system.
This device makes visitors remember arbitrary events. If the device was configured to generate memories of a massacre, a visitor would get a vivid memory of a disaster and feel the loss of many loved ones.
Words alone cannot convey the suffering.
Words alone cannot prevent what happened here from happening again.
Beyond words lies experience.
Beyond experience lies truth.
Make this truth your own.
I think such a device would constitute the second ultimate work of art. It maps directly to the emotional structures of the mind and creates any emotion the author intends to.
The ultimate work of art would be to capture and rebuild the emotion itself.