Simone Weil’s principles for automation (Man / Machine VI)

Philosopher and writer Simone Weil laid out a few principles on automation in her fascinating and often difficult book Need for Roots. Her view as positive, and she noted that among workers in factories the happiest ones seemed to be the ones that worked with machines. She had strict views on the design of these machines however, and her views can be summarized in three general principles.

First, these tools of automation need to be safe. Safety comes first, and should also be weighed when thinking about what to automate first – the idea that automation can be used to protect workers is an obvious, but sometimes neglected one.

Second, the tools of automation need to be general purpose. This is an interesting principle, and one that is not immediately obvious. Weil felt that this was important – when it came to factories – because they could then be repurposed for new social needs, and respond to changing social circumstances – most pressingly, and in her time acute, war.

Third, the machine needs to be designed so that it is used and operated by man. The idea that you would substitute man by machine she found ridiculous for several reasons, but not least because we need to work to finds purpose and meaning, and any design that eliminates us from the process of work would be socially detrimental.

All Weil’s principles are applicable and up for debate in our time. I think the safety principle is fairly accepted, but we should not that she speaks of individual safety and not our collective safety. In the cases where technology for automation could pose a challenge for broader safety concerns in different ways, Weil does not provide us with a direct answer. This need not be apocalyptic scenarios at all, but could simply be questions of systemic failures of connected automation technologies, for example. Systemic safety, individual safety, social safety are all interesting dimensions to explore here – are silicon / carbon hybrid models always safer, more robust, more resilient?

The idea about general purpose and easy to repurpose is something that I think reflects how we have seen 3d printing evolve. One idea of 3D-printing is exactly this, that we get generic factories that can manufacture anything. But the other observation that is close at hand here is that you could imagine Weil’s principle as an argument for general artificial intelligence. It should be admitted that this is taking it very far, but there is something to that, and it is that a general AI & ML model can be broadly and widely taught and we would avoid narrow guild experts emerging in our industries. That would, in turn, allow for quick learning and evolution as these technologies, needs and circumstances change. General purpose technologies for automation would allow for us to change and adapt faster to new ideas, challenges and selection pressures – and would serve us well in a quickly changing environment.

The last point is one that we will need to examine closely. Should we consider it a design imperative to design for complementarity rather than substitution? There are strong arguments for this, not least cost arguments. Any analysis of a process that we want to automate will yield a silicon – carbon cost function that gives us to cost of the process as different parts of it are performed by machines and humans. A hypothesis would be that for most processes this equation will see a distribution across the two and only for very few will we see a cost equation where the human component is zeroed out. Not least because human intelligence is produced at extraordinarily low energy cost and with great resilience. There is even a risk mitigation strategy argument here — you could argue that always including a human element, or designing for complementarity, necessarily generates more resilient and robust systems as the failure paths of AIs and human intelligence look different and are triggered by different kinds of factors. If, for any system, you can allow for different failure triggers and paths, you seem to ensure that the system self-monitors effectively and reduces risk.

Weil’s focus on automation is also interesting. Today, in many policy discussions, we see the emergence of principles on AI. One could argue that this is technology-centric principle making, and that the application of ethical and philosophical principles suit the use of a technology better and that use-centric principles are more interesting. The use-case of automation is a broad one, admittedly, but an interesting one to test this on and see if salient differences emerge. How we choose to think about principles also force us to think about the way we test them. An interesting question is to compare with other technologies that have emerged historically. How would we think about principles on electricity, computation, steam — ? Or principles on automobiles and telephones and telegraphs? Where do we effectively place principles to construct normative landscapes that benefit us as a society? Principles for driving, for communicating, for selling electricity (and using it and certifying devices etc (oh, we could actually have a long and interesting discussion about what it would mean to certify different ML models!)).

Finally, it is interesting also to think about the function of work from a moral cohesion standpoint. Weil argues that we have no rights but for the duties we assume. Work is a foundational duty that allows us to build those rights, we could add. There is a complicated and interesting argument here that ties rights to duties to human work in societies from a sociological standpoint. The discussions about universal basic income are often conducted in sociological isolation, not thinking about the network of social concepts tied up in work. If there is, as Weil assume, a connection between our work and duties and the rights a society upholds on an almost metaphysical level, we need to re-examine our assumptions here – and look carefully at complementarity design as a foundational social design imperative for just societies.

Notes on attention, fake news and noise #3: The Noise Society 10 years later

This February it is 10 years since I defended my doctoral thesis on what I then called the Noise Society. The main idea was that the idea of an orderly, domesticated and controllable information society – modeled on the post-industrial visions of Bell and others – probably was wrongheaded, and that we would see a much wilder society characterized by an abundance of information and a lack of control, and in fact: we would see information grow to a point where the value of it actually collapsed as the information itself collapsed into noise. Noise, I felt then, was a good description not only of individual disturbances in the signal, but also the cost for signal discovery over all. A noise society would face very different challenges than an information society.

Copyright in a noise society would not be an instrument of encouraging the production of information so much as a tool for controlling and filtering information in different ways. Privacy would not be about controlling data about us as much as having the ability to consistently project a trusted identity. Free expression would not be about the right to express yourself, but about the right not to be drowned out by others. The design of filters would become key in many different ways.

Looking back now, I feel that I was right in some ways and wrong in many, but that the overall conclusion – that the increase in information and the consequences of this information wealth are at the heart of our challenges with technology – was not far off target. What I am missing the thesis is a better understanding of what information does. My focus on noise was a consequence of accepting that information was a “thing” rather than a process. Information looks like a noun, but is really a verb, however.

Revisiting these thoughts, I feel that the greatest mistake was not including Herbert Simon’s analysis of attention as a key concept in understanding information. If I had done that I would have been able to see that noise also is a process, and I would have been able to ask what noise does to a society, theorize that and think about how we would be able to frame arguments of policy in the light of attention scarcity. That would have been a better way to get at what I was trying to understand at the time.

But, luckily, thought is about progress and learning, and not about being right – so what I have been doing in my academic reading and writing for the last three years at least is to emphasize Herbert Simon’s work, and the importance of understanding his major finding that with a wealth of information comes a poverty of attention and a need to allocate attention efficiently.

I believe this can be generalized, and that the information wealth we are seeing is just one aspect of an increasing complexity in our societies. The generalized Simon-theorem is this: with a wealth of complexity comes a poverty of cognition and a need to learn efficiently. Simon, in his 1969 talk on this subject, notes that it is only by investing in artificial intelligence we can do this, and he says that it is obvious to him that the purpose of all of our technological endeavours is to ensure that we learn faster.

Learning, adapting to a society where our problems are an order of magnitude more complex, is key to survival for us as a species.
It follows that I think the current focus on digitization and technology is a mere distraction. What we should be doing is to re-organize our institutions and societies for learning more, and faster. This is where the theories of Hayek and others on knowledge coordination become helpful and important for us, and our ideological discussions should focus on if we are learning as a society or not. There is a wealth of unanswered questions here, such as how we measure the rate of learning, what the opposite of learning is, how we organize for learning, how technology can help and how it harms learning — questions we need to dig into and understand at a very basic level, I think.

So, looking back at my dissertation – what do I think?

I think I captured a key way in which we were wrong, and I captured a better model – but the model I was working with then was still fatally flawed. It focused on information as a thing not a process, and construed noise as gravel in the machinery. The focus on information also detracts from the real use cases and the purpose of all the technology we see around us. If we were, for once, to take our ambitions “to make the world a better place” seriously, we would have to think about what it is that makes the world better. What is the process that does that? It is not innovation as such, innovation can go both ways. The process that makes our worlds better – individually and as societies – is learning.

In one sense I guess this is just an exercise in conceptual modeling, and the question I seem to be answering is what conceptual model is best suited to understand and discuss issues of policy in the information society. That is fair, and a kind of criticism that I can live with: I believe concepts are crucially important and before we have clarified what we mean we are unable to move at all. But there is a risk here that I recognize as well, and that is that we get stuck in analysis-paralysis. What, then, are the recommendations that flow from this analysis?

The recommendations could be surprisingly concrete for the three policy areas we discussed, and I leave as an exercise for the reader to think about them. How would you change the data protection frameworks of the world if the key concern was to maximize learning? How would you change intellectual property rights? Free expression? All are interesting to explore and to solve in the light of that one goal. I tend to believe that the regulatory frameworks we end up with would be very different than the ones that we have today.

As one part of my research as an adjunct professor at the Royal Institute of Technology I hope to continue exploring this theme and others. More to come.

Ginzburg V: Bertillonian word portraits in the age of tag clouds

Ginzburg dwells on the use of signs to identify individuals in his essay. His main example is the emergence of fingerprinting as a semiotic practice to identify and diversify crowds into individuals. But he also looks at how graphology grew out of the understanding of one’s characters – in writing – reflected one’s character – in psychology. Through the series of examples of signs and symptoms used to identify the individual (also used to warn us of recidivist criminals, like in Dumas) he tries to show that our need for a connection between the semiotic and the biological is a need that has roots in our need for accountability, and legal responsibility. In passing, Ginzburg also mentions Bertillon’s use of word portraits, a practice that was suggested to escape the simplistic physiognomic descriptions in early criminal records. And here we find something interesting.

Bertillon suggested that a linguistic description of an individual would be more interesting than one in which we try to measure a few biological qualities. He immediately ran into two problems: one was the problem of linguistic ambiguity – how do you create a literal description of an individual that can be used for uniquely identifying that individual? The other was the problem that the system was, as Ginzburg notes, wholly negative. Someone who did not fulfill the description could be eliminated, but would a word portrait really uniquely identify someone? Anyone who has read a description of criminal or seen a so-called phantom image of a criminal understands how hard that task really is.

Fast forward to today. Would it not be possible to extract unique linguistic signatures from someone’s Facebook feed or Twitter stream? In fact, would it not even be possible to more exactly identify and individual from their linguistic print than from any biological data? We could imagine a technology that creates a linguistic portrait against which we can be authenticated, and where we would not even know what idiosyncracies truly and finally identified us uniquely as ourselves.

A Turing test of sorts, a matching against all the traces and threads that we have left online, that would be able to state with a percentage the likelihood that we are the same individual as that of the text sample we are being compared to.

A lot of interesting questions suggest themselves. A finger print can only identify the individual. Can a linguistic word portrait also suggest age? Can it suggest intoxication or any other state of mind? Is it possible to build a piece of software that can show in detail how our language typically changes through our life spans? Or degrades with a number of drinks?

Imagine a filter that detects the tell-tale signs of drunk tweeting and silently holds your tweets until you sober up. Or a filter that suggests that your actual mental age seems closer to 50 than 40 at this point in time. An interesting, if somewhat eerie, possibility perhaps.

What would a word portrait of you look like?

Ginzburg IV: The origins of narrative

A large part of Ginzburg’s essay concerns the nature and origin of narrative. Ginzburg’s hypothesis is as daring as it can be controversial. He writes:

“Perhaps indeed the idea of a narrative, as opposed to spell or exorcism or invocation (Seppilli 1962), originated in a hunting society, from the experience of interpreting tracks. […]The hunter could have been the first ‘to tell a story’ because only hunters knew how to read a coherent sequence of events from the silent (even imperceptible) signs left by their pray” (p 89)

This idea, that gatherers invoked, prayed and casts spells, whereas the real story tellers were hunters, opens a whole space for speculation about the role of narrative in our societies, and helps explain the rise and growth of the detective story to its prominence. The detective story is the hunt, in this case for human prey and predator, and remains a dominating narrative form.

The tracing of prey, the interpretation of tracks across the sands of time becomes the main form of narration. The appeal of this hypothesis is that it also seems to suggest an explanation of the communal nature of narrative. Why do we, as a species, prefer to share stories to the extent that our cultural consumption more or less describes a power law distribution? If we believe that it is because stories originated as hunters’ tales then we know that sharing these meant sharing a set of communal ideas, insights and histories that could be used to create social identities and individual narratives. The stories told by the fireside are the stories that make up an ‘us’ of the individuals attending the fire. And as such they provide and grant an evolutionary advantage.

It also says something about the role of the author. If the author is a hunter, “examining a quarry’s tracks” as Ginzburg suggests, we understand our fascination with the artist, the author, the story teller in a much clearer light. The story teller provided for the community, not only an identity, but the actual food that the tribe needed.

If our minds are neurologically prepared to understand the world only in narratives, as some neuro-scientists seem to have argued, then maybe the hunter’s prerogative is even deeper, biologically seated. The narratives we need to survive are indeed those of the hunter.

But perhaps also those of the hunted.

Ginzburg III: On the serendipity engine

Ginzburg explores the role of traces in understanding the world, and usefully repeats the myth of the three sons of the King of Serendippo. The myth is originally found in several folk tales, and roughly goes like this, according to Ginzburg from a 1557 re-telling: the three princes of Serendippo meet a merchant and tells him that they think an animal has passed by. The Merchant, who is missing a camel, asks them to describe the animal and they go onto say that it is a lame camel, blind in one eye, missing a tooth, carrying a pregnant woman, and bearing honey on one side and butter on the other. The merchant then thinks that they, since they know the camel so well, must be the ones who have stolen it – so he charges them. They manage to be acquitted when they show how they by simple inferences together deduced their description.

The genealogical connection to Sherlock Holmes, and the detective story over all, is abundantly clear, and Ginzburg informs us that Horace Walpole later coined the term “serendipty” to refer to cases where “accident and sagacity” allow for knowledge discovery.

Now, in discussions about the web today some have called for more accident – thinking that the increasing focus on filters will create bubbles around us. That, in itself, is an interesting discussion – but not one that Ginzburg engages in. What he does, however, is to suggest that the etymological genealogy of the word presents anyone who wants to design a serendipty engine with two interesting problems.

First, the design of accidents is not trivial. Information follows contexts and simply randomizing information is hardly likely to give the experience of serendipity – it will rather feel like opening your spam email in the hopes of spending time there to become wiser (I just tried that out of curiousity, and did not feel wiser, but rather more misanthropic afterwards). So designing accidental information discovery is one part of the art of building a serendipty engine.

Second, the princes exhibited signficant sagacity. And this presents an even more interesting problem. What if the serendipty engine is not possible without users actively taking an interest in the knowledge and information that they consume? There is a deep problem here, facing those that argue that we need technological fixes to fight off possible filter bubbles. Maybe filters are – to a certain extent – self-imposed products of our own making?

Ginzburg II: Complexity, clues and the emergence of conjectural computing

Ginzburg discusses the gulf between natural sciences and their increasing abstraction and the concreted and detailed nature of the human sciences, almost always engaged in the individual case, about which natural science almost always remains silent. The “individuum est ineffabile”-imperative will simply not work in history, for example, where the individual case remains a node in a network of clues used to understand and think about history. History, and the social sciences, Ginzburg seems to suggest, need to work from what he calls a conjectural paradigm – and we need to understand the merits and de-merits of that system rather than try for the mathematization of all disciplines.

This particular discussion has developed even further in our time, where the computational turn in the humanities to some represent the logical end point for sciences suffering from physics envy, but now consigning themselves to uninteresting discoveries of patterns where deep insights were once to be had. Ginzburg writes:

“Galileo has posed an awkward dilemma for human sciences. Should they achieve significant results from a scientifically weak position, or should they put themselves in a strong scientific position but get meager results?”(p 110)

But Ginzburg’s warning should not, I think, be read as a warning against the use of computer models and methods in the human sciences, but rather as a reminder that what we need are models that allow for the messiness of the individual case. The computational turn can, in fact, mean that we can explore Morellian space for individual cases much more effectively, finding signs and symptoms that we otherwise would not detect as easily. Computer science can augment the search for semiotic patterns even in the individual case. And perhaps this is the way we have to move overall if we want to say something about society at all. Ginzburg notes the relationship between the need for a use of clues and the increasing complexity of the phenomena under study:

“The same conjectural paradigm, in this case used to develop still more sophisticated controls over the individual in society, also holds the potential for understanding society. In a social structure of ever-increasing complexity like that of advanced capitalism, befogged by ideological murk, any claim to systematic knowledge appears as a flight of foolish fancy. To acknowledge this is not to abandon the idea of totality. On the contrary the existence of a deep connection which explains superficial phenomena can be confirmed when it is acknowledged that direct knowledge of such a connection is impossible. Reality is opaque, but there are certain points – clues, symptoms – which allow us to decipher it.” (p 109)

The exploration of Morellian space, the idea of clues and indirect access to deep relationships seem to suggest the need for a computational extension of semiotics, perhaps a kind of conjectural computing. Which, of course, is what we have seen the past 20 years or so with machine learning, probabilitistic AI and more. But what it also suggests is that these methods of computer science actually are, or should be, methods used in the human and social sciences as well. Computer science, then, is not just a branch of, say, mathematics, but a fundamental shift in the way we think about and model an “ever-increasing complexity” and maybe, just maybe, the way that the two cultures ultimately will merge.

Ginzburg here points towards an observation that has, by now, been suggested many times: that computational methods and technologies actually present a different way to think about what constitutes scientific method – from theory to model, from experiment to simulation, from proof to search.

Such a shift encourages new technologies and inventions. The notion of “conjectural computing” as a way of describing it is not new, but helps us think about the nature and philosophy of the conjecture in a way that is rich and thought-provoking. Finally, Ginzburg in discussing this development also points out the rising importance of the aphorism. He writes:

“Side by side with the decline of the systematic approach, the aphoristic one gathers strength – from Nietzsche to Adorno. Even the word aphoristic is revealing. (It is an indication, a symptom, a clue: there is no getting away from our paradigm.). Aphorisms was the title of a famous work by Hippocrates. In the seventeenth-century collections of “Political Aphorisms” began to appear. Aphoristic literature is by definition an attempt to formulate opinions about man and society on the basis of symptoms, clues; a humanity and a society that are diseased, in crisis.” (p 109)

The rise of aphoristic thinking, the possibility and reality of aphoristic algorithms, algorithms that predict on the basis of clues – all of that is in evidence around us.

Ginzburg I: The exploration of Morellian space in the age of data science

Carlo Ginzburg explores, in a magnificent essay entitled “Clues: Morelli, Freud and Sherlock Holmes” featured in The Sign of Three: Dupin, Holmes, Peirce (ed. Eco, U and Sebeok, T, 1988), a series of ideas that not only touch deeply on the nature of semiotics and the divide between natural science and social as well as human science, but also on any number of very interesting threads that interest me deeply. In a series of posts, I will explore and discuss some of those threads – but for anyone interested in engaging with a fascinating piece of writing I can only recommend reading the essay and working through its rich ideas and complex structure carefully. 

The focus of Ginzburg’s essay is the art historian Morelli, who, under a pseudonym published a work intended to help attribute paintings more accurately. Morelli’s method was simple, but a stroke of genius. Instead of focusing on the attributes that everyone associated with a master – i.e. the smile of Mona Lisa and others in Leonardo’s paintings – Morelli suggest that the attribution of authorship might actually be done more precisely and exactly by looking at details in the paintings that the masters were not associated with.

Ginzburg reprints and shows a series of collections of ears, fingernails and noses that Morelli studied carefully and used to re-attribute a large number of works of art in an often spectacular way. Morelli’s method – focusing on the involuntary details and clues provided by the author of a work signalled a shift in thinking, a focus on symptoms, that Ginzburg attributes in part to Morelli being a doctor, used to collecting symptoms and using them to diagnose an underlying but directly inaccessible phenomenon.

The same method, of course, is also used in detective stories, and as Ginzburg shows – Sherlock Holmes himself actually in one story uses the uniqueness of an ear to establish that the unknown victim of a crime, whose ear is presented to Sherlock, is in fact a blood relative of one of the people he encounters in a case. But back to the method. Morelli focuses on details and patterns not often paid attention to, to draw conclusions about complex artefacts. There is a methodological similarity between this and what we can now do in data science that is interesting. Let’s think about it.

For any problem, we can state that it all the data pertaining to the problem can be divided into two different sets. One set is the set of data we usually ask for when we try to solve a problem or analyze it. This set, let’s call it the Canon set of data, exists in a vast space of details and data that we usually do not associate with the problem. The canon set is arrived at through experience, theory and exploration of the problem at hand. We decide that the data in the canon set is important because it has proven to give us at least a certain percentage of success in understanding, and perhaps solving the problem. Morelli’s suggestion, then, is that there are many problems where there is something even more efficient in the Morellian space outside of the canon set. The problem in Morelli’s day was that to explore Morellian space you had to be very clever and think outside of the canon set, consciously. You had to focus on the search for data in Morellian space that could outperform the canon set of data used to solve the problem. With new pattern recognition technologies we may actually be able to search more extensively through Morellian space and identify data or clusters of data that allow us to solve problems more effectively.

Such a search would probably look something like this: first we identify the accessible data we have in a particular case, then we we try to understand what the canon set looks like in order to establish a reference case, after which we try different combinations of data in Morellian space to see what could possibly outperform the canonical set of data used for a particular problem.

Thinking about any particular problem as typically being solved by a canonical set of data, but swimming in Morellian space of neglected detailed clues, also is useful in order to think about the structure of problems overall – and what happens as problems become more complex. That will be the subject of the next post.