Ginzburg V: Bertillonian word portraits in the age of tag clouds

Ginzburg dwells on the use of signs to identify individuals in his essay. His main example is the emergence of fingerprinting as a semiotic practice to identify and diversify crowds into individuals. But he also looks at how graphology grew out of the understanding of one’s characters – in writing – reflected one’s character – in psychology. Through the series of examples of signs and symptoms used to identify the individual (also used to warn us of recidivist criminals, like in Dumas) he tries to show that our need for a connection between the semiotic and the biological is a need that has roots in our need for accountability, and legal responsibility. In passing, Ginzburg also mentions Bertillon’s use of word portraits, a practice that was suggested to escape the simplistic physiognomic descriptions in early criminal records. And here we find something interesting.

Bertillon suggested that a linguistic description of an individual would be more interesting than one in which we try to measure a few biological qualities. He immediately ran into two problems: one was the problem of linguistic ambiguity – how do you create a literal description of an individual that can be used for uniquely identifying that individual? The other was the problem that the system was, as Ginzburg notes, wholly negative. Someone who did not fulfill the description could be eliminated, but would a word portrait really uniquely identify someone? Anyone who has read a description of criminal or seen a so-called phantom image of a criminal understands how hard that task really is.

Fast forward to today. Would it not be possible to extract unique linguistic signatures from someone’s Facebook feed or Twitter stream? In fact, would it not even be possible to more exactly identify and individual from their linguistic print than from any biological data? We could imagine a technology that creates a linguistic portrait against which we can be authenticated, and where we would not even know what idiosyncracies truly and finally identified us uniquely as ourselves.

A Turing test of sorts, a matching against all the traces and threads that we have left online, that would be able to state with a percentage the likelihood that we are the same individual as that of the text sample we are being compared to.

A lot of interesting questions suggest themselves. A finger print can only identify the individual. Can a linguistic word portrait also suggest age? Can it suggest intoxication or any other state of mind? Is it possible to build a piece of software that can show in detail how our language typically changes through our life spans? Or degrades with a number of drinks?

Imagine a filter that detects the tell-tale signs of drunk tweeting and silently holds your tweets until you sober up. Or a filter that suggests that your actual mental age seems closer to 50 than 40 at this point in time. An interesting, if somewhat eerie, possibility perhaps.

What would a word portrait of you look like?

Agency and autonomy II: Sorrow, pain and soul

The problem with determining agency is that it looks as if we are determining a quality in an actor in the moment. In fact, that is not what is happening. When Wittgenstein examines psychological states he notes that some of them have what he refers to as “Echte Dauer” – real duration – and some do not. Hence, it works to ask when pain starts and stops, or to speak of an instant of extreme pain, but if we were to do the same when it comes to sorrow the result would be almost comical. “Do you feel sorrow now? When did it start, when does it stop?” – Those are questions that make no sense.

Sorrow, Wittgenstein suggests, is something we discern over time in the tapestry of life, in a sequence of events. The attitude to a soul would probably be the same thing. We decide that something has agency not on the basis of a single moment, but the attitude to a soul is something that grows over time as we adopt and accept a pattern of behaviour as one that exhibits soul.

Could we ever do this for a machine? If we leave aside the rather simplistic answer that if we did it would not be a machine anymore – a valid if somewhat insipid answer – we seem to end up in a very difficult discussions about the corporeality of soul. Can we say about a disembodied system that it has a soul. Would we ever allow for the judgment that a set of algorithms should be viewed as a soul? Or do we think that only something that also has pain, can feel sorrow, fall in love – all of those things – has agency?

We could imagine a world in which agency is a package deal. We only accept the notion of agency in systems that we also believe feel pain, sorrow, joy or sadness. Such a view – let us call it the bundle hypothesis of agency – in its most expansive form says that only human beings, or what we perceive to be human beings, can have agency. If we ever wanted to build a system that exhibited agency, then, we would have to build a system that was, essentially, human.

But we could also imagine the contrary view. Someone who accepts that a system can have agency, without being able to feel pain or sorrow or any other human feelings. Such a view could be constructed in different ways, but a minimalistic one would state that we should treat such systems as can most easily be understood as having agency as having agency. This is an echo of Dennett’s test for intentionality, where the economy of description of a system determines if we call it conscious or not. Such a delineation at first feels like a cop-out, since what we are looking for seems to be the answer to the metaphysical question of if a system has agency/or is conscious, but it proves to be an interesting answer from a legal view point.

The reason is this. If we assume that economy of description can be used to delineate systems that we would like to say exhibit agency, then we have a good way of determining what systems we should hold legally liable, and what systems ultimately need to be reduced to their creators and designers in order to allocate responsibility.

Ginzburg IV: The origins of narrative

A large part of Ginzburg’s essay concerns the nature and origin of narrative. Ginzburg’s hypothesis is as daring as it can be controversial. He writes:

“Perhaps indeed the idea of a narrative, as opposed to spell or exorcism or invocation (Seppilli 1962), originated in a hunting society, from the experience of interpreting tracks. […]The hunter could have been the first ‘to tell a story’ because only hunters knew how to read a coherent sequence of events from the silent (even imperceptible) signs left by their pray” (p 89)

This idea, that gatherers invoked, prayed and casts spells, whereas the real story tellers were hunters, opens a whole space for speculation about the role of narrative in our societies, and helps explain the rise and growth of the detective story to its prominence. The detective story is the hunt, in this case for human prey and predator, and remains a dominating narrative form.

The tracing of prey, the interpretation of tracks across the sands of time becomes the main form of narration. The appeal of this hypothesis is that it also seems to suggest an explanation of the communal nature of narrative. Why do we, as a species, prefer to share stories to the extent that our cultural consumption more or less describes a power law distribution? If we believe that it is because stories originated as hunters’ tales then we know that sharing these meant sharing a set of communal ideas, insights and histories that could be used to create social identities and individual narratives. The stories told by the fireside are the stories that make up an ‘us’ of the individuals attending the fire. And as such they provide and grant an evolutionary advantage.

It also says something about the role of the author. If the author is a hunter, “examining a quarry’s tracks” as Ginzburg suggests, we understand our fascination with the artist, the author, the story teller in a much clearer light. The story teller provided for the community, not only an identity, but the actual food that the tribe needed.

If our minds are neurologically prepared to understand the world only in narratives, as some neuro-scientists seem to have argued, then maybe the hunter’s prerogative is even deeper, biologically seated. The narratives we need to survive are indeed those of the hunter.

But perhaps also those of the hunted.

Ginzburg III: On the serendipity engine

Ginzburg explores the role of traces in understanding the world, and usefully repeats the myth of the three sons of the King of Serendippo. The myth is originally found in several folk tales, and roughly goes like this, according to Ginzburg from a 1557 re-telling: the three princes of Serendippo meet a merchant and tells him that they think an animal has passed by. The Merchant, who is missing a camel, asks them to describe the animal and they go onto say that it is a lame camel, blind in one eye, missing a tooth, carrying a pregnant woman, and bearing honey on one side and butter on the other. The merchant then thinks that they, since they know the camel so well, must be the ones who have stolen it – so he charges them. They manage to be acquitted when they show how they by simple inferences together deduced their description.

The genealogical connection to Sherlock Holmes, and the detective story over all, is abundantly clear, and Ginzburg informs us that Horace Walpole later coined the term “serendipty” to refer to cases where “accident and sagacity” allow for knowledge discovery.

Now, in discussions about the web today some have called for more accident – thinking that the increasing focus on filters will create bubbles around us. That, in itself, is an interesting discussion – but not one that Ginzburg engages in. What he does, however, is to suggest that the etymological genealogy of the word presents anyone who wants to design a serendipty engine with two interesting problems.

First, the design of accidents is not trivial. Information follows contexts and simply randomizing information is hardly likely to give the experience of serendipity – it will rather feel like opening your spam email in the hopes of spending time there to become wiser (I just tried that out of curiousity, and did not feel wiser, but rather more misanthropic afterwards). So designing accidental information discovery is one part of the art of building a serendipty engine.

Second, the princes exhibited signficant sagacity. And this presents an even more interesting problem. What if the serendipty engine is not possible without users actively taking an interest in the knowledge and information that they consume? There is a deep problem here, facing those that argue that we need technological fixes to fight off possible filter bubbles. Maybe filters are – to a certain extent – self-imposed products of our own making?

Agency and autonomy I: Agency and an attitude to a soul

The notion of agency is essential to understanding our society. If we cannot say who did something, or what it means to be the one actual acting in a specific case, then all of the language games of legal liability, contractual freedom and intellectual property – to name but a few subjects – falter and fail. Agency lies at the core of our legal philosophies, it is a concept so deeply entrenched that it is easy to miss.

What, then, does it mean to be an agent, to act, to have agency? There is no simple answer here, but there is a simplistic one: we believe that we all act with agency, that whatever we aim towards, what we will, is what we should be responsible for. It is also fairly obvious that we never hold artefacts or systems responsible for their actions. Indeed, we do not think that systems act, they simply function.

We could make an observation here about language games and action. Wittgenstein captures the essence of action in saying that “my attitude to him is an attitude to a soul”, and in that simple sentence he also captures a lot of the complexity around agency and intention. We treat those systems as responsible towards which we have an attitude to a soul. If there is no soul, there can be no agency, and hence no responsibility. We do not arrest the machine that kills a worker. We examine it for flaws, and if necessary we fix it, but when looking for who is responsible we would look towards whoever was responsible for keeping the machine working safely. Indeed, we even hesitate to say that the machine “killed” someone, because a machine lacks the ability to do so.

This is about to change, however. We are now entering an age where systems are becoming so complex as to effectively have some agency, or at least act as delegates for our agency. Look at an automated system for evaluating applications for university. Such a system will decline or approve applications. Now, if an application is denied erroneously we still do not fire the machine, but look to the programming and possibly examine the implementation of the software used. But what if the machine is learning, if it is evolving?

The problem here may be one of agency and states. We assume that if the original state of the machine is set by someone, then that someone is where we would look for agency. Now, what if we have a machine that evolves a system for handling applications, and where every state S(1)…S(N) after the initial one is produced by the software itself. Do we then hold the system responsible? Reset it if it has “learned the wrong things”? Or do we turn to who ever set the initial state S(0) and then designed the algorithm by which subsequent states then were evolved? Our attitude to a soul seems not to only hinge on individual states of a system, but to something else.

Agency remains elusive. But the importance of agency is not. It seems quite clear that agency is the basis on which we design legal systems of responsibility and liability, and that we assign a moral value to acts that only really works when the object of our judgment is something towards which, or whom, we have an attitude to a soul. Now, can we ever have an attitude towards a soul towards a computational system? If not, how do we meet the challenge of ever more autonomous systems?

Ginzburg II: Complexity, clues and the emergence of conjectural computing

Ginzburg discusses the gulf between natural sciences and their increasing abstraction and the concreted and detailed nature of the human sciences, almost always engaged in the individual case, about which natural science almost always remains silent. The “individuum est ineffabile”-imperative will simply not work in history, for example, where the individual case remains a node in a network of clues used to understand and think about history. History, and the social sciences, Ginzburg seems to suggest, need to work from what he calls a conjectural paradigm – and we need to understand the merits and de-merits of that system rather than try for the mathematization of all disciplines.

This particular discussion has developed even further in our time, where the computational turn in the humanities to some represent the logical end point for sciences suffering from physics envy, but now consigning themselves to uninteresting discoveries of patterns where deep insights were once to be had. Ginzburg writes:

“Galileo has posed an awkward dilemma for human sciences. Should they achieve significant results from a scientifically weak position, or should they put themselves in a strong scientific position but get meager results?”(p 110)

But Ginzburg’s warning should not, I think, be read as a warning against the use of computer models and methods in the human sciences, but rather as a reminder that what we need are models that allow for the messiness of the individual case. The computational turn can, in fact, mean that we can explore Morellian space for individual cases much more effectively, finding signs and symptoms that we otherwise would not detect as easily. Computer science can augment the search for semiotic patterns even in the individual case. And perhaps this is the way we have to move overall if we want to say something about society at all. Ginzburg notes the relationship between the need for a use of clues and the increasing complexity of the phenomena under study:

“The same conjectural paradigm, in this case used to develop still more sophisticated controls over the individual in society, also holds the potential for understanding society. In a social structure of ever-increasing complexity like that of advanced capitalism, befogged by ideological murk, any claim to systematic knowledge appears as a flight of foolish fancy. To acknowledge this is not to abandon the idea of totality. On the contrary the existence of a deep connection which explains superficial phenomena can be confirmed when it is acknowledged that direct knowledge of such a connection is impossible. Reality is opaque, but there are certain points – clues, symptoms – which allow us to decipher it.” (p 109)

The exploration of Morellian space, the idea of clues and indirect access to deep relationships seem to suggest the need for a computational extension of semiotics, perhaps a kind of conjectural computing. Which, of course, is what we have seen the past 20 years or so with machine learning, probabilitistic AI and more. But what it also suggests is that these methods of computer science actually are, or should be, methods used in the human and social sciences as well. Computer science, then, is not just a branch of, say, mathematics, but a fundamental shift in the way we think about and model an “ever-increasing complexity” and maybe, just maybe, the way that the two cultures ultimately will merge.

Ginzburg here points towards an observation that has, by now, been suggested many times: that computational methods and technologies actually present a different way to think about what constitutes scientific method – from theory to model, from experiment to simulation, from proof to search.

Such a shift encourages new technologies and inventions. The notion of “conjectural computing” as a way of describing it is not new, but helps us think about the nature and philosophy of the conjecture in a way that is rich and thought-provoking. Finally, Ginzburg in discussing this development also points out the rising importance of the aphorism. He writes:

“Side by side with the decline of the systematic approach, the aphoristic one gathers strength – from Nietzsche to Adorno. Even the word aphoristic is revealing. (It is an indication, a symptom, a clue: there is no getting away from our paradigm.). Aphorisms was the title of a famous work by Hippocrates. In the seventeenth-century collections of “Political Aphorisms” began to appear. Aphoristic literature is by definition an attempt to formulate opinions about man and society on the basis of symptoms, clues; a humanity and a society that are diseased, in crisis.” (p 109)

The rise of aphoristic thinking, the possibility and reality of aphoristic algorithms, algorithms that predict on the basis of clues – all of that is in evidence around us.

Ginzburg I: The exploration of Morellian space in the age of data science

Carlo Ginzburg explores, in a magnificent essay entitled “Clues: Morelli, Freud and Sherlock Holmes” featured in The Sign of Three: Dupin, Holmes, Peirce (ed. Eco, U and Sebeok, T, 1988), a series of ideas that not only touch deeply on the nature of semiotics and the divide between natural science and social as well as human science, but also on any number of very interesting threads that interest me deeply. In a series of posts, I will explore and discuss some of those threads – but for anyone interested in engaging with a fascinating piece of writing I can only recommend reading the essay and working through its rich ideas and complex structure carefully. 

The focus of Ginzburg’s essay is the art historian Morelli, who, under a pseudonym published a work intended to help attribute paintings more accurately. Morelli’s method was simple, but a stroke of genius. Instead of focusing on the attributes that everyone associated with a master – i.e. the smile of Mona Lisa and others in Leonardo’s paintings – Morelli suggest that the attribution of authorship might actually be done more precisely and exactly by looking at details in the paintings that the masters were not associated with.

Ginzburg reprints and shows a series of collections of ears, fingernails and noses that Morelli studied carefully and used to re-attribute a large number of works of art in an often spectacular way. Morelli’s method – focusing on the involuntary details and clues provided by the author of a work signalled a shift in thinking, a focus on symptoms, that Ginzburg attributes in part to Morelli being a doctor, used to collecting symptoms and using them to diagnose an underlying but directly inaccessible phenomenon.

The same method, of course, is also used in detective stories, and as Ginzburg shows – Sherlock Holmes himself actually in one story uses the uniqueness of an ear to establish that the unknown victim of a crime, whose ear is presented to Sherlock, is in fact a blood relative of one of the people he encounters in a case. But back to the method. Morelli focuses on details and patterns not often paid attention to, to draw conclusions about complex artefacts. There is a methodological similarity between this and what we can now do in data science that is interesting. Let’s think about it.

For any problem, we can state that it all the data pertaining to the problem can be divided into two different sets. One set is the set of data we usually ask for when we try to solve a problem or analyze it. This set, let’s call it the Canon set of data, exists in a vast space of details and data that we usually do not associate with the problem. The canon set is arrived at through experience, theory and exploration of the problem at hand. We decide that the data in the canon set is important because it has proven to give us at least a certain percentage of success in understanding, and perhaps solving the problem. Morelli’s suggestion, then, is that there are many problems where there is something even more efficient in the Morellian space outside of the canon set. The problem in Morelli’s day was that to explore Morellian space you had to be very clever and think outside of the canon set, consciously. You had to focus on the search for data in Morellian space that could outperform the canon set of data used to solve the problem. With new pattern recognition technologies we may actually be able to search more extensively through Morellian space and identify data or clusters of data that allow us to solve problems more effectively.

Such a search would probably look something like this: first we identify the accessible data we have in a particular case, then we we try to understand what the canon set looks like in order to establish a reference case, after which we try different combinations of data in Morellian space to see what could possibly outperform the canonical set of data used for a particular problem.

Thinking about any particular problem as typically being solved by a canonical set of data, but swimming in Morellian space of neglected detailed clues, also is useful in order to think about the structure of problems overall – and what happens as problems become more complex. That will be the subject of the next post.