Big Humanities workshop on Big Data



As the IEEE conference on Big Data moved into a new phase, an incredible collection of humanities and technology partners gathered to share stories of what big data means to them.

Ada Lovelace described herself as a poetical scientist and an Analyst. I would call her a storyteller and codifier. She was able to both tell stories and encode them in a repeatable note format… A father the poet and her mother the mathematician, she from an early age learnt to flit between writing and analysing…. She learnt how to communicate and write while applying logic an analysis. I’m a bit of a fan. My time travelling self would no doubt have fallen for the most important person in the history of computer science (I suspect it might have been love in just one direction – so let’s move on shall we, but before we do.. it’s worth noting (and this is steeling from a conversation from the met office disrupter Mike Saunby) that working with Babbage not only led to computer science it also led Joseph Whitworth devising standards (without standards you wouldn’t be able to replace a light bulb, tune a TV, listen to a record, drive a car, or take a medicine)…. But I digress. So what am I trying to say? I It’s this: that interdisciplinary across the humanities is not just a desirable thing, it is the life force of discovery that flows through human existence. And history continues to show us that when it happens then sparks (literally and metaphorically) will fly. Note: October the 15th is Ada Lovelace Day.

Getting back to the conference. It’s been a hard start, I won’t deny it (most least because the four people that might, just might read this blog have been incredibly patient with my rants about the lack of people… ). Today is another day all-together. Today I feel human. I feel connected. Today is different. I apologise for yesterday. No actually I don’t. I want to shout at being subjected to a difficult two days of struggling with poorly presented academic knowledge. Today is incredible.

In a wonderful interplay between acronyms and metaphors I have been delighted with the truly interdisciplinary work of technologists and humanities collaborating on meaningful human big data. Expressing doubt and confidence in equal measure. A dash of humour has been thrown in, but underneath has been a rolling and rising discourse about that has got not just under the meaning of digital humanities but has started to get under my skin. I want to know more.

What’s happened? What has happened is that Dr Tobias Blanke and Dr Mark Hedges King’s College London have put together a remarkable workshop on Big Humanities workshop

I want the reader to understand that the humanities presented was further from my academic knowledge than the computer science has been. Yet in this set of talks the computer science is so much more vivid and exciting. The canvas of humanities enables me to understand. I don’t know about Victorian poetry texts. Yet I could immerse myself in the understanding of a subject elegantly presented as a visual narrative…

The science appears richer, more understandable, further advanced and more meaningful when presented at the heart of humanities.

Technologists have not held back on owning their subject – OCR is thrown in next to NLP and Cluster Analysis. (And we don’t want this to stop – researchers need to use their languages if they are to give passionate academic talks). But maybe we need a guide? Some simple How, Why, What, Wows of big data computational techniques – why is Hadoop better for real-time analysis. What is OCR – how does it differentiate from pattern recognition. Etc. This isn’t a barrier – it’s an opportunity.

There have been a wealth of presentations – and you’ll have to go to the workshop organisers for academic knowledge on this. But some thoughts, insights and connections that I have had today go a bit like this.

“[A]t a time when the web is simultaneously transforming the way in which people collaborate and communicate, and merging the spaces which the academic and non-academic communities inhabit, it has never been more important to consider the role which public communities – connected or otherwise – have come to play.” (Dunn & Hedges, 2012”) ->here

Making Data Matter

Data mining the 1918 flu pandemic

Viral Networks in 19th Century Newspapers





In the closing comments a panel of speakers came together to start to discuss themes, thoughts and what-nexts for humanities.

I liked Andrew Prescott’s reflection as scholar, “As historians we don’t know who the user is. Is a curator a user?”. My thought on this is that we don’t need a new form of ‘user centred histories’, instead we should rethink how we collaborate. To embrace the idea of historians a participants in a co-design process. Has anyone done this? Are their persona’s documenting typical (or a-typical) historians? Are their design guidelines or ‘branding’ documents for working with history? Is this something we could look at? Would this make a working across disciplines an easier thing? Or am I just being another voice in a mix of ideas that is just finding its feet.

Another clear big difference in this workshop to the technology focussed workshops and talks was the variety of data. And while Variety is a core theme of IEEE Big Data, the definition of variety is actually pretty narrow in terms of the talks I saw. All of the talks mentioned variety and then went on to show something that handled numerical data in a structured database. In humanities it seems the problem is more complex, potentially much harder (for machines) and crosses time and materiality to connect with everywhere humans have made their mark. In these talks it included 19th century newspapers, interviews, travelogues, transcripts, photographs, films, guidebooks, poems, private letters, journals and novels. This richness of data makes the problem of data exponentially grow into new dimensions. There was a lot of talk about language and translation – which should be a reasonably trivial problem once it goes through google translate? Right? Yes, but does google translate have a setting for 17th century vernacular? Does it have a setting for how a small community in the Lake District describe their world? And how often does language change? How many time zones do we need to encode to capture textual data? The problem was big data and now, I really don’t know what kind of data it is. But isn’t that the deal – it is at this very point of dealing with data, when your head spins and your hard drive melts that you know you’re dealing with something that’s possible bigger than big data?

At times the excitement in the room morphed into nervousness. Is this to big? And just like our friends in IEEE who worry about the end of Moore’s law, the humanities were asking ss this the end of theory? No said Barry Smith, “We must be prepared for failure. As Beckett said: fail and fail better”. He then went on the remind us that “it’s not the first time we’ve had big data. It’s happened before and we must understand the future from the past”.

As someone pointed out in the audience: there is going to be a huge argument in the humanities…

And as Christie Walker from the AHRC closed things off with: “It’s going to be great fun to stand back and see what happens”. That indeed it is.


Me, the AHRC and the IEEE Big Data 2013


I trained as an electronic engineer. My PhD exploring digital neural network models of motion perception took me deep into technical detail and gave that badge of doctorate in the philosophies  about digital electronics.  But you know this right? You know tha I radically changed the course of my interests and my professional life when I stepped stage left from the labs of Imperial College and walked into the studios of the Royal College of Art. Why did I do this? I did it because in the studios of the computer related design at the RCA there was a connectivity to people.  Playful technologies were used to complete research and exploration stories in the way people could interact with digital tech. They explored stories that predicted the future and quite radically positioned, for me and much of the rest of the world, a completely different way to see the route to discovery.  I felt re-tuned to a sense of purpose I hadn’t felt since I was hacking my BigTrak in the 1980s. And there was no looking back.

Until now.

In July this year  the Arts and Humanities Research Council (AHRC) asked whether I would like to join them at the  2013 IEEE International Conference On Big Data in Santa Clara in October. A chance for me to return to my roots and to reflect on where the arts and humanities research community fits into the Big Data landscape. A chance for the AHRC to throw somebody new into this melting pot and uncover some insights.

As I registered and was given the ubiquitous conference bag I wondered what insights I could harvest in this space that everyone from global governments to space research agencies to weather scientists to museum curators  to big supermarkets and one man and his dog  are doing on their  way mow this particularly large data meadow.

I think we nee a bit of context here. So let’s start with what is the IEEE? The Institute of Electrical and Electronics Engineers (pronounced Eye Triple E) was formed in 1884 from a collective of enlightened engineers to support professionals in their nascent field and to aid them in their efforts to apply innovation for the betterment of humanity”.  The core value and aim for the betterment of humanity rings true. It is not for the betterment of machines, it is for people. This value of humanity makes me feel good. It makes me feel that the IEEE are the right people to be tackling Big Data. Because isn’t it all about people at the end of the day? Does anyone ever wish they had spent more time with machines? This ideal that has travelled through  from the peak of Victorian society to the 21st century  makes the  feel pretty good about the IEEE as a perfect partner for arts and humanities research. And with this in mind I enter the conference room….

Day one – First IEEE workshop on data visualisation.

What is Big Data? Good question – with a pretty clear party line broadcast from the keynote and  throughout the 15 or more talks on the days. Big Data is a problem of a number of Vs.

Volume (how much data)
Velocity (how fast is it moving)
Variety (images, text, video, sensor)
Veracity (incomplete, rumours, dirty)

Now don’t get me wrong but these don’t sound like definitions that lead to the betterment of humanity. These sound like problems for machines.  I think we can find some more appropriate Vs to throw in there. How about values, validity, visibility and voices? What is it about problems just for machines that takes a powerful headline such as “Big Data” and push it right back into a meeting for computer scientists fine tuning algorithms. There seems no sense of purpose or of any grand challenge to solve? Maybe it’s me, but I wonder if anyone has asked the question why?

And this theme continued. I was disappointed (you might have guessed this) as the focus of every talk was on models for processing more big data at faster speeds.  There was very little about the human side of big data.  There was, to my surprise at this being the first IEEE workshop on visualistion  of big data,  nothing visual at all – we got close a couple of times but the speakers quickly moved on as if embarressed that a bit of beauty and clarity might somehow lesson the science.  Is this a crisis of confidence? Almost as if the original aims of the IEEE had been lost?  That’s not anyone’s fault I think it is a global problem of technologists. That people who are passionate (and the speakers were incredibly passionate) about machines are incredibly focussed on improving machines. Not improving machines for the betterment of humanity. And I know I go on about this, but by making things visual or even going further and making them tangible we bring them into a our human world. When this happens people react. People leap into action and want to know more  And the arts world is no stranger to the realities of engaging people in the physical world. When the UK’s largest public sculpture, The Angel of The North  by Anthony Gormley was first commissioned it was met with public outrage with people voicing real anger at the prospect of money being spent on something landing in their neighbourhood. It cost around £1m and has lasted the test of 15 years with over 90,000 people seeing it every day.  It is a thing now to be adored – a beacon of hope for the people of the north. There were even proposals to make an Angel of The South….  However the accountability of something being physical is something that we as data researchers need to be aware of. We know what happens when data is not made available (MPs expenses scandal in 2009). And we should remember this when we present our stories to our community.

The IEEE community appear s to have lost sight  of people and the sensory world that people live in.

Can we bring together the IEEE community with arts researchers exploring big data in a visual way?

The final talk of the day was given by Klaus Mueller who presented a case for visual feedback during the long (24 hour) data processing periods of big datasets. He illustrated the description of his algorithm with reference to a scientist working in the field who required fast visual feedback of where the data she was sampling while flying over the Arctic Circle in a research plane.  The visual overview data she obtained quickly enabled her to get a picture for data stories and follow new leads while in the sky.  The details are sketchy, but the use of a story is powerful. It tells us what the data was, why speed is important and how visual data is used in the field.

Stories of how people use big data are powerful mechanisms for understanding the role of big data in our society.

There is an opportunity for the AHRC to use its research base to harvest these stories and present to the wider world.


On the second day we were met with a positive speech from the keynote on the value of the intersection of technology and people. The speaker then dismissed people and focussed on an hour of technologies about how databases (mostly in SQL) can be managed faster, how they can deal with great amounts of data and how their team is doing it. Once again people seem to have been left out of this equation.

Talking over lunch we decided that this conference was exactly the thing we needed to spur us as arts and humanities researchers to define what the grand challenges are for big data and how they will impact on human lives and live up to the very clear mission of the original IEEE collective in 1884 – for the betterment of humanity.  This is something I wanted to return to throughout this week. I’m not sure what these challenges are – I just know that there’s something incredibly powerful that we can bring to this space and something that could define the future of interdisciplinary research. (Note to self, turn down the bold-statement producing effects and think more clearly).

Tomorrow’s talks in the workshop on Big Data and The Humanities looks promising. Maybe here I’ll get back in touch with humanity and maybe just maybe we can start to debate the value and meaning of big data to our lives.

More tomorrow.




Hacking in front of an audience – Met Office at the V&A