One of the joys of being a researcher is having the chance to work intensively on a common vision, with someone delightful who brings a sharp mind and complementary approach. In that vein, we had a hugely productive 6 week visiting OLnet Project fellowship with Ágnes Sándor, from Xerox. We are both concerned with the development of an infrastructure capable of processing the knowledge level claims that scholars make in their writing. Her work brings a natural language processing dimension to the annotation of significant rhetorical moves in texts, complementing our approach of providing socio-semantic tools for human annotation.

This is being applied to the OLnet Project‘s work to build OER Collective Intelligence, specifically, Ágnes worked on the same set of OER project reports that OLnet has been analysing for the Hewlett Foundation, enabling us to compare and contrast what sense humans and machines make of the same reports.

The natural language parsing technology that Ágnes brings looks for significant patterns in texts such as shown below:


Discourse analysis with the Xerox Incremental Parser

By way of context, when we started our work in KMi on this in 1998, which became the EPSRC Scholarly Ontologies Project (2001-04), there was no computational parsing that we knew of capable of detecting scholarly rhetorical moves. A decade on, there is a growing network of researchers focused on this challenge, the machines have got smarter, and here we are integrating machine and human annotation within Cohere as a sensemaking support system.

In a seminar summarising this work, we concluded that there remain many challenges on the technical, user experience, and theoretical fronts, but we’re passing a very satisfying milestone 🙂 My thanks to Ágnes, and of course KMi colleagues Anna and Michelle, and the (human!) OLnet researchers working on the project reports, for getting us this far. Particular thanks to Giota and Elpida for their extra time to be video interviewed as they compared their analyses with XIP’s. They can rest assured that there’s no doubt that we still need people to make sense of the world!… but when a machine takes only milliseconds to highlight interesting passages in a large report, we’re opening up exciting new vistas in sensemaking…

Integrating Human & Machine Document Annotation for Sensemaking

This event took place on 11th November 2010 at 2:30pm
Knowledge Media Institute, The Open University, UK

Simon Buckingham Shum, Ágnes Sándor

Anna De Liddo & Michelle Bachler

We report on progress made during the collaboration between KMi’s Hypermedia Discourse Group and Ágnes Sándor (Xerox Research Centre Europe, Parsing & Semantics Group). This is the outcome of her 6 week OLnet Project Expert Fellowship at the OU, funded by the Hewlett Foundation, to develop Collective Intelligence for the Open Educational Resources (OER) community.

Our research investigates the overlaps and complementarities between the outputs from human analysts making sense of 120 OER project reports, using KMi’s Cohere semantic annotation and knowledge mapping tool, and machine annotation of the corpus by the Xerox Incremental Parser (XIP). XIP’s output is imported into Cohere to explore ways to visualize the combined human+machine output, and we present preliminary results from interviews with some of the analysts to elicit their views on XIP’s annotations.

We will present a video on this work as part of the CSCW 2012 video programme.

In the video we demonstrate the practical application of research on human and machine annotation of online documents to support reflective reading and collective sensemaking of online documents. We present an innovative research prototype which integrate a discourse analysis software (XIP) with our Cohere Web Annotation and Knowledge-Mapping tool. We visualize an interactive scenario of use of the two integrated technologies in a unique user experience. This dynamic scenario will give an inspiring vision of future CSCW systems, which machine and human intelligence are combined to enhance reasoning power.

For the detailed version:

De Liddo, A., Sándor, Á. and Buckingham Shum, S. (2012, In Press). Contested Collective Intelligence: Rationale, Technologies, and a Human-Machine Annotation Study. Computer Supported Cooperative Work. Eprint: