Overview of e-Humanities Group ‘New Trends in e-Humanities’
Updated: 25 November 2013
New Trends in e-Humanities
(formerly called Research meeting)
KNAW e-Humanities Group
The e-Humanities Group generally holds a research meeting every Thursday afternoon, and persons interested in the topics of the presentations are welcome to attend. Meetings are held from 15.00 – 17.00 with one or two presentations on current research, followed by discussion.
The research meetings are held in the facilities of the Meertens Institute, Joan Muyskenweg 25, Amsterdam (for directions, see: http://www.meertens.knaw.nl/cms/en/contact), in the Symposiumzaal located on the ground floor.
Scholars engaged in the broad domain of e-research in the humanities and social sciences are invited to share their work at these meetings; please contact firstname.lastname@example.org
Academic Year: 2013-2014
(dates in reverse chronological order)
Sarven Capidisli, Web technologist
Publishing & Linking Statistical Data
[Details to follow]
Frank van der Most, DANS and eHg
[Title & details to follow]
Susan Reilly, Royal Netherlands Library (KB)
[Title & details to follow ]
Joris van Zundert, Huygens Instituut-KNAW
[Title & details to follow]
Huub Everaert, Hogeschool Utrecht
[Title & details to follow]
eHg Annual Lecture
Tim Tangherlini, UCLA
Tracking Trolls: New Challenges from the Folklore Macroscope
Humanities scholars now routinely confront a vexing problem of scale—how does one deal with the complexity presented by the massive electronic resources at our disposal without losing sight of compelling research questions? In a recent article, Katy Börner proposes the theoretically tantalizing concept of the “macroscope.” For Börner, “Macroscopes provide a ‘vision of the whole,’ helping us ‘synthesize’ the related elements and detect patterns, trends, and outliers while granting access to myriad details. Rather than make things larger or smaller, macroscopes let us observe what is at once too great, slow, or complex for the human eye and mind to notice and comprehend” (Börner 2011, 60). The macroscope holds the promise of wedding “close reading” approaches, which have been a fundamental analytical approach in folkloristics since the beginning of the field, to what Franco Moretti has called “Distant Reading” where “Distance… is a condition of knowledge: it allows you to focus on units that are much smaller or much larger than the text: devices, themes, tropes—or genres and systems” (Moretti 2000, 57). In this presentation, I explain how we began developing a macroscope for the study of folklore, based on ongoing work with the Evald Tang Kristensen collection of Danish folklore (~250,000 stories, jokes, songs, riddles, and descriptions of everyday life). From problems of acquisition to problems of presentation, from problems of classification to problems of discovery, I explore some of our solutions and some of our unexpected discoveries as we focus the macroscope on this 19th century collection.
Timothy R. Tangherlini teaches folklore, literature and cultural studies at the University of California, where he is a professor in the Scandinavian Section, and the Department of Asian Languages and Cultures. He is also an affiliate of the Center for Digital Humanities, the Center for Medieval and Renaissance Studies, the Religious Studies Program, and a faculty member in the Center for Korean Studies and the Center for European and Eurasian Studies.
He has published widely on folklore, literature, film and critical geography. His main theoretical areas of interest are folk narrative, legend, popular culture, and critical geography. His main geographic areas of interest are the Nordic region (particularly Denmark and Iceland), the United States, and Korea.
He is the author of Interpreting Legend: Danish Storytellers and their Repertoires (1994), Talking Trauma. Paramedics and Their Stories (1998), and the co-editor of Nationalism and the Construction of Korean Identity (1999), and Sitings. Critical Approaches to Korean Geography (2008). He has also produced or co-produced two documentary films, Talking Trauma: Storytelling Among Paramedics (1994) and Our Nation. A Korean Punk Rock Community (2002). His current work focuses on computation and the humanities. In 2012, along with James Abello and Peter Broadwell, Tim Tangherlini published a paper called ‘Computational Folkloristics’ in: Communications of the ACM vol 55, no. 7, pp. 60-70. His most recent book, Danish Folktales, Legends, and Other Stories (2013) is a hybrid publication that includes the rich digital interface, The Danish Folklore Nexus.
Clement Levallois, Erasmus University Rotterdam
NESSHI and GEPHI:
Sociology of science as a breeding ground for tool building in the digital humanities.
This is a report on NESSHI (www.nesshi.eu), a European project in sociology of science started in 2011, and on the tools for data visualization created in the course of this project. Most of these tools were built on top of Gephi, an open-source network visualization software. I will present in detail how Gephi has evolved to become a platform of choice for key steps of data management (not only visualization), likely to be useful in a wide range of scientific domains.
Clement Levallois is a research associate at Erasmus University Rotterdam and at the eHumanities group of the KNAW. Starting in January 2014 he will be assistant professor at EMLyon Business School, in France. With a PhD in history of science based on archival work, Clement shifted to computational modes of research when arriving in the Netherlands in 2008.
Frank van Harmelen, VU University Amsterdam
Who are the members of ABBA? or: The Cluster Heuristic: How a simple heuristic can recognise correct from incorrect answers.
Wikipedia is a well known encyclopedic resource that is in widespread use, both in daily life, and as a resource for scientists. But since its start in 2001 there have been discussions about its quality. This has become even more urgent now that there exists a database version of Wikipedia. If computers send queries to this encyclopaedic database, how will we ever know which answers were correct or not?
To our surprise, a very simple heuristic that dates back to research in psychology from the ’70s are very effective in automatically recognising which answers are correct and incorrect, without having any prior knowledge about the topic.
I will present this heuristic and our experimental findings on its effectiveness.
NO Research meeting due to eHg Workshop ‘Critical perspectives on digital humanities’ at Public Library Amsterdam
Complexity in the Digital Humanities
Complexity pervades all sciences, and will play a pivotal role in twenty first century science. The fundamental idea is that we cannot understand a subject through its microscopic constituents, but only through their interactions. In recent times, this approach has been ascending in the humanities because of the increasing availability of large amounts of digitised data. These range from large corpora of digitised texts, such as the Google Books corpus to online services such as Twitter and Facebook. Moreover, historical archives are being opened up through digitisation, drawing historians into the world of complexity. These developments offer many new possibilities, but also many computational and conceptional challenges. This workshop will reflect on the role of complexity in the digital humanities, and it will cover a broad range of subjects.
Marcel Ausloos is professor emeritus in statistical physics from the University of Liège (Belgium). He has authored over 350 papers in various fields of statistical physics. Over the years, Ausloos applied methods from physics to fields of the social sciences and humanities, ranging from language evolution to financial market crashes.
Diego Garlaschelli completed his PhD in physics at the University of Siena (Italy) in 2005, after which he held various positions in Siena (Italy), Oxford (UK) and Pisa (Italy). Since 2011 he is assistant professor at the Lorentz Institute for Theoretical Physics in Leiden (NL) and an associate fellow at the CABDyN Complexity Center in Oxford (UK). His research focuses on complex networks, human behaviour and economics.
Stefan Dormans studied Human Geography at the University of Nijmegen. He obtained his PhD at the Radboud University Nijmegen in 2007, which entailed a narrative analysis of urban tales from two medium-sized Dutch cities. After this, Dormans worked at the Virtual Knowledge Studio for the Humanities and Social Sciences (VKS) and as an Assistant Professor at the Radboud University Nijmegen. Currently, he works as Programme Development Officer at the ICR department of the Nijmegen School of Management.
The workshop is open to all who are interested. There is no fee, but seating is limited so your are kindly requested to register in advance by sending a mail to Anja de Haas (email@example.com).
Vincent Traag, KITLV, Leiden & eHumanities Group, Amsterdam
Jeannette Haagsma, eHumanities Group, Amsterdam
For details and updates, please visit http://ehumanities.nl/complexity-in-the-digital-humanities/
10:15—11:00 Keynote: Marcel Ausloos – Measuring complexity in texts
Abstract: A nonlinear dynamics approach can be used in order to quantify complexity in written texts. As a first step, a one-dimensional system is examined: two written texts by one author (L. Carroll) are considered, together with one translation, into an artificial language, i.e. esperanto. They are mapped into time series. Their corresponding shuffled versions are used for obtaining a ”base line”. Two different one-dimensional time series are investigated: (i) one based on word lengths (LTS), (ii) the other on word frequencies (FTS). It is shown that the generalized Hurst exponent and the derived multifractal functions of the original and translated texts show marked differences. The original “texts” have some skewed structure, – in contrast to a mere parabola for shuffled texts. Moreover, the esperanto text has more extreme values. This suggests cascade model-like, with multiscale time asymmetric features as finally written texts. A discussion of the difference and complementarity of mapping into a LTS or FTS is presented.
11:00—12:00 Session 1 – Literature & Music
13:15—14:00 Keynote: Diego Garlaschelli – Reconciling long-term cultural diversity and short-term collective social behavior
Abstract: An outstanding open problem is whether collective social phenomena occurring over short timescales can systematically reduce cultural heterogeneity in the long run, and whether offline and online human interactions contribute differently to the process. Theoretical models suggest that short-term collective behavior and long-term cultural diversity are mutually excluding, since they require very different levels of social influence. The latter jointly depends on two factors: the topology of the underlying social network and the overlap between individuals in multidimensional cultural space. However, while the empirical properties of social networks are intensively studied, little is known about the large-scale organization of real societies in cultural space, so that random input specifications are necessarily used in models. Here we use a large dataset to perform a high-dimensional analysis of the scientific beliefs of thousands of Europeans. We find that interopinion correlations determine a nontrivial ultrametric hierarchy of individuals in cultural space. When empirical data are used as inputs in models, ultrametricity has strong and counterintuitive effects. On short timescales, it facilitates a symmetry-breaking phase transition triggering coordinated social behavior. On long timescales, it suppresses cultural convergence by restricting it within disjoint groups. Moreover, ultrametricity implies that these results are surprisingly robust to modifications of the dynamical rules considered. Thus the empirical distribution of individuals in cultural space appears to systematically optimize the coexistence of short-term collective behavior and long-term cultural diversity, which can be realized simultaneously for the same moderate level of mutual influence in a diverse range of online and offline settings.
14:00—15:00 Session 2 – Big Data
15:15—16:00 Keynote: Stefan Dormans – TBA
16:00—16:15 Discussion & closing
1. Vincent van Traag, KITLV and eHumanities group
Community structure in complex networks
Many complex networks exhibit some modular structure: they tend to have clusters of nodes that have many links inside clusters and few links across clusters, which are commonly called “communities”. In this presentation, I will sketch a broad overview of the topic. Two core problems of community structure will be highlighted and adressed: the problem of the so-called resolution limit and the problem of community structure in random networks. Only few methods do not suffer from the resolution limit and I will introduce one such method. The second problem will be addressed from the viewpoint of the significance of community structure. Interestingly, both problems seem to be incongruent, so that no method can address both problems simultaneously. I will conclude with some practical pointers for uncovering community structure and interpreting the results.
1. Traag, V. A., Krings, G. & Van Dooren, P. Significant scales in community detection. Scientific Reports 3, 2930 (2013). http://www.nature.com/srep/2013/131014/srep02930/full/srep02930.html
2.Traag, V. A., Van Dooren, P. & Nesterov, Y. Narrow scope for resolution-limit-free community detection. Physical Review E 84, 016114 (2011). http://link.aps.org/doi/10.1103/PhysRevE.84.016114
Vincent Traag has completed his PhD at the Department of Applied Mathematics at the Université catholique de Louvain in Belgium, after which he will join the e-Humanities Group and the KITLV. In general, he is interested in social networks, social dynamics and conflict, with a desire to combine mathematical modeling and social sciences.
Originally Traag began studying computer science, but after two years he decided to switch to sociology. However, he found the more formal analysis lacking in sociology and he took up an additional year of mathematics. After graduating cum laude in sociology at the University of Amsterdam he joined the research group on “Large Graphs and Networks” at the Department of Applied Mathematics at the Université catholique de Louvain. In his thesis he addresses two topics in network science. The first topic focuses on finding communities—groups of densely connected nodes—in networks. Some methods for finding communities suffer from a drawback: they cannot detect small communities in large graph. This problem, known as the resolution-limit, was analyzed in depth by him, and he showed that only few methods do not suffer from the problem. Secondly, he investigated negative links in networks—links that represent conflict or hatred. On the one hand, the problem is similar to community detection: the focus is on finding groups with positive links within groups, but negative links in between groups. On the other hand, he analyzed how such a structure of negative and positive links might come about through social dynamics.
2. Frank van der Most, DANS and eHumanities Group
Adding and finding meaning in case-by-case network-graphs of interviews
An explorative experiment in the combination and visualization of relational data and interview-transcription coding.
For the ACUMEN project, I collected career data from and conducted interviews with about 40 university-based researchers and 10 deans, department heads and human resources managers. Career data typically comes in the form of CVs, which are suitable for storing and coding in relational databases. Doing interviews results in notes, transcriptions and coding added to the transcriptions. This is typically done with coding software such as NVIVO, Atlas or TAMSAnalyzer. Database software usually does not produce network graphs. Coding software is good at producing network graphs, but bad at dealing with relational data. The problem then is how to combine the two and for ACUMEN I explore a few possibilities. I will present one of these and evaluate its use as a tool for exploration and analysis.
Frank van der Most started his work on the ACUMEN project in the summer of 2011 at the e-Humanities Group. His research interests are research practices, the funding and organization of research, research policies and the interactions between these three. He studied Computer Science at the University of Twente and Sciences and Arts at Maastricht University. From 1997 until 2005 he was involved in research projects in the history of technology, the policy and scientific developments surrounding mad cow disease, and an evaluation of the Norwegian Research Council. During these projects he developed an interest in digital tools for qualitative and historical research, for which he developed a database application. After a failed attempt to commercially exploit these interests he returned to academia and in 2009 defended his doctoral thesis titled ‘Research councils facing new science and technology : The case of nanotechnology in Finland, the Netherlands, Norway and Switzerland’ at the University of Twente. From 2009 until 2011, he did a post-doctoral project on the use and effects of research evaluations at the CIRCLE institute for innovation studies at Lund University. Frank still has a keen interest in digital tools for research and keeps a blog on research policy and practices at www.researchaffairs.net
No RM due to UDCC Conference 2013, The Hague. Co-organised by eHg
Loet Leydesdorff, Amsterdam School of Communication Research (ASCoR, UvA);
Toward a network theory of innovation:Heterogeneity in relations, positions, and perspectives
Knowledge-based innovations span networks across functional domains such as novelty production in R&D and wealth generation on the market. The indicators of these domains are heavily institutionalized. For example, maps of patents in USPTO cannot easily be compared with publications in the Web-of-Science; innovations require the mapping of interfaces: different perspectives have reflexively to be recombined. When the systems are animated in parallel (for example, on split screens) one can show delays and feedbacks among the different domains.}
In early stages of a knowledge-driven technology, for example, researchers may preferentially attach to the inventors, whereas in a next stage, preferential attachment moves to global “centers of excellence” such as Boston, London, and Seoul. The economic dynamic may be orthogonal; for example, in terms of marketable applications.
Using CuInSe2 is a material used for the coating of Photovoltaic Cells in thin layers. We study these technologies for Alkemade et al. (in preparation) in terms of both patent classes (cognitive diffusion) and inventor addresses (geographical diffusion). The development of inventor addresses in USPTO data is shown at http://www.leydesdorff.net/photovoltaic/cuinse2 (or similarly for PatStat data at http://www.leydesdorff.net/photovoltaic/cuinse2.patstat ). One can see the animations at http://www.leydesdorff.net/photovoltaic/cuinse2/animate.html.
The development of International Patent Classifications in this same set is visualized at http://www.leydesdorff.net/photovoltaic/cuinse2/cuinse2.ppsx. The development of the (Rao-Stirling) diversity shows the three generations of the technology that can inform the interpretation of the geographical diffusion. (Figure 1)
Fernie Maas (VU History Department), Albert Merono (VU Computer Science), Wouter Beek (VU Computer Science)
Dutch Book Trade 1660-1750: using the STCN to gain insight in publishers’ strategies
Despite a stagnating domestic demand near the end of the seventeenth century, Dutch book producers managed to keep up their international market position. In a so-called embedded research project, the Short Title Catalogue, Netherlands (STCN) was used to gain insight in the strategies and decisions of these publishers. The STCN is a retrospective bibliography of publications 1540-1800, containing information on title, author, book producer, language, subject and collation. Historians and computer scientists collaborated to disclose this STCN, and to connect it to other relevant datasets. To explore the possibilities of, and difficulties in, disclosing and linking the bibliography, attention was turned to a particular strategy: publishing scandalous books. Next to explaining the process of converting and querying the STCN data, the presentation will deal with differences in handling data and the advantages of an Open Data approach in the humanities research.
Albert Meroño-Peñuela is a PhD student at the VU University Amsterdam and works at Data Archiving and Networked Services and the eHumanities Group of the Royal Netherlands Academy of Arts and Sciences in the Computational Humanities project CEDAR. He holds a bachelor in Informatics Engineering from Universitat Politècnica de Catalunya (FIB-UPC). As a researcher he has previously worked at the Institute of Law and Technology (IDT-UAB), developing models for Law using semantic technologies. His PhD focuses on the Semantic Web and the Humanities, with a particular interest in concept drift and the dynamics of meaning. His research interests also include Linked Open Data, Data Mining, Machine Learning and Computer Graphics, and he is an enthusiast of Free Software. He also likes to spend his time playing music and reading science fiction.
Wouter Beek is currently employed as a researcher at the Free University (VU) in Amsterdam, working on the Free Competition NWO project Pragmatic Semantics (PraSem). He is working on a new semantic paradigm for interpreting existing Semantic Web (SW) data, taking the contradictions, ambiguities and context-dependencies that abound into account. His conducted his previous research at the University of Amsterdam (UvA) within the international DynaLearn project, focusing on the diagnostic evaluation of qualitative models.
Fernie Maas has completed the research master Historical Sciences at Radboud University Nijmegen, specialising in 17th and 18th century cultural history. She graduated with a thesis on early modern Dutch cookery books and ideas of healthy eating. She recently finished an embedded research project on cultural industries in the Dutch Golden Age at VU University Amsterdam, working together with VU computer scientists in converting the Short Title Catalogue, Netherlands. Currently she is involved in developing a joint eHumanities education program at UvA and VU.
Marnix van Berchum, DANS (KNAW)
Linked sources: a network approach to the repertory of sixteenth-century polyphony (an introduction)
Scattered over the many libraries and archives of Europe, lie the remnants of past musical cultures. Musical manuscripts and prints provide us with glimpses of the repertories that were circulated, collected and performed. From its beginnings in the 19th century musicology has been involved with the study of these musical repertories. Sources have been studied from codicological viewpoints, the compositions these sources contain from stylistic angles. My current PhD research approaches the repertory of sixteenth-century music from the perspective of network theory. Musical compositions are regarded as cultural artefacts, contextualised within the transmission of music and broader socio-economic conditions of a defined historical period. This approach exploits the characteristics of musical sources and their content as networked entities, providing a more formalised view of the term ‘repertory’.
In my presentation I will introduce the above described research project. I will talk about the problems of approaching a distinct period from the history of music from the viewpoint of network theory. How can the network be modelled? What are the characteristics of such a network? What musicological questions can be answered.
The above will be illustrated by a first case study on one of the most famous sets of musical manuscripts from the early decades of the sixteenth century. These manuscripts were produced by the scriptorium of Petrus Alamire and were mainly created for the Habsburg-Burgundian court.
Marnix van Berchum studied Musicology at Utrecht University, and specialized in musical culture of the 15th and 16th century. He graduated with a thesis on the motets of Jachet Berchem (c.1505-1567). He is pursuing his PhD research at Utrecht University, in which he wants to apply the concepts and methods of network theory on the dissemination of music in the sixteenth century. Furthermore Marnix is Associate Director of the CMME Project (www.cmme.org).
Marnix is currently part time employed at DANS – the data archive of NWO and KNAW (www.dans.knaw.nl) – where he works amongst others on the Europeana Cloud and CLARIN-NL projects. He has a wide range of experience in projects related to Open Access, innovations in scholarly communications and ‘digital musicology’.
Lambert Schomaker, University of Groningen (RUG)
New digital-humanities methods for paleography and handwritten manuscript analysis
In this presentation I will address digital and e-Science methods for the analysis of scans of handwritten or complicated machine-print manuscripts. Over the last few years we have developed two systems: Monk and GIWIS for word retrieval and identification of the hand, respectively. We will start with an introduction of the general philosophy behind the methods: What is expected of image quality including pre-processing steps, what are the limitations. We will do this chronologically, from the ‘ingest’ of a collection to the actual use. In the second part of the tutorial, we will zoom in on an interactive e-Science system for the training of word and character shapes by the Monk system. The Monk system currently deals with collections ranging from the dead sea scrolls to medieval manuscripts, and from 17th century captain’s logs to early 20th century administrative indices. Through a continuous interaction between human and the learning machine, word and character shape classes are constructed through agglomeration. By using several recognition methods in sequence, the principle of the Fahrkunst elevator can be used: the system is pushed to an increased performance by the stepwise shifts. I will show some recent results in mining characters in the Dead Sea Scrolls. The principle of stepwise ‘uplifting’ also takes place at the level of the refinement of transcriptions, which constitutes the latest addition to the Monk tools.
prof. dr. Lambert Schomaker
Artificial Intelligence & Cognitive Engineering (ALICE) University of Groningen The Netherlands
First RM in new academic season
Lotte Wilms, Steven Claeyssens, Clemens Neudecker (National Library, KB), Hugo Huurdeman (University of Amsterdam, UvA)
Digital Humanities at the Koninklijke Bibliotheek
In the past two decades or so, national libraries have been digitising millions of pages of books, newspapers, magazines and other text based collections. In this digital age, the research landscape is changing rapidly, with scholars able to ask new types of questions and answer them in novel ways by working with a wide variety of materials and in new collaborative modes.
The National Library of the Netherlands (KB) has planned to have digitised and OCRed its entire collection of books, periodicals and newspapers from 1470 onwards by the year 2030. But already in 2013, 10% of this enormous task will be completed, resulting in 73 million digitised pages, either from the KB itself or via public-private partnerships as Google Books and ProQuest. Many are already available via various websites (e.g. kranten.kb.nl, statengeneraaldigitaal.nl, anp.kb.nl, earlydutchbooksonline.nl) and we are working on a single entry point to (re)search all sets simultaneously.
All these changes in the research landscape also ask for a change in the way the library works with the users and researchers of the material. This presentation will deal with the steps the KB has taken to support the requests for the researchers by setting up a Data Services team to simplify the access to our data. The Data Services team also works closely with the Research department, where work is also being done to offer (technical) support to the users and to gather input to better meet the needs of the researchers.
Already curious about our sets? See http://www.kb.nl/en/data-services-apis for more info or go to http://tinyurl.com/KBandDH to leave your feedback.
Steven Claeyssens is the Data Services Coordinator at the National Library of the Netherlands (KB). He has an MA in Germanic Philology from Ghent University and an MA in Book Studies form Leiden University. He has worked at the KB since 2005 as Analytical Bibliographer, Information Specialist and Collection Specialist. He is finishing a PhD on Dutch publishing history.
Lotte Wilms works in the European Projects team of the Research department of the National Library of the Netherlands. She is the KB Project Leader for Europeana Newspapers and coordinates several Digital Humanities efforts from the Research department of the KB. She has a BA in English Language and Culture and an MA in Medieval Studies from the University of Utrecht. She has worked at the KB since 2008 on various (digitisation) projects.
Clemens Neudecker, M.A., Technical Coordinator for the Research team in the Innovation and Development department of the KB.
Please click on Archive for an overview of former Research Meetings