Brayden King at Northwestern asked me to pass this on.

The Kellogg School of Management at Northwestern University seeks a post-doctoral researcher interested in at least one of the following areas of scholarship: social movements, collective behavior, networks, and organizational theory.  We particularly encourage scholars to apply who have advanced quantitative training, programming skills, and familiarity with “big data” methods. The ideal candidate will have a PhD in sociology, communications, political science, or information sciences.

The post-doctoral position will allow the scholar to advance his or her own research agenda while also working on collaborative projects related to social media and activism. The post-doctoral position will be managed by Brayden King and will be affiliated with the Management and Organizations department and NICO (Northwestern Institute on Complex Systems). The term of this position is negotiable.

To apply, please e-mail curriculum vitae along with a brief statement of how your research interests are related to this position to Juliana Steers ( with “MORS Post-Doctoral Position” as the subject. Arrange to have two letters of recommendation e-mailed to the same address. Salary and research budget are competitive and includes full medical insurance. Applications are due March 2, 2014.

Northwestern University is an Equal Opportunity, Affirmative Action Employer of all protected classes including veterans and individuals with disabilities.

This is a guest post by Charles Seguin. He is a PhD student in sociology at the University of North Carolina at Chapel Hill.

Sociologists and historians have shown us that national public discourse on lynching underwent a fairly profound transformation during the periods from roughly 1880-1925. My dissertation studies the sources and consequences of this transformation, but in this blog post I’ll just try to sketch some of the contours of this transformation. In my dissertation I use machine learning methods to analyze this discursive transformation, however after reading several hundred lynching articles to train the machine learning algorithms, I think I have a pretty good understanding of key words and phrases that mark the changes in lynching discourse. In this blog post then, I’ll be using basic keyword, bigram (word pair), and trigram searches to illustrate some of the changes in lynching discourse.

Continue reading

This is a guest post by Laura K. Nelson. She is a doctoral candidate in sociology at the University of California, Berkeley. She is interested in applying automated text analysis techniques to understand how cultures and logics unify political and social movements. Her current research, funded in part by the NSF, examines these cultures and logics via the long-term development of women’s movements in the United States. She can be reached at

Computer-assisted, or automated, text analysis is finally making its way into sociology, as evidenced by the new issue of Poetics devoted to one technique, topic modeling (Poetics 41, 2013). While these methods have been widely used and explored in disciplines like computational linguistics, digital humanities, and, importantly, political science, only recently have sociologists paid attention to them. In my short time using automated text analysis methods I have noticed two recurring issues, both which I will address in this post. First, when I’ve presented these methods at conferences, and when I’ve seen others present these methods, the same two questions are inevitably asked and they have indeed come up again in response to this issue (more on this below). If you use these methods, you should have a response. Second, those who are attempting to use these methods often are not aware of the full range of techniques within the automated text analysis umbrella and choose a method based on convenience, not knowledge.

Continue reading

I was pleased to see Fabio Rojas make an open invitation for more female scholars on OrgTheory. Writing for a technically-oriented blog, I’ve been painfully aware of the dearth of female voices expressed here. And as computational social scientists, we should be incredibly wary of the possibility of reproducing many of the same kinds of inequalities that have plagued computer science and tech at-large. We see this when “big data isn’t big enough“, as Jen Schradie has put it, when non-dominant voices are shushed in myriad different ways online, and I fear it when all our current contributors are men. Sociology has gone a long way to open up space for more “scholars at the margins” (a term I’m taking from Eric Grollman and his blog Conditionally Accepted), but there’s still a long way to go.

This is, then, an open invitation for anyone to contribute to Bad Hessian, especially women, people of color, queer people, people with disabilities, working-class or poor people, fat people, immigrants, and single parents.  Our doors are always open for guest contributors and new regular contributors. Computational social science ought to be as committed as possible to not only bringing computational methods into the social sciences, but making sure that everyone, especially those at the margins, have a place to speak to and engage with those methods.

2013 was the first full year of Bad Hessian’s existence, so we’re taking stock of what we’ve accomplished in the past year.

We’ve had 37 posts written by the regular crew plus 5 great guest authors.

We’ve had 51,520 unique visits, 39,412 unique visitors, and 70,772 pageviews. Most people are coming from search engines and we’re getting most social media traffic through Twitter.

The five most popular posts of 2013 (written in 2013) were:

  1. Lipsyncing for your life: a survival analysis of RuPaul’s Drag Race by Alex
  2. A Final Twitter-based Prediction of RuPaul’s Drag Race Season 5 by Alex
  3. Cluster Computing for $0.27/hr using Amazon EC2 and IPython Notebook by Randy Zwitch
  4. RuPaul’s Drag Race Season 5 Finale — Predicting America’s Next Drag Superstar from Twitter by Alex
  5. Has R-help gotten meaner over time? And what does Mancur Olson have to say about it? by Trey

It was a great year for us. What does 2014 bring? I can think of a few things that’ll probably come up.

  1. More stats pedagogy
  2. More IPython
  3. More social science hackathons and data events
  4. More discussions of protest event data
  5. More drag queens (duh)

And I hope more content in general! Is there anything you’d like to see here in 2014? Let us know!

Last month, Mobilization published a special issue on new methods in social movements research, edited by Neal Caren. I was one of the contributors to the issue, submitting a piece borne of my master’s work. The piece is on using supervised machine learning of Facebook messages from Egypt’s April 6th Movement in its formative months of 2008, corroborated by interviews with April 6th activists.

The abstract:

With the emergence of the Arab Spring and the Occupy movements, interest in the study of movements that use the Internet and social networking sites has grown exponentially. However, our inability to easily and cheaply analyze the large amount of content these movements produce limits our study of them. This article attempts to address this methodological lacuna by detailing procedures for collecting data from Facebook and presenting a class of computer-aided content analysis methods. I apply one of these methods in the analysis of mobilization patterns of Egypt’s April 6 youth movement. I corroborate the method with in-depth interviews from movement participants. I conclude by discussing the difficulties and pitfalls of using this type of data in content analysis and in using automated methods for coding textual data in multiple languages.

You can find the PDF here.

The issue is full of a lot of other great stuff, including:

Studying Online Activism: The Effects of Sampling Design on Findings, Jennifer Earl

How Repertoires Evolve: The Diffusion of Suicide Protest in the Twentieth Century, Michael Biggs

Contextualizing Consequences: A Sociolegal Approach to Social Movement Consequences in Professional Fields, Elizabeth Chiarello

A Methodology for Frame Dynamics: Analyzing Keying Battles in Palestinian Nationalism, Hank Johnston and Eitan Y. Alimi

The Radicalization of Contention in Northern Ireland, 1968-1972: A Relational Perspective, Gianluca De Fazio

This is a guest post by Jen Schradie. Jen is a doctoral candidate in the Department of Sociology at the University of California-Berkeley and the Berkeley Center for New Media. She has a master’s degree in sociology from UC Berkeley and a MPA from the Harvard Kennedy School. Using both statistical methods and qualitative fieldwork, her research is at the intersection of social media, social movements and social class. Her broad research agenda is to interrogate digital democracy claims in light of societal and structural differences. Before academia, she directed six documentary films on social movements confronting corporate power. You can find her at or @schradie on Twitter.

Five years ago, Chris Anderson, editor-in-chief of Wired Magazine, wrote a provocative article entitled, “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete” (2008). He argued that hypothesis testing is no longer necessary with google’s petabytes of data, which provides all of the answers to how society works. Correlation now “supercedes” causation:

This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.

An easy strawman, Anderson’s piece generated a host of articles in academic journals decrying his claim. The overall consensus, to no surprise, was that the scientific method – i.e. hypothesis testing – is far from over. Most argued as Pigliucci (2009:534) articulated,

But, if we stop looking for models and hypotheses, are we still really doing science? Science, unlike advertising, is not about finding patterns—although that is certainly part of the process—it is about finding explanations for those patterns.

Other analysts focused on the debate around “correlation is not causation.” Some critiqued Anderson in that correlation can lead you in the wrong direction with spurious noise.  Others implicitly pointed to what Box (1976) articulated so well pre-Big Data – that science is an iterative process in which correlation is useful in that it can trigger research which uses hypothesis testing.

Continue reading

The Databasement

This weekend, I made it out to Penn State to participate in the GDELT hackathon, sponsored by the Big Data Social Science IGERT and held in the punnily-named Databasement. The hackathon brought together a lot of different groups — political scientists, industry contractors, computer and information scientists, geographers, and — of course — sociologists (I was one of two).

GDELT, as you may remember, a political events database with nearly  225 million events from 1979 to the present. Hackathon attendees had interests ranging from optimizing and normalizing the database, predicting violent conflict, and improving event data in general.

Continue reading

I’m a big fan of Drew Conway‘s Data Science Venn Diagram, in which he outlines the three intersecting spheres of skill that the data scientist needs — hacking skills, math and statistics knowledge, and substantive expertise. I’ve used this idiom in thinking through how to bring more sociologists into using computational methods. This has been a matter of getting them to learn how to hack or see the virtues of hacking even if they don’t have a taste for it themselves.

But what I think the diagram is missing — or it’s at least gets buried underneath the surface — is knowledge of the processes of data production. This is maybe a subtler point which I think gets looped in with “substantive expertise” but I want to draw this line out to be as explicit as possible because I think this is one of data science’s weaker flanks and one of the places where it needs to be strengthened to gain more acceptance within the social sciences.

Continue reading

datagothamThis is a guest post by Sean J. Taylor, a PhD student in Information Systems at NYU’s Stern School of Business.

Last Thursday and Friday I attended the 2nd annual DataGotham conference in New York City. Alex Hanna asked me to write about my experience there for the benefit of those who were unable to attend, so here’s my take on the event.

Thursday evening was a social event in a really sweet rooftop space in Tribeca with an open bar and great food (a dangerous combination for this still-grad-student). Though I spent a lot of the time catching up with old friends, I would describe the evening as “hanging out on Twitter, but in person.” I met no fewer than a dozen people I had only previously known online. I am continually delighted at how awesomeness on Twitter is a reliable indicator of awesomeness in-person. Events like DataGotham are often worth it for this reason alone.

Continue reading