As the Research Director for Listen + Learn Research, I’ve been meaning to share some reflections on the July MRS Big Qual Conference and put down my frontline practitioner views on paper. I’m very pleased to have attended this conference. It’s important for the research circles to keep discussing big qual issues as despite having been around for some time, they’re quite new to the mainstream of research and insight sector. What follows is my admittedly opinionated reflections on what was covered in the conference, what does it tell us about the state of play in big qual, and what are some possible challenges to prevailing opinion.
So what is ‘big qual’ all about?
Most practitioners would agree that it is about research-grading of unstructured, naturally occurring data and sources – a euphemism for the stuff people discuss on social media and the internet. Most of the speakers and panellists were keen on the ‘bigness’ of big qual. References to millions and billions of data points, completely unattainable for traditional market research, were bandied about quite happily. But, very sensibly, there was also a widespread agreement that research-grade big data only makes sense with the ability to decode, understand and make sense of it. Or do some ‘sensemaking’ to borrow a trendy neologism.
In the broadest of senses, the ‘big’ part is machine-enabled (using clever machine learning models and AI applications) while the ‘qual’ part is human-enabled (using equally clever semioticians, linguists and programmers). Which in itself is all very good. But my reading of the conference is that the emphasis seems to be quite firmly in favour of the ‘big’, machine-enabled part.
Are we really going to need a bigger boat?
The prospect of being able to do away with the need to sample is indeed a tempting one. A research world in which we could harness the views of ‘everybody’, and achieve something of a 99% real-life statistical significance (a reference actually made during one panel discussion) is alluring. But I ask whether it’s a) desirable and b) viable?
The market and social research world have happily gone on for decades working with samples of populations. And despite some infamously wonky moments (particularly in the context of election polling), this has broadly held up alright. We see this in our work time and time again: the marginal impact of expanding a sample on results diminishes rapidly beyond a few thousand pieces of data.
So why the somewhat megalomaniac tendency to want to analyse it all? Especially in the light of the fact that the bigger the dataset, the less control we have over a pretty fundamental trait of it, namely relevance. It’s a well-known truism that social data is messy: there is lots of commercial content there, lots of bots, lots of opportunities to go down the wrong rabbit hole. Seeing that machines only collect data based on strings or words, one well end up with a data universe consisting chiefly of bot-generated sales offers. And what is the point of sampling 100% of data that is 50% irrelevant, or invalid? This is why robust samples, meticulously appraised by human coders, are a compelling proposition in the domain of big qual data too. A few thousand of 100% relevant and valid data pieces are arguably more robust than a few million 50% relevant and valid ones when we don’t know which 50% is which.
Blade Runner 2021
This brings us to another point, which has been a darling sci-fi topic for decades: the tension between humans and machines. Now, without any doubt machines are gloriously impressive in collecting data according to a given script or specification. Machines offer the access point to the universe of big qual if and used wisely act as an initial filter in identifying where the ‘heat’ is. But their ability to understand, interpret and evaluate is severely limited. Paraphrasing one of the conference panellists, nobody would like to be sentenced in court by a machine. And I found a bit of a contradiction in the virtues associated with machines and their purported ability to simultaneously identify patterns that are too big and too small to a ‘naked’ human analyst eye. How can these two be reconciled? How can an algorithm simultaneously see a needle in a haystack and a bird’s eye view? The conference did not, in my mind, provide a credible example of that in practice.
Hand me the ruler
Although some of the cases and briefs presented at the conference were quite impressive (reset segmentations or reformulated consumer strategies), a lot of the metrics used in big qual studies can be either quite rudimentary, a bit top-down, or monstrously challenging to reliably develop. Knowing that the volume of mentions of ‘Pfizer’ has increased over 2021 and that the sentiment in posts mentioning ‘Pfizer’ has grown more positive can only go as far in illuminating attitudes towards vaccines. Even if this is based on hundreds of thousands of tweets. On the other hand, using existing lexicons to run the data through can be both inaccurate and self-serving. And building bespoke analytical frameworks that tackle vast bodies of qual data in a fool-proof manner can be prohibitively expensive and limited in versatility. Even the most solid framework develop for toothpaste won’t work on pet food.
What about the economy, stupid
Touting machine learning big qual solutions as cost-saving compared to human analysis also seems wishful. I accept that these can be built, used and reused for a tightly defined context, where a study keeps looking into the same thing over and over again thereby achieving economies of scale with time. But how many research briefs offer this sort of privilege? Having recently spent some days playing with one of the leading machine learning platforms for text analysis I can first hand attest that while it certainly has the ability to learn, it takes absolute ages to do so. And it has absolutely no answer for irrelevant or complicated data. For example, if multiple products are mentioned in a product review (as they often are) each with an accompanying value statement, the AI would never make out whether ‘great picture’ is related to the Samsung and ‘awkward interface’ to the Sony. Whereas a human analyst would very easily ascribe those attributes to the right brands. That’s ‘sensemaking’ at its best. I just can’t see the argument for investing copious amounts of time and money into machine learning in case of any ad hoc research briefs. Seeing the returns it offers, it would be borderline disrespectful to the client’s budget.
How can I help you, Madam?
And this brings us to the crux of the matter: we are here to help our clients answer their questions and resolve their challenges. Big qual should be nothing more than another tool or a platform serving precisely that purpose. As a number of conference presenters and panellists rightly observed, clients are interested in outcomes much more than the process. They want robust, actionable insight. Rather than boasting material that the project they commissioned analyses a zillion data points using complex semantic clustering. And for the foreseeable future, the optimal way of using big qual to that end is through harnessing machine data collection technology coupled with human analysis of robust, highly relevant and valid samples of data. ‘Big Qual’ should really be ‘Big (enough) Qual’. And at least up to the point when our banking chatbot can resolve our disputed payment issue, and our driverless Teslas stop crashing into each other, we and our clients will be all happier for it.
To find out how you can use in-depth human analysis for better business insights, get in touch with us at contact@listenandlearnresearch.com.