When working with social data there’s one question you should ask. We’ll come to that in a minute. First, I wanted to talk a bit about why this question is so important.
Making the wrong decision hurts.
It wastes money, time and reputation.
It squanders influence and holds us back.
That’s why getting the right data is so important.
With this in mind, we need to chat about social listening’s ‘dirty little secret’.
The platforms appeal because they give us access to social data and the tools to turn it into something useful. They promise to make it easy, to take the mess of social data and make it easy for us to better understand people. At the heart of this is an exchanged of trust. They build the system, we trust the outputs.
But is all that glitters, gold?
Just because the data looks ok on the surface, doesn’t mean it is. How do you know that the clever algorithms are getting it right? That we’re seeing all we should be seeing?
Having worked with social data for nearly a decade we’ve noticed a tendency for providers to focus on data breath rather than quality.
It’s not talked about and it’s not measured. So, we’re left to assume that it’s all taken care of.
It’s not.
Having read millions of social media comments we’ve learnt a lot about how good the data you get from social listening platforms can be.
Spoiler.
It needs to be better.
After all, a great conclusion built on faulty data isn’t great. It’s an illusion.
For example, if your sentiment scores are based on data full of false positives, how can you have any confidence in the results? That shift you noticed could be real or a phantom caused by faulty data.
The first thing to accept is that no social data collection will be 100% accurate.
Errors creep in from two sources:
- The first comes from language. There are three aspects here. People use language in different and unpredictable ways. Listening needs to be inclusive (so not to miss anything important). And Boolean (the language we use to collect social data) is a blunt tool to deal with these first two points. This will inevitably lead to data which contains a percentage of false positives.
- The second source comes from the data collection process. Some sources work well (e.g. Twitter API) others can be much more error-prone. We’ve written about how this happens here.
So, what do we do about it?
We believe the first step is to get your head around the R number. No, not that one.
In this case, R stands for relevancy. It’s a proportion of data in any social listening analysis that is relevant to the question in hand.
It’s super easy to find out (and definitely worth it). Here’s how.
Download some data, randomise it. Read a sample of 100. Check how many of the comments are relevant to the question you’re looking to answer.
That’s your Relevancy rate.
Try it. You might be surprised, shocked, flabbergasted by the result.
If we get more than 40-50% the first time around that’s a good start. The important thing is to know what it is. To look at the data with open eyes.
Then you can then see how much faith to put on any automated analysis.
How comfortable would you be in presenting results where the underlying data could be 50/50 right or wrong?
If I was a client in a debrief, this is the first question I’d ask. “How relevant is the underlying data behind these conclusions?”
If they can’t answer. I’d postpone the meeting until they can.
Getting social data right is a process of trial, error and iteration.
It’s a skill you can learn and develop.
It’s easy to work out where you are and track your progress.
And there’s guidance out there to help you get better.
We’ve shared some of our top tips here.
And, if you’re short on time or the right capabilities, we can write your queries for you. Or our Query Builder workshops might be just what you need to upskill your team.
To find out more get in touch.