We have a privileged position, we get to actually read a shed load of social data. This means we know what makes it good and what you might want to look out for.
We’ve worked with a load of social listening tools and are only too happy to share some tips. These are all focused on the process of collecting data – from social media source through to your listening tool.
1. Deep and wide
The promise of social listening is to be able to tap into a vast ocean of content. Most of the time this is fine, but it’s worth keeping a couple of points in mind:
- Most platforms play-up the scale of their coverage and we often hear the “we’ve got access to over 150m sources” bandied around. We think it’s worth being super explicit about what access they have and what’s included in their pricing. We learnt (the hard way) to do our own research into what social spaces are there, and then double-check that the listening provider is able to collect data from them.
- Check their fee structure for adding new content sources. For example, if you’re doing a project in Russia, it’s, worth checking that they can get data from VK.
- “We’ve got a special relationship with Facebook/Twitter/Other social media platform”. Yawn. I’ve heard this so many times. What I’m interested in is the specifics, what, exactly does your exalted position give you that’s not available to other providers. I’ve seen a great example of where Mumsnet coverage increased considerably after changes to their licensing deal. Your listening provider should be able to quantify the quality of their relationships.
2. Are these the results you were looking for?
Social data infrastructure is crazily diverse. You’ve got millions of different sources, each could have their own way of integrating with your listening provider.
What this means in practice is the data capture process might work brilliantly – or it might not.
When it goes wrong, you could be getting duplicates, truncated data, a complete thread in one comment, missing data, lost emojis among other things. These errors could go across all of the content from that source – potentially polluting large volumes of your search results.
A way around this is to get a sample of the data, preferably by source (at least the main ones) and run through it looking for signs of wrongness. You can then work with your provider to clean this up – or you’ll get a sense of how wrong the data is and can work around that. One way to do this is to take a random sample of the data, manually analyse it, see what the results are, then apply this to your broader data set.
This is really important if you’re relying on automated metrics – the ‘rubbish in, rubbish out’ adage comes to mind here.
3. Boo, Boo, Boolean
Your boolean search query is where it all begins. It’s the gateway to good data or a load of irrelevance.
Not all platforms have the same capabilities – and it’s really important. The query tools look simple to use, but you’ve got to really work at them to make sure you’re getting the best data possible to work with.
The more flexible and capable the boolean tools you have to work with, the more accurate your results will be.
As a rough rule of thumb, we’re happy with around 50% relevancy from the query. It’s usually hard to get much more than this if you want to stay open to things you may not have thought of.
4. You say tomato, I say tomarrrrto
The last tip is about linguistic precision and it’s something to keep in mind when designing your query. It’s about how you choose your terms, rather than what the tool can do with them.
Every topic has an inherent level of linguistic precision. We think of this in terms of how much irrelevant content you’ll get when looking for your terms.
Terms like Sky, Orange or A1 are likely to be very imprecise. You’ll attract a lot of false-positive results.
At the other end of the spectrum are very discrete terms which tend to relate to one or a very small set of things.
It’s good to know where you are with this. If you’re working at the imprecise end of the spectrum you’ll need to build in lots of exclusions, make some sacrifices on data breadth, and prepare your stakeholders to understand the extent and limits of your data.
Need any help?
Getting under the skin of social data is our bread and butter. We’re a friendly lot, give us a shout if you’d like to chat any of this through or are struggling with any of it.