Developer Research 101: The right methodology for reliable survey data
By Christina Voskoglou, Senior Director of Research
Suddenly a fine day dawns when your organisation’s key stakeholders agree that you need data to understand your developer audience. Well, ok, most likely that didn’t exactly happen overnight — in fact we know* that nearly 20% of DevRel practitioners struggle to justify the budget of their developer programs and 32% rely on qualitative arguments. But let’s skip that part for now, and fast-forward to that happy moment when there is full buy-in for data-backed developer strategy decisions. Right. You need data. But what data?
First, ask the right questions
Let’s pause here for a second. At SlashData we may have data in our DNA, but we know that plunging head-first into data is not where your quest for answers should begin. Your first step should be, instead, to ensure that you are asking the right questions. However trivial that may sound, you may discover (as many others have) that in fact it is not. If you get the questions wrong, the answers will be meaningless and the time and budget you will have invested in finding data to answer them will be wasted.
Knowing exactly which questions you need answered will help you specify not only what data you need, but also where you should get it from, and how big a sample you should aim for. If it is just a total market share figure you’re after, for example, chances are you don’t need that many data points — neither in terms of sample size, nor in terms of breadth of information collected. If however you are trying to understand what developer personas (or segments) exist out there, where they are located, how they feel about different technologies, and where they’re going next, you’re looking into an undertaking of an entirely different magnitude, and may the Force be with you. Or, more practically, SlashData, as we have been in this business for more than a decade now.
Mind the source
Once you know which questions you need to answer, you should carefully choose your data collection method and sources. If, for example, you want to know how your technology is currently being used, you could use your own telemetry or usage data, or survey your own community. If, however, you need to see what other technologies your developers use, or how competing technologies are being used by your broader audience, you need to look beyond your own community by means of a global survey that targets everyone in your market. Based on our bi-annual survey of developer-facing organisations, we find that about half use their own survey data, and nearly half run qualitative research, when in fact it may be more appropriate to use a different approach.
A common, though less-than-ideal approach, for example, is tracking developer sentiment (usually in the form of Net Promoter Score) based on data collected from current users that interact with you (say, through your website). While this may be a good indication of how your current active users feel, it can not be generalized to represent the sentiment of your whole target audience, as it omits the views of past users who have now left you, and also the views of those who evaluated but rejected your technology in favour of a competitor. Those who abandon a technology are more likely to give it a low recommendation score if asked. By omitting their views and using only current users’ scores you therefore get a positively biased result. This is particularly true in highly competitive low-rigidity markets (such as some cloud services can be) where your current users are more likely to be satisfied fans who stay with you by choice rather than due to technology lock-in effects, while the displeased have already left you to turn to one of your competitors.
That is why in our surveys we always ask both current and past users how they feel about each of the technologies they either use, or have (recently) stopped using, or evaluated but rejected. In this way, we get an unbiased estimate of developer sentiment for the broad range of technologies that we track — allowing us to benchmark them with a high degree of confidence.
It’s not only the size that matters
“Can we discuss sample size already?” I hear you cry. We shall in a moment, I promise, but we need to get something else straight first. You should be collecting a sample of… what exactly? An equally — if not more — important consideration to size is the representativeness of your sample. To use a crude example, there is no point buying a truckload of bananas, when what you’re looking for is apples. Similarly, there is no point collecting an impressively big sample from the US and India alone, for example, when what you’re interested in is global trends. And that’s because developers in different parts of the world behave very differently when it comes to technology choices, as they have different motivations and business models, and may be at different stages in their journey as they form part of developer ecosystems of varied maturity. Our data proves again and again that there are vast differences between regional developer communities. That’s why we go into great lengths, each and every time, to survey developers from more than 150 countries, so that we may truly gauge the pulse of the global developer community.
It’s not only regional diversity, though, that you should carefully balance your sample for. There are several other attributes you should consider, such as the mix of professionals, hobbyists and students, and the size of the organisations that your surveyed professionals work for. The latter is particularly important if, for example, you wish to capture the views of both enterprise developers and startups that are bound to be working with different tools. Demographics such as age may also be important, as you may want to hear from both the young coders, who typically use some technologies more than others (open source software is a prime example here), and from the seasoned developers who may have a deeper understanding but also higher expectations of the tools they use.
You should also ensure that you don’t repeatedly rely on the same pool of developers, say a panel, no matter how big. In such a fast-paced industry, behavioural patterns and user profiles may change without warning. By repeatedly surveying the same people over time, you risk failing to observe the change originating from a different pocket of the developer population than the one your panel comes from. And if you do fail to observe the upcoming trend, you will miss the opportunity to ride the wave of change. This is particularly true for the emerging sectors such as augmented and virtual reality, but also for more ‘exotic’ technologies still in the early stages of their lifecycle, such as DNA computing, self-driving cars, or body-brain computer interfaces.
As we track all of these and many more, we reach out to capture the experiences and the intent of developer populations of all shapes and sizes, from small local meetups to large vendor communities. Our surveys are promoted by more than 70 leading community and media partners each time, and we make sure they are not the same 70 every time, to ensure we are not repeatedly hitting the same pools (or communities) of developers. And while we reach out afresh to the developer population each time, we consistently observe meaningful trends in our data — rather than wild jumps — which proves that we do indeed capture a representative view of the software development industry.
Last but not least, be careful of any incentives you offer to survey takers. These must be carefully designed to appeal to all profiles within your target audience, or you risk creating selection bias, i.e. attracting only developers of specific profiles, rather than a random sample of all developer profiles out there.
Is your data clean?
But no matter how hard you try, you’re bound to get some sample bias — beware of anyone who says they don’t. How do you deal with it?
First, particularly if you are offering incentives, you should clean out all fraudulent — or simply illegitimate — responses. There will always be those who are in it only for the prize, randomly clicking through your survey and diluting results. They may even build smart bots to do that (after all, we are talking about developers here). At SlashData we have developed sophisticated ML algorithms that identify such responses and unceremoniously throw them out. Based on the metadata that our bespoke survey-taking platform tracks, we are able to outsmart the not-so-honest respondents and call them out.
Then, it’s a matter of correcting for over-represented groups. It could be that, despite your best efforts, you attracted disproportionately more hobbyists than you should have done, for example. Or perhaps word got around in a particular language community about this cool survey, and slightly more enthusiasts than what you had hoped for came forth to vouch for their favourite programming language. How do you fix those imbalances? Especially given that you don’t know what the true (or population) proportions are — since that is the very thing you’re trying to estimate. In such cases, some — very very careful — data weighting is in order.
You have to be extremely careful (have I stressed this enough already?) as to how to go about it, first to identify the sources of bias, and then to decide how to correct it without introducing over-correction. But this is a rather long story, and we’ll keep it for another day.
All I shall say here is that at SlashData we treat all the different channels through which we get our data (such as our network of 70+ partners mentioned earlier) as independent samples, which we then compare across a set of parameters which we know may introduce bias. We use ML models to specify the level of correction that should be applied, and take into account all types of bias that a single response may be simultaneously carrying.
What is your margin of error?
We get that question a lot. I hope that by now I have demonstrated that, although important, this should not be your only concern. In fact, the margin of error can be quite misleading if used as the only metric to assess sample and research quality. Let me give you some statistical insight that might shed some light on this problem.
As a quick search can reveal, the margin of error is designed to measure uncertainty in random samples. More specifically, the theory of the margin of error (MoE) applies, strictly speaking, only to questions (but could, under certain circumstances, be generalised to full surveys), and only to perfect random samples. This implies that if the assumption of perfect randomness does not hold (and in the real world in most cases it doesn’t), then the theory collapses and your MoE estimate is meaningless. To go back to our crude example of obtaining a truckload of bananas as a sample of apples, just because you have a truckload (and from a large truck at that), your margin of error estimate will look satisfyingly low. Your calculation, however, will have not accounted anywhere for the fact that these were in fact (loads!) of bananas, not apples, and as such, they make for a useless sample, albeit a tasty one.
That is why at SlashData, instead of just quoting margins of error that when used in isolation may misleadingly inflate confidence in a sample, we focus our efforts on obtaining a sample that is as big, as random and as robust as possible. These are, in fact, the three elements that do lead to a reliable estimate of a margin of error. In other words, it’s not enough to only quote a margin of error. One should also be able to demonstrate that the underlying assumptions of the MoE calculations, namely randomness and normality, are met to a satisfactory degree. So if you’re out there shopping for survey-based research, make sure to first scrutinise any potential sellers for the health of their outreach and sampling methodology. Only then, if satisfied, ask about the margin of error.
Go for a large sample you can dig into
All that said, sample size is, of course, very important. To continue the margin of error discussion, suppose you are faced with a choice of two random samples (that is, samples you can be reasonably sure are close to random). If they both come from the same population, say the global developer population, then, at any given level of confidence, their difference in margin of error will lie in the sample size. Based on our robust developer population sizing research, there are currently (as of Q1 2021) 24.3 million developers in the world. That means, that even at a 99% confidence level, our sample of more than 19,000 developers from across the world yields a margin of error of less than 1% at the question level. If instead you had, for example, a (random) sample of 2,500 developers, your margin of error for the same question would be around 3%.
But having a low margin of error is not the key reason for which you should aim for a large random sample. The main reason is having the ability to dig deeper and slice the data, while still having enough sample left from which to confidently draw conclusions. If, like us, you run unsupervised models, random forests and other ML models to identify developer segments and predict their technology choices, then you need large samples to do it. Otherwise, you end up with a really thin sample that is anything but reliable with regards to the picture of developer personas that it paints. Even if you’re into simply tracking trends for subpopulations of interest, you still need a big-enough sample. In our data dashboards, for example, we give you the option to filter for many attributes, such as age, region, professional status, gender, decision-making power, and much more. If we were to start off with a small sample, filtering would leave you with a tiny, and therefore useless, sample size. For example, filtering in our Developer Population Calculator for those under 25 years of age, who are students, and have up to five years of experience, still leaves us with nearly 4,000 respondents to draw conclusions from.
Are you lost?
Here’s a cheat sheet:
- Ask the right questions. Make sure you accurately specify what business questions you need answered, and by which audience.
- Select the data collection method (such as a large-scale survey, telemetry, qualitative research, etc.) that is best suited for the problem you’re trying to solve.
- Carefully design your developer outreach to obtain a sample that is representative of the population you are interested in.
- Aim for a large sample, so that you may confidently dig into it, if you need to.
- Clean your data from illegitimate, or even fraudulent responses.
- If you’re confident enough that you have a random sample, estimate your margin of error — at the question, not survey, level.
- Check for sample bias and correct for any obvious deviations from randomness, without overfitting (or over-correcting).
In short, as you may have guessed by now, the art of research design and developer outreach is not for the faint-hearted. And it can not be wrapped up in a margin of error figure. But fear not. With more than 10 years of experience in mapping the developer ecosystem through large-scale surveys, we are here to help. All you have to do is get in touch.
*Based on our Developer Program Leader surveys.