Developer Research 101: The right methodology for reliable survey data

By Christina Voskoglou, Senior Director of Research

Suddenly a fine day dawns when your organisation’s key stakeholders agree that you need data to understand your developer audience. Well, ok, most likely that didn’t exactly happen overnight — in fact we know* that nearly 20% of DevRel practitioners struggle to justify the budget of their developer programs and 32% rely on qualitative arguments. But let’s skip that part for now, and fast-forward to that happy moment when there is full buy-in for data-backed developer strategy decisions. Right. You need data. But what data?

First, ask the right questions

Knowing exactly which questions you need answered will help you specify not only what data you need, but also where you should get it from, and how big a sample you should aim for. If it is just a total market share figure you’re after, for example, chances are you don’t need that many data points — neither in terms of sample size, nor in terms of breadth of information collected. If however you are trying to understand what developer personas (or segments) exist out there, where they are located, how they feel about different technologies, and where they’re going next, you’re looking into an undertaking of an entirely different magnitude, and may the Force be with you. Or, more practically, SlashData, as we have been in this business for more than a decade now.

Mind the source

A common, though less-than-ideal approach, for example, is tracking developer sentiment (usually in the form of Net Promoter Score) based on data collected from current users that interact with you (say, through your website). While this may be a good indication of how your current active users feel, it can not be generalized to represent the sentiment of your whole target audience, as it omits the views of past users who have now left you, and also the views of those who evaluated but rejected your technology in favour of a competitor. Those who abandon a technology are more likely to give it a low recommendation score if asked. By omitting their views and using only current users’ scores you therefore get a positively biased result. This is particularly true in highly competitive low-rigidity markets (such as some cloud services can be) where your current users are more likely to be satisfied fans who stay with you by choice rather than due to technology lock-in effects, while the displeased have already left you to turn to one of your competitors.

That is why in our surveys we always ask both current and past users how they feel about each of the technologies they either use, or have (recently) stopped using, or evaluated but rejected. In this way, we get an unbiased estimate of developer sentiment for the broad range of technologies that we track — allowing us to benchmark them with a high degree of confidence.

It’s not only the size that matters

It’s not only regional diversity, though, that you should carefully balance your sample for. There are several other attributes you should consider, such as the mix of professionals, hobbyists and students, and the size of the organisations that your surveyed professionals work for. The latter is particularly important if, for example, you wish to capture the views of both enterprise developers and startups that are bound to be working with different tools. Demographics such as age may also be important, as you may want to hear from both the young coders, who typically use some technologies more than others (open source software is a prime example here), and from the seasoned developers who may have a deeper understanding but also higher expectations of the tools they use.

You should also ensure that you don’t repeatedly rely on the same pool of developers, say a panel, no matter how big. In such a fast-paced industry, behavioural patterns and user profiles may change without warning. By repeatedly surveying the same people over time, you risk failing to observe the change originating from a different pocket of the developer population than the one your panel comes from. And if you do fail to observe the upcoming trend, you will miss the opportunity to ride the wave of change. This is particularly true for the emerging sectors such as augmented and virtual reality, but also for more ‘exotic’ technologies still in the early stages of their lifecycle, such as DNA computing, self-driving cars, or body-brain computer interfaces.

As we track all of these and many more, we reach out to capture the experiences and the intent of developer populations of all shapes and sizes, from small local meetups to large vendor communities. Our surveys are promoted by more than 70 leading community and media partners each time, and we make sure they are not the same 70 every time, to ensure we are not repeatedly hitting the same pools (or communities) of developers. And while we reach out afresh to the developer population each time, we consistently observe meaningful trends in our data — rather than wild jumps — which proves that we do indeed capture a representative view of the software development industry.

Last but not least, be careful of any incentives you offer to survey takers. These must be carefully designed to appeal to all profiles within your target audience, or you risk creating selection bias, i.e. attracting only developers of specific profiles, rather than a random sample of all developer profiles out there.

Is your data clean?

First, particularly if you are offering incentives, you should clean out all fraudulent — or simply illegitimate — responses. There will always be those who are in it only for the prize, randomly clicking through your survey and diluting results. They may even build smart bots to do that (after all, we are talking about developers here). At SlashData we have developed sophisticated ML algorithms that identify such responses and unceremoniously throw them out. Based on the metadata that our bespoke survey-taking platform tracks, we are able to outsmart the not-so-honest respondents and call them out.

Then, it’s a matter of correcting for over-represented groups. It could be that, despite your best efforts, you attracted disproportionately more hobbyists than you should have done, for example. Or perhaps word got around in a particular language community about this cool survey, and slightly more enthusiasts than what you had hoped for came forth to vouch for their favourite programming language. How do you fix those imbalances? Especially given that you don’t know what the true (or population) proportions are — since that is the very thing you’re trying to estimate. In such cases, some — very very careful — data weighting is in order.

You have to be extremely careful (have I stressed this enough already?) as to how to go about it, first to identify the sources of bias, and then to decide how to correct it without introducing over-correction. But this is a rather long story, and we’ll keep it for another day.

All I shall say here is that at SlashData we treat all the different channels through which we get our data (such as our network of 70+ partners mentioned earlier) as independent samples, which we then compare across a set of parameters which we know may introduce bias. We use ML models to specify the level of correction that should be applied, and take into account all types of bias that a single response may be simultaneously carrying.

What is your margin of error?

As a quick search can reveal, the margin of error is designed to measure uncertainty in random samples. More specifically, the theory of the margin of error (MoE) applies, strictly speaking, only to questions (but could, under certain circumstances, be generalised to full surveys), and only to perfect random samples. This implies that if the assumption of perfect randomness does not hold (and in the real world in most cases it doesn’t), then the theory collapses and your MoE estimate is meaningless. To go back to our crude example of obtaining a truckload of bananas as a sample of apples, just because you have a truckload (and from a large truck at that), your margin of error estimate will look satisfyingly low. Your calculation, however, will have not accounted anywhere for the fact that these were in fact (loads!) of bananas, not apples, and as such, they make for a useless sample, albeit a tasty one.

That is why at SlashData, instead of just quoting margins of error that when used in isolation may misleadingly inflate confidence in a sample, we focus our efforts on obtaining a sample that is as big, as random and as robust as possible. These are, in fact, the three elements that do lead to a reliable estimate of a margin of error. In other words, it’s not enough to only quote a margin of error. One should also be able to demonstrate that the underlying assumptions of the MoE calculations, namely randomness and normality, are met to a satisfactory degree. So if you’re out there shopping for survey-based research, make sure to first scrutinise any potential sellers for the health of their outreach and sampling methodology. Only then, if satisfied, ask about the margin of error.

Go for a large sample you can dig into

But having a low margin of error is not the key reason for which you should aim for a large random sample. The main reason is having the ability to dig deeper and slice the data, while still having enough sample left from which to confidently draw conclusions. If, like us, you run unsupervised models, random forests and other ML models to identify developer segments and predict their technology choices, then you need large samples to do it. Otherwise, you end up with a really thin sample that is anything but reliable with regards to the picture of developer personas that it paints. Even if you’re into simply tracking trends for subpopulations of interest, you still need a big-enough sample. In our data dashboards, for example, we give you the option to filter for many attributes, such as age, region, professional status, gender, decision-making power, and much more. If we were to start off with a small sample, filtering would leave you with a tiny, and therefore useless, sample size. For example, filtering in our Developer Population Calculator for those under 25 years of age, who are students, and have up to five years of experience, still leaves us with nearly 4,000 respondents to draw conclusions from.

Are you lost?

  1. Ask the right questions. Make sure you accurately specify what business questions you need answered, and by which audience.
  2. Select the data collection method (such as a large-scale survey, telemetry, qualitative research, etc.) that is best suited for the problem you’re trying to solve.
  3. Carefully design your developer outreach to obtain a sample that is representative of the population you are interested in.
  4. Aim for a large sample, so that you may confidently dig into it, if you need to.
  5. Clean your data from illegitimate, or even fraudulent responses.
  6. If you’re confident enough that you have a random sample, estimate your margin of error — at the question, not survey, level.
  7. Check for sample bias and correct for any obvious deviations from randomness, without overfitting (or over-correcting).

In short, as you may have guessed by now, the art of research design and developer outreach is not for the faint-hearted. And it can not be wrapped up in a margin of error figure. But fear not. With more than 10 years of experience in mapping the developer ecosystem through large-scale surveys, we are here to help. All you have to do is get in touch.

*Based on our Developer Program Leader surveys.

We are the analysts of the developer economy. We help the world understand software developers - and vice versa! www.slashdata.co