A team developing a pizza-themed game is currently facing a dilemma. They are considering whether providing new and unique toppings in a real-money starter pack would improve player conversion, engagement, and satisfaction. Some members of the team believe that this is the best direction to take and suggest increasing the price of the pack to ensure that the player perceives the value of the pack to be high. They believe that this change will not have any negative impact.

However, other members of the team disagree. They argue that monetizing pizza toppings in this manner could potentially turn players away. They suggest that separating the changes would be more effective in ensuring that each modification yields positive and measurable results.

Game and product teams are often faced with dilemmas like these. Improving KPIs is always a big focus, but so is keeping the game fun and making it enjoyable. In this instance, the challenge is to determine the optimal pace of change: whether to implement a significant change quickly and “get on with it” or proceed slowly and cautiously. The best way to resolve this issue is through A/B testing.

What is A/B testing?

A/B testing has been the standard practice for making changes to gaming products for a long time. Although it was originally associated with mobile, today it applies to various types of games on almost all platforms, except for those whose publishing policies make it inherently difficult. The primary goal of A/B testing is to generate positive trends in KPIs. By testing different versions of games against two or more audience cohorts, developers can objectively measure how changes impact their KPIs and determine the best course of action.

However, while the purpose of A/B testing is straightforward, implementing it as a product management methodology for games can be highly complex. Game studios that are new to A/B testing frequently make numerous errors, from overcomplicating their test conditions to improperly applying statistical analysis to evaluate results. Effective A/B testing requires significant rigor, patience, and a willingness to interpret test results honestly, rather than rationalizing negative outcomes. It also requires a high level of clarity.

In our experience, the most effective approach to A/B testing is to have a clear understanding of the questions you’re trying to answer, and for the team to agree on the objectives of those questions. Once you have a clear understanding of these two things, A/B testing becomes a systematic process in which game studios meticulously vary specific factors within their game and analyze the outcomes. Typically, studios only modify one factor while keeping all other internal and external factors constant to determine the impact of a singular change.

Of course, there are many additional considerations to keep in mind when implementing A/B testing. For instance, it’s crucial to have knowledge of the KPI roadmap, understand the gravity of the impact of a specific change, and be aware of the available remote configuration variables that enable the testing of particular aspects of a feature. Some other important considerations throughout the A/B testing process include:

Building a testing plan

To develop an effective A/B test plan, it is essential to craft clear, focused questions that have measurable answers. These questions will guide the A/B testing process and enable the team to evaluate the results objectively. Good questions to ask could include whether changing the price of an item in the game would lead to an increase in visits to the market. On the other hand, a question that cannot be definitively answered would be whether players like the change. By being clear about the questions being asked and the goals of the test, the team can design an effective test that yields meaningful insights.

Cross-team cooperation

If there are features required for your A/B test, it’s important to loop in the engineering team early to give them sufficient time to prepare for any necessary features or changes that may be required for the A/B testing process.

Tracking documents

Create a central tracking document to chart out your plan. This document should include a list of your A/B tests, ordered by priority, along with the start and end dates, platforms, regions, the hypothesis being tested, KPIs, variants, results, and any relevant links. It’s important to establish this document early on in the process so you can keep track of all the necessary information and easily update it as needed. To maintain consistency, it’s recommended that you use a naming convention that includes both the date and the topic, such as “economy1b_2022620”.

Run one A/B test at a time

If you’re just beginning to use A/B testing, run one at a time and see how the process works before trying multiple tests.

When you do run multiple tests, always ensure that the tests do not touch the same parts of your game. Do not, for example, run two separate tests on grain pricing simultaneously. If you do, the results will corrupt each other and leave you with no useful information. Good separations include things like testing changes on entirely separate levels. Poor separations include things like testing product price and pop up offers at the same time. 

User group variants

To ensure accurate results, it’s important that the user groups in A/B testing do not overlap in any way. For instance, if some players in a multiplayer game experience a change in matchmaking while others do not, it is important to avoid matching them together in battles. Doing so would compromise the integrity of the results and make them unreliable. Therefore, it is necessary to carefully design the user groups and ensure that they are completely separate from each other during testing.

Limiting the number of variants

To obtain quick results, it’s advisable to restrict the number of variants you test simultaneously. The reason is that the more variants you test, the more users you need to generate reliable results. Therefore, if time is a concern, testing fewer variants will likely produce more conclusive outcomes.

Testing new users

To ensure you have enough new users for your test, it’s important to loop in the marketing/user acquisition (UA) team early. This is especially true if you are testing first-time user experience (FTUE) completion rates or changes aimed at improving early retention. In such cases, you’ll need a user acquisition plan that can provide you with qualified users in sufficient quantities when you run the test.

Simultaneous tests

To ensure clear and accurate results, avoid starting new tests while others are still running. Always make sure to stop all current tests before starting new ones, as this can lead to muddied or unclear results for both sets of A/B tests.

Split tests by platform

To ensure accurate results, it is important to split tests and test results by platform, such as Android versus iOS. There can be significant differences in how users from each platform interact with a game due to various factors, such as culture, demographic, and economic status. When tests are mixed between platforms, the results may become corrupted, leading to inaccurate conclusions.

Don’t end your tests early

It is important to not stop an A/B test before the results are available. For example, if you are testing the impact of a change on day-14 retention, you should run the test for the full 14 days after attaining a sufficient number of users. If the test is stopped prematurely, the users who experienced the change may be returned to the general population, altering their behavior compared to existing users and invalidating the day-14 test data.

Involve your community

To prevent player dissatisfaction and avoid derailing player strategy, it’s important to inform your community when changes are being made. Even if you end up choosing the control variant, it’s still important to message players in the variant and take the necessary steps to ensure that they don’t have any issues shifting back.

Adjusting the A/B test

Finally, it is important to remember that not all A/B tests are created equal, and therefore each of the factors mentioned above should be considered and weighed for each individual test, rather than assuming that they will all work in every situation.

In conclusion, A/B testing can be a powerful tool for game developers looking to optimize their game and improve player engagement. However, it’s important to keep in mind the various factors that can impact the validity of A/B test results, including user group overlap, test duration, variant limits, and platform differences.

By following the tips and best practices outlined in this guide, you can ensure that your A/B testing process is as effective as possible. Remember to start with a clear hypothesis, create a tracking document, limit the number of variants, loop in marketing/UA, and test each variant by platform, among other key steps.

While it’s true that not all A/B tests are the same, by weighing these factors for each test and carefully analyzing the results, you can make data-driven decisions to enhance your game and create a better player experience.