Payment Gateway A/B Testing Guide

Payment Gateway A/B Testing Guide – How to effectively test different payment providers and methods

payment gateway A/B testing

See what is inside our A/B testing guide:

Why bother with A/B testing your payment gateway?

In a perfect world, if a user decides to pay for the service or content you are offering, they will simply click all the way through to the payment gateway, and proceed with the payment –  simple as that.

But in real life, payment gateways are not created equal, and usually differ in terms of:

payment conversion rates
payment acceptance rates
level of fraudlent transactions
level of chargebacks
downtimes & timeouts

There are plenty of payment solutions to choose from, but with so many options, how can you tell which one would best fit your business? It’s impossible before you actually implement the payment gateway and test it on your website or app. Testing different payment gateways one by one would be really time-consuming and, even worse, can cost you some serious money if you end up in a bad solution.

So what do successful merchants do? To make the best choice in the shortest time possible and with the least effort and costs involved, they simultaneously test multiple payment options using A/B testing or split testing. As evidenced below, A/B testing and split testing are the most confident, safest and easiest way to test your payment gateway set-up and choose the configuration with the best proportion of conversion and chargebacks rates.

How do A/B testing and split testing differ and when to use which?

A/B testing and split testing are often used interchangeably and considered the same thing. But the difference lies in the details. 

Split testing involves directing your traffic to two separate payment gateway URLs. This typically means a 50/50 traffic split between your current payment and the URL to your new payment setup, i.e. 

  • 50% of the traffic goes to yourwebsite.com/your-current-payments
  • 50% of the traffic goes to yourwebsite.com/new-payment-setup

A/B testing, on the other hand, means that your test runs on the same URL, and some 50% of your visitors see the different copy and design elements, whereas the remaining 50% see the unchanged page. 

The setup may look something like this:

  • Version A yourwebsite.com/your-payment-page – a control version with your current payment setup 
  • Version B yourwebsite.com/your-payment-page – a new payment setup dynamically shows on page load

Both methods are controlled experiments that run one or more variations against the original page.

You can, of course, create more than two variations – such tests are known as A/B/n tests. They require significantly more traffic and conversions and are subject to a much higher risk of errors. 

Below, we will break down the whole process of setting up and running an A/B or split test on your payment gateway setup.

The metrics to use when optimizing your payment gateway for higher success 

Every optimization journey starts with picking specific performance goals and metrics for which you want to see the uplift. The same applies to A/B testing – before you test, you need to know precisely how to measure the success or failure of your tests and performance of your payment gateway. We want to help you make the right choice by presenting a defined set of the most popular metrics that merchants use to evaluate the performance of the payment gateway.

Let’s have a closer look at each one of them.

Payment acceptance and decline rate  

Even if a customer provides all the correct credit card data and clicks the right button, their payment may still be declined by the bank. There are multiple reasons for that:

  • The payment gateway is working with an offshore acquiring bank that has problems with accepting local payment, i.e., may flag them as fraudulent 
  • The payment gateway’s anti-fraud filters are not flexible enough and may block legit transactions as fraudulent
  • There is a downtime on the payment platform or acquiring bank side, and the payment can not be processed
  • There are not enough funds on the card holder’s account
  • The payment gateway does not support the card provided by the customer

The most robust payment gateways (i.e., SecurionPay) can track payment acceptance and decline rates in real-time and even provide you with in-depth data about each declined transaction. 

These metrics are usually calculated by dividing the number of successful transactions by the total number of payment attempts.

Payment gateway conversion rate

The payment gateway conversion rate is broader metrics than the payment acceptance rate. The payment acceptance rate is actually an integral part of the payment gateway conversion rate. But on top of the payment acceptance issues, there are several other reasons why the payment may fail:

  • UX issues are making it hard for users to navigate through the payment gateway and provide their payment data
  • There are functional issue preventing some users from actually using the payment gateway
  • There are no payment methods that the user recognizes or trusts
  • The payment gateway uses language or currency that is not native for the user
  • Payment gateway design looks sloppy and outdated causing user distrust

This metrics is usually calculated by dividing the number of users who successfully reach the end of the payment process (i.e., see the “Thank You For Your Payment” page) by the total number of users who enter the payment gateway process. It can be easily set up in any web analytics software like Google Analytics.

Payment gateway abandonment rate

This metrics is related to the payment gateway conversion rate. It’s simply the ratio of people who drop off at individual steps of the payment gateway process. It is calculated from the number of people who never reach the end of the payment process. You can easily set it up in any web analytics software, e.g., using a funnel report in Google Analytics.

Average subscription value / purchase value

Optimizing the payment gateway is not always about the number of successful transactions. You may actually prefer a payment gateway setup that leads to a lower conversion rate if it offers a bonus in the form of increased purchase value. 

This metrics is a simple calculation of how much money your users spend during a single purchase, e.g., they choose higher subscription plans.

Time to complete the transaction

Perhaps this is as granular as you can get when optimizing your payment gateway’s performance. In essence, the time to complete the transaction allows you to measure the direct impact of the UX improvements in the payment form. Measuring the time users need to fill in all the forms and complete the transaction should be directly correlated with your gateway conversion rate. 

Usually, these metrics are represented as a time difference between the moment the user interacts with the first form field to the point where the payment is marked as successful.

Chargeback ratio 

This metric, although not directly influenced by your payment gateway optimization efforts, should be closely monitored during the whole process. Chargeback ratio is the percentage of successful transactions followed by a chargeback claim. Every merchant should know that an increased chargeback ratio spells severe trouble and may be costly, so it’s good to keep it as low as possible. 

After running one or two A/B tests, you may decide to switch to a new payment gateway, and your conversion rate will skyrocket – but so will your chargebacks. Treat your chargeback ratio as a health check metric that you monitor along with all your main metrics.

Optimize your payments to convert up to 19% more transactions

Merchants who process payments using SecurionPay see a measurable increase in conversion & acceptance rates.

Get in touch with us to boost the performance of your payment gateway

2. Pre-test research – finding what is wrong with your payment gateway setup

Before picking up multiple payment gateway solutions and A/B testing them, it is good to know exactly what’s wrong with our current payment gateway setup in the first place. 

Identification of all the issues with your current payment gateway setup can give you an idea about what kind of solution you really need. For example, if you discover that your current payment gateway is not optimized for mobile devices, has terrible overall UX, and decreased conversion rate in specific regions, you will know what to look for in an alternative payment gateway for your website.

 Below we present you with a bunch of tools and research methods that you can use on your own to uncover issues with your payment gateway setup. 

Technical analysis

Technical analysis is all about finding basic bugs in your payment setup, which are your conversion killers. There are three main components of technical analysis:

  • Cross-browser analysis – In order to perform cross-browser analysis, you can simply open your Google Analytics and go to Audience -> Technology -> Browser & OS report. You will see the conversion rate (for the goal of your choice) per browser. Look for users using browsers where the conversion rate is significantly lower and drop off rates from the payment process significantly higher. If you suspect problems with a specific browser in your reports, you can verify this using https://crossbrowsertesting.com and http://www.browserstack.com without having to install all the browsers on your device.

 

  • Cross-device – To perform a cross-browser analysis, open up your Google Analytics and go to Audience -> Technology -> Mobile Traffic. The screen shows the conversion rate (for the goal of your choice) per device type and manufacturer. Look for users using browsers for which the conversion rate is significantly lower, and drop off rates from the payment process are considerably higher. If you suspect issues with a specific device type in your reports, you can test it using services like https://crossbrowsertesting.com and http://www.browserstack.com 
  • Speed analysis – if your payment gateway page takes long to load, it can cause a lot of friction for your users. This can happen, especially when your users try to make the payment on mobile devices with a slow internet connection. To identify loading speed problems, you can use a tool like Google Speed Insights or Pingdom Website Speed Test.

If your technical analysis indicates that your current payment gateway setup is not optimized for all mobile devices and browsers, the big question is, “does the payment provider offer you enough flexibility to fix that?” More often than not, you are stuck with pre-defined setup, and there is not much you can do – other than some A/B testing against other payment gateways, which are better optimized for mobile devices and run faster on different browsers. 

Web Analytics analysis

Funnel reports are the go-to tool to use when analyzing your payment gateway’s performance in Google Analytics. We assume that if you’re considering running A/B tests on your payment gateway, you have already set up conversion tracking in your Google Analytics account. 

The most obvious use case for funnel reporting in GA is identifying the steps of the payment process, where most users drop off. This is perhaps one of the best reports to track the payment gateway abandonment rate discussed in the previous chapter. It’s also a way to show the performance boost after switching to a new payment gateway. The most basic funnel visualization report could look like this:

There are, however, quite a few limitations with the typical goal funnel visualization in Google Analytics (including backfilling and the lack of segmentation capabilities). Luckily, most of these shortcomings can be overcome (i.e., if you’ve implemented Enhanced Ecommerce) by using horizontal funnels.

Alternatively, you can use the goal flow report, which looks something like this…

The goal flow report is generally considered more flexible. It allows us to use advanced segments and see retroactive data when a funnel is added or changed. 

The goal flow report offers more detailed information about the paths people take, friction points, and other insights, e.g., where to prioritize your digging or optimization. It shows the most accurate visitor path before they complete a goal. On top of that, it allows users to loop back and, rather than recording an exit, show when someone goes from step one to step two and then back again. Goal flow reports don’t backfill steps if the visitor skips a step in the funnel. Also, they record the actual order in which steps of the funnel are viewed.

Web analytics – user segments analysis

Funnel analysis alone might not be enough to tell what’s wrong with your current payment gateway setup. To gain deeper insights, you will have to apply one of the key features of Google Analytics – segmentation. 

You will achieve the best results when applying segmentation to one of the previously discussed funnel reports. 

When analyzing your payment gateway funnel, try applying the following segments:

  • Users attempting to pay from different countries – try segmenting your funnel by user groups from different countries, especially ones with the most significant representation in your overall user audience. Try to answer the following questions: Is my payment gateway optimized for potential customers coming from different countries? Are conversion rates for some specific countries or regions of the world exceptionally low? 
  • Users trying to pay on mobile devices – an extension of the cross-device analysis discussed earlier. Try to answer the following questions: Is my payment gateway responsive enough and works well on all devices? Can users actually pay with one hand? Does my payment gateway offer mobile payments?
  • Users using different languages in the browser – segment your payment gateway funnel by users browsing the web in different languages. Try to answer the following questions: are there any specific languages with an above-average funnel drop-off rate? Does my payment gateway offer dynamic translation options? Does my payment gateway cover all the most popular languages utilized by my users?
  • Users using slower connection types – segment your payment gateway funnel by users browsing the web using slower internet connections, which can be typical for many developing countries. Try to answer the following questions: will my payment gateway work well on slow internet connections?

Those are just some basic segments that will help you discover the key issues with your payment gateway setup. We recommend that you experiment with more sophisticated segments to reflect your actual user base better.

Using mouse click and scroll analytics

Another useful set of analytical tools you can use to analyze your payment gateway performance is a mouse click, movement and scroll data. We can record what people do with their mouse / trackpad, and quantify this information. Mouse-tracking provides accurate insights on user activity and allows to spot trends on their movement in the form of heatmaps. The two most interesting types of heatmaps are click maps and scroll maps.

Click maps

A click map is a visual representation of aggregated data about where people click. The red color means “lots of clicks”. 

Click maps are very useful when analyzing specific payment forms, e.g., longer credit card forms and checkout processes. With this report, you can, for example, identify a credit card form field that causes the most friction. The fields which people click nervously before they eventually abandon the form altogether would be yellow. If the next form field gets no clicks, you can be sure that the yellow field is the troublemaker. 

Scroll map

Scroll maps come in handy when you want to analyze scroll depth, i.e., how far down people scroll. The longer the page, the fewer people make it all the way down. But it gets a completely new meaning when it comes to payment forms. If people don’t scroll more than 50% down your form, you may have a problem – it could be too long, or something else is causing friction. For example, if you use strong horizontal lines or color changes (e.g., white background suddenly becomes orange), people may understand them as ‘logical ends,’ and assume that whatever follows is no longer connected to what came before.

Using form analytics to uncover friction in the credit card forms

Form analytics is perhaps one of the most insightful research methods to identify issues with your payment gateway forms. According to data from Formisimo, roughly two-thirds of those who start filling out a form never complete it. To find out what’s causing users to abandon the form, you would need to analyze their interaction with each individual field in the form. This is where “form analytics” comes into play. 

Form analytics will analyze form performance down to individual fields and provide you with insights like:

  • How often are forms abandoned or “dropped”?
  • How many times do users re-enter or correct field data?
  • Which fields are regularly ignored or left blank?
  • How many form submissions generate errors?
  • How long does it take to complete a form field (or the entire form)?
  • Which form fields cause the most error messages?
  • Which form field people hesitate to fill? (hesitation time in milliseconds)?
  • Which form fields people leave empty, even though they’re required?

Unfortunately, form analytics is not something you will get out of the box with your Google Analytics. I require additional set up via Google Tag Manager. Here are a bunch of resources and step by step guides from the GTM guru Simo Ahava on how to do that:

If you don’t feel comfortable with GTM and custom events, you may want to use one of the off-the-shelf solutions like:

  • HotJar
  • Formisiom
  • Zuko
  • Decibel Insights

Whichever solution you choose, insights from form analytics will certainly be worth the money and the hassle. This is the single research method that can uncover the most issues with your payment form UX. 

Using heuristic walkthrough

For the best results, we strongly encourage that you use all of the discussed tools and research methods in parallel. But if, for some reason, you do not have the time and resources to try all the previous methods, the heuristic walkthrough is a dirty little technique you can use in almost any condition. However, be warned that the quality of insights from heuristic walkthrough alone will be inferior when not used along with other methods. 

“By definition: Heuristic analysis is an experience-based assessment where the outcome is not guaranteed to be optimal but might be good enough.”  Its main advantages include:

  • Speed. It can be done fairly quickly.
  • It’s cheap and does not require any additional tools
  • It does not require extensive technical knowledge or coding
  • One of the best methods to analyze competitors payment gateway set-up
  • Can be deployed in a hassle-free way on many different websites without any additional investment 

A word of caution – the more experience you have in conversion optimization and payment gateways, the better your findings will be. This is not a rookie research method. 

Now, how do you approach payment gateway heuristic walkthrough? First, you will have to find a framework that you will strictly follow. 

Whichever framework you pick, you need to move it to some kind of spreadsheet, i.e., Excel or Google Sheets. What you are aiming at is the creation of a simple walkthrough table where each row is different evaluation criteria, and each column is a different payment gateway you are evaluating (i.e., your gateway vs. your competitor’s set-up). 

A sample heuristic evaluation spreadsheet for MecLabs framework could look like this:

The output from a heuristic walkthrough should obviously be a list of potential issues with your payment gateway setup. Still, on top of that, you can get some benchmark for your payment gateway setup and see how it stacks up against your competitors’ gateways. Sometimes a simple but structured comparison can be a real eye-opener. 

What’s next?

Now that you have identified all the issues with your current payment setup and its performance, it’s time to look for a potential solution. The most intuitive thing to do would be to simply contact your payment processor and request changes in your payment gateway set-up. But you would be surprised at how many payment processors don’t provide any flexibility and customization. If you end up in a situation where you have to stick to the out-of-the-box setup (often a poorly designed one), it means is the time to look for an alternative payment solution. 

If this is the case, then the list of discovered issues is, at the same time, a list of everything your next payment gateway will have to deal with. The list will help you find the right payment gateway provider. Then, if you think you’ve found the right one, run an A/B test and see if the new solution can outperform your current setup and boost your revenue. 

Optimize your payments to convert up to 19% more transactions

Merchants who process payments using SecurionPay see a measurable increase in conversion & acceptance rates.

Get in touch with us to boost the performance of your payment gateway

3. Building a strong test hypothesis

A/B testing is a scientific, rigorous method of optimizing the user experience on the web, and thus it requires a methodical approach. Before you go about setting up and running the test, you need to have a hypothesis. Contrary to common belief, A/B testing is not about testing various website changes and design tweaks – it is always about testing your hypothesis. 

So what is a hypothesis in A/B testing? In a nutshell, you “hypothesize” that if you change something in your current payment gateway set-up, based on some research data, it will produce specific results, i.e., conversion and revenue growth. 

A hypothesis defines why you believe a problem occurs. Furthermore, a good hypothesis should:

  • Be testable and measurable, and thus can be tested.
  • Solve a conversion problem. 
  • Provide market insights. With a well-articulated hypothesis, your split-testing results give you information about your customers, whether the test “wins” or “loses.”

Craig Sullivan offers a simple hypothesis kit:

    1. Because we saw (research data),
    2. We expect that (change) will cause (impact).
    3. We’ll measure this using (data metric).

And an advanced one:

    1. Because we saw (research data),
    2. We expect that (change) for (population) will cause (impact[s]).
    3. We expect to see (data metric[s] change) over a period of (X business cycles).

So, after all the research we discussed earlier, your testing hypothesis may look something like this:

  1. Because we have noticed that 89% of mobile users abandon our credit card payment form, we expect that by applying a new mobile-friendly payment gateway will significantly increase the conversion rate among mobile users. We will know that by measuring payment gateway conversion rate and abandonment rate among mobile users.
  2. Because we have noticed that our payment acceptance rate is 45% lower in the EU, we expect that by implementing EU based payment gateway, we can significantly decrease payment decline rates for EU based users. We will know that by measuring payment acceptance and decline rates for our EU user base. 
  3. Because we have noticed that users are dropping out on certain form fields, which is causing 67% form abandonment rate, it is taking on average 20 minutes to complete the form for those that succeed. We expect that by simplifying UX of our payment form, we can decrease form abandonment rate. 
  4. Because we have noticed that users from Eastern Europe have a 23% lower conversion rate and a 34% higher payment form abandonment rate, we expect that adding local languages and currencies can boost conversion rates for these segments of users. 

Once you have a clearly defined testing hypothesis, you know exactly what you want to test, why you want to test it, and how you are going to measure the success or failure of the test.  Having a clear testing hypothesis will also give you the confidence that you have picked the right payment gateway setup for the test, and help you validate the whole conversion research.

4. Estimating how much time, users and conversions you will need for your test to be valid

Let’s quickly wrap up what we have done so far to be able to run successful A/B test on your payment gateway setup:

  • We have chosen metrics (KPIs) for which we want to optimize our payment gateway setup like Acceptance Rate or Conversion Rate
  • We have done the research and investigated what might be hurting our payment gateway performance 
  • Based on the research insights we have come up with a hypothesis on how we want to improve payment gateway performance metrics  

Now it is time to launch the A/B test and see if our hypothesis is true by any chance. But before we jump into testing itself, we need to run some numbers and do the math and statistics. 

A/B testing is a scientific method of validating optimization ideas, so it requires some scientific rigour. As a matter of fact, everyone can set-up and launch the test, but not everyone can and should. This is determined by the number of users using your payment gateway daily and the number of users paying using your payment setup.

But why do you need these numbers in the first place? Your test results need to be representative of the whole population of your users. You are running your A/B test only on a portion of your user base, so if there are not enough users and conversions in your test sample, the whole test is just not representative. In the A/B testing world, we call it “statistical significance.” To achieve statistical significance at the end of your test, you need to provide enough users and conversions to test in the first place. 

We strongly recommend that you read more about statistical significance in A/B testing to grasp the idea fully. 

How much traffic and conversions I need to run the test?

There are a few quick and straightforward methods to verify if you got what it takes to launch an A/B test. First, you can have a look at this A/B testing cheat sheet to get the general idea about how many conversions/payments you need on a daily basis to be able to run the test and how long it may take. 

(Image Source)

As you can see, if your payment gateway is seeing only 5 payments per day (or fewer), you will not be able to run a valid A/B test. If your daily number of successful payments is 6, you can run your test, but it will still take you about 35 days to see the results. 

If you still think you have what it takes to run the test, it is time to crack the numbers. We strongly recommend using the CXL pre-test analysis calculator

The numbers you need to provide include:

  • Weekly traffic (session or users) – i.e., the number of people visiting your payment gateway URL or app screen each week. You can easily take this number from your web analytics tool. We recommend that you use the number of “users” instead of sessions, page views, or visits. After all, you are running a test for real people and not for some abstract number of page views. 
  • Weekly conversions – i.e., the number of successful payments you are seeing for your payment gateway setup right now. You can get these metrics from your web analytics tool if you have set up your conversion tracking, or you can pick this number form the administration panel of your payment gateway. If your payment provider does not provide such metrics, you have yet another reason to switch to another payment gateway. 
  • Number of variations – i.e. the number of payment gateways or set-ups you would like to test. It is strongly recommended that you always stick to two variants and never test more than once. Testing more variants requires much more traffic and conversions, and drastically increases the probability of statistical error. 

Below you will see additional numbers that are auto-generated based on the figures you provide above:

  • Baseline conversion rate – i.e., the metrics that are telling you how efficient your payment gateway is at converting users into customers. These are the metrics towards which you will be most likely running your A/B test. You can easily calculate it on your own by dividing the number of “weekly conversions” by the number of “weekly traffic.”
  • Confidence level – i.e., the number that influences how many users and how many conversions you need to be able to run valid tests. 95% confidence tells you that at the end of the test, you want to have a 95% probability that your results are not accidental. It means that that there is a 95% probability that if you replicate the test or implement your findings, you will receive similar (or better) results. This also means that you have a 5% probability that your tests results are not valid, and if you implement them, you will actually see a decrease in conversion. Here you can read more why you should use 95% statistical confidence level for your test and what it is exactly.
  • Statistical power – the probability of getting statistically significant test results at level alpha (α) if there is an effect of a certain magnitude. In other words, it’s the ability to see a difference between test variations when a difference actually exists. This is a highly scientific metric. You don’t need an in-depth understanding of it to run successful A/B tests. Nevertheless, we encourage that you have a look at this article explaining the role of statistical power in A/B testing.
  • Number of weeks running the test – i.e., how long you will have to run your test in order to achieve 95% statistical confidence at the given the “minimum detectable effect” level and at the given number of users taking part in the test. This is only a minimum time of running your test.  You should never end your test earlier than this unless the test is broken. 
  • Minimal detectable effect – i.e., how much change in conversion rate your test will need to generate to achieve 95% statistical confidence at a given number of users taking part in the test. So the more users that take part in the test, the less uplift your test will need to generate, and vice versa. 
  • Visitors per variant – i.e., how your test length and minimal detectable effect will vary depending on the number of users that will take part in the test. The more users in the test, the faster you will see the results, and the lesser changes in conversion will be required. 

This is basically all you need to know to validate the numbers before you run an A/B test. You need to know exactly how long the test needs to run, how much users should take part in it, and, more importantly, and if you have enough users and conversions to run a valid test. 

Of course, there are plenty of other A/B test calculators you can use to double-check your test numbers. For example, check out this A/B test duration calculator by VWO.

And if you feel you are a real CRO and A/B testing pro, you can use this one which is one of the most comprehensive A/B testing calculators you can find on the web. This is what real A/B testing ninjas use. 

What if it turns out that you do not have the numbers or just feel overwhelmed by the complexity of all the stats and numbers behind A/B testing? Well, we have some workaround for A/B testing noobs. 

Bayesian Statistics – a simpler and funnier way of evaluating your payment gateway A/B tests

If you have a small website or your business is only gaining traction, you will most likely have to face the brutal truth – you do not have enough users or conversions (or both) to run an A/B test that would achieve 95% statistical significance. Alternatively, you would have to run your test for a couple of months to achieve valid results – which does not make much business sense either way. 

Everything we have covered so far about checking if you can run a valid A/B test and the 95% statistical significance falls into the “frequentist statistics” category. This is the most popular statistics model taught in schools and used in most A/B testing tools on the market.

But there is also a less popular statistics model that can be applied to A/B testing, known as Bayesian Statistics. This model is much simpler to grasp for the non-technical people and is even easier to understand for the business people. However, it comes with some pitfalls. 

Pros of using the Bayesian model:

  • Easier to understand for business people especially when it comes to presenting test results
  • Requires less data to run and validate the test
  • You decide when to end the test and how much error risk you can bear
  • It takes in to account not only the number of conversions but also the revenue generated by each of the testing variants

Cons of using the Bayesian model:

  • Less accurate than frequentist model – carrying more risk of an error
  • An inexperienced optimizer can be tempted to stop the test too early and call a winner

Let’s have a look at Bayesian A/B testing in action. The best way to this is to head to this great Bayesian A/B test calculator

The only number you have to provide to evaluate the results of your test are:

  • Number of users that will see version A of your payment gateway
  • Number of users that will see version B of your payment gateway 
  • Number of conversions / successful payment for version A
  • Number of conversions / successful payment for version B

That’s it. Everything else in this calculator is optional, but we are going to cover it anyway. First, let’s see what you get after hitting the Make calculation button. The first thing that you will see will be the horizontal chart with two bars – green and red.

This chart is by far the simplest presentation of A/B test results you can get for your tests. 

  • The green bar will always be shown for the winning test variation and will have % metrics attached to it. This % metric indicates the chances of winning / outperforming the other variation.
  • The green bar will always be shown for the losing test variation and will have % metrics attached to it. This % metric indicates the chances of winning/outperforming the other variation.

Below the chart, you will see another number that should be self-explanatory. The most important thing is the two % numbers on the chart. In the above example, there is a 91% chance of variation A to have a higher conversion rate in real life than variation B. By real life, we mean the post-test implementation of the winning variation and seeing the business impact. 

This is as simple as it gets. You have 91% of being right with your test and around 9% risk of being wrong. And it’s your call to either implement the test results or consider it unsuccessful based on the simple probability metrics. 

So if you do not have enough metrics to achieve the 95% probability, you will have to make a decision based on the less favourable odds. For example, you may have to decide based on the 70% probability that the test results were valid, and a 30% risk of the test results being misleading. It all comes down to simple business decisions based on the calculated risk and probability. Simple as that. 

But there is one more cool use case of the Bayesian calculator. It can actually build a business use case for you taking into account not only the raw number of conversion but the actual money value and the impact on your revenue. 

In the optional section, you can provide the following metrics:

  • How long is your test going to run, i.e., 14 days
  • What is the average order/transaction/cart or subscription value?
  • Minimum revenue yield in 6 months, meaning how much revenue uplift you would like to see 6 months after implementing your test results?

If you provide these 3 numbers, the calculator will build a business use case for you that will look something like this:

The most important thing you should pay attention to is the section under the chart. Apart from the previous probability of success and error, you will see “the effect on revenue.” On top of pure probability, you can now use an actual revenue value to drive your decisions. Now it is not about being right or wrong, but it is about losing or earning money. 

Back to the first chart and table, you should also see something like this:

So if your plan is to make €45,000 of additional revenue as a result of your test, or this is the quota you need to break even, you will also get a probability of hitting this number 6 months after the test. 

We encourage you to learn more about the differences between Bayesian and Frequentist statistic model and their impact on A/B testing from these great resources:

And a must-read for everyone putting their hands on A/B testing – Most Common Testing Statistics Mistakes and How to Avoid Them. 

5. Setting-up and running your payment gateway A/B test

So far, we have been mainly discussing all the preparations that go before running a successful A/B test. Now, we have finally got to the point where we can discuss launching and running the A/B itself. Let’s take a deep dive.

Drafting & designing your test variations

This is probably the most time consuming, challenging, and costly phase of A/B testing. You know what performance problems you have in your payment gateway setup, you know what fixes you would like to implement, so now it is time to do it. No matter if you want to run an A/B test or a split test on your payment gateway – always draft a wireframe or mockup before you push anything to your developers and designers. 

Drafting your payment gateway test variation is a crucial step – it serves as a safety buffer and a reality check. It helps to make sure that everyone on the team understands what exactly will be tested and why. It allows you to spot any problems and misconceptions before a line of code is written. 

What are pre-test wireframes? Wireframes are a visual guide to elements on a web page. They convey layout, content, and functionality. A wireframe could be anything from a drawing on a whiteboard to a detailed model. 

There are more than a dozen various wireframes and mockup tools on the market. Our favourite choice is Balsamiq, as it’s easy and fast to use and comes with a vast library of predefined web and app layout elements.

(Image Source)

Of course, you can pick from dozens of other tools available. In fact, we encourage you to go ahead and try as many as possible to suit your needs. 

Below you will find some of the rules to follow when drafting wireframes for your payment gateway test:

  1. Everybody be cool, it’s only a mockup – make sure the people you show it to understand that it isn’t the final version.
  2. Avoid Lorem Ipsum – heed the golden rule of website design: “all content comes before design.” Good copy is one of the most important conversion optimization factors, so why remove it from pre-test wireframes? The UX guru Luke Wroblewski aptly notes that “using dummy content or fake information in the Web design process can result in products with unrealistic assumptions and potentially serious design flaws.”
  3. Prepare a few drafts of your wireframes – wireframing, unlike coding, is cheap. It makes sense to create several variations of the payment gateway and present it to your team. This gives you the possibility to tweak some changes and iron out imperfections. 
  4. Test your wireframes/prototypes – do quick and dirty testing on your mockups before coding or deploying them with your testing tool. This can lead to interesting insights about the design which you can show to stakeholders/team members.

After you’ve made sure with your team that you are absolutely satisfied with the results and have followed all the rules, you can push your wireframes to designers and developers. 

Choosing & setting up the right testing tool

After pushing your wireframes through the design and development phase, it’s time to implement your test variations into the A/B testing tool of your choice. There are more than 20 primary A/B testing tools on the market – and probably a dozen lesser-known. We don’t want to dive into the details of each tool – instead, we recommend these great resources to help you make your mind:

For the purpose of this guide, we would like to focus on Google Optimize. This free tool is directly connected with Google Analytics enjoys probably the most significant adoption on the market among people running A/B tests. There are probably more guides on how to use Google Optimize than on all the other tools combined. 

(Image Source)

On top of that, Google Optimize natively integrates with Google Analytics, gives you automatic access to rich behavioral insights. You’ll also be able to target the valuable segments you’ve already discovered using Analytics.

Because there are tons of excellent step-by-step guides on setting up Google Optimize on your page and implementing your first test, we will guide straight to the horse’s mouth:

Pre Test Quality Assurance

There are a lot of things that might go wrong during your A/B testing, especially if you are doing it for the first time. Below, we provide you with a QA checklist for a bulletproof A/B test:

  • Is this the first time you will be running the A/B testing tool? Did you do a dry run? Even if this is not your first time, run a dry A/B test with no real users or only on some very small portion of your users. 
  • Is the testing tool integrated with web analytics? You need to have two tracking systems in place to verify the quality and correctness of your test data.  
  • Did you check any sales or marketing activities or campaigns that might influence the results of your test? 
  • Have all test variant templates been QA tested? Have you tested all test variants towards key devices and browsers? 
  • Have you checked the performance of the variants towards page speed load on different devices?
  • Have you let everyone on your team know about the test? 
  • Does this test have one primary KPI / metrics for success?
  • Did you test your variants from several locations (office, home, elsewhere)? Did you get valid data in your test reports?
  • Did you test your variants from different sources, i.e., Google, Facebook? Did you get valid data in your test reports?
  • Did you check both variants towards HTML & CSS compatibility issues? 
  • Did you check both variants towards Javascript errors, compatibility, libraries?
  • Did you check both variants towards rendering issues?
  • Did you check if the analytics tools and tag are firing are firing correctly (analytics and the test tool)?

Below once again, we are list some of the helpful tools when doing a pre-flight check-up:

We strongly recommend you not skip any of the points from this list. Even the smallest misstep can send your test results into the drain. And if we are talking about running a test on your payment gateway setup, any mistake can cost you serious dollars in revenue losses.

Launch & Monitoring payment gateway A/B Test

There are a lot of things that might go wrong during your A/B testing, especially if you are doing it for the first time. Below, we provide you with a QA checklist for a bulletproof A/B test:

  • Is this the first time you will be running the A/B testing tool? Did you do a dry run? Even if this is not your first time, run a dry A/B test with no real users or only on some very small portion of your users. 
  • Is the testing tool integrated with web analytics? You need to have two tracking systems in place to verify the quality and correctness of your test data.  
  • Did you check any sales or marketing activities or campaigns that might influence the results of your test? 
  • Have all test variant templates been QA tested? Have you tested all test variants towards key devices and browsers? 
  • Have you checked the performance of the variants towards page speed load on different devices?
  • Have you let everyone on your team know about the test? 
  • Does this test have one primary KPI / metrics for success?
  • Did you test your variants from several locations (office, home, elsewhere)? Did you get valid data in your test reports?
  • Did you test your variants from different sources, i.e., Google, Facebook? Did you get valid data in your test reports?
  • Did you check both variants towards HTML & CSS compatibility issues? 
  • Did you check both variants towards Javascript errors, compatibility, libraries?
  • Did you check both variants towards rendering issues?
  • Did you check if the analytics tools and tag are firing are firing correctly (analytics and the test tool)?

Below once again, we are list some of the helpful tools when doing a pre-flight check-up:

We strongly recommend you not skip any of the points from this list. Even the smallest misstep can send your test results into the drain. And if we are talking about running a test on your payment gateway setup, any mistake can cost you serious dollars in revenue losses.

(Image credit to Craig Sullivan)

 

  • Noise: This is perhaps the most tricky phase of running the A/B test. Many people doing it for the first time may think the test is broken or even worse – can call a winner after 2 or 3 days of testing. It’s called Noise for a reason – the graphs look random. This is just because samples are small, and results are not indicative of later performance. These graphs won’t tell you of any problems at this stage – because such volatility is expected, screw-ups don’t surface. Note: Do not stop the test at this stage or panic because you see big data fluctuations. Use the previously presented checklist to validate the correctness of your test setup. Run a web analytics diagnostic report and cross-check with the A/B testing tool data. 
  • Volatility: The crazy noisy period is over. The results may still be unclear. They are starting to show a pattern. These graphs won’t tell you of any problems just yet – because this volatility is normal, screw-ups don’t surface as easily. Run a web analytics diagnostic report and cross-check with the A/B testing tool data. 
  • Solidifying and Solid: As you get towards the estimated sample required, you’ll start to see predictability and stability in the responses. You should not see sudden changes at this stage, but check if there’s anything odd – if something was broken in the test, you should have surfaced it by now.

When to stop the A/B test and call for the results?

So, when should you stop? Only after your test has been running for the minimum number of days that you have calculated using one of the provided A/B test calculators. Even if you think you can call the winner, never end your test before it runs for the recommended number of days. 

Another important rule regarding the end of the A/B test is to always end the A/B test after a full week or business cycle. No matter how many days you need to run the test, never start the test in the middle of the week and wrap it up before the weekend. This is very important as the A/B test data is often completely different on regular workdays and weekends – no matter in what industry you are in. 

The main three stopping points are:

  • You have been running your A/B test for the minimum number of days calculated based on the number of users and conversions generated during the test.
  • After you run the test for the minimum number of days, your test has reached 95% statistical significance.
  • You want to end the test after a full business cycle has passed, i.e., if you have started the test on Monday, you want to wrap it up also on Monday despite the period of minimum test days.

Things get a bit more complicated if your test has been running for the defined minimum number of days, and you still haven’t reached the 95% statistical significance. If there is no chance that your test will reach statistical significance within a predictable period, then it’s time to reach for the Bayesian statistics, as discussed in the previous chapter.  

Just to recap what we have learned about Bayesian statistics so far:

  • You can stop any time you like, but it’s recommended to stick to the rule of a minimum number of test days and full business cycles
  • You can stop the test even if you did not reach the statistical significance – you rely on the simple probability calculation between making the right and wrong business decisions
  • It’s way easier to communicate the test results and resulting decisions to the business owners and the rest of your team.

To wrap it up, if you have enough data (users + conversions), you should follow the frequentist statistics and wait patiently until your test reaches the 95% statistical confidence. If you think you may not have enough data (users + conversions), then simply run your test for the number of days you have established in your initial calculation, and evaluate the results of your test using Bayesian model afterward. 

6. Analyzing the test results

Analyzing your test results

There are very limited options when it comes to the A/B test results. You can only end up with one of the five following results:

  • Winner There’s a clear winner, and the stats stack tells us we can be confident. This can be represented by a test with a 95% confidence if you are using the Frenquestic model or if you are using Bayesian statistics, whatever probability level makes you comfortable with the next step (for us, it would be a 90% probability).
  • Not enough data – There is not enough data for your testing tool to draw a result and call for a winner variant. This when you would usually try to evaluate your test using Bayesian statistics, and if that is still not enough, you should think about how to relaunch the test where you can capture more users and conversions.
  • LoserThe control has won, and the stats tell us we can be confident. This is when there is, in fact, a clear winner, but it is not the new optimized version of the payment gateway you are testing. It may seem like a disappointment to see that the new shiny version of the page we are testing loses to the old dusty version. But there are insights to be found in this loss. This may indicate that: a) we draw wrong conclusions from the pre-test research and try to solve a problem that does not exist or is irrelevant to the user conversion b) the new variant did not offer the solution to the identified problem, c) the execution of the test went wrong.
  • Inconclusive – Based on the test results, you can’t call for a clear winner or a loser. This usually means that the version A was equally efficient as version B in terms of converting users into successful payments. Once again this might mean: a) we draw the wrong conclusions from the pre-test research and try to solve a problem that does not exist or is irrelevant to the user conversion b) the new variant did not offer the solution to the identified problem, c) the execution of the test went wrong.
  • Warning – Test variation C is losing $200k per day, and by the end of the experiment, may lose 8.2M – continue y/n?

There are five possible outcomes, but test results definitely have more shades. Regardless of the final test result, there is a lot to be learned from each test. 

Analyzing your test results in web analytics

This is an obligatory post-test activity. Always verify your test results with the back-up data stored in your web analytics reports. If you are running tests using Google Optimizer, the case should be relatively simple as there is native integration with Google Analytics. This integration automatically generates A/B test reports in GA independent from the test report, you will see in Google Optimize. 

(Image Source)

If you are using another tool than those provided by Google, then you will have to perform some additional tasks in order to integrate your testing tools with web analytics of your choice. We strongly recommend that you do it before you run any test. 

Double-checking your test results with web analytics data allows you to be more confident in your data and decision making. Your testing tool could be recording data incorrectly. If you have no other source for your test data, you can never be sure whether to trust it. Create multiple sources of data.

If variations all look the same, you may be tempted to call your test inconclusive. But you don’t have to do that! The variation might still beat the control for specific segments.

For example, if there is a lift for returning mobile visitors – and at the same time a drop for new visitors who are desktop users – those segments might cancel each other out, making it seem like there’s “no difference.” 

But proper segmentation is key. Analyze the test across key segments to investigate that possibility. Even though B might lose to A in the overall results, B might still beat A in other segments (organic, Facebook, mobile, etc.).

There are a ton of segments you can analyze. Optimizely lists the following possibilities:

  • Browser type;
  • Source type;
  • Mobile vs. desktop, or by a device;
  • Logged-in vs. logged-out visitors;
  • PPC/SEM campaign;
  • Geographical regions (city, state/province, country);
  • New vs. returning visitors;
  • New vs. repeat purchasers;
  • Power users vs. casual visitors;
  • Men vs. women;
  • Age range;
  • New vs. already-submitted leads;
  • Plan types or loyalty program levels;
  • Current, prospective, and former subscribers;
  • Roles (if your site has, for instance, both a buyer and seller role).

At the very least—assuming you have an adequate sample size—look at these segments:

  • Desktop vs. tablet/mobile;
  • New vs. returning;
  • Traffic that lands on the page vs. traffic from internal links.

Presenting test results to your team and stakeholders

You will likely have to present the test results to the key people in your company. And often, despite having clear data behind your test results is not enough to get buy-in from the key stakeholders. You will still need to convince people that the test you have run is valid and that implementing the findings will have real business impact. 

We really like the A/B testing reporting template provided by Optimizely.

(Image Source)

It is short, clear, and straight to the point. It cuts down all the statistical and technical stuff that is irrelevant to the business people, and covers the following areas:

  • Purpose: Provide a brief description of “why” we are running this test, including your experiment hypothesis.
  • Details: Include the number of variations, a brief description of the differences; ideally, it can be visualized with screenshots. 
  • Results: Be as concrete as possible. Provide the percentage lift or loss, compared to the original, conversion rates by variation.
  • Lessons Learned: This is your chance to share your interpretation of what the numbers mean, and key insights generated from the data. The most important part of result sharing is telling a story that influences the decisions your company makes and generating new questions for future testing.
  • Revenue Impact: Whenever possible, quantify the value of a given percentage lift with year-over-year projected revenue impact.
  • Next steps: What are your recommendations after the test? What is your action plan to use the test findings to improve the business?

If you want to be more detailed about your test results, you can use this template provided by Craig Sullivan.

It is more descriptive than the previous version and dives deep into the metric aspect of the test. Here is where Craig is explaining in detail how to fill each section of the report 

Got your test results? Here’s what to do next

If you have reached the end of your test and are planning to deploy the winning set-up, we need to give you some words of caution. No matter how confident you are in your test results, there is always a risk of an error. If you measure your test using frequentest statistics and 95% statistical significance, then the probability of error will always be 5%.  When using the Bayesian model, the probability of error can go as high as 20%. 

So here are a few things of post-testing advice:

Deploy your winning set-up but keep your old payment gateway setup running for some small percentage of your users. For example, keep showing the old set-up to 10% – 20% of your users and monitor your KPIs closely if they behave in the same manner as during the test.
Deploy your winning variation and start sending 100% of your traffic to the new payment gateway set-up. But observe carefully to see if it behaves the way the test results indicate it would.
Examine how is new setup affecting specific user segments. You can use the list of segments we have provided in the previous chapters. Do not satisfy for overall conversion lift; analyze if there any segments that got actually hurt by the new variation.

Always be vigilant and ready to roll back to the previous version in case your post-test implementation did not mimic the test results. Give it at first week or two to if you can afford it and then fully validate the implementation effect. 

We know that the amount of research and work required to launch the payment gateway A/B test can be overwhelming. But there is no way around it if you want to make it right. 

If you feel like you could use the help of payment gateway optimization experts, then feel free to reach out to SecurionPay experts. We will help you in:

Setting up proper tracking tools to validate the performance of your payment gateway set-up
Validating your current payment gateway set-up towards KPIs like conversion rate, UX best practices, payment acceptance rate, chargeback, and overall potential to generate revenue and cut down on losses
Recommend changes in your payment gateway setup that you can A/B test on your way to boosting your payment gateway performance
Recommend tools, solutions and even help with the configuration and development.

Optimize your payments to convert up to 19% more transactions

Merchants who process payments using SecurionPay see a measurable increase in conversion & acceptance rates.

Get in touch with us to boost the performance of your payment gateway

The following two tabs change content below.
Marek Juszczyński

Marek Juszczyński

Full-Stack Marketing Expert With a Strong Background in B2B & Conversion Optimization

Online Dating—How to Boost Your Payment Conversion Rate