I’m going to break all the rules of industry best practices and say that you should test radically different ad copy within the same ad group without regard to limiting variables between copy variations.

If you ask any industry veteran about SEM ad copy testing best practices, they will all tell you precisely the opposite. But why? The industry standard thought process is that you must know why A is better than B. But the end goal of testing is to improve your KPIs, which doesn’t require knowing the why (At least not right away).

Consider this: the way we as an industry test ad copy would be the equivalent of building keywords by developing a hunch and iteratively launching and testing out keywords one modifier at a time. Any SEM professional would agree it would be ridiculous to build keywords based on hunches. (For example, say you have a hunch that “digital camera” will outperform “SLR camera” so you launch only those two terms in order to compare which one is better. Then upon finding that “SLR camera” is better, you then test “SLR camera” vs. “DSLR Camera”.). To build keywords like this would be to miss out on all the other possible keywords associated with camera shopping. We collectively as an industry know that it is best to try out a variety of different keywords right off the bat, and later evaluate which keywords worked best once there is sufficient data available. Using this data, we then identify those little “gem” keywords and review search query reports associated with those keywords, adding every possible permutation that arises from historical data.

Why should copy testing be different? What if we first tried every descriptor, copy structure, call to action, etc. without regard to the number of variables tested? Then, with historical data in hand, identified top-performing copies from which we could design one variable control tests to understand the why behind the performance lifts? Better yet, what if we had access to a data scientist who could analyze elements of ads en masse isolating certain elements that tended to perform better across the board so that we can use that data as the basis from which to start isolated variable testing? The Boost Media Data Science Team has developed an innovative analysis tool that does just that, providing comprehensive directional data on creative performance that is as useful to copy test planning as a keyword search query report is to keyword building! For example, using this report we can look at performance across all ads containing “SLR Camera” and “DSLR Camera” to identify that in general “DSLR” has a better return, indicating that perhaps this would be a valuable control test to run.

There are several benefits to first testing dramatically different copies before testing isolated variables:
• Elements that don’t work fail more quickly (because copy that is too similar often performs similarly so that it can take weeks or even months to identify losers).
• Before investing spend and valuable impressions on a one-variable test, we already know that both variations are likely above average of other possible creatives that could be tested.
• In the process of testing radically different copies, we may identify new top performing keywords as a result of fresh, new copy-keyword combinations so that more auctions are won where they otherwise wouldn’t have been. This can lead to increased impression volume, and ideas on new ad groups that should be split off onto their own.

So to revise my earlier statement, I say find what works first, then ask questions later. It’s nice to know that “ships free” resonates better than “free shipping”, but at the end of the day, the only reason this information is valuable is that we use it to improve performance.