How do we ensure our Screenshot service works for each browser, as well as special use cases, as we continuously integrate updates to VMs, devices, and infrastructure?
CrossBrowserTesting‘s Screenshot service has 1500+ browser combinations across desktops and mobile devices. Not to mention the permutations: Single Pages, Long Pages, Basic Auth, Login Profiles, Selenium Scripts, and Local Connections.
We need automated tests to check our services internally and reports to find errors.
Use our Screenshots API
We aren’t testing the UI, so we don’t need to automate interactions via Selenium. We can simply use our Screenshots API to fire off tests to each browser. And, since each test is marked as successful only if it completes generating the windowed and full page image, we can use this data for our pass/fail reporting.
Create a Special Account
We set up a special account that is allowed to use a lot of automated minutes and parallel tests so that our own plan limitations can be bypassed. This also gives us the ability to focus reporting to this particular user, as it will only run these automated screenshots tests.
Implement a persistent queue
It will take time to process all 1500+ browsers and we need to ensure we test each one. We need to make sure we are getting full coverage of browsers, even if some part of the system would go down, or the process dies while running.
A queue is great for knowing exactly what browser should be tested next and what browsers remain to be tested. However, an in-memory queue is volatile and restarting the process would mean you’d start over, retesting browsers already tested. We need it to pick up where it left off in the event of a system failure, or if we purposely stop it for system maintenance.
There are many ways we could create a queue, but the simplest by far is a plain old text file. In our script, we make a call out to our API to get a list of all browsers, loop over the results and insert the API names for each OS/Browser combination we want to test on its own line in the file. (We chose the “API names” as the identifier as our Screenshots API needs them in this format for consumption. Keeps it simple when we don’t have to transform data again at time of running the test.)
Our script generates a comprehensive list, but here are a couple of api-name examples:
Moderate tests run over time
We didn’t want to inundate the system nor hog resources from customers, so we had to moderate how often we run screenshot tests. Focusing the script to run one browser from the queue and then quit, enabled us to simply schedule a task that would fire off the script at a preferred rate.
The obvious answer for scheduling tasks is to create a crontab, but how often should this run?
A screenshot test cannot run longer than 4 minutes by system limitations. At worst, a configuration may be queued for a couple of minutes before a browser is available, hence running a new screenshot test for a browser every 6 minutes made sense. So at any given time, we only have one browser running 6 test cases: Single Page, Long Page, Basic Auth, Login Profile, Selenium Script, and a Local Connection.
*/6 * * * * /opt/cbt/QA-ScreenShotTest/runScreenshotTestCases.sh
Putting It All Together
- A cronjob runs our script every 6 minutes.
- The script checks the persistent queue file: If it doesn’t have entries or doesn’t exist, it gets a list of all browsers and builds the queue file anew.
- The script pulls the first browser from the queue and saves the file back without the current browser dequeued.
- The script launch several test cases against the browser and then exits.
This approach worked well because:
- If the script dies while processing, only one browser is skipped.
- If part of the system gets shutdown, the script can pick up where it left off.
- We can control the frequency for how often a browser should run.
After testing the script was working as intended, all we had to do is set the process in motion by turning on the cronjob.
We have been using Grafana to give us insight into data for other systems, so it made sense to reuse here. We only needed to hook up our queries to give us results within Grafana’s GUI, tweak the data we need to focus on, and voila, now we have information for what is running well vs what needs attention.
It’s hard to get a system up and working perfectly the first run, so this does require continually tweaking the report results to get more granular information, but this has enabled us to find issues and patterns of failures so they can get addressed.