Online experiments and inaccurate timing. Are we doomed?

Article by Ben Howell

Photo by Anton Makarenko from Pexels

Photo by Anton Makarenko from Pexels

We all want accurate timing for our online psychology experiments, but there are many hardware and software induced confounds that reduce the accuracy of scheduled events and reaction time measurements. To our chagrin, timing in our experiments is not as accurate as we would like, so it seems we're doomed. Or are we? In almost every case, the answer is no.


Hardware constraints

Display refresh rate

A display on a computer monitor, laptop or smartphone is a series of static frames updated at regular intervals (called the refresh rate) to create a dynamic moving display.

The display refresh rate for most smartphones and consumer/business-grade LCD monitors is 60 frames/second (1 frame every ~16.667ms). Therefore, for example, if you schedule an item to display for 10ms it will only be displayed for a single frame, but the item will be displayed for an actual time of ~16.667ms. It is worth keeping the refresh rate in mind if you need very high precision and accuracy of rendering. Setting duration times to multiples of 16.667ms can help. It is worth noting however, that some high end LCD monitors render at frequencies of up to 240Hz (1 frame every ~4.167ms). Display refresh rate is by far the most important consideration of those discussed here.

Display refresh cycle

The time it takes to render a frame from plotting of the first pixel at the top left, to the plotting of the last pixel at the bottom right, is called the refresh cycle. Typical monitors and laptops perform these cycles between 8ms and 16ms.

Pixel response time

The time it takes to change the color of a pixel, typically measured as the time to change from one shade of gray to another shade of gray. The measuring technique is referred to as gray-to-gray, g-t-g or gtg. Typical monitors and laptops have response times (gtg) of between 1ms and 5ms.

Display touch response times

The time taken for a touch screen display to register and act upon a touch event is referred to as the display touch response time. This is heavily product-dependent and can range from 50ms to hundreds of ms.

USB device polling rate

Devices using USB ports (typical mouse and keyboard for example) are polled for updates 125 times/second (1 update every 8ms), therefore response times of buttons (mouse activated) or keyboard key events can only be known to a precision of 8ms.


Software constraints

Timestamp jitter

Believe it or not, web browsers provide less accurate timing now than in the past. Because of a series of timing based computer attacks in 2017 browser vendors have had to decrease timing accuracy as a mitigation against such attacks. In the past, browsers could offer timestamps accurate to 5µs (depending on other hardware and software constraints), however today, most browsers offer timestamp precision of between 1ms and 2ms with some vendors adding additional jitter in the range of ±1ms (within the original bounds).

Common understanding is that timing precision offered by web browsers will improve in coming years once the timing attacks are better understood and mitigation strategies developed.

Javascript

Javascript is the programming language that runs our web applications inside the web browser. Javascript contains functionality allowing us to batch many separate jobs (execution items) and run them in synchronization with the repainting of the display. This is known as requestAnimationFrame.

To ensure that the actual start time of an on-screen item and the display duration of that item are as precise as possible, most computerized experiment platforms (e.g. Psychstudio) synchronize the rendering of on-screen items with calls to requestAnimationFrame. In other words, visual items (e.g. fixations, masks, images, text, visual stimuli, etc) that are to be presented on screen are scheduled to start on the next frame to be rendered. Many online experiment platforms synchronize the polling of response events (e.g. button and key presses) with frame rendering as well.


Miscellaneous

Many other factors at the participant's end can influence timing precision and accuracy, including, but not limited to, system load, experiment application being used, the number of applications running concurrently on the participants device and human factors (e.g. split attention and distraction).


So are we doomed?

Betteridge's law of headlines suggests we aren't and fortunately, so does science. In the vast majority of cases, the subject of investigation is not the absolute values of timing measurements but rather the variation and standard deviation of those measurements. Variability of timing precision and accuracy in software based experiments (both online and in-lab) is negligible when compared to the variability between and within subjects. Small reaction time effects (~20ms) have been accurately measured in online experiments despite unknown variation between actual and intended timing of stimulus presentation, stimulus onset asynchrony and recording of reaction times (Crump, McDonnell & Gureckis, 2013).

So the answer is no. Even where clocks have a resolution of 30ms, the measurement resolution has neglibile effect on detecting mean reaction time differences (Ulrich & Giray, 1989).


Further reading

  • Chetverikov, A., & Upravitelev, P. (2015). Online versus offline: The Web as a medium for response time data collection. Behavior Research Methods, 48(3), 1086–1099. doi: 10.3758/s13428-015-0632-x
  • Crump, M., McDonnell, J., & Gureckis, T. (2013). Evaluating Amazon's Mechanical Turk as a tool for experimental behavioral research. PLoS ONE, 8(3), e57410. doi: 10.1371/journal.pone.0057410
  • Damian, M. (2010). Does variability in human performance outweigh imprecision in response devices such as computer keyboards? Behavior Research Methods, 42(1), 205–211. doi: 10.3758/BRM.42.1.205
  • de Leeuw, J., & Motz, B. (2015). Psychophysics in a Web browser? Comparing response times collected with JavaScript and Psychophysics Toolbox in a visual search task. Behavior Research Methods, 48(1), 1–12. doi: 10.3758/s13428-015-0567-2
  • Plant, R. (2015). A reminder on millisecond timing accuracy and potential replication failure in computer-based psychology experiments: An open letter. Behavior Research Methods, 48(1), 408–411. doi: 10.3758/s13428-015-0577-0
  • Ulrich, R., & Giray, M. (1989). Time resolution of clocks: Effects on reaction time measurement—Good news for bad clocks. British Journal of Mathematical and Statistical Psychology, 42(1), 1–12. doi: 10.1111/j.2044-8317.1989.tb01111.x
  • van Steenbergen, H., & Bocanegra, B. (2015). Promises and pitfalls of Web-based experimentation in the advance of replicable psychological science: A reply to Plant (2015). Behavior Research Methods. 48(4), 1713–1717. doi: 10.3758/s13428-015-0677-x

Ready to start using the world's easiest online experiment builder?

Conduct simple psychology tests and surveys, or complex factorial experiments. Increase your sample size and automate your data collection with experiment software that does the programming for you.

Behavioral experiments. Superior stimulus design. No code.

Ben Howell
Ben Howell
Founder, Psychstudio