The digital learning platform (DLP) partners in SEERNet are working to provide platforms that any researcher can use to do experiments. In this post, the Carnegie Learning/UpGrade team reflects on insights and learnings from their work with researchers moving from design to implementation in a Digital Learning Platform. If you’re not familiar with Upgrade, see this overview.
When we consider how the UpGrade platform can help learning scientists conduct next-generation education research, we often point to how the framework enables rapid, iterative field testing of learning experiences to see what works best to improve learning outcomes, and in particular, testing those experiences at scale. But before arriving at the point of deploying and scaling such field tests, the UpGrade team works with research partners to review the necessary steps between generating an initial research idea and implementing it in an EdTech product.
One of the first things we review when meeting with a research partner is the experimental design. Evaluating and refining the design can help researchers understand what types of designs are possible with UpGrade, and help us as platform developers determine how best to support researchers’ needs in the context of current platform functionality and the UpGrade product roadmap. Currently, UpGrade supports individual and group random assignment, segmentation of participants, explicit include/exclude lists, and experiments coordinated across elements in a curriculum sequence, among other features. Experimental designs such as within-subjects, factorial, and multi-armed bandits are currently in-progress, under development, or are on the UpGrade roadmap for future releases. To the extent that the roadmap is flexible, such development efforts can, at times, be accelerated for research partners whose needs exceed current offerings. Alternatively, partners may choose to modify their design if the goal is to deploy experiments as quickly as possible.
Outside of UpGrade itself, another consideration in the design process is whether the client application (i.e., the EdTech app that students work in, which connects to UpGrade) has appropriate features to deliver the experiments as the researchers intend, and if not, how to proceed. For example, our partners in a recent research collaboration wanted to deliver brief surveys to users in Carnegie Learning’s MATHia software. MATHia did not have built-in survey authoring at that time, so the UpGrade/Carnegie Learning team explored two options: first, we could deliver surveys by integrating external survey software into the user experience, which would be displayed in a popup window when students were working in MATHia, or, we could allocate software developer resources to create a novel survey authoring tool within MATHia itself. Ultimately, we decided that 1) it was valuable to maintain the students’ user experience in MATHia rather than introduce a potentially confusing external resource, 2) authoring our own tool would allow survey results to be instrumented with other MATHia data capture results, and importantly, 3) our developers could feasibly develop, test, and implement the survey authoring tool in a reasonable timeframe. In either case, it was important to consult with the researchers about the full extent of their study and intended user experience to reach a solution. Details that may seem trivial may be challenging to implement, and such challenges could go unrealized if platform developers have an incomplete understanding of their partners’ needs.
Overall, our intent when working with researcher-partners to develop and implement their experimental designs is to understand and communicate clearly about unequivocal needs versus “nice-to-haves,” with experiment plans and mutual timelines fully sketched out. Doing this as early in the process as possible, and having frequent check-ins over the course of implementation, enables all constituents to contribute and ensure transparency between platform developers and researchers.
Conducting field experiments at scale can also affect deployment decisions, such as how and when to time delivery of experimental content. When researchers approach us with an experiment idea, we need to understand how the experiment will be inserted into students’ curriculum sequence. Some educational research involves pulling students out of their normal instructional context to perform an experiment, and then allowing students to go back to their usual material. When students are using an adaptive curriculum and are potentially reaching target material asynchronously, this can pose a challenge for such pull-out methods. Researchers must accept that many students may not be in the appropriate experimental “window”, either having already learned the target material or not having learned the prerequisites for the target material.
Alternatively, integrating an experiment in-context with the students’ curriculum requires carefully timing the intervention according to the appropriate period in the school year for which the student is experiencing the target instructional material. In traditional research approaches, adaptive curricula can make these timing concerns particularly challenging, but running experiments at scale can recast the challenge as an advantage. When a student starts and completes an experiment as a normal part of the instructional process rather than in the pull-out method, we call this a curriculum-embedded experiment. In adaptive curricula, the experiment can run asynchronously as each student reaches the topic that is the focus of the experiment. At scale, when experiments span classrooms, schools, or districts, even students’ progress in traditional (non-adaptive) curricula are not dictated by a common timeline. In a widely-deployed adaptive system such as MATHia that is used by students across many districts and states, such asynchronous use of a curriculum can be advantageous for research partners who wish to enroll participants in UpGrade experiments. Rather than timing experiment design and deployment to carefully align with discrete weeks in a curriculum sequence, experiments can be run–and with an adequate subject pool–virtually anytime during the school year. With curriculum sequences that differ across states and individual students, and state standards which dictate that particular topics be sequenced earlier or later in the school year, the idea that there is no optimal time to run an experiment for all students means that deploying an experiment any time can provide data from some students. At the scale of thousands or millions of students, “some” may still represent far more data than researchers could collect under a more conventional, small-scale approach.
In the same way that reviewing how planned experiments can impact platform or application development, platform developers should appropriately communicate with research partners about timing of experimental deployment and which populations are expected to be reached. If researchers are obliged to target specific schools, gathering information about those schools and their curriculum sequences will be imperative to appropriately align timing of experiment implementation. In other cases, such as when a broader population is used, the advantage of using a widely-deployed adaptive system can make such considerations more flexible.
As platform developers, we want researchers to have trust and confidence that we can implement and deploy experiments that lead us to our common goal of improving learning outcomes. Frequent and careful communication about the overall experimental design framework, how platform and client application features can accommodate design decisions, and the consideration of implementation timing with respect to curriculum sequences are fundamental steps to take when collaborating with research partners. Carnegie Learning and the UpGrade team look forward to continuing to support the research community and learning from these partnerships.
If you are interested in learning more: