It’s not MMA 👊, it’s MmLA! 🤯

Futuristic Graphics for decoration only.

While the acronym for Mixed Martial Arts (MMA) has started to become something of a household name, Multimodal Learning Analytics (MmLA) isn’t quite there yet. And probably won’t be in the majority of households anytime in the near future. But the use of MmLA may become a classroom standard soon. Especially with the growing adoption of Artificial Intelligence (AI) in Learning Technology (LT).

Fictional cyborg MMA instructor with the class working out in the background.
Cyborg MMA Instructor concept
Created by DALL-E via Bing by my request

We live in a multimodal world

Everything we do is “multimodal” by definition. We humans have 5 basic senses we’ve been using since birth – sight, hearing, smell, touch, and taste. This combination of senses is how we make sense of the world around us. When we combine the information that is coming from these senses with past learned experiences and behaviors, we can start to intelligently interact with our environment and adapt to the perpetual changes we meet. For example, seeing and smelling a pepper can make us recall the hot taste that may have burned our tongue last time we had spicy food, and make us want to try it again. While some may infer that isn’t a very “intelligent” thing to do (burn your mouth with hot peppers), this combining of real-time information with information stored and now retrieved and applied is the very definition of intelligence (Merriam-Webster, n.d.).

If we are to truly have artificial general intelligence (AGI) (Kanade, 2022), then we must have an artificial means of compiling the multimodal data that the AI entity will have to deal with, and this must be done in real-time. Therefore, multimodal intelligence (MMI) is AI. Ergo, extrapolating to the use of AI in the learning environment, MmLA will lead the way in implementing interactive educational AI. Just as you or I learn and adapt to the multimodal sensory data with our mind’s previously learned experiences, the LT will need MmLA to essentially be its mind to intelligently learn, adapt, respond, and interact with the learner it is trying to help.

Feedback, yes. But when?

Providing feedback in a timely manner is crucial in an instructional process. Too little feedback or feedback at the wrong time, and that feedback is wasted. For example, “study more” is technically feedback but it is woefully too little and not specific enough to truly be of any help. Likewise, providing a detailed list of what to study more and where the materials to study could be found would be fantastic, so long as they are provided in time to be used before the exam takes place.

Ochoa describes a real-time feedback system for a public speaking student in the chapter (Ochoa, 2022) as a representation of the type of system where MmLA could be used to create an automated tutor system. This example of providing feedback, perhaps on a heads-up display (HUD) or on-screen prompt for the presenter’s display, is a way to show why a MmLA system must be agile and work in real time. If this coaching experience is to have maximum impact on the student presenter’s growth, it should happen while the presentation is going on so the student does not have to try to recreate the scenario afterward with a mental picture that will fade within minutes of the presentation’s end.

Challenges abound

While the field of MmLA is not new, it is not mature yet either as many of the technologies it relies upon are evolving as well. Just since the “overnight” success of OpenAI’s ChatGPT (Hu, 2023) have the public been so interested in AI as a real thing and not just some whim of science fiction writers as a plot device. However, with these other elements of AI starting to make leaps and bounds in capability and availability, MmLA is on the cusp of being the beneficiary of that growth provided the MmLA user community plays their cards just right. Ochoa mentions several methodological proposals to standardize the terminology of the research so that teams working on the next big thing in MmLA can pull together in the same direction, and not continue to feed the eddy currents of unstructured research.

The Practicum

Bringing together the multiple modes of data input, combining these into a coherent decision-making process that can synthesize an output that is relevant to the topic at hand without creating a privacy problem.

With the world moving more online and digitized every day, there are rising concerns about personal sovereignty over the things that make us unique (Su, 2022). When a MmLA-based AI tutor interacts with a student (especially a minor) and starts to store those experiences in order to learn, track, and support that student, the privacy concerns of that student’s data is gaining greater importance in the level of confidentiality that exists between the AI and the student. It will be better to address these concerns from the beginning, than spend a lot of time developing something that is handicapped by policy changes later.

What Surprised Me

What surprised me here is that the notion of multimodal AI has been the silent elephant in the room during my entire learning experience with AI up to this point. It hasn’t dawned on me that the narrow AI I have been accustomed to working with has focused on a single input single output (SISO) format when (now that it has been pointed out) everything we do is multiple input multiple output (MIMO). If we are to truly have “droids” or robots walking and working among us, they will have to be MIMO creatures as well. So, this notion of MmLA should not be squirreled away in a back room somewhere, it should lead the charge in Human-Machine Interactions (HMI).

Eureka! 💡

Mentioned in the text (Ochoa, 2022) is Microsoft’s Platform for Situated Intelligence (https://www.microsoft.com/en-us/ai/ai-lab-platform-for-situated-intelligence). This open-source project is intended to create a standard framework from which researchers and developers can have a shared platform to develop interactive AI – the sort of AI that MmLA can get behind. This looks like a potential platform “front end” or integrated development environment (IDE) for AI and calls for a much closer and deeper look as soon as possible.

I wonder … Platform for Situated Intelligence … PSI… Ψ… you know that Greek letter sort of looks like a multimodal input (top) feeding a MmLA repository (bottom). Doesn’t it? Hmm… 🤔

 

References

Hu, K. (2023, February 2). ChatGPT sets record for fastest-growing user base – analyst note. Retrieved from Reuters: https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/

Kanade, V. (2022, February 14). What Is General Artificial Intelligence (AI)? Definition, Challenges, and Trends. Retrieved from Spiceworks: https://www.spiceworks.com/tech/artificial-intelligence/articles/what-is-general-ai/

Merriam-Webster. (n.d.). Intelligence. Retrieved November 12, 2023, from Merriam-Webster.com Dictionary: https://www.merriam-webster.com/dictionary/intelligence

Ochoa, X. (2022). Multimodal Learning Analytics: Rationale, Process, Examples, and Direction. In C. Lang, A. F. Wise, A. Merceron, D. Gašević, & G. Siemens (Eds.), Handbook of Learning Analytics (2nd ed., pp. 54-65). Society for Learning Analytics Research (SoLAR). https://doi.org/10.18608/hla22.006

Su, C. (2022, September 9). The Rising Concern of Data Privacy Around the World. Retrieved from BDO: https://www.bdo.com.sg/en-gb/blogs/bdo-cyberdigest/september-2022/the-rising-concern-of-data-privacy-around-the-world

Comments are closed.