Asking the Right Questions: Anna Gilbert's Push for Better AI Science
In the midst of today's AI excitement, Anna Gilbert offers a perspective shaped by years of hands-on experience. As the John C. Malone Professor of Electrical & Computer Engineering and Statistics & Data Science, Gilbert's work with machine learning (ML) and artificial intelligence (AI) stretches back well before the current boom. From her early days at AT&T Bell Laboratories to her current role in academia, Gilbert's expertise provides unique insights into the true capabilities and challenges of these rapidly evolving technologies.
Among other things, Gilbert’s research focuses on the design of efficient algorithms for signal processing and the extraction of meaning from massive data sets. It’s an area where ML and AI have proved useful for many years. “AI has been important to my work throughout my career,” she says.
From 1998-2004, Gilbert worked at AT&T Bell Laboratories, where research involving artificial neural networks began in earnest in the mid-80s. Since then, an explosive increase in computing power has enabled rapid advances in AI, ML, neural networks, and large language models, including predictive models like ChatGPT.
Understandably, then, when asked to weigh in on the recent hype surrounding AI, Gilbert radiates mild amusement.
“There is quite a bit of hype, yes,” Gilbert said. “As soon as you say artificial intelligence -- as soon as you put the word ‘intelligence’ in there -- people go, ‘Oh my god, it’s HAL, and the machines are going to take over the world!’”
Gilbert says she gets that there’s “a wow factor” when AI performs human-like feats. Still, she observes, “If people really understood what these things were doing, even at a really high level, they would not ask the questions they are asking.”
“Most people are interacting with these programs – bots, algorithms, et cetera – at the level of language,” she said. “And a lot of people are used to evaluating intelligence, or cognition, or whatever, in terms of language – and they think of intelligence in terms of language.”
But, Gilbert cautions, one should not conflate intelligence with language.
“In fact, I would argue that language stuff – mundane, everyday language – is not nearly as sophisticated as, for example, the proper accurate precise solution to an engineering question.” (“I know my humanities colleagues would kill me,” she adds, with a laugh.)
That’s not to disparage the importance of language or linguistic pursuits. But, from a computer science perspective, Gilbert points out that it makes sense to think of language as a certain kind of task or problem.
There’s a dictionary of words – a dataset – and grammar, which is a set of rules for using those words. Therefore, given a series of words, you can assign some probability to what the next word will be.
“Mundane, everyday language is predictable. Once you realize that, you can then build a model.”
However, Gilbert stresses that this predictability doesn't extend to all forms of language. Creative, beautiful, and exciting expressions in poetry, literature, and philosophy often defy simple prediction, highlighting the complexity of human communication and the current limitations of AI in understanding or generating profound, original, content.
Predictable, rules-based tasks have long been fertile ground for automation. And tasks like predictive text, translation, coming up with human-like replies in a chat, or even impersonating a particular person’s style of writing or speech, while they might still seem like magic to the average consumer, were among the first party tricks AI learned.
AI and ML, however, are moving past such simple rules-based applications. In her research and in collaborations across disciplines, Gilbert is interested in the ways that, properly interrogated and implemented, AI might transform and accelerate scientific inquiry.
One of Gilbert’s recent projects, published in Acta Materialia, illustrates the nuances of her approach. Working with Jan Schroers, the Robert Higgin Professor of Mechanical Engineering and Materials Science, Gilbert looked at the complex materials science problem presented by metallic glasses.
Simply put, some metals, when melted and then rapidly cooled, are non-crystalline in their solid state, instead having a glass-like structure. Known as vitrified metals or metallic glasses, these materials have various industrial, medical, and scientific applications.
It would be useful to predict which metals or alloys will display this property, but that has proven difficult. Researchers are still trying to understand the rules that determine whether a metal has the potential to become metallic glass.
ML models have sped up this type of materials science discovery in related fields, saving time and money. Gilbert and her collaborators set out to evaluate an ML model developed to predict metallic glass formation, comparing it with an older model, and found that human learning outperformed both models. However, the researchers found that the predictions of the ML model improved when physical insights (such as the ratio of the smallest to the largest element in a given alloy) were introduced to the model. They plan to build on the findings by training a new ML model with these added physical insights.
Investigating how to improve an ML model to speed up materials discovery is one avenue of research. Another is bringing a critical eye to the theoretical underpinnings of algorithms already in wide use.
Large data sets have long been a focus for Gilbert. And, she observes, “real-world data is often messy.” Extracting meaning from data is a famously tricky pursuit, one that humans have increasingly relied on AI and ML to tackle.
Gilbert has noted that new algorithms are proposed at such a rapid pace that they are often in common use despite the theory that underlies them being insufficiently understood. When implementation outruns theory, distortions and inaccuracies can result. Another recent project, published in arXiv with the beguiling title May the force be with you, looked at ways to address that problem.
Gilbert and her collaborators reexamined a group of algorithms often used for dimensionality reduction. Her team included Stefan Steinerberger, an assistant professor in Yale's mathematics department at the time of the initial work and Yulan Zhang, then a Yale undergraduate student double majoring in mathematics and computer science. Zhang's contributions to the project earned her an undergraduate prize for the research.
High-dimensional datasets require more storage space, more computing power, and more time to process, hence the need to compress them. In such situations, dimensionality reduction creates a representative model of such a dataset, reducing noise -- like irrelevant or redundant data points -- ideally without distorting the meaning of the dataset.
Using real and simulated datasets, Gilbert and her collaborators looked at t-SNE, a common method for dimensionality reduction. They proposed that the vector field associated with t-SNE (and a family of similar methods, known as non-linear force-based methods) gives additional high-quality information that can be used to refine results.
Since the vector field is always automatically computed, their research points to a meaningful refinement in a method already in common use for a range of applications. It’s the kind of tweak Gilbert is known for, the kind that results in better theory as well as better implementation.
“We need to always be asking: are we doing good science?” Gilbert said. “Can we design better algorithms? Where do they not work, and why? Are we testing them correctly? Are we asking the right questions?”