Wednesday, January 5, 2022

Suspect Shenanigans When You Hear Claims of "Mind Reading" Technology

 The New Yorker recently published an extremely misleading article with a title of "The Science of Mind Reading," and with a subtitle of "Researchers are pursuing age-old questions about the nature of thoughts—and learning how to read them." The article (not written by a neuroscience scholar) provides no actual evidence that anyone is making progress trying to read thoughts from a brain. 

The article starts out with a dramatic-sounding but extremely dubious narrative. We hear of experts trying to achieve communication with a Patient 23 who was assumed to be in a "vegetative state" after a bad injury five years ago.  We read about the experts asking questions while scanning the patient's brain.  They were looking for some brain signals that could be interpreted as a "yes" answer or "no" answer.  We are told: "They would pose a question and tell him that he could signal 'yes' by imagining playing tennis, or 'no' by thinking about walking around his house." 

We get this narrative (I will put unwarranted and probably untrue statements in boldface):

"Then he asked the first question: 'Is your father’s name Alexander''

The man’s premotor cortex lit up. He was thinking about tennis—yes.

'Is your father’s name Thomas?'

Activity in the parahippocampal gyrus. He was imagining walking around his house—no.

'Do you have any brothers?'

Tennis—yes.

'Do you have any sisters?'

House—no."

Constantly foisted upon us by scientists and science writers, the claim that particular regions of the brain "light up" under brain scanning is untrue. Such claims are visually enforced by extremely deceptive visuals in which tiny differences of less than 1 percent are shown in bright red, thereby causing people to think the very slight differences are major differences. The truth is that all brain regions are active all the time. When a brain is scanned, there are only tiny signal differences that show up in a brain scan.  Typically the differences will be no greater than about half of one percent, smaller than 1 part in 200.  When scanning a brain, you can always see dozens of little areas that have a very slightly greater activity, and there is no reason to think that such variations are anything more than very slight chance variations. Similarly, if you were to analyze the blood flow in someone's foot, you would find random small variations in blood flow between different regions, with differences of about 1 part in 200. 

Because of such random variations, there would never be any warrant for claiming that a person was thinking about a particular thing based on small fluctuations in brain activity. At any moment there might for random reasons be 100 different little areas in the brain that had 1 part in 200 greater activity, and 100 other different little areas in the brain that might have 1 part in 200 less activity.  In this case no evidence has been provided of any ability to read thoughts of a person supposed to be in a vegetative state. We cannot reliably distinguish any signal from the noise. 

The New Yorker article describing the case above refers us to a Los Angeles Times article entitled "Brains of Vegetative Patients Show Signs of Life." The article gives us no good evidence that thoughts were read from this patient 23. The article merely mentions that 54 patients in a vegetative state had their brains scanned, and that one of them (patient 23) seemed "several times" to answer "yes" or "no" correctly, based on examining fluctuations of brain activity.  Given random variations in brain activity, you would expect to get such a result by chance if you scanned 54 patients who were completely unconscious. So no evidence of either consciousness or thought reading has been provided.  

A look at the corresponding scientific paper  shows that the fluctuations in brain activity were no more than about a half of one percent. No paper like this should be taken seriously unless the authors followed a rigorous blinding protocol, but the paper makes no mention of any blinding protocol being followed.  Under a blinding protocol, anyone looking for signs of a "yes" or "no" answer would not know whether a "yes" answer was the correct answer.  The paper provides no actual evidence either of thought reading by brain scanning or even of detection of consciousness. We merely have tiny 1-part-in-200 signal variations of a type we would expect to get by chance from scanning one or more of 54 patients who are all unconscious.  

The paper tells that six questions were asked, and the authors seemed impressed that one of the 54 patients seemed to them to answer all six questions correctly (by means of brain fluctuations that the authors are subjectively interpreting).  The probablility of getting six correct answers to yes-or-no questions by a chance method such as coin-flipping is 1 in two-to-the-sixth-power, or 1 in 64.  So it is not very unlikely at all that you would get one such result testing 54 patients, purely by chance, even if all of the patients were unconscious and none of them understood the instructions they were given.  

The New Yorker article then introduces Princeton scientist Ken  Norman, incorrectly describing him as "an expert on thought decoding." Because no progress has been made on decoding thoughts from studying brains, no one should be described as an expert on such a thing. The article then gives us a very misleading passage trying to suggest that scientists are making some progress in understanding how a brain could produce or represent thoughts:

"Now, Norman explained, researchers had developed a mathematical way of understanding thoughts. Drawing on insights from machine learning, they conceived of thoughts as collections of points in a dense 'meaning space.' They could see how these points were interrelated and encoded by neurons." 

To the contrary, no neuroscientist has the slightest idea of how thoughts could be encoded by neurons, nor have neuroscientists  discovered any evidence that any neurons encode thoughts. It is nonsensical to claim that thoughts can be compared to points in three-dimensional space. Points in three-dimensional space are simple 3-number coordinates, but thoughts can be vastly more complicated. If I have the thought that I would love to be lounging on a beach during sunset while sipping lemonade, there is no way to express that thought as three-dimensional coordinates. 

We then read about some experiment:

"Norman invited me to watch an experiment in thought decoding. A postdoctoral student named Manoj Kumar led us into a locked basement lab at P.N.I., where a young woman was lying in the tube of an fMRI scanner. A screen mounted a few inches above her face played a slide show of stock images: an empty beach, a cave, a forest. 'We want to get the brain patterns that are associated with different subclasses of scenes,' Norman said." 

But then the article goes into a long historical digression, and we never learn of what the result is from this experiment. Norman is often mentioned, but we hear no mention of any convincing work he has done on this topic. Inaccurately described as "thought decoding," the attempt described above is merely an attempt to pick up signs in the brain of visual perception. Seeing something is not thinking about it. Most of the alleged examples of high-tech "mind reading" are merely claimed examples of picking up traces of vision by looking at brains -- examples that are not properly called "mind reading" (a term that implies reading someone's thoughts).

We hear a long discussion often mentioning Ken Norman, but failing to prevent any good evidence of high-tech mind reading. We read this claim about brain imaging: "The scripts and the scenes were real—it was possible to detect them with a machine." But the writer presents no evidence to back up such a claim. 

Norman is a champion of a very dubious analytical technique called multi-voxel pattern analysis (MVPA), and seems to think such a technique may help read thoughts from the brain. A paper points out problems with such a technique:

"MVPA does not provide a reliable guide to what information is being used by the brain during cognitive tasks, nor where that information is. This is due in part to inherent run to run variability in the decision space generated by the classifier, but there are also several other issues, discussed here, that make inference from the characteristics of the learned models to relevant brain activity deeply problematic." 

In a paper, Norman claims "This multi-voxel pattern analysis (MVPA) approach has led to several impressive feats of mind reading."  Looking up two of the papers cited in support of this claim, I see that only four subjects were used in each study.  Looking up another of the studies cited in support of this claim, I find that only five subjects were used for the experiment cited. This means none of these studies provided robust evidence (15 subjects per study group being the minimum for a moderately reliable result). This is what goes on massively in neuroscience papers: authors making claims that other papers showed some thing that the papers did not actually show, because poor methodology (usually including way-too-small sample sizes) occurred in the cited studies.   

The New Yorker article then discusses a neuroscientist named Jack Gallant, stating the following: "Jack Gallant, a professor at Berkeley who has used thought decoding to reconstruct video montages from brain scans—as you watch a video in the scanner, the system pulls up frames from similar YouTube clips, based only on your voxel patterns—suggested that one group of people interested in decoding were Silicon Valley investors."  Gallant has produced a Youtube.com clip entitled "Movie Reconstruction from Human Brain Activity."

On the left side of the video we see some visual images. On the right side of the video we see some blurry images entitled "Clip reconstructed from brain activity."  We are left with the impression that scientists have somehow been able to get "movies in the mind" by scanning brains. 

However, such an impression is very misleading, and what is going on smells like smoke and mirrors shenanigans.  The text below the video explains the funky technique used.  The videos entitled "clip reconstructed from brain activity" were produced through some extremely elaborate algorithm that mainly used inputs other than brain activity. Here is the description of the technique used:

"[1] Record brain activity while the subject watches several hours of movie trailers. [2] Build dictionaries (i.e., regression models) that translate between the shapes, edges and motion in the movies and measured brain activity. A separate dictionary is constructed for each of several thousand points at which brain activity was measured....[3] Record brain activity to a new set of movie trailers that will be used to test the quality of the dictionaries and reconstructions. [4] Build a random library of ~18,000,000 seconds (5000 hours) of video downloaded at random from YouTube. (Note these videos have no overlap with the movies that subjects saw in the magnet). Put each of these clips through the dictionaries to generate predictions of brain activity. Select the 100 clips whose predicted activity is most similar to the observed brain activity. Average these clips together. This is the reconstruction."

This bizarre and very complicated rigmarole is some very elaborate scheme in which brain activity is only one of the inputs, and the main inputs are lots of footage from Youtube videos.  It is very misleading to identify the videos as "clip reconstructed from brain activity," as the clips are mainly constructed from data other than brain activity. No actual evidence has been produced that someone detected anything like "movies in the brain." It seems like merely smoke and mirrors under which some output from a variety of sources (produced by a ridiculously complicated process) is being passed off as something like "movies in the brain." 

Similar types of extremely dubious convoluted methods seem to be going on in the papers here co-authored by Gallant:

In both of these papers, we have a kind of byzantine methodology in which bizarre visual montages or artificial video clips are constructed. For example, the second paper resorts to "an averaged high posterior (AHP) reconstruction by averaging the 100 clips in the sampled natural movie prior that had the highest posterior probability." The claim made by the New Yorker -- that Gallant has "used thought decoding to reconstruct video montages from brain scans" is incorrect. Instead, Gallant is constructing visual montages using some extremely elaborate and hard-to-justify methodology (the opposite of straightforward), and brain scans are merely one of many inputs from which such montages are constructed.  This is no evidence of technology reading thoughts or imagery from brains.  In both of the papers above, only three subjects were used. 15 subjects per study group is the minimum for a moderately compelling experimental result. And since neither paper uses a blinding protocol, the papers fail to provide robust evidence of anything. 

The rest of the New Yorker article is mainly something along the lines of "well, if we've made this much progress, what wonderful things may be on the horizon?" But no robust evidence has been provided that any progress has been made in reading thoughts or mental imagery from brains. The author has spent quite a while interviewing and walking around with scientist Ken Norman, and has accepted "hook, line and sinker" all the claims Norman has made, without asking any tough questions, and without critically analyzing the lack of evidence behind his more doubtful claims and the dubious character of the methodologies involved. The article is written by a freelance writer who has written on a very wide variety of topics, and who shows no signs of being a scholar of neuroscience or the brain or philosophy of mind issues.  

There are no strong neural correlates of either thinking or recall. As discussed here, brain scan studies looking for neural correlates of thinking or recall find only very small differences in brain activity, typically smaller than 1 part in 200. Such differences are what we would expect to see from chance variations, even if a brain does not produce thinking and does not produce recall.  The chart below illustrates the point. 

neural correlates of thinking

What typically goes on in some study claiming to find some neural correlate of thinking or recall is professor pareidolia. Pareidolia is when someone hoping to find some pattern reports a pattern that isn't really there, like someone eagerly scanning his toast each day for  years until he finally reports finding something that looks to him like the face of Jesus. A professor examining brain scans and eagerly hoping to find some neural signature or correlate of thinking or recall may be as prone to pareidolia as some person scanning the clouds each day eagerly hoping to find some shape that looks like an angel. 

There are ways for scientists to help minimize the chance that they are reporting patterns because of pareidolia. One way is the application of a rigorous blinding protocol throughout an experiment. Another way is to use adequate sample sizes such as 15 or 30 subjects per study group. Most neuroscience experiments fail to follow such standards. The shockingly bad tendencies of many  experimental biologists was recently revealed by a replication project that found a pitifully low replication rate and other severe problems in a group of biology experiments chosen to be replicated.

Postscript: The latest example of needless risk to subjects for the sake of unfounded "mind reading by brain scanning" claims is a study with a preprint entitled "Semantic reconstruction of continuous language from non-invasive brain recordings." The study failed to show any good evidence for anything important, as it used a way too-small study group sizes of only three subjects and seven subjects (15 subjects per study group is the minimum for a moderately impressive result). Following Questionable Research Practices, the scientists report no sample size calculation, no blinding protocol, no pre-registration, no control group, and no effect size. The only "statistical significance" reported is what smells like "p-hacking" kind of results of the bare minimum for publication (merely p < .05). For these basically worthless results, seven subjects endured something like 16 hours of brain scanning with a 3T scanner, which is more than 30 times longer than they would have had for a diagnostic MRI.  Senselessly, this study has been reported by our ever-credulous science press as some case of reading thoughts by brain scanning. It is no evidence of any such thing.  

No comments:

Post a Comment