Saturday, March 11, 2023

Misleading Tricks of the Latest Claim of Mind-Reading by Brain Scans

In a previous post entitled "Suspect Shenanigans When You Hear Claims of 'Mind Reading' Technology" I discussed some of the tricks used by people claiming that brain scans can reveal mind activity. I discussed one example. It was the case of a researcher who had used some incredibly elaborate analysis pipeline that included brain scans to create movies. I pointed out that brain scans were only one element in the extremely elaborate set of inputs, and that it was misleading to claim that the output movies were generated from brain scans. I stated this:

"This bizarre and very complicated rigmarole is some very elaborate scheme in which brain activity is only one of the inputs, and the main inputs are lots of footage from Youtube videos.  It is very misleading to identify the videos as 'clips reconstructed from brain activity,' as the clips are mainly constructed from data other than brain activity. No actual evidence has been produced that someone detected anything like 'movies in the brain.' It seems like merely smoke and mirrors under which some output from a variety of sources (produced by a ridiculously complicated process) is being passed off as something like 'movies in the brain.' "

Recently we had another case of the press fooling us with untrue claims about mind-reading brain scans.  The Daily Mail gave us this bogus headline: "Scientists can now read your MIND: AI turns people's thoughts into images with 80% accuracy." Vice.com gave us this equally untrue headline: "Researchers Use AI to Generate Images Based on People's Brain." Upon analyzing the scientific paper that inspired these stories, I was able to figure out how the misleading "sleight of hand" is being done. It's a "fool you" mashup methodology. 

The paper is the one you can here. It has the very misleading title "High-resolution image reconstruction with latent diffusion models from human brain activity." What is going on is that the researchers used an analysis methodology in which the actual brain scans are a superfluous input. They got AI outputs using a technique in which there was no need at all to use brain scans.  The paper title is misleading because it implies that such brain scans were a crucial input, when such brain scans were an unnecessary input. 

Below is an explanation of how it worked:

(1) There is a Natural Scenes Dataset that was created by some morally dubious excessive-seeming fMRI scanning in which subjects were brain-scanned with high-intensity 7T scanners for about 40 hours each while looking at natural scenes (a medically unnecessary risk to these subjects). That dataset is described in the paper here, entitled "A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence."

That dataset was created using images from the Microsoft Common Objects in Context (COCO) image dataset. The authors of the previously mentioned paper mentioned above say, "We obtained 73,000 color natural scenes from the richly annotated Microsoft Common Objects in Context (COCO) image dataset." These authors then brain-scanned people while they were looking at these images, using 7T fMRI scanners. 

(2) The authors of the new paper ("High-resolution image reconstruction with latent diffusion models from human brain activity") traced back the images of the Natural Scenes Dataset to their COCO source, as they admit by saying, "The images used in the NSD experiments were retrieved from MS COCO and cropped to 425 x 425 (if needed)." This COCO database includes text annotations for each image, which is one or more words identifying each image. The authors of the new paper ("High-resolution image reconstruction with latent diffusion models from human brain activity") clearly indicate that they grabbed these text annotations from the COCO database. They say they used an "average of five text annotations associated to each MS COCO image." 

(3) Having a text phrase associated with each image they used from the Natural Scenes Dataset, a phrase identifying what each image was, the authors used such text phrases as inputs to the Stable Diffusion generative AI, which can generate multiple images from text phrases.  In case you have not tried the Stable Diffusion AI, which you can try using this link,  it works like the example below. I typed in "spooky snowy castle" as the prompt, and the AI generated four images of spooky snowy castles:


(4) Some additional use was made of the actual brain scans from the Natural Scenes Dataset, but that was not necessary, and was probably just a little "icing on the cake," apparently as a way for the authors to kind of "cover their tracks" by some convoluted rigmarole making it harder for people to track down the main way their images were generated. 

(5) The authors then incorrectly claimed that they had done "image reconstruction ... from human brain activity." In fact, the human brain activity was a superfluous (unnecessary) input that was not a necessary part of the process. The method used to get the Stable Diffusion output images would have worked just fine without any brain scan data at all. All you need to get good Stable Diffusion output images of some particular type is a text prompt. And the researchers had got the appropriate text prompts by matching the images of the Natural Scenes Dataset with the source of the images (the MS COCO dataset), which has text phrases describing each of the images.  

This is a "smoke and mirrors" sleazy trick that you should not be fooled by. The authors incorrectly claimed that they had done "image reconstruction with latent diffusion models from human brain activity," when the human brain activity was not an essential input. The technique used here has no dependency on any brain scan data. The authors claim that they have shown you can "reconstruct high-resolution images with high semantic fidelity from human brain activity." This claim is untrue. Using only the brain activity data, the authors would be unable to create any images corresponding to what the subjects had seen when such brain scans were made. 

There were two cheats here: (1) using text annotations (descriptions of the images the brain scanned subjects saw), descriptive phrases which the brain-scanned subjects never heard or saw; (2) use of an image generating AI using these text phrases as inputs rather than inputs of the brain scans. The authors have not reconstructed what the subjects saw from their brain scans. The authors have used a sneaky data backdoor to get something they never could have got from such brain scans alone. A legitimate attempt to reconstruct what people saw from brain scans would have used only the brain scans, and would never have succeeded. You cannot identify or reconstruct from brain scans what people saw or thought while their brains were being scanned.  

What is going on here is something rather like the conversation below: 

Jack: Did you know I can tell which restaurant you went to from a list of the items you ordered?
Jill:  Really? Let's try.
Jack: Okay, just give me a receipt you got from some dinner you ordered.
Jill: Okay, here's my receipt from last night. 
Jack: Okay, let me see, you ordered a large pizza and 2 medium Pepsi drinks. Using my astonishing algorithm, I deduce that you went to Santino's Pizza Palace on 34th Street.
Jill: Wow, that's amazing -- you figured out where I ate from what I ordered!

Of course, Jack has done no such thing. Jack is cheating. He simply read the name of the restaurant from the bottom of the receipt. 

As long as I am mentioning the Natural Scenes Dataset, let me mention a very troubling fact about that dataset: that it was created by what seems like a recklessly excessive scanning of 8 subjects.  paper entitled "A massive 7T fMRI dataset to bridge
3 cognitive neuroscience and artificial intelligence" discusses some data collection used to create this dataset: a process in which eight subjects were brain scanned 30 to 40 times with 7T scanners twice as powerful as the 3T scanners or 1.5T scanners normally used for MRI scans.  The paper states this: 

"The total number of 7T fMRI scan sessions were 43, 43, 35, 33, 43, 35, 43, and 33 for subj01–subj08, respectively. The average number of hours of resting-state fMRI conducted for each subject was 2.0 hours, and the average number of hours of task-based fMRI conducted for each subject was 38.5 hours."

This was in addition to other 3T scans the subjects were given.  The paper makes no mention of any consideration of health risks to these people, who received $30 per hour for the medically unnecessary scans. A 7T scanner would presumably have more than twice the risks of the 3T scanners.  Referring to mere 3T MRI scans, the 2022 paper "The effects of repeated brain MRI on chromosomal damage" found that "The total number of damaged cells increased by 3.2% (95% CI 1.5–4.8%) per MRI." The paper was referring to "DNA breaks" that have a possibility of increasing cancer risks. There is no medical need for anyone to receive more than one or a few MRI scans. Scanning subjects for 40 hours with 7T scanners seems rather like playing Russian roulette with the health of subjects, who might one day get cancer or dementia from such excessive scanning. It is dismaying that people were lured into undergoing such risks for "chump change" payments such as $30 per hour. 

All future claims to be generating images from brain scans should be regarded with the greatest suspicion whenever such claims made any use of the Natural Images Dataset. I have explained above how there is a tricky "backdoor" method by which anyone can generate images from that dataset very similar to the images that the poor over-scanned subjects saw when the brain scans of that database were made. 

We can expect to see in the future some additional studies using sleazy tricks such as the one described here. There will be more and more confusing methodology papers that use complicated technological mashups that leverage AI and data backdoors.  Don't be fooled by such shenanigans. It is never possible to figure out what someone thought or saw from merely looking at brain scans, and any new paper suggesting otherwise will almost certainly be using complicated trickery designed to hide its sneaky sleight-of-hand. 

Postscript: Above I stated, "You cannot identify or reconstruct from brain scans what people saw or thought while their brains were being scanned." This statement is not at all discredited by papers such as the mistitled paper "The Code for Facial Identity in the Primate Brain." That paper does not meet good standards of experimental neuroscience. The paper is not a pre-registered paper that committed itself to one exact method of analysis before data was analyzed. The paper is one of those papers in which you get the suspicion that the authors were playing around with countless types of statistical analysis before ending up with what they reported. The analysis pipeline they report is some hopelessly convoluted and arbitrary rigmarole that fails to provide any convincing evidence for any such thing as a code for representing faces in brains. The statistics involved are so convoluted a can of worms (or perhaps we should say  "vat of worms") that it smells like irreproducible results.  I may note three fundamental failures:

(1) The lack of pre-registration, leaving the authors free to "keep torturing the data until it confessed."
(2) The lack of any blinding protocol, a necessity for a paper like this to be taken seriously.
(3) The use of only two monkey subjects (in a correlation study such as this, 15 subjects would be the minimum for a slightly impressive result). 

At the NBC News web site, we have a story entitled "From brain waves, this AI can sketch what you're picturing." The title gives the incorrect idea that scientists were trying to reconstruct what people were imagining (a common deceit of stories like this), although the actual study only involved what people were seeing while their brains were scanned. The story claims, "The resulting generated image matched the attributes (color, shape, etc.) and semantic meaning of the original image roughly 84% of the time." That's not a claim made by the scientific paper, which reports accuracy of only about 21%. in its "Results on Different Subjects" section. The study used annotations in the COCO database (text descriptions of the images), so it apparently used the same kind of data backdoor trick described above. Again, we are given the misleading impression that images are being reconstructed (or the content of images guessed) based solely on brain scans, when no such thing is happening. Instead the content of what someone saw is being guessed based on brain scans and lots of other data other than just the brain scans.  

Misrepresentations of what went on in studies of this type are extremely common in the press. We may be told that such and such a study identified what people were thinking or picturing when the study merely involved people whose brains were scanned when they were seeing something or speaking words. 

The latest in misleading poor-quality science papers trying to insinuate mind-reading by brain scans is the paper "Semantic reconstruction of continuous language from non-invasive brain recordings." A few subjects were brain-scanned for 16 hours, and attempts were made to predict what they had heard, using both brain scans and "a generative neural network language model that was trained on a large dataset of natural English word sequences" in order to get some ability to predict the words that would follow from a sequence of previous words. We read, "Given any word sequence, this language model predicts the words that could come next." The meager results produced had a statistical significance of only "p < .05," which is very unimpressive. That's the kind of result you would expect to get by chance in one out of 20 tries. Again, we have a misleading formula of  getting output using "brains scans plus some other huge thing" with the small claimed success coming mostly from the other huge thing, not the brain scans. No actual evidence has been provided that you can reconstruct what people were thinking or hearing from brain scans alone. For the sake of this piece of misleading parlor-trick junk science, some subjects had their brains scanned for 16 hours, a medically unnecessary risk to them which may have increased their chance of getting cancer or dementia. 

Ever-eager to produce misleading but interesting-sounding click-bait stories that help increase page views and advertising revenue, the press has jumped on this story, producing some very misleading stories that do not accurately describe the research,  and fail to tell us that the results mainly do not come from analysis of brain scans, but from some high-tech AI trained to anticipate the most likely words that would follow from some text. 

No comments:

Post a Comment