Opinion: Lessons for scientists from the All of Us Research Program backlash
Scientists are taught early in our training that criticism is a fundamental part of the job. What we are not usually trained to navigate, however, is public backlash, which is exactly what followed the publication of the comprehensive genomic sequencing results from the All of Us Research Program. The study aims to add the genomic information of 1 million volunteers from normally underrepresented genetic backgrounds to datasets that have been mostly constituted by people of European descent.
The publication faced public criticism almost instantaneously. At the heart of the matter is how the study presented the diversity of race and ethnicity in their dataset. It gets a little technical, but it focuses on the fact that the researchers used a type of graph called uniform manifold approximation and projection. UMAP reduces the complexity of a given dataset to something that can be plotted in a classic 2D graph. So UMAP is almost designed to find and exaggerate differences, creating new patterns that might not exist in the original data. In other words, the graph reinforces the misconception that races and ethnicity follow neatly distinct genetic components.

