Good morning, everyone. We we get started with um Doctor uh Huttenhauer. Um, thank you all for being here for combined uh surgery and anesthesia, grand rounds. Um, with the pleasure of having Doctor, um, Curtis Huttenhauer, uh, join us today. Um, to, um, present on the molecular detail in human microbiome population studies. He is professor, um, of computational biology, uh, biology and biostatistics at the Harvard TH Chan School of Public Health. Um, and an expert in microbiome, uh, uh, research. And as, um, many of us, uh, many of us know, this is a very exciting area, um, of investigation that pretty much applies to every disease. Um, so everyone in the room should have something to relate to, uh, this, uh, uh, field of study. So, um, thank you, Doctor Huttenauer. I look forward to hearing what you have to say. Oh, thanks so much for the, the invitation to come, uh, down this morning. I was, I was just saying we're right down the street, but I haven't been down here in a while. So, it's, it's great to come, uh, to interact with, uh, a new audience. Um, and I'm really excited to get feedback on any of the work that, that we've been doing in, in microbiome population studies, uh, lately. So, please interrupt with, with questions at, at any point. Um, so, the lab is just, uh, down the street at the School of Public Health. We're, we're split between there and the Broad Institute, um, and we're generally on a combination of computational methods development for microbial community studies generally, and applications in, uh, human population studies. And particularly in the environment of the, the School of Public Health. This is driven by the past, you know, 5 to 10 years' worth of work in linking the microbiome, particularly to chronic, often immune or inflammatory disease. So, I'll, I'll speak today quite a bit about inflammatory bowel disease, um, but we do work in, in IBD along with other conditions like rheumatoid arthritis or type 1 diabetes that are strongly immune-linked. Increasingly associated with changes in the microbiome, either causally or responsively, uh, but not necessarily driven by a single pathogen the way that that infectious disease might be. Um, and 11 quick comment on, on some of the, the School of Public Health's interest generally in microbiome population studies. I'll mention this again at the end, but we're, we've just recently launched a platform for large scale human microbiome. Uh, population studies. The platform itself has the, the awkward name Biomass or Biobank for microbiome Research in Massachusetts. It was funded by state tax money, so we have to be properly grateful. Um, but more generally, it's a standardized platform that lets us collect and analyze human. Microbiome samples from large populations quickly, efficiently, cheaply, including both stool and oral samples, um, and both sequenceable, molecularly analyzable samples and culture-compatible samples. So we can take them to a dish and grow bugs, or we can take them to, to mice and humanize them. To do more detailed follow-up. Um, so, I'll say a little bit more about this at the end. Our, our current flagship launch collection is is, uh, targeting, uh, 25,000 samples from within one of the school's large, uh, epidemiological cohorts, the nurses nurses Health Study 2. so, we're just starting this collection. Um, I'm happy to, to chat a little bit more about both the platform for collecting microbiome samples, generally, and the specific collection, uh, later on. Um, oh, and I, I almost, uh, almost forgot to show the, the picture of the standardized kit that we've developed for the, the platform, um, that includes both these small collection of tubes for convenient at-home stool collection. You can drop it in the mail afterwards, and then things like, uh, standardized questionnaire and, and metadata collection instruments, uh, to get dietary information, uh, medication information, environmental information along with that. Um, so, as, as a slightly broader background, I'll spend a lot of time today talking about results from, uh, studying the human microbiome. And I wanted to give just two slides' worth of background on, um, some of the techniques for studying microbial communities generally, and, and the human microbiome specifically. Um, I've already mentioned stool and oral microbiome samples. Basically, all microbial community studies start with some type of culture independent sample, um, be it a, a stool tube or a skin swab. And I would break down the types of information of molecular profiles that that we tend to work with into three broad categories, only one of which is sequencing. It's perhaps the best known tool currently for studying microbial communities. But in various studies, we and others will integrate both other types of molecular profiling and other types of culture independent cellular profiling of microbial communities to get a more complete picture of how microbes interact with each other, or with their hosts, be it a human being or a mouse, or, or otherwise. And of course, not all of these methods have to be culture independent. Um, so, there's increasingly good methods for microculture or high throughput, uh, profiling of cultured microbes to incorporate information like microbial metabolism into a microbial community study, along with other, uh, culture-related or high throughput culture independent cellular assays, um, such as direct imaging or things like flow cell sorting to get microbial counts from a community. So these type of cellular targeted assays complement different molecular assays. Again, either sequencing, or mostly mass spectrometry based uh methods for metabolomics or proteomics that can be carried out on a a whole microbial community, the same way they might be on human cells or on a microbial isolate. And then, of course, most of the, the data that that we and others work with for microbial community profiling is sequencing-based, either DNA or RNA uh nucleotides, and you'll see two main types of sequencing applied to microbial communities. Either amplicon-based or, or you'll often see this, uh, abbreviated as 16S based sequencing. which targets a single widely conserved microbial gene for sequencing, in which each variant tells you which bug it came from. So Amplicon or 16S sequencing is a very cheap, very efficient way of just counting up which bugs are there. Shotgun metagenomic or metatranscriptomic sequencing, instead of targeting just one gene, just one locus, sequences a little bit of everything, exactly like you would for a single complete genome. And so, this can give you a more complete picture of a microbial community. It can tell you what genes and pathways, and metabolism are going on, but at the cost of some some additional complexity. Amplicon sequencing is typically used to get a, a microbial census, what I'll call a taxonomic profile, that quantifies which bugs are there. And so, whenever I say bugs or taxa, or, or occasionally I'll, I'll say microbial species, any of these are, are summaries of uh counting up which microbes are present. For amplicon sequencing data, that typically means just bacteria, occasionally fungi. For shotgun metagenomic profiling, you can use taxonomic information to also quantify archaea, um, or viral members of a microbial community. And with both Amplicon or increasingly metagenomic profiling, you can zoom in not just to the level of which species are present, but very specific variants within those species. Um, so, microbes of the same species can differ by as much as 1/3 of their genome, not, not snips, not individual nucleotide variants like you would see in human genetics, but whole gene gains and losses, hundreds or thousands per genome within the same species. So, being able to see in in great detail which strains, what the exact version of each microbe that's present, can help to link those microbial variants to human health phenotypes, and then shotgun metagenomics provides other types of information that can help understand what those microbes are doing in a community, including functional profiling, which lists not just which bugs are there, but which genes and pathways they're carrying, or Increasingly, you can assemble whole new microbial sequences out of a metagenome without needing to isolate and culture individual microbes. And those near complete genomes will also in turn tell you which bugs are there, what genes and pathways, what metabolism they're carrying out. So with this sort of one slide background of the types of data that can be thrown at a microbial community, the types of downstream biology you'll see range anywhere from human population epidemiology down to very basic microbiology. So, often we're interested in studies in which we associate some kind of change in the microbiome at the human population scale with a health outcome. Um, so, as we'll see later, if we take, uh, a collection of patients with inflammatory bowel disease and look at their response to treatment or the progression of their IBD, we can ask whether there are changes in their gut microbiome that correspond with those health outcomes. Um, if you're interested in more basic molecular biology, it's easy to take these types of data and zoom in on individual enzyme variants. Um, if you want to get a, a collection of, for example, mucus degrading genes from different microbes in the gut microbiome, one can take these data and zoom in on just those metabolic gene families within a community. If you're interested more in comparative genomics or microbial evolution, you can look at variants of the same organism across populations. Um, so a classic example might be something like Helicobacter pylori, where the microbial genetics is essentially reflective of the host human genetics. So, you can ask about co-evolution or interactions between microbes, each other, and in a community and the host, using this type of comparative sequence information. Or if you're interested in changes over time, biophysics or microbial community dynamics, you can ask about changes in microbial transcription, post-transcription and molecular activity, or microbial growth. Anytime we sequence a microbial community, we're essentially asking what the copy number of that microbe is in a community. So we can assess changes in how many of which microbes are there over time, or in response to nutrient availability or perturbations like host medications or antibiotics. So, any of these downstream biological questions are sort of fair game, given the right slices of analyzed data of any of these raw types, be they sequencing-based or some of those, those other examples that I showed. So now, to see how all of this works in practice, I'll use many examples today from the Human Microbiome Project, um, which is, uh, a project, uh, NH sponsored project that we've worked with for quite a while now, either on the first phase of the HMP, um, which wrapped up several years ago, and had the mandate of really defining baseline microbiome variation in a, a healthy population. Um, so I already mentioned that different microbes of the same species can differ by thousands of genes. Um, different humans of presumably the same species, you know, typically share 99.9% plus of our DNA. Um, but we don't share more than 10 to 20% of our microbes typically. So, even, um, identical twins, for example, will share 10, 15% of their gut microbes, and even fewer of more exposed microbial communities like the skin or the oral microbiome. So the HMP one set out to define what the bounds of this sort of normal variation are. There are many perfectly healthy microbial community configurations that make up the baseline human microbiome in in different populations. So, I'll use a couple examples from the HMP1. I'll spend a little more time talking about the second phase of the project, or the HMP 2, which had a more functional and and host targeted mandate, in which we specifically wanted to study microbiome responses to disease, so perturbations from this baseline dynamically over time, and ask how host and microbial interactions and molecular activity responded to, to changes in disease. So, this is where we were looking specifically at changes during inflammatory bowel disease. And I, I won't describe them today, but there are two other parallel projects, one taking place at Virginia Commonwealth University, uh, studying pregnancy and preterm birth, and primarily the vaginal microbiome, and one taking place at Stanford and Jackson labs looking at, uh, changes in the gut microbiome during, uh, type 2 diabetes and weight gain. So we looked at the, the gut microbiome and inflammatory bowel disease. Um, which I, I think is interesting, both as a, an important health condition in and of itself, and as a model for broader, complex microbiome-linked disease. Um, and of course, we're not the, the first people to think this, it's, it's surprising how far back you can go and find surprisingly modern language describing sort of overall ecological shifts or dysbioses in the gut microbiome during IBD. Um, and this arose out of some of the, the earliest work in IBD as a, as a defined condition, um, starting in the, the 1940s and 1950s when antibiotics started to become more widely available as a treatment to which some IBD patients are responsive, but not all, um, but in the absence of a single culturable pathogen that seemed to, to cause the disease in a, a Cox postulate sense. Um, this was maintained throughout the 1950s and 1960s as more targeted treatments for inflammatory bowel disease became available that aren't precisely antimicrobials, but which can disrupt the gut microbiome generally. And then even, even more, uh, specifically during the 1970s and early 1980s, when genetically modified mice started to be used, some of which with immune lesions, would recapitulate IBD-like phenotypes and also carry a disrupted gut microbiome. So all of this is suggesting that something that's not just a single pathogen, but is still microbial, is associated with the the disease or diseases. But of course, this became even clearer in the past 10 years, uh, 10 years or so, during which we could use sequencing to study these gut microbiome shifts in more detail. So, very rapidly, our understanding of the microbiome and IBD progressed through this detection of general dysbiosis or reduced ecological diversity. You start to, to miss typical gut bugs during inflammation. Down into an understanding of which specific clades or groups of microbes tend to become more or less abundant during inflammation in the gut in different individuals, and to some degree how these associate with the major uh clinical subsets of inflammatory bowel disease. So, you'll see CD on the slides as an abbreviation for Crohn's disease, and UC as an abbreviation for ulcerative colitis within the inflammatory bowel disease. So, I'll show some of the, the results from, uh, studies of the IBD microbiome over the next few slides. But again, I, I think this is in some ways even more important as a general model for complex disease, uh, linked to the microbiome. So, just like genetic, uh, uh, our understanding of genetic conditions has changed over the past 10 or 20 years, um, from a, uh, to a continuum from single locus, um, Uh, conditions like cystic fibrosis, where the large effect on the disease comes from essentially a single gene, through complex genetic traits like, uh, a a trait like height or intelligence, or a disease like type 1 or type 2 diabetes that have a complex genetic architecture, where there's not just one gene that causes them, but there's a combined functional and emergent functional property of many small effects spread out over related genes. We see something similar in the microbiome, where the, the parallel to a Mendelian disease is essentially a pathogenic disease, uh, uh, an infectious disease in which there is a single bug of large effect that causes that condition. Complex disease like inflammatory bowel disease in the microbiome seems to share this emergent functional architecture with complex human genetic conditions in which there's not just one bug that causes the disease. But there's an emergent functional interaction between many microbes in an ecology like the gut. And we'll see, see more examples, uh, from, from IBD like that. So during the HMP 2, in order to study this, we followed just over 100 individuals for 1 year each, collecting mainly stool samples every, every 2 weeks, along with a collection of other specimens we'll see in a second here. These were roughly 50% Crohn's disease patients, 25% ulcerative colitis patients, and 25% non-IBD controls, some healthy, some, some with other uh GI conditions. And they were spread about fifty-fifty between adult and pediatric patients, and also about fifty-fifty between recent onset and established disease patients. So, we had a a good spread both geographically from different clinical centers and phenotypically across inflammatory bowel disease phenotypes. Again, we followed each of these subjects for one year each. They very graciously provided us, uh, biweekly stool samples using the predecessor of that, that home collection kit I showed earlier, along with uh colonoscopic uh biopsies at enrollment, and roughly quarterly blood draws. And this combination of specimens allowed us to really, as, as per the NIH mandate, get a wide swath of different molecular measures of host and microbial activity during disease. Most of our stool targeted assays were were looking more at microbes, including sequencing, metagenomics, and metatranscriptomics. Some of these mass spec-based molecular profiles like metabolomics and and proteomics that I mentioned earlier. virally targeted sequencing, and then mainly, uh, from our, our tissue specimens, we're able to get molecular profiles of host activity, including um uh gut transcriptomics, epigenetic sequencing, both from colon biopsies and from circulating blood, serological profiles, and then, uh, exome sequencing of the, the host as well. Um, one of the nice things about the design is that many of these different measures came from the same or nearly the same samples and time points, which allows us to combine and compare them later on. So, if we think about the 1600 or so metagenomic sequences from our, from the stool samples as a baseline, About half of those, 800 or so, also included meatranscriptomics. About a third of them, uh, 400 or so included proteomics, about 500 included metabolomics, and so on and so on and so on, down to about 300 stool samples in particular that were hit with all of the half dozen different molecular profiles that we're able to generate for stool, in addition to this baseline information from biopsies and, and blood draws as well. So we were really able to take advantage and of and integrate a pretty wide swath of different molecular profiles of the microbiome during analysis of the, the population. All of these data are being made available on the project's web portal, the inflammatory bowel Disease Multiomics database or IBDMDB. Um, this includes both an an overview of the study design and the project, and the, the protocol documents, as well as detailed information about how we collected and handled all of these samples. So, if you're interested in how that kit was put together, what the components were, how the samples were stored, how the sequencing was carried out, all of those detailed protocol documents are available, along with the, the raw data and the process data, um, alongside the metadata for the population. So anything that we're allowed to share, we've we've posted publicly on the site. Anything like human sequencing that has to be protected as controlled access is available through DB GAP instead. So it's still publicly available through controlled access. So, as a starting point from all of these data, it's easy to take any one measurement type and ask how that particular molecular profile of host or microbial activity changes during inflammatory bowel disease. So, we and others have done this with microbes many times now. One of the, the newer profiles that we could take a look at here is something like stool metabolomics, and how small molecule chemicals that are both host and microbially derived change during inflammatory bowel disease. And the, the overall pattern is similar to what you'll see for many types of gut inflammation. Where there's a slightly stronger signal in Crohn's disease, there's a parallel but slightly less statistically significant signal in ulcerative colitis, and here we've summarized by category collections of individual metabolites that are either enriched or depleted during uh inflammation. So, for example, overall the class of sphingolipids, which are a combination of host and microbially derived, um, small molecules are enriched during IBD. Something like the lactones are depleted. And I think this gets more interesting though, when we start to to split this out and integrate different molecular profiles um that are enriched or depleted in the gut during inflammatory bowel disease. So, we can compare, for example, all of these individual compounds that were more or less abundant during gut inflammation, with the microbes that were also more or less abundant. And if, uh, we'll use a consistent annotation here. Including the red for Crohn's disease, the orange for ulcerative colitis. Frequently they overlap, which is the, the darker red, and then blue, uh, for our, our non-IBD control individuals. So in this heat map, we're showing correlations between, so covariation between individual microbes that were more or less abundant during inflammatory bowel disease, and small molecule metabolites in the gut that were also more or less abundant. And these are actually correlations after regressing out, after taking out the signal from covariants like age or antibiotic usage usage or other medication. And the effect of the, the inflammation itself, of the disease itself. So these are associations between bugs and small molecules. That are often disease linked, but are stronger than just the effect of the inflammation itself. So, it, it becomes very suggestive of potential mechanisms by which a bug and a small molecule might interact during disease. So, a correlation might occur here because a bug generates one of these small molecules metabolically in the gut. It might occur because the bug depends on one of these small molecules as a metabolic input instead, or you might get an association between a bug and a metabolite because that bug interacts ecologically with another microbe that generates or depends on that small molecule. And we can zoom into it again a third data type accompanying these, the metagenomic and metatranscriptomic shotgun sequencing, to help pick apart these potential hypotheses. Um, by looking at what in the, the genomes of these bugs might generate or depend on small molecules, or, as we'll see a little bit later, looking at host transcripts to see which host pathways might be sensing and responding to these small molecules. So first to take a look at at some of the microbial mechanisms driving these small molecule interactions uh during inflammation. Uh, we started with some of the early, uh, pilot data from a subset of the, the total human microbiome Project 2 population, specifically the subset of individuals with long dense time courses. So, unsurprisingly, not every individual provided every stool sample over an entire year. So, we had an early subpopulation that that were the, uh, the, the, the good participants that provided essentially every biweekly sample. And we looked at uh the subset of control and Crohn's disease um individuals who provided these, these complete time courses. And this is just a quick, very high level summary to get a sense of what these populations look like. You can see that there's a surprising amount of stability over time within the subject, and this is a general property of the gut microbiome, especially, and the human microbiome generally. There's a little less stability, specifically in Crohn's disease patients when there are periods of disease activity, and there can be huge differences between individuals. So, it's a little hard to read, but these colors are just whole phylum level summaries of which bugs are present. Some individuals are mostly cyan, some individuals are mostly purple, some individuals are mostly blue, and these are completely different phyla of microbes. And again, these are typical differences between even healthy individuals. There's a lot of variation in the microbiome. So in order to to help understand why some of this variation is disease linked, Because people have such different microbes, even at the baseline, many of the, the taxonomic associations um that have been found with respect to IBD are, are quite broad. They're different between individuals. So, there are groups of bugs that are typically enriched or depleted, but they're not the same bugs from person to person. Instead, what seems to be one of the, the commonalities is that there will be bugs that carry some particular set of molecular functions or metabolic functions, and those functions tend to be enriched or depleted regardless of which bug they're in. One of the strongest signals um in that space for IBD and other gut inflammatory conditions, is the response to, to oxygen, uh, in the gut. So there's the, the lumen of the gut is, is typically anaerobic. There's more oxygen availability up against the mucosal surface, and this becomes even greater during periods of inflammation, um, both due to, to leakage from the host and as an active immune mechanism, an antimicrobial mechanism. So, microbes that can, that are aerotolerant, that can tolerate or take advantage of that, that oxygen availability, tend to become more abundant. They can take advantage of that and grow during inflammation in the gut. So, we can see this across many different organisms when we group together the class of uh aerottolerant or facultatively anaerobic organisms in the gut. We see a strong enrichment for uh aerottolerant organisms in the HMP2. Uh, IBD patients compared to controls. And the same thing is true when we look at, at other populations as well. So, there's a depletion for error-tolerant organisms in the healthy subjects from the HMP1, or if we take a, a completely independent inflammatory bowel disease study, there's an enrichment again in, in the Crohn's disease, uh, subset, uh, for facultative anaerobes. So this represents a whole class of functionally related organisms that might be phylogenetically very different. So they can be very different bugs, but they share this functional property. And to get a sense of how individual organisms within that class behave, you can typically, if you take one bug like Ruinococcus navius, see some enrichment, but it's not as strong as it is for the overall functional class. And this is the organism that we saw earlier that was strongly correlated to specific small molecules in the gut when more abundant as well. So in order to understand why this happens sort of weekly in individual organisms, we can zoom even further in and ask what is it about certain strains of pheuminococcus nauss that seem to make them oxygen and inflammation associated. And it turns out in in this subset of HMP2 data, there's a collection of about 200 genes that are shown along the Y axis here, that are uniquely present in just the strains, just the variants of pheuminococcus nevus that are present in the Crohn's disease patient. And these genes never appear in the strains of pheuminococcus nevus that are carried by control healthy individuals. If you squint hard, you can see that we do know what some of these genes, uh, do functionally. There's a subset, unsurprisingly, that, that deal with aero tolerance and oxygen utilization. There's a subset that are involved in mucosal adhesion and cellular invasion, so they make these strains of pheuminococcus navis look a little more pathogen-like. Um, and interestingly, there's a few that that take advantage of other host products like mucus availability or iron availability. But even without squinting, you can see that most of these rows are, are not annotated. We don't know at the molecular level, what the function of most of these genes that make these strains of pheuminococcus navius IBD specific, are actually doing. And this is not a, a unique problem within this one specific example. If we zoom back out and look at the overall functional profile of stuff going on in the gut, molecular processes in the gut. If we ignore which microbes are carrying them and just ask about what gene families and what biochemistry are present, if we take a collection of typical gut microbiomes, we, we can usually get good biochemical annotations for maybe a quarter or so of the microbial genes that are present. Another quarter are novel sequences. They're so diverge that they don't even look very much like any sequences that have been isolated that have been sequenced from isolate microbes previously. And then the remaining 50% or so of genes look like sequences that we've seen before, but do not have associated biochemistry. We don't know what they're doing. This is in the gut. The gut is, is fairly well studied. We have a lot of microbial islets that have been well characterized from the human gut generally. Things get dicier when you go to less well studied components of the human microbiome. Uh, so, skin, for example, that in this case, skin from the front of the nostrils, tends to contain an even greater fraction of novel and uncharacterized sequences, and I, I am occasionally glad that we mostly don't work on non-human associated microbial communities, because most of the microbes and most of the sequences present in environments like soil or ocean water are even less well studied. So, Both within the human microbiome, and more generally, there is a shocking amount of novel microbial biochemistry that we know is disease relevant in the human microbiome. We can throw a dart and hit something like the example I just showed from Pneuminococcus navis, where a subset of strains are inflammation associated for biochemical reasons that we can only, only start to understand at this point. It's highly probable that there are things like novel antimicrobials out here in this huge body of non-human associated microbial novel microbial sequence as well. So, there's, there's, I, I think of this as job security. There's an amazing range of really interesting microbial biochemistry out there that's health relevant, both within and outside of the, the human microbiome. So, in order to, to start getting a handle on this novel microbial biochemistry, uh, we have several of the more computational methods development projects going on in the lab, some of which are microbial assembly based, and some of which are more, more biochemical and, and metabolically based. One project that we just recently finished up, aimed to get a handle on what some of these novel microbial sequences are. So, not necessarily sequences that we know what their biochemistry is, what their, their metabolic purpose is, but at least getting a catalog of new microbial sequences that are typical in the human microbiome, and which bugs they're associated with. Um, so, this is work, I, I should say by Nicholas Cata, who, who was a former postdoc in the group has his own lab now, um, and did a really amazing study of just under 10,000 total shotgun metagenomes from across the body. So, including the gut. But also including the vaginal microbiome, skin, oral microbiome. Um, so, Nicola assembled de novo, uh, a total of about 150,000, um, reasonably high quality microbial genomes from these 10,000 shotgun metagenomes, about 70,000 of which passed the same quality levels as a microbial isolate genome would. So this means that we can go out to public databases of samples that were taken from the wild, sequenced without isolating individual microbes, and get genomes back that are just as good as what a high quality bar would be for a microbial isolate genome. So, getting microbial genomes is surprisingly cheap and easy these days. Of these 70,000 or so high quality metagenomic assemblies, When compared with 80,000 existing pre-existing microbial isolate genomes, we could group these down to roughly 5000 species level clusters of related microbes. So, we can't call them species because these are all new bugs, they don't have a name yet, but these clusters are at about the same level of resolution as a typical microbial species would be. 5000 of these were human associated, the other 110 or 12,000 came from these other non-human associated microbes, and of these 5000, only about a third corresponded to known microbial species. The other 2/3 were against species level bins, and I'll refer to these as SGBs, species-level genome bins. Um, the other 2/3 of these were species-level microbes that did not correspond to a previously observed microbe. If we take these 5000 or so species, And ask about their phylogenetic relationship, they follow essentially what you would expect from the human microbiome. So again, these colors, if, if you squint are, are by um Uh, phylum, there's a collection of bacterio deities and firmicutes and actinobacteria, typical human associated phyla. But there's novelty spread all the way around the tips of these new organisms, much of which, and I know this is impossible to see on the projector, is associated with understudied environments outside of the gut across the body. In, uh, new developmental stages, like very early in life, during childhood, or later in life, um, during aging, or with international populations. Um, so, much of this novelty was specifically associated with understudied geographical populations, non-westernized or, or developing areas. So, if we look in, in a sort of typical American or, or European gut, we'll recognize three quarters or more of the microbes that we find. By assembling all of these new bugs, we were still able to find another 10% or so of, of microbes there, getting this up to 80 or 90% recognizability. In non-Westernized populations, only about 1/3 of the bugs that we, that we encountered in, in these international populations were previously isolated. We can bump this up to more like 3/4 recognizability with these newly assembled microbes. Not all of the newly um identified species level bins were associated with these, these international undercharacterized populations though. Some of them are prevalent everywhere. So, one of the most prevalent organisms that we've seen for years in, you know, the, the normal adult American gut microbiome is an unclassified subdilogranulum organism, which is an understudied genus. Only has a few representative microbes in it. By picking apart this this single organism using newly assembled metagenomic sequences, it turns out that this bug is actually very clearly 7, a collection of 77 different species level clades that we can now very clearly separate from each other. Only one of them has a corresponding uh microbial isolate sequence, which is a gemager isolate, so a slightly different, uh, genus. It was included right here. The clay overall is over 80% prevalent, so that means that most of us in this room have one of these organisms in our guts right now, and they've not been previously isolated. This has been the first study that picked apart these new organisms into again, species like genome bands. Even from among well studied species like the bacteroides in the gut, for example, we still find a remarkable amount of functional novelty. So, again, different bugs of the same species can differ by hundreds or thousands of genes. The bacteroides, which are some of the most common organisms in the human gut, are an extreme example of this, where different bacteroides of the same species typically differ by hundreds or thousands of genes. And so by assembling collections of new bacteroides, even from among well studied species, we can quickly double or more the number of gene families, most of which have something to do with things like polysaccharide utilization in the gut. We can quickly double or more the number of gene families associated with these in the human with these organisms in the human gut. So again, there's a huge amount of sequence novelty and functional metabolic novelty, ranging from whole new sequences and organisms in new populations through new organisms in well studied populations, to new genes in well studied organisms that are becoming accessible through studies about the microbiome. In order to get a handle on metabolically, biochemically, what some of these, um, gene families and pathways do, um, I'll spend one slide on, on an example of one of the computational methods we've developed, which is a, a system for functional profiling of metagenomes or meatranscriptomes. So, classifying which genes and biochemical pathways are present and assigning them to specific organisms in a community. Um, so the name of the system is HMON, it's an acronym for something. Um, but more importantly, it gives you a, a functional profile and output of which genes or pathways are present in a community. Again, to get a sense of how some of these analyses work, it starts by taking a quality controlled, unassembled shotgun metagenome or metatranscriptome. First, figuring out which organisms are present. So it compares all of these short uh nucleotide reads to reference databases of just the most unique sequences. So, sequences that are strongly associated just with one specific organism. So, because those are a small subset of sequences, it can very quickly use this to figure out which bugs are present. We then draw from pre-built catalogs of all of the genes that can possibly be associated with those organisms. We can quickly map nucleotides to the genes associated with detected organisms and count up hits. This is a computationally very efficient process. So we can quickly get a catalog of how many copies of which genes in which bugs are present in a metagenome or metatranscriptome. Of course, because there's so much sequence novelty, even, even in well-studied environments like the gut, this won't capture every read. So the third tier of this processing takes remaining reads and performs a more computationally expensive translated search. So, it will compare nucleotides to a protein catalog that's much larger. So, this is a slower process. It has to work harder to translate the nucleotides and shift reading frames, um. But it only runs on these small subset of reeds for which it is necessary. And this won't necessarily identify which bugs a reed comes from, but it will help to figure out which genes or biochemical pathways a reed comes from. So you can still count up hits and figure out which genes are there, even if you don't know which bugs they're in. And then finally, if there are reeds that still can't be classified this way, this is where in the 4th tier we can assemble these reeds, we can stick them back together into fragmentary genomes, and just directly characterize those genomes rather than the the reeds themselves. So what this provides for a microbial community, looks something like what you would get for a transcriptomic study of a single organism like a human being. You get back a list of genes, since they're coming from a whole mixture of microbes in a community, each gene is broken down into the one or more bugs that contribute that gene to the community. When it's known, sometimes it's not known because we only see translated protein level hits. For each of these gene families, we get a relative abundance in units like um uh CPM or RPKMs like what you would get from an RNA seq experiment. And then for the subset of these gene families that are biochemically characterized, we can put them back together into pathways and quantify how much of each known biochemical pathway occurs in a particular community, and which bugs are are contributing that. So what this looks like in practice, allows us to summarize the the activity either DNA copy numbers or RNA transcriptional activity of a whole microbial community, without needing to assemble it. So, we've used this, for example, to identify uniquely host associated processes in the human microbiome. If we take a functional profile of all the different biochemical processes that a microbial community is carrying out, many of those will be housekeeping functions. You can see the ribosome, you can see the cell cycle, and you can see basic processes necessary for microbial life. But if we filter those out and ask only which biochemical processes are prevalent across the human body in any host associated microbial community, and which do not occur in non-host associated microbes. We end up with a, a fairly small set of known pathways. I'm sure there are dozens more pathways like this that haven't been studied yet. That looks something like this. So, each panel here is one body area across the human microbiome. We're not looking just at the gut now, but across the body. Each column is one metagenome, so one individual that's been sampled. The height of that column corresponds to the abundance of this biochemical pathway, which in this case is vitamin B12 biosynthesis, and the color of each bar indicates which general group of microbes is contributing this process to that specific community. So something like a vitamin B12 salvage is present essentially across the human body. It's prevalent in most microbial environments. It is carried out by completely different bugs, both between different environments and between different individuals in the same microbial environment. And this process tends to be absent from non-host associated microbial communities. So, this is a function, something like a, a police station or public transit. It is an ecological function that needs to be present in a host associated microbiome, regardless of which microbe is carrying it out. Again, there's, there's only about 20 or so characterized pathways that we can see that perform these sort of keystone ecological functions that are necessary for the human microbiome. I'm sure there are dozens more that just haven't been identified yet out of millions and millions of microbial genes that we see. There are many more processes that are environment specific. So, we can perform the same type of analysis and identify processes like nitrate reduction that follow a similar ecological pattern, but just within certain body sites. So, for example, here, nitrate reduction is prevalent across the oral cavity. It occurs in different parts of the, the oral microbiome. Again, it's it's it's necessary, it's essentially always present in the oral microbiome, but it's carried out by different bugs in slightly different environments or in different individuals within the same environment. And there are many processes like this, like polysaccharide utilization in the gut, that are prevalent, necessary for that biochemical environment, but carried out by different organisms, either in different environments or in different people. So we can quantify this and start to link the ecology of which microbes are present to the biochemistry of what they need to be doing at the molecular level, to survive in that biochemical environment and influence the host biochemically like we saw with Pheuminococcus navis and IBD earlier. This ranges from processes. That we describe as low within and between subject diversity. Um, so, one of the common, uh, ecological properties that you'll see, even for environmental ecology, let alone microbial ecology, are these concepts of alpha and and beta diversity. Alpha diversity captures how many different bugs are in a community, beta diversity captures how different they are between communities, between people typically. We can do the same thing now for processes. If we take one pathway. Like, uh, glutal CA degradation, this is a low alpha and beta diversity process. It's essentially always present in the gut. It does not differ between people, and it's essentially always contributed specifically by thecallobacterium prosniti. So this is a keystone function ecologically from this specific microbe. Other processes might be low alpha diversity. They are carried out just by one specific bug in a community, but that bug might differ between people. So, this is an example from the vaginal microbiome in which this particular pathway is, is essentially always present, but it can be contributed by different bugs in different communities. At the other extreme, we often see pathways that are diverse within the subject. And carried out by essentially the same bugs between people, and most commonly, this is actually the, the most typical um behavior of the human microbiome. You can have pathways, uh, like we saw in the previous slide, that need to be present in order for that community to function ecologically, but they can be carried out by many different organisms and different organisms and different people. Um, so these are the more typical behaviors, but we see all of these combinations occurring in the wild for different biochemical processes, ranging from complex contributions that are variable between subjects, down to these sort of keystone processes that are contributed by one bug in every individual. So, I'm, I'm a little low on time. Um, I could, I could keep talking about the HMP 1 and 2, and, and our other projects for, forever. The HMP 2 in, in particular includes just a, an amazing wealth of host and microbial information. Um, I'll, I'll try to quickly, uh, get in some of the more clinically oriented information about the IBD phenotypes of the HMP2 subjects on this slide. Where the three main outcome measures that we had for inflammation over the course of the, the one year each, um, were either patient reported indices, the HBI for Crohn's disease, SCCAI for the ulcerative colitis patients. We compare these with a molecular measure of gut inflammation, uh, by reading out, uh, fecal calprotectin. And in turn, we compare these with a microbial measure of dysbiosis in the gut. So how basically how inflamed does your microbiome look as a measure of disease activity. Um, for the folks in, in this room, I, it's probably not surprising that no two of these measures agreed with each other particularly well. Um, neither the patient reported, nor the host molecular, nor the microbial measure tended to, to capture the same aspects of disease activity. But, uh, you know, I'm biased, but, but I think the one that actually uh measured IBD activity most, uh, accurately at the molecular level was actually measures of dysbiosis in the microbiome. And we can see this by looking at the many different types of molecular profiles that were captured. Whenever gut microbial configurations were ecologically unusual. Whenever the microbiome at a particular point in time looked different than baseline, was significantly different than the control population, this corresponded to changes not just in which microbes were present, but in which microbial processes were transcribed, which host processes were being transcribed, and small molecule biochemical profiles, um, in the gut, including bioacids. Other microbial linked uh metabolites like short chain fatty acids. And most interestingly, to molecular markers of disease activity outside of the gut. Um, so there's a small number of circulating serological markers, antibodies that, that correspond with either ulcerative colitis or Crohn's disease. Um, ASCA tends to be specifically Crohn's associated. AA. tends to be specifically ulcerative colitis associated. And one of the, the things we saw that I, I thought was particularly remarkable is that these enrichments for circulating antibodies were much stronger during periods of active disease as defined by microbial dysbiosis in the gut. So, we're seeing a correspondence between a host phenotype, a, a disease outcome. A gut ecological marker in the microbiome and circulating serology reflected in in an antibody level that's both uh disease activity and disease subtype specific. And again, I'm, I'm low on time, but we see some of this reflected in post-transcriptional measures as well. So, we can break down differentially expressed genes by tissue in the gut. We had two standardized, uh, biopsy locations during colonoscopy, one at the ileum and, and one at the rectum. Most human transcriptional responses tended to actually be tissue-specific, not disease-specific, not microbiome-specific. They were much more responsive to the local biogeography in the gut. But a subset of these were differential with respect to disease status and inflammation. They tended to be more Crohn's-specific in the ileum, unsurprisingly, and more ulcerative colitis-specific, um, at the, the rectal biopsies. Most of these genes were not directly interacting with the microbiome, so far as we could tell, but a small subset from pathways like IL-17 signaling or complement would associate weakly with microbial changes right at that mucosal surface. So out of all of the, the host transcriptional activity going on along the colon, There seemed to be a very small number of very specific, both locationally specific and microbially specific, essentially sensory responses. That are linking this inflammation inflammatory response in the gut to very specific individual microbes at just one location immediately surrounding that transcript in the gut. Um, so to wrap up, again, I can, I can keep telling story after story from the, the HMP2 subjects. Every individual is, is their own, their own story. So, each of these slides is just one subject's data over the course of the year. And you can see changes, for example, here, between completely different microbiome states during, uh, in one of the Crohn's disease patients. Changes over time in which microbes are present, even in the, the healthy controls, and I'm, I'm going through these, these quite quickly. Um, for the sake of time, or you can zoom way out and just ask, what are all of the different interacting molecular players that we see in the system. Be they bugs or small molecule, biochemical, uh, metabolites, or host transcripts, or microbial pathways. And so, we can summarize much of these data all the way from individual stories and individual subjects up to the overall molecular picture. And pick out individual microbes or individual small molecules of interest and ask what do these processes tend to interact with between the host and the microbiome during disease. So, I, I've highlighted today some of both what we know and, and what we don't know about the microbiome, um, the latter of which I think is still after working on this for almost 10 years now, I think is, is really exciting. There's very basic mechanisms of how, what small molecules, microbes in the gut use to talk to each other into the immune system that we're still starting to, to pick apart. We can identify small molecule metabolites and microbial products like those pheuminococcus nauss pathways that are, um, immune sensed. As far as translating this, I, I haven't talked very much today about how one can modify the microbiome to alleviate this type of inflammatory response. But of course, that's a, a very active area of research, as well as using the microbiome as a readout for, uh, for hosts, uh, prognostics. So, we're, we're looking right now at whether some of these microbiome changes are predictive of future response to specific IV treat IBD treatments or not. And then finally, up at the, the more population scale, I've tried to give a sense of how some of these epidemiologic microbiome epidemiological studies are, are carried out. This, of course, can have a, an effect on things like health policy and what treatments are recommended. There's a whole line of, of research that I, I didn't discuss today about things like early life exposures to antibiotics, and how that influences microbiome development, immune imprinting, and later, um, allergic and, and atopic outcomes. So, many of the computational methods that I've mentioned today were, were developed by the lab. I didn't go into most of the details, but they're all available, free, open source, etc. um, as part of the, the lab suite of tools called the Biobakery. So, feel free to take a look there if you want more on the, the biostatistical or or uh bioinformatics side. There's documentation, tutorials, cloud images, um, hopefully, all of the, the pieces that you need to, to do your own microbiome work if, if you're interested. And then finally, I, I want to put in one quick plug at the end. As of yesterday, we have online some of the information for our kickoff symposium for the uh Center for the, the Microbiome and Public Health that the, the School of Public Health is launching. So, hopefully, this is now correctly live. It was when we checked it out yesterday. So, I'd, I'd, I'd like to make you the first to, to help take a look at the symposium online if you're interested in attending in a couple of months, uh, this May, when the, the school launches the initiative. So, many thanks again for the invitation for joining this morning. Um, and I'm happy to take any questions if there's still time for. Thanks so much. Well, Doctor Hattenhauer, I'd like to thank you for bringing to us this morning some remarkable information. Much of it I'll confess is far above my, my head. Um, and I'm sure there's some questions from the audience. The one, the one thing that comes to mind when you showed some of the later slides which showed when the microbiome changed so dramatically in the patients with, with the inflammatory bowel disease. How do you sort out whether that is secondary to the, to the therapies that the patients are receiving or whether it has some role in their current disease state? We, we've taken a few different approaches to teasing apart the effects of inflammation in the gut from the effects of, of medication on the gut. Um, One of which is is cross-sectional, so we can essentially regress out, uh, effects of sufficiently prevalent medications and ask whether bugs or small molecules are perturbed in the same way across the population, regardless of of whether an individual is receiving a particular medication, uh, be it biologics or steroids or antibiotics or what have you. Um, uh, the second method is longitudinal, that we can use the within subject changes over time to assess whether the the same perturbations are there before and after medication, basically. So, something is already going wrong, so to speak, before a subject starts to receive medication or whether the same thing continues to go wrong over time. Um, the third is actually not so much in this study, but in, uh, another pair of studies, um, risk and protect, that both, uh, recruited, uh, new onset Crohn's and ulcerative colitis patients, respectively. Um, in both of those studies, they specifically recruited only subjects, um, that were not previously medicated and assessed which microbiome shifts were present in those individuals before medication. We've done something similar now in, in rheumatoid arthritis. Um, Often, I'd say in all three of those cases, um, There, there hasn't been, uh, it's been easy to, to tease apart the effect of medication versus inflammation. Um, the latter for most medications tends to be stronger. There are only a few medications like antibiotics, metformin for diabetes, um, to some degree, proton pump inhibitors that that have as large an effect on the microbiome as inflammation. Um, some of the other more common IBD meds like steroids or biologics, uh, anti-TNF don't seem to have a, a large effect on the microbiome specifically. They're targeting more host side processes like the inflammatory response. Additional questions for Doctor Etenhauer. Doctor Jackson Uh, thank you for adding clarity to an incredibly complex, uh, field. Um, you know, as surgeons, we know that the GI tract is filled with nooks and crannies, and, uh, further, there are diseases which dilate small bowel, etc. So I'm wondering within an individual, is it possible that there are in fact micro microbiomes where there are dysbiosis in a specific portion of the GI tract that can cause disease and yet may be missed by a global uh scan of just the stool. I, I definitely think that's often the case. Um, We, we know that we get a, uh, from, from other data, we know that we get a good readout of the overall mucosa from a stool sample. There tends to be a, a pretty close correspondence between stool and sort of the overall average of the, the colon mucosa. Um, there's not very much data out there on sort of microbiogeography in the colon, because it's, it's hard to get those samples. Um, but from more accessible environments like, uh, the skin or the oral microbiome, we can see these amazingly regional changes sometimes. Um, and I, from the data that I've seen generally in the gut, I would not be surprised if many of these, especially host microbiome interactions, are highly local, and part of why we don't see more global interactions between, say, A particular host immune pathway and microbial changes is that they're occurring in, in one spot. And if you're interested in how, say, uh, microbes initiate an inflammatory response in IBD or in or in colorectal cancer is an even better example. That's when we go looking in early stage CRC there are minimal, if any, microbiome perturbations. Later on, you go in and, and you can see very clear changes in the, the local microbiome around a, a later stage tumor. I, I would bet that they are still there early on, but we can't see them unless you look in exactly the right place, because a microbial change like like five bacterium that is initiating an inflammatory response that's gonna turn into CRC is just happening in one specific location. So when Wendy Garrett, who is on there and studies this in in mice, is, is able to get these exquisite beautiful local um immune response information from a, a mouse resection that we are rarely able to get in, in humans. Any additional quick questions? If not, thank you so much for your presentation. Thank you. This. I Right Oh, sorry, so yeah, I saw you had a question right at the end. Sorry. There's, there's a whole literature now on microbiome and central nervous system disease, Parkinson's, autism, yeah.
Click "Show Transcript" to view the full transcription (58802 characters)
Comments