Escherichia coli

Escherichia coli (),[1][2] also known as E. coli (),[2] is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus Escherichia that is commonly found in the lower intestine of warm-blooded organisms (endotherms).[3][4] Most E. coli strains are harmless, but some serotypes (EPEC, ETEC etc.) can cause serious food poisoning in their hosts, and are occasionally responsible for food contamination incidents that prompt product recalls.[5][6] The harmless strains are part of the normal microbiota of the gut, and can benefit their hosts by producing vitamin K2,[7] (which helps blood to clot) and preventing colonisation of the intestine with pathogenic bacteria, having a mutualistic relationship.[8][9] E. coli is expelled into the environment within fecal matter. The bacterium grows massively in fresh fecal matter under aerobic conditions for 3 days, but its numbers decline slowly afterwards.[10]

E. coli and other facultative anaerobes constitute about 0.1% of gut microbiota,[11] and fecal–oral transmission is the major route through which pathogenic strains of the bacterium cause disease. Cells are able to survive outside the body for a limited amount of time, which makes them potential indicator organisms to test environmental samples for fecal contamination.[12][13] A growing body of research, though, has examined environmentally persistent E. coli which can survive for many days and grow outside a host.[14]

The bacterium can be grown and cultured easily and inexpensively in a laboratory setting, and has been intensively investigated for over 60 years. E. coli is a chemoheterotroph whose chemically defined medium must include a source of carbon and energy.[15] E. coli is the most widely studied prokaryotic model organism, and an important species in the fields of biotechnology and microbiology, where it has served as the host organism for the majority of work with recombinant DNA. Under favorable conditions, it takes as little as 20 minutes to reproduce.[16]

E. coli is a Gram-negative, facultative anaerobe (that makes ATP by aerobic respiration if oxygen is present, but is capable of switching to fermentation or anaerobic respiration if oxygen is absent) and nonsporulating bacterium.[17] Cells are typically rod-shaped, and are about 2.0 μm long and 0.25–1.0 μm in diameter, with a cell volume of 0.6–0.7 μm3.[18][19][20]

E. coli is Gram-negative because its cell wall is composed of a thin peptidoglycan layer and an outer membrane. During the staining process, E. coli picks up the color of the counterstain safranin and stains pink. The outer membrane surrounding the cell wall provides a barrier to certain antibiotics such that E. coli is not damaged by penicillin.[15]

Strains that possess flagella are motile. The flagella have a peritrichous arrangement.[21] It also attaches and effaces to the microvilli of the intestines via an adhesion molecule known as intimin.[22]

E. coli can live on a wide variety of substrates and uses mixed acid fermentation in anaerobic conditions, producing lactate, succinate, ethanol, acetate, and carbon dioxide. Since many pathways in mixed-acid fermentation produce hydrogen gas, these pathways require the levels of hydrogen to be low, as is the case when E. coli lives together with hydrogen-consuming organisms, such as methanogens or sulphate-reducing bacteria.[23]

In addition, E. coli's metabolism can be rewired to solely use CO2 as the source of carbon for biomass production. In other words, this obligate heterotroph's metabolism can be altered to display autotrophic capabilities by heterologously expressing carbon fixation genes as well as formate dehydrogenase and conducting laboratory evolution experiments. This may be done by using formate to reduce electron carriers and supply the ATP required in anabolic pathways inside of these synthetic autotrophs.[24]

E. coli have three native glycolytic pathways: EMPP, EDP, and OPPP. The EMPP employs ten enzymatic steps to yield two pyruvates, two ATP, and two NADH per glucose molecule while OPPP serves as an oxidation route for NADPH synthesis. Although the EDP is the more thermodynamically favorable of the three pathways, E. coli do not use the EDP for glucose metabolism, relying mainly on the EMPP and the OPPP. The EDP mainly remains inactive except for during growth with gluconate.[25]

When growing in the presence of a mixture of sugars, bacteria will often consume the sugars sequentially through a process known as catabolite repression. By repressing the expression of the genes involved in metabolizing the less preferred sugars, cells will usually first consume the sugar yielding the highest growth rate, followed by the sugar yielding the next highest growth rate, and so on. In doing so the cells ensure that their limited metabolic resources are being used to maximize the rate of growth. The well-used example of this with E. coli involves the growth of the bacterium on glucose and lactose, where E. coli will consume glucose before lactose. Catabolite repression has also been observed in E.coli in the presence of other non-glucose sugars, such as arabinose and xylose, sorbitol, rhamnose, and ribose. In E. coli, glucose catabolite repression is regulated by the phosphotransferase system, a multi-protein phosphorylation cascade that couples glucose uptake and metabolism.[26]

Optimum growth of E. coli occurs at 37 °C (98.6 °F), but some laboratory strains can multiply at temperatures up to 49 °C (120 °F).[27] E. coli grows in a variety of defined laboratory media, such as lysogeny broth, or any medium that contains glucose, ammonium phosphate monobasic, sodium chloride, magnesium sulfate, potassium phosphate dibasic, and water. Growth can be driven by aerobic or anaerobic respiration, using a large variety of redox pairs, including the oxidation of pyruvic acid, formic acid, hydrogen, and amino acids, and the reduction of substrates such as oxygen, nitrate, fumarate, dimethyl sulfoxide, and trimethylamine N-oxide.[28] E. coli is classified as a facultative anaerobe. It uses oxygen when it is present and available. It can, however, continue to grow in the absence of oxygen using fermentation or anaerobic respiration. The ability to continue growing in the absence of oxygen is an advantage to bacteria because their survival is increased in environments where water predominates.[15]

Redistribution of fluxes between the three primary glucose catabolic pathways: EMPP (red), EDP (blue), and OPPP (orange) via the knockout of pfkA and overexpression of EDP genes (edd and eda).

The bacterial cell cycle is divided into three stages. The B period occurs between the completion of cell division and the beginning of DNA replication. The C period encompasses the time it takes to replicate the chromosomal DNA. The D period refers to the stage between the conclusion of DNA replication and the end of cell division.[29] The doubling rate of E. coli is higher when more nutrients are available. However, the length of the C and D periods do not change, even when the doubling time becomes less than the sum of the C and D periods. At the fastest growth rates, replication begins before the previous round of replication has completed, resulting in multiple replication forks along the DNA and overlapping cell cycles.[30]

The number of replication forks in fast growing E. coli typically follows 2n (n = 1, 2 or 3). This only happens if replication is initiated simultaneously from all origins of replications, and is referred to as synchronous replication. However, not all cells in a culture replicate synchronously. In this case cells do not have multiples of two replication forks. Replication initiation is then referred to being asynchronous.[31] However, asynchrony can be caused by mutations to for instance DnaA[31] or DnaA initiator-associating protein DiaA.[32]

E. coli and related bacteria possess the ability to transfer DNA via bacterial conjugation or transduction, which allows genetic material to spread horizontally through an existing population. The process of transduction, which uses the bacterial virus called a bacteriophage,[33] is where the spread of the gene encoding for the Shiga toxin from the Shigella bacteria to E. coli helped produce E. coli O157:H7, the Shiga toxin-producing strain of E. coli.

E. coli encompasses an enormous population of bacteria that exhibit a very high degree of both genetic and phenotypic diversity. Genome sequencing of many isolates of E. coli and related bacteria shows that a taxonomic reclassification would be desirable. However, this has not been done, largely due to its medical importance,[34] and E. coli remains one of the most diverse bacterial species: only 20% of the genes in a typical E. coli genome is shared among all strains.[35]

In fact, from the more constructive point of view, the members of genus Shigella (S. dysenteriae, S. flexneri, S. boydii, and S. sonnei) should be classified as E. coli strains, a phenomenon termed taxa in disguise.[36] Similarly, other strains of E. coli (e.g. the K-12 strain commonly used in recombinant DNA work) are sufficiently different that they would merit reclassification.

A strain is a subgroup within the species that has unique characteristics that distinguish it from other strains. These differences are often detectable only at the molecular level; however, they may result in changes to the physiology or lifecycle of the bacterium. For example, a strain may gain pathogenic capacity, the ability to use a unique carbon source, the ability to take upon a particular ecological niche, or the ability to resist antimicrobial agents. Different strains of E. coli are often host-specific, making it possible to determine the source of fecal contamination in environmental samples.[12][13] For example, knowing which E. coli strains are present in a water sample allows researchers to make assumptions about whether the contamination originated from a human, another mammal, or a bird.

A common subdivision system of E. coli, but not based on evolutionary relatedness, is by serotype, which is based on major surface antigens (O antigen: part of lipopolysaccharide layer; H: flagellin; K antigen: capsule), e.g. O157:H7).[37] It is, however, common to cite only the serogroup, i.e. the O-antigen. At present, about 190 serogroups are known.[38] The common laboratory strain has a mutation that prevents the formation of an O-antigen and is thus not typeable.

Like all lifeforms, new strains of E. coli evolve through the natural biological processes of mutation, gene duplication, and horizontal gene transfer; in particular, 18% of the genome of the laboratory strain MG1655 was horizontally acquired since the divergence from Salmonella.[39] E. coli K-12 and E. coli B strains are the most frequently used varieties for laboratory purposes. Some strains develop traits that can be harmful to a host animal. These virulent strains typically cause a bout of diarrhea that is often self-limiting in healthy adults but is frequently lethal to children in the developing world.[40] More virulent strains, such as O157:H7, cause serious illness or death in the elderly, the very young, or the immunocompromised.[40][41]

The genera Escherichia and Salmonella diverged around 102 million years ago (credibility interval: 57–176 mya), which coincides with the divergence of their hosts: the former being found in mammals and the latter in birds and reptiles.[42] This was followed by a split of an Escherichia ancestor into five species (E. albertii, E. coli, E. fergusonii, E. hermannii, and E. vulneris). The last E. coli ancestor split between 20 and 30 million years ago.[43]

The long-term evolution experiments using E. coli, begun by Richard Lenski in 1988, have allowed direct observation of genome evolution over more than 65,000 generations in the laboratory.[44] For instance, E. coli typically do not have the ability to grow aerobically with citrate as a carbon source, which is used as a diagnostic criterion with which to differentiate E. coli from other, closely, related bacteria such as Salmonella. In this experiment, one population of E. coli unexpectedly evolved the ability to aerobically metabolize citrate, a major evolutionary shift with some hallmarks of microbial speciation.

In the microbial world, a relationship of predation can be established similar to that observed in the animal world. Considered, it has been seen that E. coli is the prey of multiple generalist predators, such as Myxococcus xanthus. In this predator-prey relationship, a parallel evolution of both species is observed through genomic and phenotypic modifications, in the case of E. coli the modifications are modified in two aspects involved in their virulence such as mucoid production (excessive production of exoplasmic acid alginate ) and the suppression of the OmpT gene, producing in future generations a better adaptation of one of the species that is counteracted by the evolution of the other, following a co-evolutionary model demonstrated by the Red Queen hypothesis.[45]

E. coli is the type species of the genus (Escherichia) and in turn Escherichia is the type genus of the family Enterobacteriaceae, where the family name does not stem from the genus Enterobacter + "i" (sic.) + "aceae", but from "enterobacterium" + "aceae" (enterobacterium being not a genus, but an alternative trivial name to enteric bacterium).[46][47]

The original strain described by Escherich is believed to be lost, consequently a new type strain (neotype) was chosen as a representative: the neotype strain is U5/41T,[48] also known under the deposit names DSM 30083,[49] ATCC 11775,[50] and NCTC 9001,[51] which is pathogenic to chickens and has an O1:K1:H7 serotype.[52] However, in most studies, either O157:H7, K-12 MG1655, or K-12 W3110 were used as a representative E. coli. The genome of the type strain has only lately been sequenced.[48]

Many strains belonging to this species have been isolated and characterised. In addition to serotype (vide supra), they can be classified according to their phylogeny, i.e. the inferred evolutionary history, as shown below where the species is divided into six groups.[53][54] Particularly the use of whole genome sequences yields highly supported phylogenies. Based on such data, five subspecies of E. coli were distinguished.[48]

The link between phylogenetic distance ("relatedness") and pathology is small,[48] e.g. the O157:H7 serotype strains, which form a clade ("an exclusive group")—group E below—are all enterohaemorragic strains (EHEC), but not all EHEC strains are closely related. In fact, four different species of Shigella are nested among E. coli strains (vide supra), while E. albertii and E. fergusonii are outside this group. Indeed, all Shigella species were placed within a single subspecies of E. coli in a phylogenomic study that included the type strain,[48] and for this reason an according reclassification is difficult. All commonly used research strains of E. coli belong to group A and are derived mainly from Clifton's K-12 strain (λ+ F+; O16) and to a lesser degree from d'Herelle's Bacillus coli strain (B strain)(O7).

The first complete DNA sequence of an E. coli genome (laboratory strain K-12 derivative MG1655) was published in 1997. It is a circular DNA molecule 4.6 million base pairs in length, containing 4288 annotated protein-coding genes (organized into 2584 operons), seven ribosomal RNA (rRNA) operons, and 86 transfer RNA (tRNA) genes. Despite having been the subject of intensive genetic analysis for about 40 years, many of these genes were previously unknown. The coding density was found to be very high, with a mean distance between genes of only 118 base pairs. The genome was observed to contain a significant number of transposable genetic elements, repeat elements, cryptic prophages, and bacteriophage remnants.[55]

More than three hundred complete genomic sequences of Escherichia and Shigella species are known. The genome sequence of the type strain of E. coli was added to this collection before 2014.[48] Comparison of these sequences shows a remarkable amount of diversity; only about 20% of each genome represents sequences present in every one of the isolates, while around 80% of each genome can vary among isolates.[35] Each individual genome contains between 4,000 and 5,500 genes, but the total number of different genes among all of the sequenced E. coli strains (the pangenome) exceeds 16,000. This very large variety of component genes has been interpreted to mean that two-thirds of the E. coli pangenome originated in other species and arrived through the process of horizontal gene transfer.[56]

Genes in E. coli are usually named in accordance with the uniform nomenclature proposed by Demerec et al.[57] Gene names are 3-letter acronyms that derive from their function (when known) or mutant phenotype and are italicized. When multiple genes have the same acronym, the different genes are designated by a capital later that follows the acronym and is also italicized. For instance, recA is named after its role in homologous recombination plus the letter A. Functionally related genes are named recB, recC, recD etc. The proteins are named by uppercase acronyms, e.g. RecA, RecB, etc. When the genome of E. coli strain K-12 substr. MG1655 was sequenced, all known or predicted protein-coding genes were numbered (more or less) in their order on the genome and abbreviated by b numbers, such as b2819 (= recD). The "b" names were created after Fred Blattner, who led the genome sequence effort.[55] Another numbering system was introduced with the sequence of another E. coli K-12 substrain, W3110, which was sequenced in Japan and hence uses numbers starting by JW... (Japanese W3110), e.g. JW2787 (= recD).[58] Hence, recD = b2819 = JW2787. Note, however, that most databases have their own numbering system, e.g. the EcoGene database[59] uses EG10826 for recD. Finally, ECK numbers are specifically used for alleles in the MG1655 strain of E. coli K-12.[59] Complete lists of genes and their synonyms can be obtained from databases such as EcoGene or Uniprot.

Several studies have investigated the proteome of E. coli. By 2006, 1,627 (38%) of the 4,237 open reading frames (ORFs) had been identified experimentally.[60] The 4,639,221–base pair sequence of Escherichia coli K-12 is presented. Of 4288 protein-coding genes annotated, 38 percent have no attributed function. Comparison with five other sequenced microbes reveals ubiquitous as well as narrowly distributed gene families; many families of similar genes within E. coli are also evident. The largest family of paralogous proteins contains 80 ABC transporters. The genome as a whole is strikingly organized with respect to the local direction of replication; guanines, oligonucleotides possibly related to replication and recombination, and most genes are so oriented. The genome also contains insertion sequence (IS) elements, phage remnants, and many other patches of unusual composition indicating genome plasticity through horizontal transfer.[55]

The interactome of E. coli has been studied by affinity purification and mass spectrometry (AP/MS) and by analyzing the binary interactions among its proteins.

Protein complexes. A 2006 study purified 4,339 proteins from cultures of strain K-12 and found interacting partners for 2,667 proteins, many of which had unknown functions at the time.[61] A 2009 study found 5,993 interactions between proteins of the same E. coli strain, though these data showed little overlap with those of the 2006 publication.[62]

Binary interactions. Rajagopala et al. (2014) have carried out systematic yeast two-hybrid screens with most E. coli proteins, and found a total of 2,234 protein-protein interactions.[63] This study also integrated genetic interactions and protein structures and mapped 458 interactions within 227 protein complexes.

E. coli belongs to a group of bacteria informally known as coliforms that are found in the gastrointestinal tract of warm-blooded animals.[64] E. coli normally colonizes an infant's gastrointestinal tract within 40 hours of birth, arriving with food or water or from the individuals handling the child. In the bowel, E. coli adheres to the mucus of the large intestine. It is the primary facultative anaerobe of the human gastrointestinal tract.[65] (Facultative anaerobes are organisms that can grow in either the presence or absence of oxygen.) As long as these bacteria do not acquire genetic elements encoding for virulence factors, they remain benign commensals.[66]

Due to the low cost and speed with which it can be grown and modified in laboratory settings, E. coli is a popular expression platform for the production of recombinant proteins used in therapeutics. One advantage to using E. coli over another expression platform is that E. coli naturally does not export many proteins into the periplasm, making it easier to recover a protein of interest without cross-contamination.[67] The E. coli K-12 strains and their derivatives (DH1, DH5α, MG1655, RV308 and W3110) are the strains most widely used by the biotechnology industry.[68] Nonpathogenic E. coli strain Nissle 1917 (EcN), (Mutaflor) and E. coli O83:K24:H31 (Colinfant)[69][70]) are used as probiotic agents in medicine, mainly for the treatment of various gastrointestinal diseases,[71] including inflammatory bowel disease.[72] It is thought that the EcN strain might impede the growth of opportunistic pathogens, including Salmonella and other coliform enteropathogens, through the production of microcin proteins the production of siderophores.[73]

Most E. coli strains do not cause disease, naturally living in the gut,[74] but virulent strains can cause gastroenteritis, urinary tract infections, neonatal meningitis, hemorrhagic colitis, and Crohn's disease. Common signs and symptoms include severe abdominal cramps, diarrhea, hemorrhagic colitis, vomiting, and sometimes fever. In rarer cases, virulent strains are also responsible for bowel necrosis (tissue death) and perforation without progressing to hemolytic-uremic syndrome, peritonitis, mastitis, sepsis, and Gram-negative pneumonia. Very young children are more susceptible to develop severe illness, such as hemolytic uremic syndrome; however, healthy individuals of all ages are at risk to the severe consequences that may arise as a result of being infected with E. coli.[65][75][76][77]

Some strains of E. coli, for example O157:H7, can produce Shiga toxin (classified as a bioterrorism agent). The Shiga toxin causes inflammatory responses in target cells of the gut, leaving behind lesions which result in the bloody diarrhea that is a symptom of a Shiga toxin-producing E. coli (STEC) infection. This toxin further causes premature destruction of the red blood cells, which then clog the body's filtering system, the kidneys, in some rare cases (usually in children and the elderly) causing hemolytic-uremic syndrome (HUS), which may lead to kidney failure and even death. Signs of hemolytic uremic syndrome include decreased frequency of urination, lethargy, and paleness of cheeks and inside the lower eyelids. In 25% of HUS patients, complications of nervous system occur, which in turn causes strokes. In addition, this strain causes the buildup of fluid (since the kidneys do not work), leading to edema around the lungs, legs, and arms. This increase in fluid buildup especially around the lungs impedes the functioning of the heart, causing an increase in blood pressure.[78][76][77]

Uropathogenic E. coli (UPEC) is one of the main causes of urinary tract infections.[79] It is part of the normal microbiota in the gut and can be introduced in many ways. In particular for females, the direction of wiping after defecation (wiping back to front) can lead to fecal contamination of the urogenital orifices. Anal intercourse can also introduce this bacterium into the male urethra, and in switching from anal to vaginal intercourse, the male can also introduce UPEC to the female urogenital system.

Enterotoxigenic E. coli (ETEC) is the most common cause of traveler's diarrhea, with as many as 840 million cases worldwide in developing countries each year. The bacteria, typically transmitted through contaminated food or drinking water, adheres to the intestinal lining, where it secretes either of two types of enterotoxins, leading to watery diarrhea. The rate and severity of infections are higher among children under the age of five, including as many as 380,000 deaths annually.[80]

In May 2011, one E. coli strain, O104:H4, was the subject of a bacterial outbreak that began in Germany. Certain strains of E. coli are a major cause of foodborne illness. The outbreak started when several people in Germany were infected with enterohemorrhagic E. coli (EHEC) bacteria, leading to hemolytic-uremic syndrome (HUS), a medical emergency that requires urgent treatment. The outbreak did not only concern Germany, but also 15 other countries, including regions in North America.[81] On 30 June 2011, the German Bundesinstitut für Risikobewertung (BfR) (Federal Institute for Risk Assessment, a federal institute within the German ) announced that seeds of fenugreek from Egypt were likely the cause of the EHEC outbreak.[82]

Some studies have demonstrated an absence of E.coli in the gut flora of subjects with the metabolic disorder Phenylketonuria. It is hypothesized that the absence of these normal bacterium impairs the production of the key vitamins B2 (riboflavin) and K2 (menaquinone) - vitamins which are implicated in many physiological roles in humans such as cellular and bone metabolism - and so contributes to the disorder.[83]

Carbapenem-resistant E. coli (carbapenemase-producing E. coli) that are resistant to the carbapenem class of antibiotics, considered the drugs of last resort for such infections. They are resistant because they produce an enzyme called a carbapenemase that disables the drug molecule.[84]

The time between ingesting the STEC bacteria and feeling sick is called the "incubation period". The incubation period is usually 3–4 days after the exposure, but may be as short as 1 day or as long as 10 days. The symptoms often begin slowly with mild belly pain or non-bloody diarrhea that worsens over several days. HUS, if it occurs, develops an average 7 days after the first symptoms, when the diarrhea is improving.[85]

Diagnosis of infectious diarrhea and identification of antimicrobial resistance is performed using a stool culture with subsequent antibiotic sensitivity testing. It requires a minimum of 2 days and maximum of several weeks to culture gastrointestinal pathogens. The sensitivity (true positive) and specificity (true negative) rates for stool culture vary by pathogen, although a number of human pathogens can not be cultured. For culture-positive samples, antimicrobial resistance testing takes an additional 12–24 hours to perform.

Current point of care molecular diagnostic tests can identify E. coli and antimicrobial resistance in the identified strains much faster than culture and sensitivity testing. Microarray-based platforms can identify specific pathogenic strains of E. coli and E. coli-specific AMR genes in two hours or less with high sensitivity and specificity, but the size of the test panel (i.e., total pathogens and antimicrobial resistance genes) is limited. Newer metagenomics-based infectious disease diagnostic platforms are currently being developed to overcome the various limitations of culture and all currently available molecular diagnostic technologies.

The mainstay of treatment is the assessment of dehydration and replacement of fluid and electrolytes. Administration of antibiotics has been shown to shorten the course of illness and duration of excretion of enterotoxigenic E. coli (ETEC) in adults in endemic areas and in traveller's diarrhea, though the rate of resistance to commonly used antibiotics is increasing and they are generally not recommended.[86] The antibiotic used depends upon susceptibility patterns in the particular geographical region. Currently, the antibiotics of choice are fluoroquinolones or azithromycin, with an emerging role for rifaximin. Oral rifaximin, a semisynthetic rifamycin derivative, is an effective and well-tolerated antibacterial for the management of adults with non-invasive traveller's diarrhea. Rifaximin was significantly more effective than placebo and no less effective than ciprofloxacin in reducing the duration of diarrhea. While rifaximin is effective in patients with E. coli-predominant traveller's diarrhea, it appears ineffective in patients infected with inflammatory or invasive enteropathogens.[87]

ETEC is the type of E. coli that most vaccine development efforts are focused on. Antibodies against the LT and major CFs of ETEC provide protection against LT-producing, ETEC-expressing homologous CFs. Oral inactivated vaccines consisting of toxin antigen and whole cells, i.e. the licensed recombinant cholera B subunit (rCTB)-WC cholera vaccine Dukoral, have been developed. There are currently no licensed vaccines for ETEC, though several are in various stages of development.[88] In different trials, the rCTB-WC cholera vaccine provided high (85–100%) short-term protection. An oral ETEC vaccine candidate consisting of rCTB and formalin inactivated E. coli bacteria expressing major CFs has been shown in clinical trials to be safe, immunogenic, and effective against severe diarrhoea in American travelers but not against ETEC diarrhoea in young children in Egypt. A modified ETEC vaccine consisting of recombinant E. coli strains over-expressing the major CFs and a more LT-like hybrid toxoid called LCTBA, are undergoing clinical testing.[89][90]

Other proven prevention methods for E. coli transmission include handwashing and improved sanitation and drinking water, as transmission occurs through fecal contamination of food and water supplies. Additionally, thoroughly cooking meat and avoiding consumption of raw, unpasteurized beverages, such as juices and milk are other proven methods for preventing E.coli. Lastly, avoid cross-contamination of utensils and work spaces when preparing food.[91]

Escherichia coli bacterium, 2021, Illustration by David S. Goodsell, RCSB Protein Data Bank
This painting shows a cross-section through an Escherichia coli cell. The characteristic two-membrane cell wall of gram-negative bacteria is shown in green, with many lipopolysaccharide chains extending from the surface and a network of cross-linked peptidoglycan strands between the membranes. The genome of the cell forms a loosely-defined "nucleoid", shown here in yellow, and interacts with many DNA-binding proteins, shown in tan and orange. Large soluble molecules, such as ribosomes (colored in reddish purple), mostly occupy the space around the nucleoid.

Because of its long history of laboratory culture and ease of manipulation, E. coli plays an important role in modern biological engineering and industrial microbiology.[93] The work of Stanley Norman Cohen and Herbert Boyer in E. coli, using plasmids and restriction enzymes to create recombinant DNA, became a foundation of biotechnology.[94]

E. coli is a very versatile host for the production of heterologous proteins,[95] and various protein expression systems have been developed which allow the production of recombinant proteins in E. coli. Researchers can introduce genes into the microbes using plasmids which permit high level expression of protein, and such protein may be mass-produced in industrial fermentation processes. One of the first useful applications of recombinant DNA technology was the manipulation of E. coli to produce human insulin.[96]

Many proteins previously thought difficult or impossible to be expressed in E. coli in folded form have been successfully expressed in E. coli. For example, proteins with multiple disulphide bonds may be produced in the periplasmic space or in the cytoplasm of mutants rendered sufficiently oxidizing to allow disulphide-bonds to form,[97] while proteins requiring post-translational modification such as glycosylation for stability or function have been expressed using the N-linked glycosylation system of Campylobacter jejuni engineered into E. coli.[98][99][100]

Modified E. coli cells have been used in vaccine development, bioremediation, production of biofuels,[101] lighting, and production of immobilised enzymes.[95][102]

Strain K-12 is a mutant form of E. coli that over-expresses the enzyme Alkaline Phosphatase (ALP).[103] The mutation arises due to a defect in the gene that constantly codes for the enzyme. A gene that is producing a product without any inhibition is said to have constitutive activity. This particular mutant form is used to isolate and purify the aforementioned enzyme.[103]

Strain OP50 of Escherichia coli is used for maintenance of Caenorhabditis elegans cultures.

Strain JM109 is a mutant form of E. coli that is recA and endA deficient. The strain can be utilized for blue/white screening when the cells carry the fertility factor episome[104] Lack of recA decreases the possibility of unwanted restriction of the DNA of interest and lack of endA inhibit plasmid DNA decomposition. Thus, JM109 is useful for cloning and expression systems.

E. coli is frequently used as a model organism in microbiology studies. Cultivated strains (e.g. E. coli K12) are well-adapted to the laboratory environment, and, unlike wild-type strains, have lost their ability to thrive in the intestine. Many laboratory strains lose their ability to form biofilms.[105][106] These features protect wild-type strains from antibodies and other chemical attacks, but require a large expenditure of energy and material resources. E. coli is often used as a representative microorganism in the research of novel water treatment and sterilisation methods, including photocatalysis. By standard plate count methods, following sequential dilutions, and growth on agar gel plates, the concentration of viable organisms or CFUs (Colony Forming Units), in a known volume of treated water can be evaluated, allowing the comparative assessment of materials performance.[107]

In 1946, Joshua Lederberg and Edward Tatum first described the phenomenon known as bacterial conjugation using E. coli as a model bacterium,[108] and it remains the primary model to study conjugation.[109] E. coli was an integral part of the first experiments to understand phage genetics,[110] and early researchers, such as Seymour Benzer, used E. coli and phage T4 to understand the topography of gene structure.[111] Prior to Benzer's research, it was not known whether the gene was a linear structure, or if it had a branching pattern.[112]

E. coli was one of the first organisms to have its genome sequenced; the complete genome of E. coli K12 was published by Science in 1997[55]

From 2002 to 2010, a team at the Hungarian Academy of Science created a strain of Escherichia coli called MDS42, which is now sold by Scarab Genomics of Madison, WI under the name of "Clean Genome. E.coli",[113] where 15% of the genome of the parental strain (E. coli K-12 MG1655) were removed to aid in molecular biology efficiency, removing IS elements, pseudogenes and phages, resulting in better maintenance of plasmid-encoded toxic genes, which are often inactivated by transposons.[114][115][116] Biochemistry and replication machinery were not altered.

By evaluating the possible combination of nanotechnologies with landscape ecology, complex habitat landscapes can be generated with details at the nanoscale.[117] On such synthetic ecosystems, evolutionary experiments with E. coli have been performed to study the spatial biophysics of adaptation in an island biogeography on-chip.

Studies are also being performed attempting to program E. coli to solve complicated mathematics problems, such as the Hamiltonian path problem.[118]

In other studies, non-pathogenic E. coli has been used as a model microorganism towards understanding the effects of simulated microgravity (on Earth) on the same.[119][120]

In 1885, the German-Austrian pediatrician Theodor Escherich discovered this organism in the feces of healthy individuals. He called it Bacterium coli commune because it is found in the colon. Early classifications of prokaryotes placed these in a handful of genera based on their shape and motility (at that time Ernst Haeckel's classification of bacteria in the kingdom Monera was in place).[90][121][122]

Bacterium coli was the type species of the now invalid genus Bacterium when it was revealed that the former type species ("Bacterium triloculare") was missing.[123] Following a revision of Bacterium, it was reclassified as Bacillus coli by Migula in 1895[124] and later reclassified in the newly created genus Escherichia, named after its original discoverer.[125]

In 1996, the world's worst to date outbreak of E. coli food poisoning occurred in Wishaw, Scotland, killing 21 people.[126] This death toll was exceeded in 2011, when the 2011 Germany E. coli O104:H4 outbreak, linked to organic fenugreek sprouts, killed 53 people.