Author Archives: Júne

This Is How We’ll Vaccinate the World Against COVID-19

This Is How We’ll Vaccinate the World Against COVID-19

Image of a maze with the Earth at the center and a hypodermic needle at the bottom.

Photo-illustration: Edmon de Haro

In a triumph of science, the first two large-scale trials to report the effectiveness of vaccines against SARS-CoV-2—the deadly, highly contagious virus that causes COVID-19—were both great successes right out of the gate. In November, the pharmaceutical giant Pfizer and the much younger biotech company Moderna both reported that their vaccines were about 95 percent effective in preventing cases of COVID-19. The news came just 10 months after the virus was first isolated and sequenced in a lab in China.

As of early December, 50 other candidate vaccines were making their way through human clinical trials, according to the World Health Organization. Thirteen of those vaccines were already in the final stage before approval, each being tested on tens of thousands of volunteers to check for side effects and measure efficacy: how well the shots protect against the disease. One of those, made by AstraZeneca and the University of Oxford, also showed promising—though less clear—efficacy results in late November.

But even before those vaccines neared the finish line, the heaviest burdens of ending the pandemic and restoring the global economy had shifted from the scientists to the engineers. Our hopes now hinge on the technologists who are challenged with manufacturing and transporting billions of doses of new, highly complex biotech products—and the public health officials figuring out how best to distribute them to a world that can hardly wait.

Throughout 2020, vaccine producers and their suppliers constructed new factories and otherwise increased their capacity while governments, international agencies, and philanthropies signed billion-dollar contracts, preordering doses by the hundreds of millions. In the United States, the federal initiative known as Operation Warp Speed deployed a budget of more than US $12 billion to develop, test, and mass-produce new vaccines along with the vials, syringes, and other materials needed to deliver them to an anxious populace.

Moncef Slaoui, the initiative’s chief scientist, told IEEE Spectrum in October that the U.S. government had already begun stockpiling two vaccines (from Pfizer and Moderna), and that commercial-scale production was beginning on two others. “So if and when they are approved” by regulators at the U.S. Food and Drug Administration (FDA), he said, “those can be used in the [U.S.] population immediately.”

Creation and deployment of a new vaccine against a novel disease normally takes at least a decade. The audacious goal of Operation Warp Speed and like-minded efforts in other nations is to complete this feat in less than two years. The pace is every bit as intense as the space race of the 1960s, but the stakes are far higher.

There are plenty of reasons for skepticism. “When was the last time anybody made a billion of anything safely and reliably?” asks Arthur Caplan, a bioethics professor at NYU Grossman School of Medicine. “Never,” he says. “Plants go offline, crap breaks, you can’t find a part.” Caplan argues that we should expect snafus: “There’s a ton of things that can go wrong just on manufacturing.”

But also consider this: In 2019, brewers in the United States used applied microbiology to ferment, filter, fill, package, and distribute nearly 50 billion bottles and cans of beer—all in copacetic single-dose units, most of it refrigerated.

Will the university, industry, and government teams grappling with the vaccine challenge be able to bring together the interrelated technical systems that must work in concert—including massive bioreactors and purification lines, acres of fast-fill vials, and thousands of planeloads of ultracold shipping containers? Can humanity really pull this off?

Somewhat surprisingly, the answer so far appears to be: Yes, we can.

Not everything will go smoothly. Paul Offit, a member of the COVID-19 vaccine working group at the U.S. National Institutes of Health, sat down in June to talk with the editor of the Journal of the American Medical Association about the steep road ahead. “The hardest thing about making a vaccine is mass-producing it,” Offit said. “You have to have the right buffering agent, the right stabilizing agent. You have to have the right vial. You have to do real-time stability studies to make sure that when the vaccine leaves the manufacturing plant, that the time it takes to get from the tarmac to the person’s arm does not cause any problems. Because, remember, when you’re shipping vaccines, they’re going to be exposed to high temperatures and low temperatures, and you have to make sure that you have a stable product.”

Take, for example, the RNA-based vaccine that Pfizer and its German partner BioNTech developed—the first to be approved by the FDA. This kind of vaccine contains slightly altered pieces of the virus’s genetic material (RNA) encased in nanometer-size fatty blobs, which fuse with human cells and cause them to produce the SARS-CoV-2 spike protein, thus triggering an immune response in the body. None of the vaccine experts interviewed for this article had dared to hope that any COVID-19 vaccine—let alone an RNA-based vaccine, a type that’s never before been commercialized—would achieve a 95 percent efficacy rate.

But that stellar effectiveness can wink out if the vaccine gets too warm for too long. As Offit emphasized, temperature affects all vaccines; most (including AstroZeneca’s) must remain between 2 °C and 8 °C to retain potency. RNA vaccines, however, are especially unstable.

At its assembly plants in Kalamazoo, Mich., and Puurs, Belgium, Pfizer has warehouses full of ultracold freezers to store its vaccine at –70 °C. Workers pack the frozen vials into custom-built containers that each hold about 1,000 of them, along with a layer of dry-ice pellets. Also in the box is a GPS-enabled thermal sensor that transmits the temperature and location of the package as it moves via trucks and planes to distribution centers throughout the world.

Distributors are rapidly scaling up too. UPS has said that it’s building two warehouses full of deep freezers—one in Louisville, Ky., and another in the Netherlands—that are capable of storing enough COVID-19 vaccines to inoculate millions of people. FedEx, which routinely delivers about 500,000 dry-ice-packed shipments a month, is doing the same in Memphis, Indianapolis, and Paris.

Rich Gottwald, president of the Compressed Gas Association, says that a nationwide shortage of carbon dioxide last spring spurred CO2 producers to work closely with vaccine makers, ensuring that dry ice will be there when and where they need it. “There may be some challenges in getting the vaccine distributed, but dry ice is not one of those challenges,” he says.

Most of these trips from factory to pharmacy or clinic should take no more than three days, and Pfizer’s vaccine stays fresh for up to 10 days in its container when unopened. Once thawed, the liquids must be kept in pharmacy-grade refrigerators and used within five days. Moderna claims its RNA vaccine can be transported and stored in deep freezers at –20 °C for up to six months and then refrigerated at distribution points for up to 30 days.

Unfortunately, only technologically advanced nations will be able to manage all these logistical complexities. In September, the shipping company DHL analyzed the transport challenges posed by a global rollout of COVID-19 vaccines. Its report concluded that mass distribution of vaccines requiring dry ice for storage will be feasible in only about two dozen countries, accounting for 2.5 billion people. All of Africa, most of South America, and much of Asia would struggle to put such a vaccine to widespread use.

In contrast, DHL estimates, around 60 countries would find it quite possible to inoculate their combined 5 billion residents with vaccines like AstraZeneca’s, which can be stored and transported at refrigerator temperatures of 2 °C to 8 °C (a typical temperature in pharmaceutical supply chains). Both ease of transport and substantially lower manufacturing costs favor more traditional vaccines, such as those that use harmless viruses to trigger an immune response. AstraZeneca’s vaccine, for example, is expected to sell for about a third the cost of the RNA vaccines.

In the hope of making coronavirus vaccines available to even the poorest nations, the World Health Organization, the Coalition for Epidemic Preparedness Innovations, and Gavi, the Vaccine Alliance have joined together to form the COVAX initiative. The coalition has been raising money to secure 2 billion vaccine doses through 2021 for the 90-plus low- and middle-income countries expected to participate, many of which can’t afford to buy or make vaccines on their own. As of mid-November, COVAX reported about US $2 billion in pledged donations, but it said at least $5 billion more is needed to achieve its goal.

These front-runners are just the opening salvo in what will be a protracted battle against SARS-CoV-2. Reinforcements, in the form of other vaccine options, should arrive in 2021 and will be crucial in bringing this pandemic to an end.

“No one manufacturer is going to be able to scale up and make enough doses for 7 billion people,” says Leonard Friedland, director of scientific affairs and public health at GSK Vaccines. “So I hope they all work.”

Pfizer said in July that it was aiming to produce 100 million doses of its product by the end of 2020, but by November it had halved that estimate. The hardest part for Pfizer has been mixing the synthetic pieces of RNA with fatty acids and cholesterol to form delivery particles of just the right size, says Slaoui of Operation Warp Speed. “These mixing operations are very complex,” he says.

And there is likely to be a shortage of cholesterol needed for the lipid nanoparticles, warns Jake Becraft, CEO of Strand Therapeutics, a biotech company in Massachusetts that is developing RNA vaccines of its own. “The simple fact is that those supply chains were nowhere near ready for the demand of billions of vaccines,” Becraft says. Some capacity can be redirected to support COVID-19 vaccine production, he says, “but it will also come at the cost of a lot of drugs in the pipeline for diseases like cystic fibrosis and cancer” that require the same ingredients.

Nevertheless, Pfizer has projected that it will produce up to 1.3 billion doses of COVID-19 vaccine by the end of 2021. Because each person’s inoculation requires two doses spaced two or three weeks apart, that should be enough to protect roughly 650 million people. The U.S. government has prepurchased 100 million of those doses, with an option to buy 500 million more.

As of press time, Moderna was hoping that its vaccine would be ready for broad release to the public in late December, assuming that all went smoothly with its licensing application to the FDA. The company signed up a manufacturing partner, Lonza Group, which is scaling up global manufacturing to be able to deliver 100 million doses a year from its site in Portsmouth, N.H., and another 300 million doses a year from a larger facility in Visp, Switzerland.

Meanwhile, in China, the companies Sinopharm and Sinovac have late-stage trials underway on three vaccines that contain intact coronavirus, which is harvested from live cell cultures and then chemically treated so that it cannot reproduce inside a person. This technology, used to make the annual flu vaccine and many others, has a long track record of success. And China has lots of manufacturing capacity for making inactivated-virus vaccines, notes John Moore, a professor of immunology at Weill Cornell Medical School in New York. Sinopharm is reportedly gearing up to produce 1 billion doses of its vaccine in 2021, if the product succeeds in trials.

But drugmakers elsewhere have largely steered clear of vaccines made from live cells infected with the SARS-CoV-2 virus, which pose obvious dangers to workers. The need for “biosafety level 3” facilities designed and certified to handle such biohazards makes such products harder to scale up, according to Kate Bingham, who chairs the U.K. government’s Vaccines Taskforce.

Of the remaining five vaccines in final-stage trials, four (including the AstraZeneca vaccine) are made by inserting a key gene from the coronavirus into a largely harmless human or chimpanzee adenovirus. After injection, these viral vector vaccines produce the important SARS-CoV-2 protein fragment inside the body, triggering an immune reaction.

The tricky part is harvesting enough of the engineered adenoviruses from the cell cultures in which they are grown. “The biggest issue as we scale up has been optimizing the infection step,” Slaoui says. Stirring 2,000 liters of living cells well enough to let the virus infect most of them—but gently enough so as not to rupture many of them—has proven difficult.

A similar scale-up challenge comes up in the production of the final kind of vaccine, one made by Novavax in Gaithersburg, Md. The company makes its protein vaccine in a factory in Morrisville, N.C., by growing huge batches of armyworm moth cells, which it has genetically engineered to churn out copies of a subunit of the coronavirus’s spike protein. After breaking up the cells and purifying the slurry, workers mix the desired protein with harmless microscopic particles that will carry the virus fragment into the body to trigger an immune response.

Here, Slaoui says, the big challenge is to bust up the cells in a way that doesn’t completely overwhelm the purification process with unwanted moth proteins. The company has a clinical trial underway in the United Kingdom, but in late November it delayed planned trials in the United States and Mexico because production was not scaling up as quickly as anticipated.

Nevertheless, Novavax has promised the U.S. government 100 million doses as they come off its production lines, and the company claims it has the capacity at a plant it bought in the Czech Republic to make a billion more doses in 2021.

If several vaccines gain approval and begin ramping up production in parallel, could there be what engineers call a “common mode” failure? The vaccines may vary, for example, but so far they’re all packaged the same way—in 5-milliliter vials made of a special kind of glass—and then injected into the arm via syringe.

“Syringes are probably less of a problem than vials and stoppers,” says Georges Benjamin, executive director of the American Public Health Association. “If I was wanting to pay attention to what can go wrong, it’d be that.”

Vaccines are so potent that each vial typically holds enough for five doses. Moderna claims its RNA vaccine is stronger still, so doctors can get 10 doses from every vial. On the one hand, that means that a 1,000-vial container of Moderna vaccine could give 10,000 people one of the two doses they will need. On the other hand, every vial that breaks wastes that many more doses.

The problem with frozen vaccines isn’t that ultracold temperatures make vials brittle, says Robert Schaut, the scientific director of pharmaceutical technologies at the glass-making company Corning. “You’re already below its glass-transition temperature, unlike a plastic or other material. So glass is exactly as strong at –70 °C as it is at room temperature,” he points out. “But when you cool vaccine down to those temperatures, the liquid expands and puts a lot of stress on the glass.”

Two years ago, Corning came out with a stronger, aluminosilicate glass that can be prestressed during vial manufacture by replacing sodium atoms in the materials with potassium atoms. That switch introduces hundreds of megapascals of compressive stress into the material—plenty enough to resist breakage during freezing or transport, Schaut says. He claims that the stronger glass vials also eliminate flaking and dramatically reduce tiny particles dislodged during the filling process, which in the past has led to recalls of conventional glass vials.

More useful still, the new vials are slippery. At the fill-finish stage of vaccine production, when big batches of vials are jostling along through the machinery, the slick coating on the vials lets them glide past each other more easily. Reducing jams on manufacturing lines adds 20 to 50 percent to the throughput, Schaut says, and once lines are moving smoothly, operators can double their speed.

Since the first quarter of 2020, Corning has been shipping millions of vials a month to its Operation Warp Speed partners from its plants outside Corning, N.Y. The company used part of its $204 million government contract to speed construction of a new factory in North Carolina, set to come online next year. Schaut says Corning should now be able to churn out 164 million vials a year—enough to ship at least 820 million doses of vaccine. 

“We set the objective to have enough vaccine to immunize the U.S. population by the first half of 2021,” said Slaoui of Operation Warp Speed, in October. “And that definitely will be the case. We will have 600 million to 700 million doses or more by May or June 2021.”

Thanks to unprecedented government investments, an impressively coordinated scramble by several industries, and some fortuitous technological advances, Slaoui’s boast seems credible. Since April, Stacy Springs and Donovan Guttieres at M.I.T.’s Center for Biomedical Innovation have been collecting data about each step of the supply, production, and distribution chains for COVID-19 vaccines. They have built models to investigate dependencies and identify critical points where shortages could interrupt production.

So far, Springs says, they have seen companies and agencies cooperating to spot problems and fix them: “A lot of the manufacturers are already moving to dual sourcing of materials and putting in other safety nets, so that they’re not going to be in a position where they don’t have what they need.” Although governments have been competing with one another to some extent to preorder vaccine for their own people, “there’s a lot of goodwill and sharing going on within the industry,” she says.

It is indeed encouraging to learn that the immense efforts being mounted now to vaccinate the world against COVID-19 are being undertaken in a cooperative spirit. Perhaps, after a year of divisiveness and social isolation, the realization is dawning that we’re all in this together. 


via IEEE Spectrum Biomedical

December 16, 2020 at 01:30AM

Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads

Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads

Cell lines

Cell lines for Puerto Rican individuals HG00731, HG00732 and HG00733 have been previously described19.

HiFi PacBio sequencing

Isolated DNA was prepared for HiFi library preparation as described3. Briefly, DNA was sheared to an average size of about 15 kbp using Covaris gTUBE, and the quantity and size were checked using Qubit (Thermo Fisher) and FEMTO Pulse (Agilent) instruments. Fragments underwent library preparation using the Template Prep Kit v1 (PacBio) and then fractionation on a SageELF (Sage Science) instrument. After evaluating size, fractions averaging 11, 13 or 15 kbp were sequenced on a Sequel II (PacBio) instrument using Sequel II chemistry v1 or v2EA (Early Access beta). After sequencing, raw data were analyzed with SMRT Link 7.1 or 8.0 using the CCS protocol with a cutoff minimum of three passes and estimated accuracy of 0.99. In total, 18 SMRT Cell 8Ms were run for the Puerto Rican trio (HG00731, HG00732 and HG00733) for an average yield per sample of 91 Gbp of HiFi reads (Supplementary Table 7).

Strand-seq data analysis

All Strand-seq data in a FASTQ format were obtained from publicly available sources (‘Data availability’). At every step that requires alignment of short-read Strand-seq data to the squashed or clustered de novo assembly (Fig. 1), we used BWA-MEM (version 0.7.15-r1140) with the default parameters. In the next step, we filtered out all secondary and supplementary alignments using SAMtools (version 1.9). Subsequently, duplicate reads were marked using Sambamba (version 0.6.8). For every Strand-seq data analysis, we filtered out reads with mapping quality less than 10 as well as all duplicate reads.

Squashed genome assembly

Initially, squashed assemblies were constructed to produce a set of unphased contigs. We assembled HiFi reads using the Peregrine assembler.

All Peregrine (v0.1.5.5) assemblies were run using the following command: asm {reads.fofn} 36 36 36 36 36 36 36 36 36 –with-consensus \
–shimmer-r 3 –best_n_ovlp 8 –output {assembly.dir}

Clustering contigs into chromosomal scaffolds

We used the R package SaaRclust15 to cluster de novo squashed assemblies into chromosomal scaffolds. SaaRclust takes as an input Strand-seq reads aligned to the squashed de novo assembly in a BAM format. Given the parameter settings, we discarded contigs shorter than 100 kbp from further analysis. Remaining contigs were partitioned into variable sized bins of 200,000 Strand-seq mappable positions. The counts of aligned reads per bin, separated by directionality (+/Crick or −/Watson), are used as an input for SaaRclust that divides contigs into a user-defined number of clusters (set to n = 100|150). Contigs genotyped as Watson–Crick (WC) in most cells were discarded. We further removed contigs that could be assigned to multiple clusters with probability P < 0.25 (Supplementary Fig. 23). Subsequently, SaaRclust merges clusters that share the same strand inheritance across multiple Strand-seq libraries. Shared strand inheritance is used to construct a graph of connected components (clusters), and the most connected subgraphs are reported, resulting in approximately 24 clusters—that is, one cluster should ideally be representative of one human chromosome. Next, we defined misoriented contigs within each cluster as those having opposing directionality in every Strand-seq library. We used hierarchical clustering to detect groups of minus-oriented and plus-oriented contigs. To synchronize contig directionality, we switch direction in one group of contigs from plus to minus or vice versa. Contigs synchronized by direction are then subjected to positional ordering within a cluster. We again use contig strand state coinheritance as a proxy to infer physical distance for each contig pair in every Strand-seq library. The resultant coinheritance matrix serves as input for the ‘Traveling Salesman Algorithm’ implemented in R package TSP (version 1.1–7)41 and attempts to order contigs based on strand state coinheritance. As the initial squashed assembly might contain assembly errors, SaaRclust is able to detect and correct such errors as bins of the same contig being assigned to different clusters (‘Chimeric contig’) or bins of the same contig that differ in directionality (‘Misoriented contig’). Lastly, we export clustered, reoriented and ordered contigs into a single FASTA file with a single FASTA record per cluster. A complete list of parameters used to run SaaRclust in this study is reported below:

SaaRclust command:

scaffoldDenovoAssembly(bamfolder = <>, outputfolder = <>, = TRUE, = TRUE, pairedEndReads = TRUE, bin.size = 200000, step.size = 200000, = 0.25, bin.method = ’dynamic’, min.contig.size = 100000, assembly.fasta = assembly.fasta, concat.fasta = TRUE, num.clusters = 100|150, remove.always.WC = TRUE, mask.regions = FALSE)

Variant calling

Clustered assemblies in full chromosomal scaffolds are then used for realignment of long PacBio reads. To call variants in HiFi reads, we use DeepVariant38 v0.9.0, which uses a deep neural network with a pre-trained model (–model_type=PACBIO). For the variant calling, HiFi reads were aligned using pbmm2 v1.1.0 ( with settings align –log-level DEBUG –preset CCS –min-length 5000 and filtered with samtools view -F 2308. After variant calling, we select only heterozygous SNVs using BCFtools v1.9.

For both PacBio CLR and ONT reads, we use the LongShot variant caller:

longshot –no_haps –force_overwrite –auto_max_cov
–bam {alignments} –ref {clustered_assm}
–region {contig} –sample_id {individual} –out {output}

Phasing chromosomal scaffolds

To create completely phased chromosomal scaffolds, we used a combination of Strand-seq and long-read phasing18. First, we realigned Strand-seq data on top of the clustered assemblies as stated previously. Only regions that inherit a Watson and Crick template strand from each parent are informative for phasing and are detected using breakpointR42. Haplotype-informative regions are then exported using the breakpointR function called ‘exportRegions’. Using the set of haplotype-informative regions together with positions of heterozygous SNVs, we ran StrandPhaseR18 to phase SNVs into whole-chromosome haplotypes. Such sparse haplotypes are then used as a haplotype backbone for long-read phasing using WhatsHap to increase density of phased SNVs.

breakpointR command (run and export of results):

breakpointr(inputfolder = <>, outputfolder = <>, windowsize = 500000, binMethod = ’size’, pairedEndReads = TRUE, pair2frgm = FALSE, min.mapq = 10, filtAlt = TRUE, background = 0.1, minReads = 50)
exportRegions(datapath = <>, file = <>, collapseInversions = TRUE, collapseRegionSize = 5000000, minRegionSize = 5000000, state = ’wc’)

StrandPhaseR command:

strandPhaseR(inputfolder = <>, positions = <SNVs.vcf>, WCregions = <hap.informtive.regions>, pairedEndReads = TRUE, min.mapq = 10, min.baseq = 20, num.iterations = 2, translateBases = TRUE, splitPhasedReads = TRUE)

WhatsHap command:

whatshap phase –chromosome {chromosome} –reference {reference.fasta} {input.vcf} {input.bam} {input.vcf_sparse_haplotypes}

Haplotagging PacBio reads

Having completely phased chromosomal scaffolds at sufficient SNV density allows us to split long PacBio reads into their respective haplotypes using WhatsHap. This step can be performed in two ways: splitting all reads across all clusters into two bins per haplotype or splitting reads into two bins per cluster and per haplotype. Both strategies consist of the same two steps: 1) labeling all reads with their respective haplotype (‘haplotagging’) and 2) splitting the input reads only by haplotype or by haplotype and cluster (‘haplosplitting’). The WhatsHap commands are identical in both cases except for limiting WhatsHap to a specific cluster during haplotagging and discarding reads from other clusters to separate the reads by haplotype and cluster:

whatshap haplotag [–regions {cluster}] –output {output.bam} –reference {input.fasta} –output-haplotag-list {output.tags}{input.vcf} {input.bam}
whatshap split [–discard-unknown-reads] –pigz –output-h1 {output.hap1} –output-h2 {output.hap2} –output-untagged {output.un} –read-lengths-histogram {output.hist} {input.fastq} {input.tags}

Creating haplotype-specific assemblies

After haplotagging and haplosplitting, the long HiFi reads separated by haplotype were then used to create fully haplotype-resolved assemblies. Our haplotagging and haplosplitting strategy enabled us to examine two types of haploid assemblies per input long-read dataset: the two haplotype-only assemblies (short: h1 and h2), plus the haploid assemblies created by using also all untagged reads—that is, all reads that could not be assigned to a haplotype (short: h1-un and h2-un). Hence, for each input read dataset, this amounts to four ‘genome-scale’ assemblies. We focused our analyses on the read sets h1-un (H1) and h2-un (H2). Final phased assemblies were created using parameters stated in the ‘Squashed genome assembly’ section.

SD analysis

SDs were defined as resolved or unresolved based on their alignments to SDs defined in GRCh38 ( using minimap2 with the following parameters: —secondary=no -a –eqx -Y -x asm20 -m 10000 -z 10000,50 -r 50000 –end-bonus=100 -O 5,56 -E 4,1 -B 5 (ref. 33). Alignments that extended a minimum number of base pairs beyond the annotated SDs were considered to be resolved. The percent of resolved SDs was determined for minimum extension varying from 10,000 to 50,000 bp, and the average was reported. This analysis is adapted from Vollger et al.34 (

SD collapse analysis

Collapses were identified using the methods described in Vollger et al.34. In brief, the method identifies regions in the assemblies that are at least 15 kbp in length and have read coverage exceeding the mean coverage plus three standard deviations. Additionally, collapses with more than 75% common repeat elements (identified with RepeatMasker) or TRs (identified with Tandem Repeats Finder43) are excluded.

BAC clone insert sequencing

BAC clones from the VMRC62 clone library were selected from random regions of the genome not intersecting with an SD (n = 77). DNA from positive clones were isolated, screened for genome location and prepared for long-insert PacBio sequencing as previously described (Segmental Duplication Assembler (SDA))34. Libraries were sequenced on the PacBio RS II with P6-C4 chemistry (17 clones) or the PacBio Sequel II with Sequel II 2.0 chemistry (S/P4.1-C2/5.0-8 M; 60 clones). We performed de novo assembly of pooled BAC inserts using Canu v1.5 (Koren et al.26) for the 17 PacBio RS II BACs and using the PacBio SMRT Link v8.0 Microbial assembly pipeline (Falcon + Raptor, for the 60 Sequel II BACs. After assembly, we removed vector sequence pCCBAC1, re-stitched the insert and then polished with Quiver or Arrow. Canu is specifically designed for assembly with long error-prone reads, whereas Quiver/Arrow is a multi-read consensus algorithm that uses the raw pulse and base-call information generated during SMRT (single-molecule, real-time) sequencing for error correction. We reviewed PacBio assemblies for misassembly by visualizing the read depth of PacBio reads in Parasight (, using coverage summaries generated during the resequencing protocol.

Assembly polishing and error correction

Assembly misjoints are visible using Strand-seq as recurrent changes in strand state inheritance along a single contig. Strand state changes can result from a double-strand break (DSB) repaired by homologous recombination during DNA replication, causing an SCE1. DSBs are random independent events that occur naturally during a cell’s lifespan and, therefore, are unlikely to occur at the same position in multiple single cells2. Instead, a strand state change at the same genomic position in a population of cells is indicative of a different process other than DSB (such as a genomic SV or genome misassembly)13,44,45. Observing a complete switch from WW (Watson–Watson) to CC (Crick–Crick) strand state or vice versa at about 50% frequency is observed when a part of the contig is being misoriented (Supplementary Fig. 6). All detected misassemblies in the final phased assemblies (Supplementary Table 1) were corrected using SaaRclust using the following parameters:

scaffoldDenovoAssembly(bamfolder = <>, outputfolder = <>, = TRUE, = TRUE, pairedEndReads = TRUE, bin.size = 200000, step.size = 200000, = 0.9, bin.method = ’dynamic’, ord.method = ’greedy’, min.contig.size = 100000, = 500000, assembly.fasta = assembly.fasta, concat.fasta = FALSE, num.clusters = 100|150, remove.always.WC = TRUE, mask.regions = FALSE)

Common assembly breaks

To detect recurrent breaks in our assemblies, we searched for assembly gaps present in at least one phased assembly completed by Flye (for CLR PacBio reads) or Peregrine (for HiFi PacBio reads). For this, we mapped all haplotype-specific contigs to GRCh38 using minimap2 using the same parameters as in the SD analysis method. We defined an assembly break as a gap between two subsequent contigs. We searched for reoccurring assembly breaks in 500-kbp non-overlapping bins and filtered out contigs smaller than 100 kbp. Each assembly break was defined as a range between the first and the last breakpoint found in any given genomic bin and was annotated based on the overlap with known SDs, gaps, centromeres and SV callsets19, allowing overlaps within 10-kbp distance from the breakpoint boundaries.

Base accuracy

Phred-like QV calculations were made by aligning the final assemblies to 77 sequenced and assembled BACs from VMRC62 falling within unique regions of the genome (>10 kbp away from the closest SD) where at least 95% of the BAC sequence was aligned. The following formula was used to calculate the QV, and insertions and deletions of size N were counted as N errors: QV = –10log10(1 – (percent identity/100)).

Each assembly was polished twice with Racon28 using the haplotype-partitioned HiFi FASTQs. The alignment and polishing steps were run with the following commands:

minimap2 -ax map-pb –eqx -m 5000 -t {threads} –secondary=no {ref} {fastq} | samtools view -F 1796 – > {sam}
racon {fastq} {sam} {ref} -u -t {threads} > {output.fasta}

The HG00733 ONT assemblies were polished with MarginPolish/HELEN32 (git commit 4a18ade) following developer recommendations. The alignments were created with minimap2 v2.17 and used for polishing as follows:

minimap2 -ax map-ont -t {threads} {assembly} {reads} |
samtools sort -@ {threads} |
samtools view -hb -F 0×104>{output}
marginpolish {alignments} {assembly} MP_r941_guppy344_human.json
–threads {threads} –produceFeatures –outputBase {output}
helen polish –image_dir {mp_out} –model_path HELEN_r941_guppy344_human.pkl
–threads {threads} –output_dir {output} –output_prefix HELEN

QV estimates based on variant callsets lifted back to the human reference hg38 were derived as follows: Genome in a Bottle46 high-confidence region sets (release v3.3.2) for individuals HG001, HG002 and HG005 were downloaded, and the intersection of all regions (BEDTools v2.29.0 ‘multiinter’47) was used as proxy for high-confidence regions in other samples (covering ~2.17 Gbp). For all samples, variant callsets based on Illumina short-read alignments against the respective haploid assembly were generated using BWA 0.7.17 and FreeBayes v1.3.1 as follows:

bwa mem -t {threads} -R {read_group} {index_prefix} {reads_mate1} {reads_mate2} | samtools view -u -F 3840 – |
samtools sort -l 6 {output_bam}

The BAM files were sorted with SAMtools v1.9 and duplicates marked with Sambamba v0.6.6 ‘markdup’. The variant calls with FreeBayes were generated as follows:

freebayes –use-best-n-alleles 4 –skip-coverage {cov_limit} –region {assembly_contig} -f {assembly_fasta}
–bam {bam_child} –bam {bam_parent1} –bam {bam_parent2}

Options ‘–use-best-n-alleles’ and ‘–skip-coverage’ were set following developer recommendations to increase processing speed. Variants were further filtered with BCFtools v1.9 for quality and read depth: ‘QUAL >=10 && INFO/DP<MEAN+3*STDDEV’. Variants were converted into BED format using vcf2bed v2.4.37 (ref. 48) with parameters ‘–snvs’, ‘–insertions’ and ‘–deletions’. The alignment information for lifting variants from the haploid assemblies to the human hg38 reference was generated with minimap v2.17-r941, and the liftover was realized with paftools (part of the minimap package):

minimap2 -t {threads} -cx asm20 –cs –secondary=no -Y -m 10000 -z 10000,50 -r 50000 –end-bonus=100 -O 5,56 -E 4,1 -B 5 ’ hg38.fasta {input_hap_assembly} > {hap-assm}_to_hg38.paf
paftools.js liftover -1 10000 {input_paf} {input_bed} > {output.hg38.bed}

The lifted variants were intersected with our custom set of high-confidence regions using BEDTools ‘intersect’. The total number of base pairs in homozygous variants was then computed as the sum over the length (as reported by FreeBayes as LEN) of all variants located in the high-confidence regions. Because not all variants could be lifted from the haploid to the hg38 reference assembly, we cannot know whether these variants would fall into the ‘high-confidence’ category. We thus computed a second, more conservative, QV estimate counting also all homozygous calls as error that were not lifted to the hg38 reference.

Hi-C based scaffolding and validation

To independently evaluate the accuracy of our scaffolds, we used proximity ligation data for NA12878 and HG00733 (‘Data availability’). By aligning Hi-C data to our scaffolds produced by SaaRclust, we can visually confirm that long-range Hi-C interactions are limited to each cluster reported by SaaRclust.

In addition, we attempted to reproduce Hi-C-based scaffolds presented by Garg et al.12 for NA12878 using 3D-DNA49. Input to this pipeline was created with Juicer50 and an Arima Genomics Hi-C script, which are both publicly available.

Arima script -i {squashed_asm} -e {cut-Sequence} -o {cut-sites.txt}

Juicer -g {genome_id} -s {enzyme} -z {squashed_asm} -r -p {chrom.sizes} -y {cut-sites.txt}

3D-DNA {squashed_asm} {juicer_merged_no_dups}

SV, indel and SNV detection

Methods for SV, indel and SNV calling are similar to previous HiFi assembly work33 but were adapted for phased assemblies. Variants were called against the GRCh38 primary assembly (that is, no alternate, patch or decoy sequences), which includes chromosomes and unplaced/unlocalized contigs. Mapping was performed with minimap2 2.17 (ref. 51) using parameters –secondary=no -a -t 20 –eqx -Y -x asm20 -m 10000 -z 10000,50 -r 50000 –end-bonus=100 -O 5,56 -E 4,1 -B 5, as described previously33. Alignments were then sorted with SAMtools v1.9 (ref. 52).

To obtain variant calls, alignments were processed with, which was derived in the SMRT-SV v2 pipeline (,54, to parse CIGAR string operations to make variant calls30.

Alignment records from assemblies often overlap, which would produce duplicate variant calls with possible different representations (fragmented or shifted). For each haplotype, we constructed a tiling path covering GRCh38 once and traversing loci most centrally located within alignment records. Variants within the path were chosen, and variants outside the tiling path (that is, potential duplicates) were dropped from further analysis.

After obtaining a callset for H1 and H2 independently, we then merged the two haplotypes into a single callset. For homozygous SV and indel calls, an H2 variant must intersect an H1 variant by 1) 50% reciprocal overlap (RO) or 2) within 200 bp and a 50% reciprocal size overlap (RO if variants were shifted to maximally intersect). For homozygous SNV calls, the position and alternate base must match exactly. The result is a unified phased callset containing homozygous and heterozygous variants. Finally, we filtered out variants in pericentromeric loci where callsets are difficult to reproduce54 and loci where we found a collapse in the assembly of either haplotype.

We intersected RefSeq annotations from the UCSC RefSeq track and evaluated the effect on genes noting frameshift SVs and indels in coding regions by quantifying the number of bases affected per variant on genic regions. If an insertion or deletion changes coding sequence for any isoform of a gene by a non-modulo-3 number of bases, we flag the gene as likely disrupted.

Variants falling within TRs and SDs were also annotated using UCSC hg38 tracks. For TR and SD BED files, we merged records allowing regions within 200 bp to overlap with BEDTools47. SVs and indels that were at least 50% contained within an SD or TR region were annotated as SD or TR. For RefSeq analysis, we classified genes as contained within TR or SD by intersecting exons with the collapsed TR and SD regions allowing any overlap.

Phasing accuracy estimates

To evaluate phasing accuracy, we determined SNVs in our phased assemblies based on their alignments to GRCh38. This procedure is described in the ‘SV, indel and SNV detection’ section in the Methods. We evaluate phasing accuracy of our assemblies in comparison to trio-based phasing for HG00733 (ref. 19) and NA12878 (ref. 46). In all calculations, we compare only SNV positions that are shared between our SNV calls and those from trio-based phasing. To count the number of switch errors between our phased assemblies and trio-based phasing, we compare all neighboring pairs of SNVs along each haplotype and recode them into a string of 0s and 1s depending on whether the neighboring alleles are the same (0) or not (1). The absolute number of differences in such binary strings is counted between our haplotypes and the trio-based haplotypes (per chromosome). The switch error rate is reported as a fraction of counted differences of the total number of compared SNVs (per haplotype). Similarly, we calculate the Hamming distance as the absolute number of differences between our SNVs and trio-based phasing (per chromosome) and report it as a fraction of the total number of differences to the total number of compared SNVs (per haplotype).

MHC analysis

We extracted the MHC, defined as chr6:28000000–34000000, by mapping each haplotype sequence against GRCh38 and extracting any primary or supplementary alignments to this region. We created a dotplot for each haplotype’s MHC region using Dot from DNAnexus ( (Supplementary Fig. 11). We created phased VCFs for both the CCS and Shasta assemblies using the two haplotype files as input to Dipcall ( Then, we compared the phasing between the haplotype files using the compare module within WhatsHap. This results in a switch error rate of 0.48% (six sites) and a Hamming error rate of 0.28% (four sites) from 1,433 common heterozygous sites between the VCFs.

Detection of loss of heterozygosity regions

To localize regions of decreased heterozygosity, we calculated the SNV diversity as a fraction of heterozygous variants between H1 and H2 within 200-kbp-long genomic bins (sliding by 10 kbp). In the next step, we rescaled SNV diversity values to a vector of 0s and 1s by setting values <25th quantile to 0 and those >25th quantile to 1. Then, we used R package fastseg55 to find change points in previously created vector of 0s and 1s while reporting segments of minimal length of 200 (diversity values per bins). In turn, we genotyped each segment based on a median segment value. Segments with median value ≤0.05 were genotyped as ‘LOH’ (loss of heterozygosity), whereas the rest were genotyped as ‘NORM’ (normal level of heterozygosity).

Detection of misassembled contigs

To detect assembly errors in squashed or phased assemblies, we used our SaaRclust package. First, we aligned Strand-seq reads to an assembly in question and then ran SaaRclust with the following parameters:

scaffoldDenovoAssembly(bamfolder = <>, outputfolder = <>, = TRUE, = TRUE, pairedEndReads = TRUE, bin.size = 200000, step.size = 200000,, bin.method = ’fixed’, ord.method = ’greedy’, min.contig.size = 100000, assembly.fasta = assembly.fasta, concat.fasta = FALSE, num.clusters = 100, remove.always.WC = TRUE, mask.regions = FALSE, desired.num.clusters = 24)

The list of misassembled contigs predicted assembly errors is reported by SaaRclust in RData object with prefix ‘putativeAsmErrors_*’.

Likely disrupted genes

Using RefSeq intersect counts, we found all genes with at least one non-modulo-3 insertion or deletion within the coding region of any isoform (that is, frameshift). We filtered out any genes not fully contained within a consensus region of the two haplotypes, which we defined as regions where both H1 and H2 had exactly one aligned contig. If a gene had multiple non-modulo-3 events, whether in the same isoform or not, the gene was counted once.

Variant comparisons

We compared variants to previously published callsets by intersecting them with the same RO/Size-RO strategy used to merge haplotypes. For HGSVC comparisons, we also excluded variant calls on unplaced contigs, unlocalized contigs and chrY of the reference (that is, chr1-22,X), which were not reported by the HGSVC study. To quantify the number of missed variants proximal to another, we took variants that failed to intersect an HGSVC variant and found the distance to the nearest variant of the same type (INS versus INS and DEL versus DEL).

Robust and reproducible implementation

The basic workflow of our study is implemented in a reproducible and scalable Snakemake56 pipeline that has been successfully tested in compute environments ranging from single servers to high-performance cluster setups (‘Code availability’). Major tasks in the pipeline, such as read alignment or assembly, have been designed as self-contained ‘start-to-finish’ jobs, automating even trivial steps, such as downloading the publicly available datasets used in this study. Owing to the considerable size of the input data, we strongly recommend deploying this pipeline only on compute infrastructure tailored to resource-intensive and highly parallel workloads.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.


via Bioinformatics : subject feeds

December 8, 2020 at 03:06AM

2021 preview: How soon will a covid-19 vaccine return life to normal?

2021 preview: How soon will a covid-19 vaccine return life to normal?

By Michael Le Page , Clare Wilson , Donna Lu , Graham Lawton and Adam Vaughan

New Scientist Default Image

People enjoy the sun while social distancing in Melbourne, Australia, in October 2020 after restrictions eased

Darrian Traynor/Getty Images

IF 2020 felt hellish, be warned that we aren’t out of the fire yet, even if we are moving in the right direction. Welcome to 2021, aka purgatory.

There is little doubt that vaccines hold the key to ending the pandemic. A recent modelling study predicted that vaccinating just 40 per cent of US adults over the course of 2021 would reduce the coronavirus infection rate by around 75 per cent and cuthospitalisations and deaths from covid-19 by more than …


via New Scientist – Health

December 26, 2020 at 08:21AM

Why the COVID-19 Pandemic Has Caused a Widespread Existential Crisis

Why the COVID-19 Pandemic Has Caused a Widespread Existential Crisis

“The ‘ol quarantine move-in,” a friend joked a couple months ago, when I told her I’d decided to live with my boyfriend of almost two years.

I can add all the caveats I want—my lease was up and we probably would have moved in together this year anyway—but I know I’m a statistic. I’m just one of the countless people who have made huge life decisions during this massively chaotic and unsettling pandemic year.

Of course, there is significant privilege in having the time and ability to choose to make a life shift right now, when many people are facing changes they most certainly did not ask for: losses of jobs, savings, homes, friends, family, security. But among those lucky enough to make them voluntarily, life adjustments are coming fast and frequently.

My Instagram feed feels like a constant stream of engagements, pandemic weddings, moving trucks, career announcements and newly adopted pets. Three of my closest friends decamped from major cities to houses in the suburbs in 2020; one bought a house, got married and decided to change careers over the course of about six months.

I’m in my late twenties, so to some degree this comes with the territory. But something about the COVID-19 pandemic, about the unending strangeness of the year 2020, seems to have paved the way for even more change than usual. It’s hard to plan two weeks in the future—who knows what will be open, what we’ll feel safe doing—but, with our previous lifestyles already uprooted, it feels easier than ever to plant new ones. My friends and I joke that when we catch up from our respective quarantines, there is either nothing new, or everything.

We’re not alone. The U.S. population seems to be making changes to the way it lives, works and relates en masse.

A Pew Research Center poll found that, as of June, 22% of American adults had either moved because of the pandemic or knew someone who did. That trend apparently continued into the fall: About 20% more houses sold in November 2020 compared to November 2019, according to U.S. Census Bureau data. The reasons for that trend are likely many. Among them, months of indoor time seems to have prompted many people to look for homes that offer more space, and those who can work from home suddenly have more freedom to move beyond the commuting distance of an office.

Meanwhile, about a quarter of U.S. adults said they’re considering a career shift due to the pandemic, found a November report from HR company Morneau Shepell. That’s not surprising, given that traditional workplaces have been partially replaced (at least for now) by teleworking and many people who cannot work from home must grapple with an entirely new risk-benefit analysis associated with clocking in. The numerous Americans who lost jobs in 2020 also have no choice but to reconsider their employment.

In the world of relationships, jewelers are reporting double-digit increases in engagement ring sales, the Washington Post reported in December. In the 2020 installment of Match’s annual Singles in America report, more than half of respondents said they’re prioritizing dating and rethinking the qualities they search for in a partner, likely sparked by the complete social upheaval of this year.

It will take years for researchers to fully understand the effect coronavirus had on the U.S. population, and it’s unlikely there will be one single lifestyle shift that characterizes the pandemic. Right now, the dominant trend seems to be change itself. The COVID-19 pandemic appears to have spurred a collective reckoning with our values, lifestyles and goals—a national existential crisis of sorts.

Freelance journalist and author Nneka Okona has lived in Atlanta for almost five years, but it often didn’t feel that way. Okona, 34, traveled a total of about 100,000 miles in 2019, so she was rarely home. Even when she vowed to take a month or two off from traveling, she’d get antsy and book a last-minute getaway. To say pandemic lockdowns and social distancing changed her lifestyle would be a massive understatement.

“It was such a drastic change. I realized maybe a couple months into the pandemic that I actually was not doing well, mental-health-wise,” Okona says. She started seeing a therapist, who helped her realize she was suffering depression after slamming the brakes on her action-oriented life.

Almost a year into the pandemic, Okona says she’s doing much better mentally and reflecting on her life in ways that weren’t possible when she was constantly on the go. “With the movement I was just so distracted,” she says. “It was easier to ignore a lot of things I needed to focus on because I didn’t have time.” Now, she says, she’s thinking critically about where she wants to live, whether she wants to continue freelancing and in what form she’ll continue her travel habit moving forward.

Reevaluation is a common reaction to sudden, strange stillness like that brought on by the pandemic, says Dr. Elinore McCance-Katz, who leads the U.S. Substance Abuse and Mental Health Services Administration. “It gives people a lot of time to review their lives and think about what life could look like moving forward,” she says. “For many people, that’s not a bad thing, for them to really spend time taking an inventory of what their life is like currently and what they want it to be like.”

Quarantine also creates a perfect storm for making big decisions, says Jacqueline Gollan, a psychiatry professor at Northwestern University’s Feinberg School of Medicine who studies decision making. Many people are stuck at home for most of their waking hours, watching one day bleed into the next. When it feels like nothing noteworthy is going on, people may try to make things happen.

“People have a basic bias toward action,” Gollan says. “People will want to take action on something, whatever it is, rather than delay action [even] when that’s the best option.”

That natural inclination may be ratcheted up even further when people are trying to relieve negative emotions associated with the pandemic, Gollan says. In addition to a general preference for action over inaction, humans are also likely to seek out situations—new relationships, living situations, jobs—that seem like they’ll relieve stress, sadness or other bad feelings. That’s particularly likely during something as emotionally taxing as a pandemic.

Coronavirus has also reminded people of their own mortality, Gollan says. “People are realizing that life is short, and they’re reprioritizing,” she says. That’s an expected reaction: Studies show that natural disasters and other traumatic events can prompt people to make big decisions like getting married, often in a search for security or comfort.

Crises can also make people analyze and change their values. People tend to become more religious after natural disasters, research shows, perhaps out of a desire to understand or cope with difficult and inexplicable situations. Similarly, a Pew Research Center report from October 2020 found that 86% of U.S. adults thought there were lesson(s) humankind should learn from the COVID-19 pandemic. When asked to specify what those lessons were, people gave Pew more than 3,700 answers—some practical (the importance of wearing a mask), some spiritual (“We need to pray more and pray harder”) and some personal (we should “value humankind and intimacy”).

Relationships are often the first thing to get a makeover when people take a hard look at their lives, says Amanda Gesselman, associate director for research at the Kinsey Institute, a research center that focuses on sex and relationships. Gesselman’s research shows many people, particularly those in their twenties and thirties, are spending more time than usual on dating apps during the pandemic, and report having deeper conversations with the people they meet there, compared to before the pandemic.

“A big trend right now is really focusing on what kind of connections you want,” she says.

It’s not all warm and fuzzy, though. Rachel Dack, a Maryland-based psychotherapist and relationship coach, says she is indeed seeing many clients think critically about what they want in a relationship—and that leads to breakups and divorces as well as engagements and cohabitations.

In Match’s recent survey, about a quarter of singles said stay-at-home orders caused them to end a relationship. Some preliminary data also suggest more couples than normal are divorcing this year, though not all researchers agree with that assessment. For every relationship moving forward, Dack says, another seems to be splintering—perhaps not surprising, given pressures like financial stress or the tension of forced 24/7 togetherness. Researchers have observed that phenomenon in the aftermath of other crises; stressful times can both end and promote relationships.

Mass traumas can force change in other unpleasant ways, too. Both the 1918 flu pandemic and the 2008 recession led to noticeable decreases in the U.S. birth rate. National or global crises can also cause or compound mental health and substance abuse issues at the population level, as the COVID-19 pandemic has already done.

Research shows that rates of depression and anxiety have skyrocketed during the pandemic, which is one reason Gollan says it’s wise to think carefully about making any serious choices right now. “We’re notoriously not very good at predicting the consequences of a future decision,” Gollan says, and we’re also prone to “optimism bias”—the tendency to believe our decisions will work out in the end and that the future will be largely positive. That’s not always the case, though. Decisions can and do backfire, especially when they’re made under duress

That’s not to say all change is bad. For many people, the pandemic has kickstarted a genuinely valuable process of reevaluation—it’s been a disruption so jarring it forces introspection. The luxury of extra free time, for those who have it, can also make it easier to define and act upon values and priorities.

The trick, Gollan says, is leaning into the natural inclination for change without toppling over the edge. Don’t act just because you think you should, and resist the urge to make life-altering changes based solely on temporary factors, she says. (The pandemic will end, though it might not feel like it.) “Stress test” your planned decision by seeking out information or perspectives that challenge it, Gollan suggests—before it’s too late to undo.

As we spoke, I wondered whether Gollan would approve of my decision to move in with my boyfriend. I haven’t had any regrets so far, but maybe I’ve been blinded by optimism and a desire for comfort amidst all the difficulty of this year. Did I stress test the plan enough? Should we have waited until the pandemic ended and our heads cleared?

I’m not sure what an expert would say. But if 2020 has taught me anything, it’s that I cannot begin to predict what the future—or even tomorrow—will bring. I’m happy where I am, and that feels like more than enough as a historically awful year comes to a close. Maybe it’s the optimism bias at work. But optimism, psychologically biased or not, feels like a worthy antidote to a year marked by tragedy and sadness and stress. I’m going to hang onto it where I can.


via Healthland

December 29, 2020 at 11:41PM

Deep Neural Networks Help to Explain Living Brains

Deep Neural Networks Help to Explain Living Brains

In the winter of 2011, Daniel Yamins, a postdoctoral researcher in computational neuroscience at the Massachusetts Institute of Technology, would at times toil past midnight on his machine vision project. He was painstakingly designing a system that could recognize objects in pictures, regardless of variations in size, position and other properties — something that humans do with ease. The system was a deep neural network, a type of computational device inspired by the neurological wiring of living brains.

“I remember very distinctly the time when we found a neural network that actually solved the task,” he said. It was 2 a.m., a tad too early to wake up his adviser, James DiCarlo, or other colleagues, so an excited Yamins took a walk in the cold Cambridge air. “I was really pumped,” he said.

It would have counted as a noteworthy accomplishment in artificial intelligence alone, one of many that would make neural networks the darlings of AI technology over the next few years. But that wasn’t the main goal for Yamins and his colleagues. To them and other neuroscientists, this was a pivotal moment in the development of computational models for brain functions.

DiCarlo and Yamins, who now runs his own lab at Stanford University, are part of a coterie of neuroscientists using deep neural networks to make sense of the brain’s architecture. In particular, scientists have struggled to understand the reasons behind the specializations within the brain for various tasks. They have wondered not just why different parts of the brain do different things, but also why the differences can be so specific: Why, for example, does the brain have an area for recognizing objects in general but also for faces in particular? Deep neural networks are showing that such specializations may be the most efficient way to solve problems.

Similarly, researchers have demonstrated that the deep networks most proficient at classifying speech, music and simulated scents have architectures that seem to parallel the brain’s auditory and olfactory systems. Such parallels also show up in deep nets that can look at a 2D scene and infer the underlying properties of the 3D objects within it, which helps to explain how biological perception can be both fast and incredibly rich. All these results hint that the structures of living neural systems embody certain optimal solutions to the tasks they have taken on.

These successes are all the more unexpected given that neuroscientists have long been skeptical of comparisons between brains and deep neural networks, whose workings can be inscrutable. “Honestly, nobody in my lab was doing anything with deep nets ,” said the MIT neuroscientist Nancy Kanwisher. “Now, most of them are training them routinely.”

Deep Nets and Vision

Artificial neural networks are built with interconnecting components called perceptrons, which are simplified digital models of biological neurons. The networks have at least two layers of perceptrons, one for the input layer and one for the output. Sandwich one or more “hidden” layers between the input and the output and you get a “deep” neural network; the greater the number of hidden layers, the deeper the network.

Deep nets can be trained to pick out patterns in data, such as patterns representing the images of cats or dogs. Training involves using an algorithm to iteratively adjust the strength of the connections between the perceptrons, so that the network learns to associate a given input (the pixels of an image) with the correct label (cat or dog). Once trained, the deep net should ideally be able to classify an input it hasn’t seen before.

In their general structure and function, deep nets aspire loosely to emulate brains, in which the adjusted strengths of connections between neurons reflect learned associations. Neuroscientists have often pointed out important limitations in that comparison: Individual neurons may process information more extensively than “dumb” perceptrons do, for example, and deep nets frequently depend on a kind of communication between perceptrons called back-propagation that does not seem to occur in nervous systems. Nevertheless, for computational neuroscientists, deep nets have sometimes seemed like the best available option for modeling parts of the brain.

Researchers developing computational models of the visual system have been influenced by what we know of the primate visual system, particularly the pathway responsible for recognizing people, places and things called the ventral visual stream. (A largely separate pathway, the dorsal visual stream, processes information for seeing motion and the positions of things.) In humans, this ventral pathway begins in the eyes and proceeds to the lateral geniculate nucleus in the thalamus, a sort of relay station for sensory information. The lateral geniculate nucleus connects to an area called V1 in the primary visual cortex, downstream of which lie areas V2 and V4, which finally lead to the inferior temporal cortex. (Nonhuman primate brains have homologous structures.)

The key neuroscientific insight is that visual information processing is hierarchical and proceeds in stages: The earlier stages process low-level features in the visual field (such as edges, contours, colors and shapes), whereas complex representations, such as whole objects and faces, emerge only later in the inferior temporal cortex.

Those insights guided the design of the deep net by Yamins and his colleagues. Their deep net had hidden layers, some of which performed a “convolution” that applied the same filter to every portion of an image. Each convolution captured different essential features of the image, such as edges. The more basic features were captured in the early stages of the network and the more complex features in the deeper stages, as in the primate visual system. When a convolutional neural network (CNN) like this one is trained to classify images, it starts off with randomly initialized values for its filters and learns the correct values needed for the task at hand.

The team’s four-layer CNN could recognize eight categories of objects (animals, boats, cars, chairs, faces, fruits, planes and tables) depicted in 5,760 photo-realistic 3D images. The pictured objects varied greatly in pose, position and scale. Even so, the deep net matched the performance of humans, who are extremely good at recognizing objects despite variation.

Unbeknownst to Yamins, a revolution brewing in the world of computer vision would also independently validate the approach that he and his colleagues were taking. Soon after they finished building their CNN, another CNN called AlexNet made a name for itself at an annual image recognition contest. AlexNet, too, was based on a hierarchical processing architecture that captured basic visual features in its early stages and more complex features at higher stages; it had been trained on 1.2 million labeled images presenting a thousand categories of objects. In the 2012 contest, AlexNet routed all other tested algorithms: By the metrics of the competition, AlexNet’s error rate was only 15.3%, compared to 26.2% for its nearest competitor. With AlexNet’s victory, deep nets became legitimate contenders in the field of AI and machine learning.

Yamins and other members of DiCarlo’s team, however, were after a neuroscientific payoff. If their CNN mimicked a visual system, they wondered, could it predict neural responses to a novel image? To find out, they first established how the activity in sets of artificial neurons in their CNN corresponded to activity in almost 300 sites in the ventral visual stream of two rhesus macaques.

Then they used the CNN to predict how those brain sites would respond when the monkeys were shown images that weren’t part of the training data set. “Not only did we get good predictions … but also there’s a kind of anatomical consistency,” Yamins said: The early, intermediary and late-stage layers of the CNN predicted the behaviors of the early, intermediary and higher-level brain areas, respectively. Form followed function.

Kanwisher remembers being impressed by the result when it was published in 2014. “It doesn’t say that the units in the deep network individually behave like neurons biophysically,” she said. “Nonetheless, there is shocking specificity in the functional match.”

Specializing for Sounds

After the results from Yamins and DiCarlo appeared, the hunt was on for other, better deep-net models of the brain, particularly for regions less well studied than the primate visual system. For example, “we still don’t really have a very good understanding of the auditory cortex, particularly in humans,” said Josh McDermott, a neuroscientist at MIT. Could deep learning help generate hypotheses about how the brain processes sounds?

That’s McDermott’s goal. His team, which included Alexander Kell and Yamins, began designing deep nets to classify two types of sounds: speech and music. First, they hard-coded a model of the cochlea — the sound-transducing organ in the inner ear, whose workings are understood in great detail — to process audio and sort the sounds into different frequency channels as inputs to a convolutional neural network. The CNN was trained both to recognize words in audio clips of speech and to recognize the genres of musical clips mixed with background noise. The team searched for a deep-net architecture that could perform these tasks accurately without needing a lot of resources.

Three sets of architectures seemed possible. The deep net’s two tasks could share only the input layer and then split into two distinct networks. At the other extreme, the tasks could share the same network for all their processing and split only at the output stage. Or it could be one of the dozens of variants in between, where some stages of the network were shared and others were distinct.

Unsurprisingly, the networks that had dedicated pathways after the input layer outdid the networks that fully shared pathways. However, a hybrid network — one with seven common layers after the input stage and then two separate networks of five layers each — did almost as well as the fully separate network. McDermott and colleagues chose the hybrid network as the one that worked best with the least computational resources.

When they pitted that hybrid network against humans in these tasks, it matched up well. It also matched up to earlier results from a number of researchers that suggested the non-primary auditory cortex has distinct regions for processing music and speech. And in a key test published in 2018, the model predicted the brain activity in human subjects: The model’s intermediate layers anticipated the responses of the primary auditory cortex, and deeper layers anticipated higher areas in the auditory cortex. These predictions were substantially better than those of models not based on deep learning.

“The goal of the science is to be able to predict what systems are going to do,” said McDermott. “These artificial neural networks get us closer to that goal in neuroscience.”

Kanwisher, initially skeptical of deep learning’s usefulness for her own research, was inspired by McDermott’s models. Kanwisher is best known for her work in the mid-to-late 1990s showing that a region of the inferior temporal cortex called the fusiform face area (FFA) is specialized for the identification of faces. The FFA is significantly more active when subjects stare at images of faces than when they’re looking at images of objects such as houses. Why does the brain segregate the processing of faces from that of other objects?

Traditionally, answering such “why” questions has been hard for neuroscience. So Kanwisher, along with her postdoc Katharina Dobs and other colleagues, turned to deep nets for help. They used a computer-vision successor to AlexNet — a much deeper convolutional neural network called VGG — and trained two separate deep nets in specific tasks: recognizing faces, and recognizing objects.

The team found that the deep net trained to recognize faces was bad at recognizing objects and vice versa, suggesting that these networks represent faces and objects differently. Next, the team trained a single network on both tasks. They found that the network had internally organized itself to segregate the processing of faces and objects in the later stages of the network. “VGG spontaneously segregates more at the later stages,” Kanwisher said. “It doesn’t have to segregate at the earlier stages.”

This agrees with the way the human visual system is organized: Branching happens only downstream of the shared earlier stages of the ventral visual pathway (the lateral geniculate nucleus and areas V1 and V2). “We found that functional specialization of face and object processing spontaneously emerged in deep nets trained on both tasks, like it does in the human brain,” said Dobs, who is now at Justus Liebig University in Giessen, Germany.

“What’s most exciting to me is that I think we have now a way to answer questions about why the brain is the way it is,” Kanwisher said.

Layers of Scents

More such evidence is emerging from research tackling the perception of smells. Last year, the computational neuroscientist Robert Yang and his colleagues at Columbia University designed a deep net to model the olfactory system of a fruit fly, which has been mapped in great detail by neuroscientists.

The first layer of odor processing involves olfactory sensory neurons, each of which expresses only one of about 50 types of odor receptors. All the sensory neurons of the same type, about 10 on average, reach out to a single nerve cluster in the next layer of the processing hierarchy.  Because there are about 50 such nerve clusters on each side of the brain in this layer, this establishes a one-to-one mapping between types of sensory neurons and corresponding nerve clusters. The nerve clusters have multiple random connections to neurons in the next layer, called the Kenyon layer, which has about 2,500 neurons, each of which receives about seven inputs. The Kenyon layer is thought to be involved in high-level representations of the odors. A final layer of about 20 neurons provides the output that the fly uses to guide its smell-related actions (Yang cautions that no one knows whether this output qualifies as classification of odors).

To see if they could design a computational model to mimic this process, Yang and colleagues first created a data set to mimic smells, which don’t activate neurons in the same way as images. If you superimpose two images of cats, adding them pixel by pixel, the resulting image may look nothing like a cat. However, if you mix an odor from two apples, it’ll likely still smell like an apple. “That’s a critical insight that we used to design our olfaction task,” said Yang.

They built their deep net with four layers: three that modeled processing layers in the fruit fly and an output layer. When Yang and colleagues trained this network to classify the simulated odors, they found that the network converged on much the same connectivity as seen in the fruit fly brain: a one-to-one mapping from layer 1 to layer 2, and then a sparse and random (7-to-1) mapping from layer 2 to layer 3.

This similarity suggests that both evolution and the deep net have reached an optimal solution. But Yang remains wary about their results. “Maybe we just got lucky here, and maybe it doesn’t generalize,” he said.

The next step in testing will be to evolve deep networks that can predict the connectivity in the olfactory system of some animal not yet studied, which can then be confirmed by neuroscientists. “That will provide a much more stringent test of our theory,” said Yang, who will move to MIT in July 2021.

Not Just Black Boxes

Deep nets are often derided for being unable to generalize to data that strays too far from the training data set. They’re also infamous for being black boxes. It’s impossible to explain a deep net’s decisions by examining the millions or even billions of parameters shaping it. Isn’t a deep-net model of some part of the brain merely replacing one black box with another?

Not quite, in Yang’s opinion. “It’s still easier to study than the brain,” he said.

Last year, DiCarlo’s team published results that took on both the opacity of deep nets and their alleged inability to generalize. The researchers used a version of AlexNet to model the ventral visual stream of macaques and figured out the correspondences between the artificial neuron units and neural sites in the monkeys’ V4 area. Then, using the computational model, they synthesized images that they predicted would elicit unnaturally high levels of activity in the monkey neurons. In one experiment, when these “unnatural” images were shown to monkeys, they elevated the activity of 68% of the neural sites beyond their usual levels; in another, the images drove up activity in one neuron while suppressing it in nearby neurons. Both results were predicted by the neural-net model.

To the researchers, these results suggest that the deep nets do generalize to brains and are not entirely unfathomable. “However, we acknowledge that … many other notions of ‘understanding’ remain to be explored to see whether and how these models add value,” they wrote.

The convergences in structure and performance between deep nets and brains do not necessarily mean that they work the same way; there are ways in which they demonstrably do not. But it may be that there are enough similarities for both types of systems to follow the same broad governing principles.

Limitations of the Models

McDermott sees potential therapeutic value in these deep net studies. Today, when people lose hearing, it’s usually due to changes in the ear. The brain’s auditory system has to cope with the impaired input. “So if we had good models of what the rest of the auditory system was doing, we would have a better idea of what to do to actually help people hear better,” McDermott said.

Still, McDermott is cautious about what the deep nets can deliver. “We have been pushing pretty hard to try to understand the limitations of neural networks as models,” he said.

In one striking demonstration of those limitations, the graduate student Jenelle Feather and others in McDermott’s lab focused on metamers, which are physically distinct input signals that produce the same representation in a system. Two audio metamers, for example, have different wave forms but sound the same to a human. Using a deep-net model of the auditory system, the team designed metamers of natural audio signals; these metamers activated different stages of the neural network in the same way the audio clips did. If the neural network accurately modeled the human auditory system, then the metamers should sound the same, too.

But that’s not what happened. Humans recognized the metamers that produced the same activation as the corresponding audio clips in the early stages of the neural network. However, this did not hold for metamers with matching activations in the deeper stages of the network: those metamers sounded like noise to humans. “So even though under certain circumstances these kinds of models do a very good job of replicating human behavior, there’s something that’s very wrong about them,” McDermott said.

At Stanford, Yamins is exploring ways in which these models are not yet representative of the brain. For instance, many of these models need loads of labeled data for training, while our brains can learn effortlessly from as little as one example. Efforts are underway to develop unsupervised deep nets that can learn as efficiently. Deep nets also learn using an algorithm called back propagation, which most neuroscientists think cannot work in real neural tissue because it lacks the appropriate connections. “There’s been some big progress made in terms of somewhat more biologically plausible learning rules that actually do work,” Yamins said.

Josh Tenenbaum, a cognitive neuroscientist at MIT, said that while all these deep-net models are “real steps of progress,” they are mainly doing classification or categorization tasks. Our brains, however, do much more than categorize what’s out there. Our vision system can make sense of the geometry of surfaces and the 3D structure of a scene, and it can reason about underlying causal factors — for example, it can infer in real time that a tree has disappeared only because a car has passed in front of it.

To understand this ability of the brain, Ilker Yildirim, formerly at MIT and now at Yale University, worked with Tenenbaum and colleagues to build something called an efficient inverse graphics model. It begins with parameters that describe a face to be rendered on a background, such as its shape, its texture, the direction of lighting, the head pose and so on. A computer graphics program called a generative model creates a 3D scene from the parameters; then, after various stages of processing, it produces a 2D image of that scene as viewed from a certain position. Using the 3D and 2D data from the generative model, the researchers trained a modified version of AlexNet to predict the likely parameters of a 3D scene from an unfamiliar 2D image. “The system learns to go backwards from the effect to the cause, from the 2D image to the 3D scene that produced it,” said Tenenbaum.

The team tested their model by verifying its predictions about activity in the inferior temporal cortex of rhesus macaques. They presented macaques with 175 images, showing 25 individuals in seven poses, and recorded the neural signatures from “face patches,” visual processing areas that specialize in face recognition. They also showed the images to their deep learning network. In the network, the activation of the artificial neurons in the first layer represents the 2D image and the activation in the last layer represents the 3D parameters. “Along the way, it goes through a bunch of transformations, which seem to basically get you from 2D to 3D,” Tenenbaum said. They found that the last three layers of the network corresponded remarkably well to the last three layers of the macaques’ face processing network.

This suggests that brains use combinations of generative and recognition models not just to recognize and characterize objects but to infer the causal structures inherent in scenes, all in an instant. Tenenbaum acknowledges that their model doesn’t prove that the brain works this way. “But it does open the door to asking those questions in a more fine-grained mechanistic way,” he said. “It should be … motivating us to walk through it.”

Editor’s note: Daniel Yamins and James DiCarlo receive research funding from the Simons Collaboration on the Global Brain, which is part of the Simons Foundation, the organization that also funds this editorially independent magazine. Simons Foundation funding decisions have no bearing on Quanta’s coverage. Please see this page for more details.


via Quanta Magazine

October 29, 2020 at 12:12AM

The Economics of Coffee in One Chart

The Economics of Coffee in One Chart

The Economics of Coffee in One Chart

Breaking Down the Economics of Coffee

What goes into your morning cup of coffee, and what makes it possible?

The obvious answer might be coffee beans, but when you start to account for additional costs, the scope of a massive $200+ billion coffee supply chain becomes clear.

From the labor of growing, exporting, and roasting the coffee plants to the materials like packaging, cups, and even stir sticks, there are many underlying costs that factor into every cup of coffee consumed.

The above graphic breaks down the costs incurred by retail coffee production for one pound of coffee, equivalent to about 15 cups of 16 ounce brewed coffee.

The Difficulty of Pricing Coffee

Measuring and averaging out a global industry is a complicated ordeal.

Not only do global coffee prices constantly fluctuate, but each country also has differences in availability, relative costs, and the final price of a finished product.

That’s why a cup of 16 oz brewed coffee in the U.S. doesn’t cost the same in the U.K., or Japan, or anywhere else in the world. Even within countries, the differences of a company’s access to wholesale beans will dictate the final price.

To counteract these discrepancies, today’s infographic above uses figures sourced from the Specialty Coffee Association which are illustrative but based on the organization’s Benchmarking Report and Coffee Price Report.

What they end up with is an estimated set price of $2.80 for a brewed cup of coffee at a specialty coffee store. Each store and indeed each country will see a different price, but that gives us the foundation to start backtracking and breaking down the total costs.

From Growing Beans to Exporting Bags

To make coffee, you must have the right conditions to grow it.

The two major types of coffee, Arabica and Robusta, are produced primarily in subequatorial countries. The plants originated in Ethiopia, were first grown in Yemen in the 1600s, then spread around the world by way of European colonialism.

Today, Brazil is far and away the largest producer and exporter of coffee, with Vietnam the only other country accounting for a double-digit percentage of global production.

Country Coffee Production (60kg bags) Share of Global Coffee Production
Brazil 64,875,000 37.5%
Vietnam 30,024,000 17.4%
Colombia 13,858,000 8.0%
Indonesia 9,618,000 5.6%
Ethiopia 7,541,000 4.4%
Honduras 7,328,000 4.2%
India 6,002,000 3.5%
Uganda 4,704,000 2.7%
Peru 4,263,000 2.5%
Other 24,629,000 14.2%

How much money do growers make on green coffee beans? With prices constantly fluctuating each year, they can range from below $0.50/lb in 2001 to above $2.10/lb in 2011.

But if you’re looking for the money in coffee, you won’t find it at the source. Fairtrade estimates that 125 million people worldwide depend on coffee for their livelihoods, but many of them are unable to earn a reliable living from it.

Instead, one of the biggest profit margins is made by the companies exporting the coffee. In 2018 the ICO Composite price (which tracks both Arabica and Robusta coffee prices) averaged $1.09/lb, while the SCA lists exporters as charging a price of $3.24/lb for green coffee.

Roasting Economics

Roasters might be charged $3.24/lb for green coffee beans from exporters, but that’s far from the final price they pay.

First, beans have to be imported, adding shipping and importer fees that add $0.31/lb. Once the actual roasting begins, the cost of labor and certification and the inevitable losses along the way add an additional $1.86/lb before general business expenses.

By the end of it, roasters see a total illustrated cost of $8.73/lb.

Roaster Economics ($/lb)
Sales Price $9.40
Total Cost $8.73
Pre-tax Profit $0.67
Taxes $0.23
Net Profit $0.44
Net Profit (%) 7.1%

When it comes time for their profit margin, roasters quote a selling price of around $9.40/lb. After taxes, roasters see a net profit of roughly $0.44/lb or 7.1%.

Retail Margins

For consumers purchasing quality, roasted coffee beans directly through distributors, seeing a 1lb bag of roasted whole coffee for $14.99 and higher is standard. Retailers, however, are able to access coffee closer to the stated wholesale prices and add their own costs to the equation.

One pound of roasted coffee beans will translate into about 15 cups of 16 ounce (475 ml) brewed coffee for a store. At a price of $2.80/cup, that translates into a yield of $42.00/lb of coffee.

That doesn’t sound half bad until you start to factor in the costs. Material costs include the coffee itself, the cups and lids (often charged separately), the stir sticks and even the condiments. After all, containers of half-and-half and ground cinnamon don’t pay for themselves.

Factoring them all together equals a retail material cost of $13.00/lb. That still leaves a healthy gross profit of $29.00/lb, but running a retail store is an expensive business. Add to that the costs of operations, including labor, leasing, marketing, and administrative costs, and the total costs quickly ramp up to $35.47/lb.

In fact, when accounting for additional costs for interest and taxes, the SCA figures give retailers a net profit of $2.90/lb or 6.9%, slightly less than that of roasters.

A Massive Global Industry

Coffee production is a big industry for one reason: coffee consumption is truly a universal affair with 2.3 million cups of coffee consumed globally every minute. By total volume sales, coffee is the fourth most-consumed beverage in the world.

That makes the retail side of the market a major factor. Dominated by companies like Nestlé and Jacobs Douwe Egberts, global retail coffee sales in 2017 reached $83 billion, with an average yearly expenditure of $11 per capita globally.

Of course, some countries are bigger coffee drinkers than others. The largest global consumers by tonnage are the U.S. and Brazil (despite also being the largest producer and exporter), but per capita consumption is significantly higher in European countries like Norway and Switzerland.

The next time you sip your coffee, consider the multilayered and vast global supply chain that makes it all possible.

Subscribe to Visual Capitalist

Thank you!
Given email address is already subscribed, thank you!
Please provide a valid email address.
Please complete the CAPTCHA.
Oops. Something went wrong. Please try again later.

The post The Economics of Coffee in One Chart appeared first on Visual Capitalist.


via Visual Capitalist

October 25, 2020 at 03:17AM

CRISPR weapon spread by bacterial sex could destroy deadly superbugs

CRISPR weapon spread by bacterial sex could destroy deadly superbugs

By Michael Le Page

New Scientist Default Image

Illustration of carbapenem-resistant Acinetobacter sp. bacteria


Bacteria armed with a CRISPR-based weapon that infects other microbes during the bacterial equivalent of sex could help us kill off dangerous antibiotic-resistant superbugs – if regulators approve their use. While the approach has huge promise, its reliance on genetically engineered bacteria is likely to be controversial.

“We would be releasing genetically modified killing machines into the environment. What could go wrong?” says David Edgell at Western University in Canada.

There are two main problems with conventional antibiotic drugs. First, they often kill beneficial bacteria along with dangerous ones and disrupt microbiomes. This …


via New Scientist – Health

October 26, 2020 at 06:19PM

Long covid: Why are some people sick months after catching the virus?

Long covid: Why are some people sick months after catching the virus?

By Jessica Hamzelou

New Scientist Default Image

Vanessa Branchi

THE argument for naturally obtained herd immunity as a solution to the coronavirus pandemic has made a return in recent weeks. But letting the virus spread among younger people, who are less likely to die from covid-19, could lead to devastating consequences. Estimates suggest that there could already be millions of people around the world living with “long covid” – what appears to be a debilitating syndrome that follows a coronavirus infection.

As personal stories of long-term problems accumulate, researchers and health bodies are learning more about what might cause these long-lasting symptoms, and how best to treat them. …


via New Scientist – Health

October 30, 2020 at 03:30AM

More on a world rate of profit

More on a world rate of profit

Back in July, I wrote a post on a new approach to a world rate of profit and how to measure it.  I won’t go over the arguments again as you can read that post and previous ones on the subject.  But in that July post, I said I would follow up on the decomposition of the world rate of profit and the factors driving it.  And I would try to relate the change in the rate of profit to the regularity and intensity of crises in the capitalist mode of production. And I would consider the question of whether, if there is a tendency for the rate of profit to fall as Marx argued, it could reach zero eventually; and what does that tell us about capitalism itself?  I am not sure I can answer all those points in this post, but here goes.

First, let me repeat the results of the measurement of a world rate of profit offered in the July post.  Based on data now available in Penn World Tables 9.1 (IRR series), I calculated that the average (weighted) rate of profit on fixed assets for the top G20 economies from 1950 to 2017 (latest data) looked like this in the graph below.

Source: Penn World Tables, author’s calculations

I have divided the series into four periods that I think define different situations in the world capitalist economy.  There is the ‘golden age’ immediately after WW2 where profitability is high and even rising.  Then there is the now well documented (and not disputed) collapse in the rate of profit from the mid-1960s to the global slump of the early 1980s.  Then there is the so-called neoliberal recovery where profitability recovers, but peaks in the late 1990s at a level still well below the golden age.  And finally, there is the period that I call the Long Depression where profitability heads back down, with a jerk up from the mild recession of 2001 to 2007, just before the Great Recession. Recovery in profitability since the end of the GR has been miniscule.

So Marx’s law of profitability is justified empirically.  But is it justified theoretically?  Could there be other reasons for the secular fall in profitability than those proposed by Marx.  Marx’s theory was that capitalists competing with each other to increase profits and gain market share would try to undercut their rivals by reducing costs, particularly labour costs.  So investment in machinery and technology would be aimed at shedding labour – machines to replace workers. But as new value depends on labour power (machines do not create value without labour power), there would be a tendency for new value (and particularly surplus value) to fall relatively to the increase in investment in machinery and plant (constant capital in Marx’s terms).

So over time, there would be a rise in constant capital relative to investment in labour (variable capital) ie a rise in the organic composition of capital (OCC).  This was the key tendency in Marx’s law of profitability.  This tendency could be counteracted if capitalists could force up the rate of exploitation (or surplus value) from the employed workforce.  Thus if the organic composition of capital rises more than the rate of surplus value, the rate of profit will fall – and vice versa.  If this applies to the rate of profit as measured, it lends support to Marx’s explanation of the falling rate of profit since 1950.

Well, here is a graph of the decomposition of the rate of profit for the G20 economies.  The graph shows that the long-term decline in profitability is matched by a long-term rise in the OCC.  So Marx’s main explanation for a falling rate of profit, namely a rise in the organic composition of capital is supported.

Source: Penn World Tables, author’s calculations

What about the rate of surplus value?  If that rises faster than the OCC, the rate of profit should rise and vice versa.  Well, here are the variables broken down into the four periods I described above.  They show the percentage change in each period.

Source: Penn World Tables, author’s calculations

For the whole period 1950-2017, the G20 rate of profit fell over 18%, the organic composition of capital rose 12.6% and the rate of surplus value actually fell over 8%.  In the golden age, the rate of profit rose 11%, because the rate of surplus value rose more (16%) than the OCC (4%).  In the profitability crisis of 1966-82, the rate of profit plummeted 35% because, although the OCC also fell 6%, the rate of surplus value dropped 38%.  In the neoliberal recovery period, the rate of profit rose 24% because although the OCC rose 11%, the rate of surplus value rose 37% (a real squeeze on workers wages and conditions).  In the final period since 1997 when the rate of profit fell 10% to 2017, the OCC rose a little (4%) but the rate of surplus value dropped a little (7%).

These results confirm Marx’s law as an appropriate explanation of the movement in the world rate of profit since 1950 – I know of no other alternative explanation that explains this better.

So will the rate of profit eventually fall to zero and what does that mean?  Well, if the current rate of secular fall in the G20 economies continues, it is going to take a very long time to reach zero – well into the next century!  Among the G7 economies, however, if the average annual fall in profitability experienced in the last 20 years or so is continued, then the G7 rate will reach zero by 2050.  But of course, there could be a new period of revival in the rate of profit, probably driven by the destruction of capital values in a deep slump and by a severe restriction on labour’s share of value by reactionary governments.

Nevertheless, what the secular fall in the profitability of capital does tell you is that capitalism’s ability to develop the productive forces and take billions out of poverty and towards a world of abundance and harmony with nature is hopelessly impossible.  Capitalism as a system is already past its sell-by date.

Finally, can we relate falling profitability with regular and recurring crises of production and investment in capitalism?  In my book, Marx 200, I explain that connection and in the July post I showed a close correlation between falling profitability of capital and a fall in the total mass of profits.  Marx argued that, as average profitability of capital in an economy falls, capitalists compensate for this by increasing investment and production to boost the mass of profit.  He called this a double edge law: falling profitability and rising profits.  However, at a certain point, such is the fall in profitability that the mass of profits stops rising and starts to fall – this is the crux point for the beginning of an ‘investment strike’ leading to a slump in production, employment and eventually incomes and workers’ spending.  Only when there is a sufficient reduction in costs for capitalists, bringing about a rise in profitability and profits, will the ‘business cycle’ resume.

What is happening right now?  Well, as we have seen above, global profitability was already at a low point in 2017 and still below the pre-Great Recession peak.  By any measured guess, it was even lower in 2019.  And I have updated my measure of the mass of profits in the corporate sector of the major economies (US, UK, Germany, Japan, China).  Even before the pandemic broke and the lockdowns began, global corporate profits had turned negative, suggesting a slump was on its way in 2020 anyway.

We read about the huge profits that the large US tech and online distribution companies (FAANGS) are making.  But they are the exception.  Vast swathes of corporations (large and small) globally are struggling to sustain profit levels as profitability stays low and/or falls. Now the pandemic slump has driven global corporate profits down by around 25% in the first half of 2020 – a bigger fall than in the Great Recession.

Source: National accounts, author’s calculations

Profits recovered fast after the Great Recession.  It may not be so quick this time.


via Michael Roberts Blog

September 20, 2020 at 06:42PM

코로나 이후(The After Time) – 마이클 셔머 (1/2)

코로나 이후(The After Time) – 마이클 셔머 (1/2)

(Michael Shermer, American Scholar)

원문 보기

이 글을 쓰고 있는 2020년 여름, 나는 때로 이 세상이 마치 허먼 멜빌이 묘사한 소설 속 세상이 아닌가 생각한다.

“미친 에이허브에게 흰 고래 모비딕은 모든 광기와 고통, 사물의 이면을 자극하는 것, 악의를 품고 있는 진실, 힘줄이 끊어지고 뇌가 구워지는 것, 삶과 생각에 존재하는 모든 미묘한 악, 그리고 순수한 악이 구체화, 의인화된 존재였으며 그럼에도 실제로 공격가능한 대상이었다. 그는 고래의 등에 인류가 아담 이래 느껴온 모든 분노와 증오를 쌓았으며, 마치 자신을 폭탄처럼 사용해 그의 뜨거운 심장을 그 위에 터뜨렸다.”

중환자실에서 코비드 19로 마지막 숨을 내쉬고 있는 가족에게 마지막 인사조차 할 수 없는 상황에서 누가 광기와 고통을 느끼지 않을까? 몇 달 씩이나 고립되어 자신의 생각과 주장이 입막음을 당하고 사회적 활동에 제약을 당하는 상황에서 누가 힘줄이 끊어지고 뇌가 구워지는 경험을 하지 않을까?

우리가 먼 미래를 예상할수록 시야는 점점 더 흐려지고 불활실성의 안개는 짙어진다. 2030년에 2020년은 어떤 의미를 가질까? 30년 뒤인 2050년에는? 백년 뒤인 2120년에는? 베이지안 추론과 빅데이터 분석을 훈련받은 초예측자들이라 하더라도 5년 이상의 미래에 대해서는 그저 동전을 던지는 것 이상을 예측하지 못한다. 더구나 나는 초예측자도 아니다. 나는 지구의 미래를 예측해달라는 부탁을 받았을때, “역사의 교훈(The Lessons of History)”을 쓴 윌과 아리엘 듀란트가 쓴 서문이 떠올랐다. “이것은 위험한 작업이다. 오직 바보만이 수만년을 불확실한 결론을 가진 수백쪽으로 압축한다. 그럼 이제 그 일을 시작해 보겠다.”

전시대와 후시대(The Before Time and the After Time)

1966년 방영된 스타트렉의 “미리(Miri)” 에피소드에서 아직 어린 소녀인 주인공 미리는 당황한 커크 선장에게 자신의 행성에서 모든 어른들은 죽었으며, 아이들만 남았다고 말한다. “전시대(Before Time)에 어른들은 아프기 시작했어요. 우리는 숨었고, 그들은 모두 죽었어요.” 전시대(Before Time)의 어원을 추적한 언어학자 벤 짐머는 이 단어가 종종 전염병이 돌기 전의 세상을 가리키며 킹제임스 버전의 사무엘서에 나올 정도로 오래된 단어라고 말한다. “전시대(Beforetime) 이스라엘에서는 하나님에게 물어볼 것이 있을때 선견자(seer)가 직접 가서 물었고, 우리는 그에게 들었다. 오늘날 예언자라 부르는 이들을 이전에는 선견자(seer)라 불렀다.” 아틀란틱의 칼럼니스트 마리나 코렌은 코비드 19가 이 오래된 용어를 되살렸다고 말한다. “코로나바이러스가 전국을 휩쓸기 전의 세상에 대한 그리움이 사람들이 ‘전시대(Before Time)’라 부르는 그 시기를 마치 오래된 과거처럼 느끼게 만든다.”

‘전시대’가 묵시론적 세상의 이전을 의미한다면, 묵시론적 시대가 끝난 다음을 예언하는 “후시대(After Time)”란 용어도 있을 것이다. 묵시론적(apocalyptic)이라는 말은 종종 세상이 완전히 파괴되는 것을 의미하지만, 이 단어에 해당하는 그리스어의 원래 의미는 “계시하다(revelation)”, 혹은 “알려지지 않았던 사실을 밝혀내다”는 것이다. 이런 의미에서 나는 지금 이 시기가 우리에게 무엇을 밝혀줄 것인지를, 그러한 예측을 방해하는 요인들을 제거함으로써 알아보려 한다. 그 이유란 요기 베라가 “예측은 어렵다. 특히 미래에 대해서는”이라고 말한 이유일 것이다. 여기 네 가지가 있다.

첫번째는 가용성 휴리스틱이라는 것으로, 우리는 이미 알고 있는 것들, 특히 우리가 감정적으로 익숙하고 쉽게 생각할 수 있는 것들을 중심으로 미래의 확률을 예측한다는 것이다. 예를 들어, 우리가 비행기 사고 이야기를 자주 들음으로써 비행기 사고로 죽을 확률을 높게 예측하는 오류가 여기에 속한다. 두번째는 부정 편향이라는 것으로 보상보다는 위험에, 긍정적 자극보다는 부정적 자극에 더 민감한 것을 말한다. 세번째는 필립 테틀록과 댄 가드너가 쓴 “초예측(Superforecasting)”에서 이야기한 것으로, 대부분의 소위 전문가들 또한 그들의 미래 예측이 맞았는지를 확인했을 때 원숭이가 다트를 던지는 것과 다름 없었다는 것이다. 이는 그들이 자신의 예측을 제대로 확인하지 않으면서 (이는 확증 편향이라고도 알려져 있다) 자신을 과신하였고, 과학적 치장에도 불구하고 실은 다른 사람들과 마찬가지로 온갖 인지적 편향과 착각을 가졌기 때문이다. 네번째는 어쩌면 미래를 예측하는데 있어 가장 큰 장애물로, 이 세상이 고도로 우발적인 동시에 혼돈의 상태라는 것이다. 특정한 역사의 변곡점에서 아주 작고 우연한 사건이 전체 역사의 방향을 돌릴 수 있으며 이는 사실 예측이 거의 불가능하다. 

문제는 위의 요소들이 우리로 하여금 2020년의 시대를 바탕으로 미래를 예측하는데 영향을 줄 것인지일 것이다. 코비드-19는 인류를 완전히 새로운 방향으로 끌고 갈만큼 강력한 것일까? 아니면 이번 일도 늘 그랬던 것처럼 역사의 파도에 씻겨져 사라질까?

인류가 얻어낸 대부분의 유익한 사회 변화는 폭력 혁명이나 파괴적 격변이 아니라 기존의 제도에 기반한 점진적인 변화를 통해 이루어졌다. 2015년 출간한 “도덕의 궤적(The Moral Arc)”에서 나는 정치와 경제, 시민권과 범죄 정의, 전쟁과 예절, 통치와 폭력범죄에 이르는 다양한 주제에서 진보가 어떻게 이루어졌는지를 조사했다. 거의 모든 경우, 점진적이고 체계적인 문제 해결 방식이 더 안전하고 더 평등한 사회를 만드는데 월등히 성공적인 접근이었다. 이런 흐름이 이 코로나 시대와 그 이후에도 지속될 수 있을까? 한 번 그 답을 알아보자.

코비드-19에 대한 초예측

최고의 예측능력을 가진 앤서니 파우치(미 국립보건원 전염병 연구소 소장을 36년째 맡고 있는 코로나 최고 권위자 – 역자 주)도 이 전염병이 어떻게 끝날지는 알지 못한다. 우선 바이러스가 어떻게 진화할지, 곧 덜 치명적인 변이가 나타날지, 아니면 (가능성은 낮지만) 더 치사율이 높은 변이가 나타날지가 중요하다. 아직은 2019년 12월 처음 발견된 형태에서 유전적으로 많은 변화를 보이지 않고 있으며, 이는 백신 개발의 측면에서 좋은 소식이다. 사망률 또한 낮아지고 있고, 이는 숙주를 너무 빨리 죽이지 않는 것이 바이러스에게도 유리하다는 점에서 진화론으로 설명할 수 있는 현상이다. 사실 바이러스의 입장에서는 감염된 뒤에도 몇 주 동안은 증상을 나타내지 않게 하는 것이 최상이며, 실제로 지금 그런 현상이 일어나고 있다.

백신이 만약 개발되고 생산된다면, 그리고 12개월에서 18개월 사이에 수십억 명이 이 백신을 맞는다면, 낙관주의자들의 바램대로 예전의 생활로 돌아가는 것도 가능하다. 이런 미래가 충분히 가능한 이유로는, 다수의 정부 기관과 공공기관, 사기업들이 이를 위해 힘쓰고 있기 때문이다. 하지만 한편으로, HIV 처럼 여전히 사망률이 높으면서도 백신이 개발되지 못한 바이러스들이 있다. 독감 바이러스처럼 계속 변이를 만들어내기 때문에 매 번 새로운 백신을 만들어야 하는 것들도 있다. 백신 반대자들 때문에 집단 면역의 수준에 이르지 못해, 바이러스가 계속 돌아다닐 가능성도 있다.

어떤 경우이건, SARS-CoV-2 혹은 코비드-19의 다른 변이들이 완전히 사라지기는 힘들 것이라 생각되며, 이는 특정한 바이러스가 완전히 사라지는 것이 자연계에서 매우 특이한 일이기 때문이다. (천연두가 예외가 될 것이다.) 그리고 설사 코로나 바이러스가 완전히 사라진다 하더라도, 여전히 다른 치명적인 병을 옮기는, 어쩌면 코비드-19보다 더 치명적인 수많은 바이러스들이 있으며, 이는 우리가 이번 바이러스의 싸움과 무관하게, 미래의 전염병을 예방하기 위한 노력을 계속 해야 함을 의미한다. 이런 상황에서, 가까운 미래, 그리고 먼 미래에 우리는 어떤 모습일까?

경제와 산업

유사 이래 모든 위기가 그랬던 것처럼, 경제는 결국 회복될 것이다. 하지만 2조달러 이상이 새로 인쇄된 이상, 경제를 망가뜨릴 인플레이션의 가능성은 충분히 존재한다. 이 정도 규모의 부양책은 전례가 없는 것으로,  이를 회복하기 위해서는 몇 년이 걸릴 것이다. 그럼 결국 경제가 망가지는 것일까? 나는 그렇게 생각하지 않는다. 아담 스미스는 미국의 독립이 영국의 경제를 망가뜨릴까 걱정하는 친구에게 이렇게 답했다. “그것 말고도 망할 이유는 많네.”

그러나 인플레이션에 이은 최악의 시나리오가 벌어지지 않더라도, 겨우 손익을 맞추던 산업들, 예를 들어 충분한 기부금이 없는 소규모 학교, 중소 규모의 종교시설, 소규모 신문사, 잡지사, 미디어 회사 그리고 문을 다시 열지 못하고 있는 백화점과 쇼핑몰은 어쩌면 완전히 문을 닫게 될지 모른다. 이는 많은 이들에게 힘든 일이 되겠지만, 장기적으로 볼때 그저 나쁜 일만은 아니다. 이는 오스트리아의 경제학자 조셉 슘페터가 말한 “창조적 파괴”, 곧 새로운 산업이 자리를 잡기 위해 기존의 산업이 파괴되는 과정이 조금 급하게 일어나는 것일 수 있다.

아마존은 이미 새로운 시대가 주는 이익을 만끽하고 있다. 시장이 열리면 경쟁자가 들어오며, 우리는 월마트나 타겟과 같은 경쟁자들이 베조스의 제국에 틈을 내려 노력하는 것을 보고 있다. 그리고 지금 이순간 누군가가 어떤 창고에서 언젠가 다음 세대의 애플, 구글, 아마존이 될 새로운 기술을 개발하고 있을지 누가 알 것인가. 독점이 오랜 시간 유지되는 일은 거의 일어나지 않는다. 한편, 버려진 백화점, 쇼핑몰, 그리고 다른 건물들은 창고, 피트니스 클럽, 병원, 박물관, 심지어 아파트 등으로 바뀔 수 있고 이미 그런 변화는 일어나고 있다. 1950년대 이후 지어진 1,500개의 쇼핑몰 중 약 500개가 문을 닫았다. 그 중 60여개가 주거 및 사무실을 제공하는 건물로 리모델링 되었으며, 75개는 재개발이 진행되는 중이다. 콜로라도 레이크우드의 한 쇼핑몰이 좋은 예가 될 것이다. 2000년 문을 닫은 이곳은 11,000 평 넓이의 공원이 포함된 22개 구역으로 나뉘어 개발되고 있으며 총 8,400평의 사무실과 2,000 명이 살 수 있는 아파트가 개발되고 있다. 코로나로 인해 문을 닫는 쇼핑몰들은 이렇게 새로운 시장을 만들어낼 것이다.

앤드류 양이 지난 2020년 대통령 후보 경선에서 주장하기 전까지, 보편적 기본소득은 그저 아이디어에 불과했다. 경선 때만 하더라도, 정부가 소득을 보전해주기 위해 수천만 명에게 수표를 보내게 될 것이라고 생각한 이는 거의 없을 것이다. 하지만 그가 경선에서 떨어지고 몇 주 안되어 바로 그런 일이 일어났다. 이번 재난지원금에 대한 경제적 분석 결과에 따라 이런 형태의 지원이 정부의 한 정책으로 자리잡게 될지 결정될 것이다.

여행 제한이 사라지고, 항공사들이 승객을 비용 효율적이면서도 안전하게 운송할 방법을 찾아낸다면, 지금 거의 사라진 사업적 목적의 여행은 다시 재개될 것이다. 하지만 코로나 이전의 수준으로 돌아갈 것 같지는 않다. 가상의 형태로 거의 모든 의사소통을 할 수 있는 상황에서 굳이 당신을 구성하는 원자를 직접 이동시킬 필요가 있을까?

우리의 삶이 온라인으로 더 옮겨 가면서, 화상회의나 원격 의사소통과 같은 새로운 수요에 대한 거부감과 규제는 줄어들 것이다. 물론 어떤 영역들은 여전히 직접 얼굴을 보면서 이루어지겠지만, 계약과 법적 문서까지 원격으로 이루어지는 상황에서 그런 영역은 그리 많지 않을 것이다.

의사와 환자들 또한, 처음에는 필요에 의해, 나중에는 편리함에 의해 점점 더 원격 의료와 가상 현실 등의 온라인 도구로 옮겨갈 것이다. 평균 17분의 상담을 위해 의사가 있는 병원으로 차를 몰고 가 접수를 하고 긴 시간을 기다리고 싶어하는 사람은 없다. 물론 대부분의 치료는 원격으로 이루어질 수 없지만, 의사와 환자 사이 어느 쪽에서건 조금이라도 그 부담을 줄일 수 있는 방법은 쉽게 받아들여질 것이다.



via NewsPeppermint

September 4, 2020 at 07:56AM