Reverse Engineering / Forward Simulation: REFSTM

Some companies aggregate databases of disease pathways from the scientific literature. Others go a step further, assembling these disease pathways to form in silico networks of disease, and then use simulation to predict outcomes based on these pathways.

These aggregation approaches are limited by what actually appears in the published literature, which covers at most 5 to 10 percent of the biological circuitry of human cells. Within this limited body of information, those molecular variables that are most obviously altered in disease or in response to a drug are predominant. Thus, the molecular variables that may change subtly but that are central, or causal, to the disease or drug response are missed. The literature will continue growing, but it will take a long time to reach a level of completeness where one can make confident predictions and discoveries.

GNS takes a different approach. Our REFSTM technology takes as direct input biological data of all kinds to rapidly build a series of highly focused molecular models of the disease or drug interactions being studied that, as an ensemble, best explain the data. This series of models is then rapidly queried experimentally -- entirely in silico -- in order to uncover those molecular markers that are most causally related to the disease or drug response being studied.

Reverse Engineering and Forward Simulation

Reverse-Engineering

Using powerful supercomputing capabilities and our own unique platform, we uncover the most likely causal or "directed" network connections between all the variables measured that best describe the system that would have given rise to the data, both the molecular changes and their relationship to any given set of endpoints/phenotype. These are the key elements of most of the ensemble of models.

Because the Reverse-Engineering process is probabilistic in nature, we generate network models that are accompanied by confidence intervals, which help us understand whether enough data has been collected to generate meaningful answers, and where additional data should be collected if necessary. There is never enough data to deterministically identify the one unique system that could have given rise to the data, therefore the output from the reverse-engineering step is an ensemble of networks that together reflect the system being studied.

In sum, we can rapidly reverse-engineering the causal and quantitative connections between drugs, the disease system and endpoints, directly from biological data themselves.

Forward Simulation:

The ensemble of Reverse Engineering models, though powerful tools, are not sufficient: They are simply too big to "look at" to generate confident knowledge about the disease or drug. Thus, we have developed the further capability to quantitatively examine "interventions" to the ensemble of models in as many ways as one can conceive, a process called "Forward Simulation."

For example, we can measure the effects of different doses of a drug "in silico" on the endpoints of interest, and identify related markers of efficacy and toxicity.

We can also systematically introduce in-silico perturbations, e.g., knock down or upregulate each gene, protein, or other biological entity reflected in the collected data to identify potential mechanisms of disease. These results, obtained in hours or days rather than years, can easily be validated in "wet lab" experiments.

The most critical difference between GNS' approach and others is that we let the biology tell us what is important. Sometimes genes or other biological entities that don't change much in terms of their expression levels in disease or upon the administration of a drug are still very relevant — clustering does not catch these key genes, but elucidating the network of drug-disease biology does. GNS returns to the known biology after its unbiased models and predictions are made, so that they may be put into the context of what is known. Typically, both known and new biology are recovered from the process.