Nam Nguyen

4-5pm, Dec 17 2020

https://stonybrook.zoom.us/j/94214254415?pwd=K1VoQml4cFdlVW51VW41dWtid2tJdz09



The molecular mechanisms and functions in complex biological systems
currently remain elusive. Recent high-throughput techniques, such as
next-generation sequencing, have generated a wide variety of
multiomics datasets that enable the identification of biological
functions and mechanisms via multiple facets. However, integrating
these large-scale multiomics data and discovering functional insights
are, nevertheless, challenging tasks. To address these challenges,
machine learning has been broadly applied to analyze multiomics. In
particular, multiview learning is more effective than previous
integrative methods for learning data's heterogeneity and revealing
cross-talk patterns. Although it has been applied to various contexts,
such as computer vision and speech recognition, multiview learning has
not yet been widely applied to biological data--specifically,
multiomics data. Therefore, we have developed a framework called
multiview empirical risk minimization (MV-ERM) for unifying multiview
learning methods (Nguyen, et al., PLoS Computational Biology, 2020).
MV-ERM enables potential applications to understand multiomics
including genomics, transcriptomics, and epigenomics, in an aim to
discover the functional and mechanistic interpretations across omics.
Based on MV-ERM, we have developed the following methods:
ManiNetCluster, Varmole and ECMarker.



(1) ManiNetCluster (Nguyen, et al., BMC Genomics, 2019) is a manifold
learning method which simultaneously aligns and clusters gene networks
(e.g., co-expression) to systematically reveal the links of genomic
function between different phenotypes. Specifically, ManiNetCluster
employs manifold alignment to uncover and match local and non-linear
structures among networks, and identifies cross-network functional
links. We demonstrated that ManiNetCluster better aligns the
orthologous genes from their developmental expression profiles across
model organisms than state-of-the-art methods. This indicates the
potential non-linear interactions of evolutionarily conserved genes
across species in development. Furthermore, we applied ManiNetCluster
to time series transcriptome data measured in the green alga
Chlamydomonas reinhardtii to discover the genomic functions linking
various metabolic processes between the light and dark periods of a
diurnally cycling culture;



(2) Varmole (Nguyen, et al., Bioinformatics, 2020) is an interpretable
deep learning method that simultaneously reveals genomic functions and
mechanisms while predicting phenotype from genotype. In particular,
Varmole embeds multi-omic networks into a deep neural network
architecture and prioritizes variants, genes and regulatory linkages
via biological drop-connect without needing prior feature selections.
With an application to schizophonia, we demonstrate that Varmole
provides an effective alternative for recent statistical methods that
associate functional omic data (e.g. gene expression) with genotype
and phenotype and that link variants to individual genes in population
studies such as genome-wide association study;



(3) ECMarker (Jin*, Nguyen*, et al., Bioinformatics, 2020) is an
interpretable and scalable machine learning model that predicts gene
expression biomarkers for disease phenotypes and simultaneously
reveals underlying regulatory mechanisms. Particularly, ECMarker is
built on the integration of semi- and discriminative- restricted
Boltzmann machines, a neural network model for classification allowing
lateral connections at the input gene layer. With application to the
gene expression data of non-small cell lung cancer (NSCLC) patients,
we found that ECMarker not only achieved a relatively high accuracy
for predicting cancer stages but also identified the biomarker genes
and gene networks implying the regulatory mechanisms in lung cancer
development.



Finally, we propose a novel multiview learning method, Malignomics, to
predict phenotypes from heterogeneous multi-omic features. Malignomics
will first align multi-omic features by deep manifold alignment onto a
common latent space, better predicting nonlinear relationships across
omics. This deep alignment aims to preserve both global consistency
and local smoothness across omics and reveal higher-order nonlinear
interactions (i.e., manifolds) among cross-omic features. Second, it
uses these manifold structures to regularize the classifiers for
predicting phenotypes. This manifold-regularization allows
highlighting cross-omic feature manifolds and prioritizing the
features and interactions for the phenotypes. The prioritized
multi-omic features will further reveal underlying phenotypic
functions and mechanisms and thus enhance the biological
interpretation of Malignomics. We will apply Malignomics to
multi-omics data in neuropsychiatric disorders, and prioritize gene
regulatory networks linking risk variants, regulatory elements, and
genes for the disorders. We will also compare Malignomics with the
state-of-the-arts, and investigate how the manifold regulation will
potentially improve understanding of multi-omics functions and
predicting diseases.



Place:  https://stonybrook.zoom.us/j/99167126152?pwd=TFpEYzM0aFhiOFJxSFJEb1JSS3YyQT09  

Time: 3 PM EST - Dec, 16th, 2020 

Abstract: 

Shadows provide useful cues to analyze visual scenes but also hamper many computer vision algorithms such as image segmentation, object detection, or tracking. For those reasons, shadow detection and shadow removal have been well-studied in computer vision.

Early work on shadow detection and removal focused on physical illumination models of shadows. These methods can express, identify, and remove shadows in a physically plausible manner. However, these models are often hard to optimize and are slow during inference due to their reliance on hand-designed image features. Recently, deep-learning approaches have achieved breakthroughs in performance for both shadow detection and removal. They learn to extract useful features through training while being extremely efficient during inference. However, these models are data-dependent, opaque, and ignore the physical aspects of shadows. Thus they often lack generalization and produce inconsistent results.

We propose incorporating physical illumination constraints of shadows into deep-learning models. These constraints force the networks to more closely follow the physics of shadows, enabling them to systematically and realistically modify shadows in images. For shadow detection, we present a novel Generative Adversarial Network (GAN) based model where the generator learns to generate images with realistic attenuated shadows that can be used to train a shadow detector. For shadow removal, we propose a method that uses deep-networks to estimate the unknown parameters of a shadow image formation model that removes shadows. The system outputs high-quality shadow-free images with little or no image artifacts and achieves state-of-the-art performance in shadow removal when trained on a fully-supervised setting. Moreover, the system is easy to train and constrain since the shadow removal mapping is strictly defined by the simplified illumination model with interpretable parameters. Thus, it can be trained even with a much weaker form of supervision signal. In particular, we show that we can use two sets of patches, shadow and shadow-free, to train our shadow decomposition framework via an adversarial system. These patches are cropped from the shadow images themselves.
Therefore, this is the first deep-learning method for shadow removal that can be trained without any shadow-free images, providing an alternative solution to the paired data dependency issue. The advantage of this training scheme is even more pronounced when tested on a novel domain such as video shadow removal where the method can be fine-tuned on a testing video with only the shadow masks generated by a pre-trained shadow detector and further improves shadow removal results.

The Institute for AI-Driven Discovery and Innovation hosts Dr. Mary
Simoni for a talk on her music and its intersection with AI, as part
of the Music and AI Seminars series.

The event will be held on Thursday, December 10, 2020, at 3:00 PM.

Abstract: Mary Simoni, Dean of Humanities, Arts & Social Sciences at
Rensselaer Polytechnic Institute will discuss her research in the use
of computer algorithms and technology in the composition and
performance of music. The talk will feature compositions inspired by
Augmented Transition Networks (ATNs), employ motion tracking to
control synthesis parameters, and a work in progress that employs
machine learning using training data that juxtaposes classical music
with COVID-19. During this talk, participants will be introduced to
several technologies that support music information retrieval, machine
learning, and algorithmic composition such as jSymbolic, Weka, and
Common Music.

Zoom details below:
https://stonybrook.zoom.us/j/98236706900?pwd=bDFEZFZtaHBWU0cyL0wxK3UrdUpIdz09
Meeting ID: 982 3670 6900
Passcode: 133945