In multiway data, each sample is measured by multiple sets of
correlated attributes. We develop a probabilistic framework for
modeling structural dependency from partially observed
multi-dimensional array data, known as pTucker. Latent components
associated with individual array dimensions are jointly retrieved
while the core tensor is integrated out. The resulting algorithm
is capable of handling large-scale data sets. We verify the
usefulness of this approach by comparing against classical models
on applications to modeling amino acid fluorescence, collaborative
filtering and a number of benchmark multiway array data.
[pdf]
In Web-based services of dynamic content (such as news articles),
recommender systems face the difficulty of timely identifying new
items of high-quality and providing recommendations for new users.
We propose a feature-based machine learning approach to
personalized recommendation that is capable of handling the
cold-start issue effectively. The proposed framework is general and flexible
for other personalized tasks. The superior performance of our
approach is verified on a large-scale data set collected from the
Today-Module on Yahoo! Front Page, with comparison against six
competitive approaches.
[pdf][slides]
We consider the case when relationships are postulated to exist due to hidden common
causes. We discuss how the resulting graphical model differs from Markov
networks, and how it describes different types of real-world relational processes.
A Bayesian nonparametric classification model is built upon this graphical representation
and evaluated with several empirical studies.
GOTO Ricardo Silva's homepage for [pdf], [data] and [code]
In this paper we model relational random variables on the edges of a network using
Gaussian processes (GPs). We describe appropriate GP priors, i.e., covariance
functions, for directed and undirected networks connecting homogeneous or heterogenous
nodes. The framework suggests an intimate connection between link
prediction and transfer learning, which were traditionally two separate topics. [pdf]
Censored targets, such as the time to events in survival
analysis,
can generally be represented by intervals on the real line. In
this paper, we propose a novel support vector technique (named SVCR)
for
regression on censored targets. Interestingly,
this approach provides a general formulation for both standard
regression and binary classification tasks.
[pdf][longer
version]
Correlation between instances is often modelled via a kernel
function using input attributes of the instances. Relational
knowledge can further reveal additional pairwise correlations
between variables of interest. In this paper, we develop a class
of models which incorporates both reciprocal relational
information and input attributes using Gaussian process
techniques. This approach provides a novel non-parametric Bayesian
framework with a data-dependent prior for supervised learning
tasks. We also apply this framework to semi-supervised learning.
Experimental results on several real world data sets verify the
usefulness of this algorithm.
[pdf]
We present two new support vector approaches for ordinal
regression.
These approaches find the concentric spheres with minimum volume that
contain most of the training samples.
[pdf]
We consider the problem of utilizing unlabeled data for
Gaussian
process inference. Using a geometrically motivated data-dependent
prior, we propose a graph-based construction of semi-supervised
Gaussian processes. We demonstrate this approach empirically on several
classification problems.
[pdf]
In this paper, we propose a new basis
selection criterion for building sparse GP regression
models that provides promising gains in accuracy as well as
efficiency over previous methods.
Our algorithm is much faster than that of Smola and Bartlett,
while, in generalization it greatly outperforms the
information gain approach proposed by Seeger et al, especially
on the quality of predictive distributions.
[ps][code]
In this paper, we propose a probabilistic kernel approach to
preference learning based on Gaussian processes. A new likelihood
function is proposed to capture the preference relations in the
Bayesian framework. The generalized formulation is also applicable to
tackle many multiclass problems.
[ps][code]
In this paper, we propose two new support vector formulations
for
ordinal regression, which optimize multiple thresholds to define
parallel discriminant hyperplanes for the ordinal scales. Both
approaches guarantee that the thresholds are properly ordered at the
optimal solution.
[ps][code]
In this paper, we present a probabilistic approach to ordinal
regression in Gaussian processes. In the Bayesian framework of
Gaussian processes, we propose a likelihood function for ordinal
variables that is a generalization of the probit function.
Two inference techniques, based on Laplace approximation and
expectation propagation respectively, are applied for model
selection.
[ps][code]
In this paper, we describe a gene selection algorithm
based on Gaussian processes to discover consistent gene expression
patterns associated with ordinal clinical phenotypes. The
technique of automatic relevance determination is applied to
represent the significance level of the genes in a Bayesian framework.
[pdf][code]
In this paper, we present a graphical model that extends
segmental semi-Markov
models (SSMM) to exploit multiple sequence alignment profiles for
protein structure
prediction. A novel parameterized model is proposed as the likelihood
function
for the SSMM. By incorporating the information from long range
interactions in
beta-sheets, this model is capable of carrying out inference on contact
maps.
[pdf][webserver]
In this paper, we use soft insensitive loss function
in likelihood evaluation, and describe a Bayesian framework in a
stationary Gaussian process. Bayesian methods are used to implement
model adaptation, while keeping the merits of support vector
regression, such as quadratic programming and sparseness. Moreover,
confidence interval is provided in prediction.
[code]