J.C. Bioch (Cor)
http://repub.eur.nl/ppl/1990/
List of Publicationsenhttp://repub.eur.nl/logo.jpg
http://repub.eur.nl/
RePub, Erasmus University RepositoryEstimating the Market Share Attraction Model using Support Vector Regressions
http://repub.eur.nl/pub/21926/
Wed, 01 Sep 2010 00:00:01 GMT<div>G.I. Nalbantov</div><div>Ph.H.B.F. Franses</div><div>P.J.F. Groenen</div><div>J.C. Bioch</div>
We propose to estimate the parameters of the Market Share Attraction Model (Cooper and Nakanishi, 1988; Fok and Franses, 2004) in a novel way by using a nonparametric technique for function estimation called Support Vector Regressions (SVR) (Smola, 1996; Vapnik, 1995). Traditionally, the parameters of the Market Share Attraction Model are estimated via a Maximum Likelihood (ML) procedure, assuming that the data are drawn from a conditional Gaussian distribution. However, if the distribution is unknown, Ordinary Least Squares (OLS) estimation may seriously fail (Vapnik, 1982). One way to tackle this problem is to introduce a linear loss function over the errors and a penalty on the magnitude of model coefficients. This leads to qualities such as robustness to outliers and avoidance of the problem of overfitting. This kind of estimation forms the basis of the SVR technique, which, as we will argue, makes it a good candidate for estimating the Market Share Attraction Model. We test the SVR approach to predict (the evolution of) the market shares of 36 car brands simultaneously and report promising results.SVM-Maj: a majorization approach to linear support vector machines with different hinge errors
http://repub.eur.nl/pub/12011/
Thu, 01 Nov 2007 00:00:01 GMT<div>P.J.F. Groenen</div><div>G.I. Nalbantov</div><div>J.C. Bioch</div>
Support vector machines (SVM) are becoming increasingly popular for the prediction of a binary dependent variable. SVMs perform very well with respect to competing techniques. Often, the solution of an SVM is obtained by switching to the dual. In this paper, we stick to the primal support vector machine (SVM) problem, study its effective aspects, and propose varieties of convex loss functions such as the standard for SVM with the absolute hinge error as well as the quadratic hinge and the Huber hinge errors. We present an iterative majorization algorithm that minimizes each of the adaptations. In addition, we show that many of the features of an SVM are also obtained by an optimal scaling approach to regression. We illustrate this with an example from the literature and do a comparison of different methods on several empirical data sets.Instance-Based penalization techniques for classification
http://repub.eur.nl/pub/8218/
Sat, 06 Jan 2007 00:00:01 GMT<div>G.I. Nalbantov</div><div>J.C. Bioch</div><div>P.J.F. Groenen</div>
Several instance-based large-margin classi¯ers have recently
been put forward in the literature: Support Hyperplanes, Nearest Convex
Hull classifier, and Soft Nearest Neighbor. We examine those techniques
from a common fit-versus-complexity framework and study the links be-
tween them. Finally, we compare the performance of these techniques
vis-a-vis each other and other standard classification methods.Estimating the market share attraction model using support vector regressions.
http://repub.eur.nl/pub/8528/
Mon, 01 Jan 2007 00:00:01 GMT<div>G.I. Nalbantov</div><div>Ph.H.B.F. Franses</div><div>J.C. Bioch</div><div>P.J.F. Groenen</div>
We propose to estimate the parameters of the Market Share Attraction Model (Cooper & Nakanishi, 1988; Fok & Franses, 2004) in a novel way by using a non-parametric technique for function estimation called Support Vector Regressions (SVR)
(Vapnik, 1995; Smola, 1996). Traditionally, the parameters of the Market Share Attraction Model are estimated via a Maximum Likelihood (ML) procedure, assuming that the data are drawn from a conditional Gaussian distribution. However, if the distribution is unknown, ML estimation may seriously fail (Vapnik, 1982). One way to tackle this problem is to introduce a linear loss function over the errors and a penalty on the magnitude of model coefficients. This leads to qualities such as robustness to outliers and avoidance of the problem of over¯tting. This kind of estimation forms the basis of the SVR technique, which, as we will argue, makes it a good candidate for solving the Market Share Attraction Model. We test the SVR approach to predict (the evolution of) the market shares of 36 car brands simultaneously and report stronger results than when using a ML estimation procedure.Nearest convex hull classification
http://repub.eur.nl/pub/8217/
Fri, 01 Dec 2006 00:00:01 GMT<div>G.I. Nalbantov</div><div>P.J.F. Groenen</div><div>J.C. Bioch</div>
Consider the classification task of assigning a test object to
one of two or more possible groups, or classes. An intuitive way to proceed
is to assign the object to that class, to which the distance is minimal. As
a distance measure to a class, we propose here to use the distance to the
convex hull of that class. Hence the name Nearest Convex Hull (NCH)
classification for the method. Convex-hull overlap is handled through the
introduction of slack variables and kernels. In spirit and computationally
the method is therefore close to the popular Support Vector Machine
(SVM) classifier. Advantages of the NCH classifier are its robustness
to outliers, good regularization properties and relatively easy handling
of multi-class problems. We compare the performance of NCH against
state-of-art techniques and report promising results.Nonlinear support vector machines through iterative majorization and I-splines
http://repub.eur.nl/pub/7889/
Wed, 19 Jul 2006 00:00:01 GMT<div>P.J.F. Groenen</div><div>J.C. Bioch</div><div>G.I. Nalbantov</div>
To minimize the primal support vector machine (SVM) problem, we
propose to use iterative majorization. To do so, we propose to use it-
erative majorization. To allow for nonlinearity of the predictors, we use
(non)monotone spline transformations. An advantage over the usual ker-
nel approach in the dual problem is that the variables can be easily inter-
preted. We illustrate this with an example from the literature.Classification with support hyperplanes
http://repub.eur.nl/pub/8012/
Wed, 19 Jul 2006 00:00:01 GMT<div>G.I. Nalbantov</div><div>J.C. Bioch</div><div>P.J.F. Groenen</div>
A new classification method is proposed, called Support Hy-
perplanes (SHs). To solve the binary classification task, SHs consider the
set of all hyperplanes that do not make classification mistakes, referred
to as semi-consistent hyperplanes. A test object is classified using that
semi-consistent hyperplane, which is farthest away from it. In this way, a
good balance between goodness-of-fit and model complexity is achieved,
where model complexity is proxied by the distance between a test object
and a semi-consistent hyperplane. This idea of complexity resembles the
one imputed in the width of the so-called margin between two classes,
which arises in the context of Support Vector Machine learning. Class
overlap can be handled via the introduction of kernels and/or slack vari-
ables. The performance of SHs against standard classifiers is promising
on several widely-used empirical data sets.Solving and interpreting binary classification problems in marketing with SVMs
http://repub.eur.nl/pub/7038/
Wed, 09 Nov 2005 00:00:01 GMT<div>J.C. Bioch</div><div>P.J.F. Groenen</div><div>G.I. Nalbantov</div>
Marketing problems often involve inary classification of customers into ``buyers'' versus ``non-buyers'' or ``prefers brand A'' versus ``prefers brand B''. These cases require binary classification models such as logistic regression, linear, and
quadratic discriminant analysis. A promising recent technique for
the binary classification problem is the Support Vector Machine
(Vapnik (1995)), which has achieved outstanding results in areas ranging from Bioinformatics to Finance. In this paper, we compare the performance of the Support Vector Machine against standard binary classification techniques on a marketing data set and elaborate on the interpretation of the obtained results.Induction of Ordinal Decision Trees
http://repub.eur.nl/pub/271/
Mon, 10 Feb 2003 00:00:01 GMT<div>J.C. Bioch</div><div>V. Popova</div>
This paper focuses on the problem of monotone decision trees from the
point of view of the multicriteria decision aid methodology (MCDA). By
taking into account the preferences of the decision maker, an attempt is
made to bring closer similar research within machine learning and MCDA.
The paper addresses the question how to label the leaves of a tree
in a way that guarantees the monotonicity of the resulting tree. Two
approaches are proposed for that purpose - dynamic and static labeling
which are also compared experimentally.
The paper further considers the problem of splitting criteria in the con-
text of monotone decision trees. Two criteria from the literature are com-
pared experimentally - the entropy criterion and the number of con
criterion - in an attempt to find out which one fits better the specifics of
the monotone problems and which one better handles monotonicity noise.Monotone Decision Trees and Noisy Data
http://repub.eur.nl/pub/207/
Mon, 17 Jun 2002 00:00:01 GMT<div>J.C. Bioch</div><div>V. Popova</div>
The decision tree algorithm for monotone classification presented in [4, 10] requires strictly monotone data sets. This paper addresses the problem of noise due to violation of the monotonicity constraints and proposes a modification of the algorithm to handle noisy data. It also presents methods for controlling the size of the resulting trees while keeping the monotonicity property whether the data set is monotone or not.Modular Decomposition of Boolean Functions
http://repub.eur.nl/pub/190/
Mon, 08 Apr 2002 00:00:01 GMT<div>J.C. Bioch</div>
Modular decomposition is a thoroughly investigated topic in many areas such
as switching theory, reliability theory, game theory and graph theory. Most appli-
cations can be formulated in the framework of Boolean functions. In this paper
we give a uni_ed treatment of modular decomposition of Boolean functions based
on the idea of generalized Shannon decomposition. Furthermore, we discuss some
new results on the complexity of modular decomposition. We propose an O(mn)-
algorithm for the recognition of a modular set of a monotone Boolean function f
with m prime implicants and n variables. Using this result we show that the the
computation of the modular closure of a set can be done in time O(mn2). On the
other hand, we prove that the recognition problem for general Boolean functions is
coNP-complete.Version Spaces and Generalized Monotone Boolean Functions
http://repub.eur.nl/pub/187/
Tue, 19 Mar 2002 00:00:01 GMT<div>J.C. Bioch</div><div>T. Ibaraki</div>
We consider generalized monotone functions f: X --> {0,1}
defined for an arbitrary binary relation <= on X by the
property x <= y implies f(x) <= f(y). These include the
standard monotone (or positive) Boolean functions, regular Boolean
functions and other interesting functions as special cases. It is
shown that a class of functions is closed under conjunction and
disjunction (i.e., a distributive lattice) if and only if it is
the class of monotone functions with respect to some quasi-order.
Subsequently, we consider the monoid of all conjunctive operators on a set and show that this monoid is algebraically isomorphic to the monoid of all binary relations on this set. In this development, two operators, positive content and positive closure, play an important role.
The results are then applied to the version space of all monotone
hypotheses of a set of binary examples also called the class of
all monotone extensions of a partially defined Boolean function,
to clarify its lattice theoretic properties.The Algorithmic Complexity of Modular Decomposition
http://repub.eur.nl/pub/99/
Thu, 14 Jun 2001 00:00:01 GMT<div>J.C. Bioch</div>
Modular decomposition is a thoroughly investigated topic in
many areas such as switching theory, reliability theory, game theory and
graph theory. We propose an O(mn)-algorithm for the recognition of a
modular set of a monotone Boolean function f with m prime implicants
and n variables. Using this result we show that the computation of
the modular closure of a set can be done in time O(mn2). On the other
hand, we prove that the recognition problem for general Boolean func
tions is NP-complete. Moreover, we introduce the so called generalized
Shannon decomposition of a Boolean functions as an efficient tool for
proving theorems on Boolean function decompositions.Bankruptcy Prediction with Rough Sets
http://repub.eur.nl/pub/76/
Thu, 22 Feb 2001 00:00:01 GMT<div>J.C. Bioch</div><div>V. Popova</div>
The bankruptcy prediction problem can be considered an or
dinal classification problem. The classical theory of Rough Sets describes
objects by discrete attributes, and does not take into account the order-
ing of the attributes values. This paper proposes a modification of the
Rough Set approach applicable to monotone datasets. We introduce re-
spectively the concepts of monotone discernibility matrix and monotone
(object) reduct. Furthermore, we use the theory of monotone discrete
functions developed earlier by the first author to represent and to com-
pute decision rules. In particular we use monotone extensions, decision
lists and dualization to compute classification rules that cover the whole
input space. The theory is applied to the bankruptcy prediction problem.Mining frequent intemsets in memory-resident databases
http://repub.eur.nl/pub/61/
Tue, 05 Dec 2000 00:00:01 GMT<div>W.H.L.M. Pijls</div><div>J.C. Bioch</div>
Due to the present-day memory sizes, a memory-resident database has become a practical option. Consequently, new methods designed to mining in such databases are desirable.
In the case of disk-resident databases, breadth-first search methods are commonly used. We propose a new algorithm, based upon depth-first search in a set-enumeration tree. For memory-resident databases, this method turns out to be superior to breadth-first search.Quasi-monotone decision trees for ordinal classification
http://repub.eur.nl/pub/446/
Thu, 01 Jan 1998 00:00:01 GMT<div>R. Potharst</div><div>J.C. Bioch</div><div>R. van Dordregt</div>
In many classification problems the domains of the attributes and the classes are linearly ordered. Since the known decision tree methods generate non-monotone trees, these methods are not suitable for monotone classification problems. We already provided order-preserving tree-generation algorithms for multi-attribute classification problems with k linearly ordered classes in a previous paper. For real-world datasets it is important to consider approximate solutions to handle problems like speed, tree-size and noise. In this report we develop a new decision tree algorithm that generates quasi-monotone decision trees. This algorithm outperforms classical algorithms such as those of Quinlan with respect to prediction, and beats algorithms that generate strictly monotone decision trees with respect to speed. This report contains proofs of all presented results.Dualisation, decision lists and identification of monotone discrete functions
http://repub.eur.nl/pub/757/
Thu, 01 Jan 1998 00:00:01 GMT<div>J.C. Bioch</div>
Many data-analysis algorithms in machine learning, datamining and a variety of other disciplines essentially operate on discrete multi-attribute data sets. By means of discretisation or binarisation also numerical data sets can be successfully analysed. Therefore, in this paper we view/introduce the theory of (partially defined) discrete functions as an important theoretical tool for the analysis of multi-attribute data sets. In particular we study monotone (partially defined) discrete functions. Compared with the theory of Boolean functions relatively little is known about (partially defined) monotone discrete functions. It appears that decision lists are useful for the representation of monotone discrete functions. Since dualisation is an important tool in the theory of (monotone) Boolean functions, we study the interpretation and properties of the dual of a (monotone) binary or discrete function. We also introduce the dual of a pseudo-Boolean function. The results are used to investigate extensions of partially defined monotone discrete functions and the identification of monotone discrete functions. In particular we present a polynomial time algorithm for the identification of so-called stable discrete functions.Monotone Decision Trees
http://repub.eur.nl/pub/522/
Wed, 01 Jan 1997 00:00:01 GMT<div>J.C. Bioch</div><div>T. Petter</div><div>R. Potharst</div>
EUR-FEW-CS-97-07 Title Monotone decision trees Author(s) R. Potharst J.C. Bioch T. Petter Abstract In many classification problems the domains of the attributes and the classes are linearly ordered. Often, classification must preserve this ordering: this is called monotone classification. Since the known decision tree methods generate non-monotone trees, these methods are not suitable for monotone classification problems. In this report we provide a number of order-preserving tree-generation algorithms for multi-attribute classification problems with k linearly ordered classes.Bivariate decision trees
http://repub.eur.nl/pub/458/
Mon, 01 Jan 1996 00:00:01 GMT<div>J.C. Bioch</div><div>O. van der Meer</div><div>R. Potharst</div>
Decision trees with tests based on a single variable, as produced by methods such as ID3, C4.5 etc., often require a large number of tests to achieve an acceptable accuracy. This makes interpretation of these trees, which is an important reason for their use, disputable. Recently, a number of methods for constructing decision trees with multivariate tests have been presented. Multivariate decision trees are often smaller and more accurate than univariate trees; however, the use of linear combinations of the variables may result in trees that are hard to interpret. In this paper we consider trees with test based on combinations of at most two variables. We show that bivariate decision trees are an interesting alternative to both uni- and multivariate trees.On the use of simple classifiers for the initialisation of one-hidden-layer neural nets
http://repub.eur.nl/pub/1436/
Sun, 01 Jan 1995 00:00:01 GMT<div>J.C. Bioch</div><div>R. Carsouw</div><div>R. Potharst</div>
In this report we discuss the use of two simple classifiers to initialise the input-to-hidden layer of a one-hidden-layer neural network. These classifiers divide the input space in convex regions that can be represented by membership functions. These functions are then used to determine the weights of the first layer of a feedforward network.