VARIABLE SELECTION IN MULTIVARIATE FUNCTIONAL DATA CLASSIFICATION

Publications

Share / Export Citation / Email / Print / Text size:

Statistics in Transition New Series

Polish Statistical Association

Central Statistical Office of Poland

Subject: Economics , Statistics & Probability

GET ALERTS

ISSN: 1234-7655
eISSN: 2450-0291

DESCRIPTION

6
Reader(s)
15
Visit(s)
0
Comment(s)
0
Share(s)

SEARCH WITHIN CONTENT

FIND ARTICLE

Volume / Issue / page

Related articles

VOLUME 20 , ISSUE 2 (June 2019) > List of articles

VARIABLE SELECTION IN MULTIVARIATE FUNCTIONAL DATA CLASSIFICATION

Tomasz Górecki / Mirosław Krzyśko / Waldemar Wołyński

Keywords : multivariate functional data, variable selection, dCov, HSIC, classification

Citation Information : Statistics in Transition New Series. Volume 20, Issue 2, Pages 123-138, DOI: https://doi.org/10.21307/stattrans-2019-018

License : (CC BY-NC-ND 4.0)

Published Online: 22-July-2019

ARTICLE

ABSTRACT

A new variable selection method is considered in the setting of classification with multivariate functional data (Ramsay and Silverman (2005)). The variable selection is a dimensionality reduction method which leads to replace the whole vector process, with a low-dimensional vector still giving a comparable classification error. Various classifiers appropriate for functional data are used. The proposed variable selection method is based on functional distance covariance (dCov) given by Székely and Rizzo (2009, 2012) and the Hilbert-Schmidt Independent Criterion (HSIC) given by Gretton et al. (2005). This method is a modification of the procedure given by Kong et al. (2015). The proposed methodology is illustrated with a real data example.

Content not available PDF Share

FIGURES & TABLES

REFERENCES

ANDO, T., (2009). Penalized optimal scoring for the classification of multi-dimensional functional data, Statistical Methodology, 6, pp. 565–576.

BERRENDERO, J. R., CUEVAS, A., TORRECILLA, J. L., (2016). Variable selection in functional data classification: a maxima-hunting proposal, Statistica Sinica, 26 (2), pp. 619–638.

DELAIGLE, A., HAAL, P., (2012). Methodology and theory for partial least squares applied to functional data. Annals of Statistics, 40, pp. 322–352.

FERRATY, F., VIEU, P., (2003). Curve discrimination. A nonparametric functional approach. Computational Statistics & Data Analysis, 44, pp. 161–173.

FERRATY, F., VIEU, P., (2009). Additive prediction and boosting for functional data. Computational Statistics & Data Analysis, 53 (4), pp. 1400–1413.

GÓRECKI, T., KRZYSKO, M., WASZAK, Ł., WOŁY´ NSKI, W., (2014). Methods of´ reducing dimension for functional data, Statistics in Transition new series, 15, pp. 231–242.

GÓRECKI, T., KRZYSKO, M., WOŁY´ NSKI, W., (2016). Multivariate functional re-´ gression analysis with application to classification problems, In: Analysis of Large and Complex Data, Studies in Classification, Data Analysis, and Knowledge Organization, Eds.: Wilhelm Adalbert F. X., Kestler Hans A., Springer International Publishing, pp. 173–183.

GRETTON, A., BOUSQUET, O., SMOLA, A., SCHÖLKOPF, B., (2005). Measuring statistical dependence with Hilbert-Schmidt norms. In: Algorithmic Learning Theory (S., Jain, H. U., Simon and E., Tomita, eds.), Lecture Notes in Computer Science, 3734, pp. 63–77, Springer, Berlin.

HASTIE, T. J., TIBSHIRANI, R. J., BUJA, A., (1995). Penalized discriminant analysis, Annals of Statistics, 23, pp. 73–102.

HORVÁTH, L., KOKOSZKA, P., (2012). Inference for Functional Data with Applications, Springer, New York.

JACQUES, J., PREDA, C., (2014). Model-based clustering for multivariate functional data, Computational Statistics & Data Analysis, 71, pp. 92–106.

KONG, J., WANG, S., WAHBA G., (2015). Using distance covariance for improved variable selection with application to learning genetic risk models, Statistics in Medicine, 34, pp. 1708–1720.

KUHN, M., Contributions from Jed Wing, Steve Weston, Andre Williams, Chris Keefer, Allan Engelhardt, Tony Cooper, Zachary Mayer, Brenton Kenkel, the R Core Team, Michael Benesty, Reynald Lescarbeau, Andrew Ziem, Luca Scrucca, Yuan Tang, Can Candan and Tyler Hunt, (2018), caret: Classification and Regression Training. R package version 6.0-80, https://CRAN.Rproject.org/package=caret.

R Core Team (2018). R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, https://www.Rproject.org/.

RAMSAY, J. O., SILVERMAN, B.W., (2005). Functional Data Analysis, Springer, New York.

RAMSAY, J. O., WICKHAM, H. GRAVES, S., HOOKER, G., (2018). fda: Functional Data Analysis, R package version 2.4.8, https://CRAN.R-project.org/package=fda.

RIZZO, M. L., SZÉKELY, G. J., (2018). energy: E-Statistics: Multivariate Inference via the Energy of Data, R package version 1.7-5, https://CRAN.Rproject.org/package=energy.

ROSSI, F., DELANNAYC, N., CONAN-GUEZA, B., VERLEYSENC, M., (2005). Representation of functional data in neural networks, Neurocomputing, 64, pp. 183–210.

ROSSI, F., VILLA, N., (2006). Support vector machines for functional data classification, Neural Computing, 69, pp. 730–742.

ROSSI, N., WANG, X., RAMSAY, J.O., (2002). Nonparametric item response function estimates with EM algorithm, Journal of Educational and Behavioral Statistics, 27, pp. 291–317.

SCHÖLKOPF, B., SMOLA, A. J., MÜLLER, K. R., (1998). Nonlinear component analysis as a kernel eigenvalue problem, Neural Computation, 10, pp. 1299– 1319.

SZÉKELY, G. J., RIZZO, M. L., BAKIROV, N. K., (2007). Measuring and testing dependence by correlation of distances, The Annals of Statistics, 35 (6), pp.

2769–2794.

SZÉKELY, G. J., RIZZO, M. L., (2009). Brownian distance covariance, Annals of Applied Statistics, 3 (4), pp. 1236–1265.

SZÉKELY, G. J., RIZZO, M. L., (2012). On the uniqueness of distance covariance, Statistical Probability Letters, 82 (12), pp. 2278–2282.

SZÉKELY, G. J., RIZZO, M. L., (2013). The distance correlation t-test of independence in high dimension. Journal of Multivariate Analysis, 117, pp. 193–213.

EXTRA FILES

COMMENTS