School of Computing

Staff Profiles

Dr Jaume Bacardit

Reader in Machine Learning


I am a Reader in Machine Learning at the School of Computing of Newcastle University since 2017. I am affiliated to the Interdisciplinary Computing and Complex BioSystems (ICOS) research group. My areas of expertise are machine learning, bioinformatics and biomedical data analytics.


My research interests include the development of machine learning methods for large-scale problems and their application to challenging problems, mostly involving biological data.

Academic background

I received a BEng and MEng in Computer Engineering and a PhD in Computer Science from the Ramon Llull University in Barcelona, Spain in 1998, 2000 and 2004, respectively.

My PhD thesis involved the adaptation and application of a class of rule-based machine learning methods called Learning Classifier Systems to Data Mining tasks in terms of scalability, knowledge representations and generalisation capacity.

From 2005 to 2007 I was a postdoc at the University of Nottingham working on Protein Structure Prediction. From 2008 to 2013 I was a Lecturer in Bioinformatics at the University of Nottingham, and from 2014 to 2017 i was Senior Lecturer in Biodata Mining at Newcastle University.

Google Scholar: Click here.


I have published papers on algorithmic advances to improve the scalability of machine learning methods, tackling challenges such as large dimensionality spaces, large sets of records, postprocessing operators or the use of data-intensive computing technologies such as GPUs and MapReduce.

The main focus of my applied research on biological data is knowledge discovery: analyzing the structure of the data mining models to discover useful knowledge, such as (panels of) biomarkers or functional networks and in this way bring the data mining process closer to the domain experts. My methods have been applied to a variety of biological/biomedical domains: the proces of germination in plants, cancer in humans or osteoarthitis both in humans and model organisms and multiple data-generating biotechnologies: transcriptomics, proteomics, lipidomics, etc.

I lead the data analytics work of three large projects: D-BOARD (Biomedical, EU FP7 6M euros), APPROACH (Biomedical EU Innovative Medicine Initiative, 15M euros) and Portabolomics (Synthetic Biology, UK EPSRC £4.3M). I am also co-investigator in the Critical project (Cybersecurituy, UK EPSRC £2M).



  • Lazzarini N, Runhaar J, Bay-Jensen AC, Thudium CS, Bierma-Zeinstra SMA, Henrotin Y, Bacardit J. A machine learning approach for the identification of new biomarkers for knee osteoarthritis development in overweight and obese women. Osteoarthritis and Cartilage 2017, 25(12), 2014-2021.
  • Baron S, Lazzarini N, Bacardit J. Characterising the influence of rule-based knowledge representations in biological knowledge extraction from transcriptomics data. In: EvoApplications 2017: 20th European Conference on the Applications of Evolutionary Computation. 2017, Amsterdam: Springer.
  • Lazzarini N, Bacardit J. RGIFE: a ranked guided iterative feature elimination heuristic for the identification of biomarkers. BMC Bioinformatics 2017, 18, 322.
  • Garcia-Piquer A, Bacardit J, Fornells A, Golobardes E. Scaling-up multiobjective evolutionary clustering algorithms using stratification. Pattern Recognition Letters 2017, 93, 69-77.
  • Lazzarini N, Widera P, Williamson S, Heer R, Krasnogor N, Bacardit J. Functional networks inference from machine learning models. BioData Mining 2016, 9, 28.
  • Gutierrez PD, Lastra M, Bacardit J, Benitez JM, Herrera F. GPU-SME-kNN: Scalable and memory efficient kNN and lazy learning using GPUs. Information Sciences 2016, 373, 165-182.
  • Franco MA, Bacardit J. Large-scale experimental evaluation of GPU strategies for evolutionary machine learning. Information Sciences 2016, 330, 385–402.
  • Swan AL, Stekel DJ, Hodgman C, Allaway D, Alqahtani MH, Mobasheri A, Bacardit J. A machine learning heuristic to identify biologically relevant and minimal biomarker panels from omics data. BMC Genomics 2015, 16(Suppl 1), S2.
  • Martinez-Ballesteros M, Bacardit J, Troncoso A, Riquelme JC. Enhancing the scalability of a genetic algorithm to discover quantitative association rules in large-scale datasets. Integrated Computer-Aided Engineering 2015, 22(1), 21-39.
  • Triguero I, Peralta D, Bacardit J, García S, Herrera F. MRPR: A MapReduce solution for prototype reduction in big data classification. Neurocomputing 2015, 150(PA), 331-345.
  • Eduati F, Mangravite L, Wang T, Tang H, Bare J, Huang R, Norman T, Kellen M, Menden M, Yang J, Zhan X, Zhong R, Xiao G, Xia M, Abdo N, Kosyk O, Eduati F, Bare J, Norman T, Kellen M, Menden M, Friend S, Stolovitzky G, Dearry A, Tice R, Huang R, Xia M, Simeonov A, Abdo N, Kosyk O, Rusyn I, Wright F, Wang T, Tang H, Zhan X, Yang J, Zhong R, Xiao G, Xie Y, Tang H, Yang J, Wang T, Xiao G, Xie Y, Alaimo S, Amadoz A, Ammad-Ud-din M, Azencott C, Bacardit J, Barron P, Bernard E, Beyer A, Bin S, van-Bömmel A, Borgwardt K, Brys A, Caffrey B, Chang J, Chang J, Christodoulou E, Clément-Ziza M, Cohen T, Cowherd M, Demeyer S, Dopazo J, Elhard J, Falcao A, Ferro A, Friedenberg D, Giugno R, Gong Y, Gorospe J, Granville C, Grimm D, Heinig M, Hernansaiz R, Hochreiter S, Huang L, Huska M, Jiao Y, Klambauer G, Kuhn M, Kursa M, Kutum R, Lazzarini N, Lee I, Leung M, Lim W, Liu C, López F, Mammana A, Mayr A, Michoel T, Mongiovì M, Moore J, Narasimhan R, Opiyo S, Pandey G, Peabody A, Perner J, Pulvirenti A, Rawlik K, Reinhardt S, Riffle C, Ruderfer D, Sander A, Savage R, Scornet E, Sebastian-Leon P, Sharan R, Simon-Gabriel C, Stoven V, Sun J, Tang H, Teixeira A, Tenesa A, Vert J, Vingron M, Wang T, Walter T, Whalen S, Wisniewska Z, Wu Y, Xiao G, Xie Y, Xu H, Yang J, Zhan X, Zhang S, Zhao J, Zheng W, Zhong R, Ziwei D, Friend S, Dearry A, Simeonov A, Tice R, Rusyn I, Wright F, Stolovitzky G, Xie Y, Saez-Rodriguez J. Prediction of human population responses to toxic compounds by a collaborative competition. Nature Biotechnology 2015, 33, 933-940.
  • Triguero I, del Rio S, Lopez V, Bacardit J, Benitez JM, Herrera F. ROSEFW-RF: The winner algorithm for the ECBDL'14 big data competition: An extremely imbalanced big data bioinformatics problem. Knowledge-Based Systems 2015, 87, 69-79.
  • Triguero I, Peralta D, Bacardit J, Garcia S, Herrera F. A combined MapReduce-windowing two-level parallel scheme for evolutionary prototype generation. In: 2014 IEEE Congress on Evolutionary Computation (CEC). 2014, Beijing, China: IEEE.
  • Bacardit J, Widera P, Lazzarini N, Krasnogor N. Hard Data Analytics Problems Make for Better Data Analysis Algorithms: Bioinformatics as an Example. Big Data 2014, 2(3), 164-176.
  • Blakes J, Raz O, Feige U, Bacardit J, Widera P, Ben-Yehezkel T, Shapiro E, Krasnogor N. Heuristic for Maximizing DNA Reuse in Synthetic DNA Library Assembly. ACS Synthetic Biology 2014, 3(8), 529-542.
  • Garcia-Piquer A, Fornells A, Bacardit J, Orriols-Puig A, Golobardes E. Large-Scale Experimental Evaluation of Cluster Representations for Multiobjective Evolutionary Clustering. IEEE Transactions on Evolutionary Computation 2014, 18(1), 36-53.
  • Alkurashi MM, May ST, Kong K, Bacardit J, Haig D, Elsheikha HM. Susceptibility to experimental infection of the invertebrate locusts (Schistocerca gregaria) with the apicomplexan parasite Neospora caninum. PeerJ 2014, 2, e674.
  • Gibbs DJ, Bacardit J, Bachmair A, Holdsworth MJ. The eukaryotic N-end rule pathway: conserved mechanisms and diverse functions. Trends in Cell Biology 2014, 24(10), 603-611.
  • Khoury GA, Liwo A, Khatib F, Zhou H, Chopra G, Bacardit J, Bortot LO, Faccioli RA, Deng X, He Y, Krupa P, Li J, Mozolewska MA, Sieradzan AK, Smadbeck J, Wirecki T, Cooper S, Flatten J, Xu K, Baker D, Cheng J, Delbem ACB, Floudas CA, Keasar C, Levitt M, Popovic Z, Scheraga HA, Skolnick J, Crivelli SN, Foldit P. WeFold: A coopetition for protein structure prediction. Proteins: structure, function, and bioinformatics 2014, 82(9), 1850-1868.
  • Swan AL, Hillier KL, Smith JR, Allaway D, Liddell S, Bacardit J, Mobasheri A. Analysis of mass spectrometry data from the secretome of an explant model of articular cartilage exposed to pro-inflammatory and anti-inflammatory stimuli using machine learning. BMC Musculoskeletal Disorders 2013, 14, 349.
  • Swan AL, Mobasheri A, Allaway D, Liddell S, Bacardit J. Application of Machine Learning to Proteomics Data: Classification Biomarker Identification in Postgenomics Biology. OMICS: A Journal of Integrative Biology 2013, 17(12), 595-610.
  • Franco MA, Krasnogor N, Bacardit J. GAssist vs. BioHEL: critical assessment of two paradigms of genetics-based machine learning. Soft Computing 2013, 17(6), 953-981.
  • Calian DA, Bacardit J. Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets. Memetic Computing 2013, 5(2), 95-130.
  • Bacardit J, Llorà X. Large-scale data mining using genetics-based machine learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2013, 3(1), 37-61.
  • Franco MA, Krasnogor N, Bacardit J. Analysing BioHEL using challenging boolean functions. Evolutionary Intelligence 2012, 5(2), 87-102.
  • Bacardit J, Widera P, Márquez-Chamorro A, Divina F, Aguilar-Ruiz JS, Krasnogor N. Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features. Bioinformatics 2012, 28(19), 2441-2448.
  • Franco MA, Krasnogor N, Bacardit J. Post-processing operators for decision lists. In: Fourteenth International Conference on Genetic and Evolutionary Computation - GECCO '12. 2012.
  • Fainberg HP, Bodley K, Bacardit J, Li D, Wessely F, Mongan NP, Symonds ME, Clarke L, Mostyn A. Reduced Neonatal Mortality in Meishan Piglets: A Role for Hepatic Fatty Acids?. PLoS ONE 2012, 7(11), e49101.
  • Glaab E, Bacardit J, Garibaldi JM, Krasnogor N. Using Rule-Based Machine Learning for Candidate Disease Gene Prioritization;Sample Classification of Cancer Gene Expression Data. PLoS ONE 2012, 7(7), e39932.
  • Bassel GW, Glaab E, Marquez J, Holdsworth MJ, Bacardit J. Functional Network Construction in Arabidopsis Using Rule-Based Machine Learning on Large-Scale Data Sets. Plant Cell 2011, 23(9), 3101-3116.
  • Franco MA, Krasnogor N, Bacardit J. Modelling the initialisation stage of the ALKR representation for discrete domains and GABIL encoding. In: 13th Annual Conference on Genetic and Evolutionary Computation - GECCO '11. 2011.
  • Smith RE, Jiang MK, Bacardit J, Stout M, Krasnogor N, Hirst JD. A learning classifier system with mutual-information-based fitness. Evolutionary Intelligence 2010, 3(1), 31-50.
  • Bacardit J, Browne W, Drugowitsch J, Bernadó-Mansilla E, Butz MV, ed. Learning Classifier Systems. Springer, 2010.
  • Franco MA, Krasnogor N, Bacardit J. Speeding up the evaluation of evolutionary learning systems using GPGPUs. In: 12th Annual Conference on Genetic and Evolutionary Computation. 2010.
  • Bacardit J, Krasnogor N. A mixed discrete-continuous attribute list representation for large scale classification domains. In: 11th Annual Conference on Genetic and Evolutionary Computation - GECCO '09. 2009.
  • Bacardit J, Stout M, Hirst JD, Valencia A, Smith RE, Krasnogor N. Automated Alphabet Reduction for Protein Datasets. BMC Bioinformatics 2009, 10, 6.
  • Bacardit J, Burke EK, Krasnogor N. Improving the scalability of rule-based evolutionary learning. Memetic Computing 2009, 1(1), 55-67.
  • Alcala-Fdez J, Sanchez L, Garcia S, delJesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernandez JC, Herrera F. KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Computing 2009, 13(3), 307-318.
  • Bacardit J, Krasnogor N. Performance and Efficiency of Memetic Pittsburgh Learning Classifier Systems. Evolutionary Computation Journal 2009, 17(3), 307-342.
  • Stout M, Bacardit J, Hirst JD, Smith RE, Krasnogor N. Prediction of topological contacts in proteins using learning classifier systems. Soft Computing 2009, 13(3), 245-258.
  • Bacardit J, Stout M, Hirst JD, Krasnogor N. Data Mining in Proteomics with Learning Classifier Systems. In: Learning Classifier Systems in Data Mining. Springer, 2008, pp.17-46.
  • Bacardit J, Krasnogor N. Empirical Evaluation of Ensemble Techniques for a Pittsburgh Learning Classifier System. In: Learning Classifier Systems. Springer, 2008, pp.255-268.
  • Bacardit J, Bernadó-Mansilla E, Butz MV, Kovacs T, Llorà X, Takadama K, ed. Learning Classifier Systems. 2008.
  • Bacardit J, Bernado-Mansilla E, Butz MV. Learning Classifier Systems: Looking Back;Glimpsing Ahead. In: Learning Classifier Systems. Springer, 2008, pp.1-21.
  • Stout M, Bacardit J, Hirst JD, Krasnogor N. Prediction of recursive convex hull class assignments for protein residues. Bioinformatics 2008, 24(7), 916-923.
  • Bacardit J, Stout M, Hirst JD, Sastry K, Llorà X, Krasnogor N. Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. In: 9th Annual Conference on Genetic and Evolutionary Computation - GECCO '07. 2007.
  • Bacardit J, Garrell JM. Bloat Control and Generalization Pressure Using the Minimum Description Length Principle for a Pittsburgh Approach Learning Classifier System. In: Learning Classifier Systems. Springer, 2007, pp.59-79.
  • Bacardit J, Butz MV. Data Mining in Learning Classifier Systems: Comparing XCS with GAssist. In: Learning Classifier Systems. Springer, 2007, pp.282-290.
  • Bacardit J, Goldberg DE, Butz MV. Improving the Performance of a Pittsburgh Learning Classifier System Using a Default Rule. In: Learning Classifier Systems. Springer, 2007, pp.291-307.
  • Bacardit J, Stout M, Krasnogor N, Hirst JD, Blazewicz J. Coordination number prediction using learning classifier systems: performance and interpretability. In: 8th Annual Conference on Genetic and Evolutionary Computation - GECCO '06. 2006.
  • Stout M, Bacardit J, Hirst JD, Krasnogor N, Blazewicz J. From HP Lattice Models to Real Proteins: Coordination Number Prediction Using Learning Classifier Systems. In: Applications of Evolutionary Computing. Springer, 2006, pp.208-220.
  • Stout M, Bacardit J, Hirst JD, Blazewicz J, Krasnogor N. Prediction Of Residue Exposure And Contact Number For Simplified Hp Lattice Model Proteins Using Learning Classifier Systems. In: 7th International FLINS Conference on Applied Artificial Intelligence. 2006.
  • Bacardit J, Krasnogor N. Smart crossover operator with multiple parents for a Pittsburgh learning classifier system. In: 8th Annual Conference on Genetic and Evolutionary Computation - GECCO '06. 2006.
  • Bacardit J. Analysis of the initialization stage of a Pittsburgh approach learning classifier system. In: 7th Annual Conference on Genetic and Evolutionary Computation - GECCO '05. 2005.
  • Bacardit J, Garrell JM. Analysis and Improvements of the Adaptive Discretization Intervals Knowledge Representation. In: 5th Annual Conference on Genetic and Evolutionary Computation - GECCO '03. 2004.
  • Aguilar-Ruiz J, Bacardit J, Divina F. Experimental Evaluation of Discretization Schemes for Rule Induction. In: 6th Annual Conference on Genetic and Evolutionary Computation - GECCO '04. 2004.
  • Bacardit J, Goldberg DE, Butz MV, Llorà X, Garrell JM. Speeding-Up Pittsburgh Learning Classifier Systems: Modeling Time and Accuracy. In: Parallel Problem Solving from Nature - PPSN VIII. 2004.
  • Teixido M, Belda I, Roselló X, González S, Fabre M, Llorá X, Bacardit J, Garrell JM, Vilaró S, Albericio F, Giralt E. Development of a Genetic Algorithm to Design;Identify Peptides that can Cross the Blood-Brain Barrier. QSAR & Combinatorial Science 2003, 22(7), 745-753.
  • Bacardit J, Garrell JM. Evolving Multiple Discretizations with Adaptive Intervals for a Pittsburgh Rule-Based Learning Classifier System. In: Genetic and Evolutionary Computation Conference — GECCO 2003. 2003, Springer.
  • Bacardit J, Garrell JM. Evolution of Multi-adaptive Discretization Intervals for a Rule-Based Genetic Learning System. In: Iberoamerican Conference in Artificial Intelligence - IBERAMIA2002. 2002, Springer.