Baixe o app para aproveitar ainda mais
Prévia do material em texto
1 Introduction DROP RDS Experiments Conclusions Active Learning with Interactive Response Time and its Application to the Diagnosis of Parasites Priscila Tiemi Maeda Saito†⋆ Advisor: Alexandre Xavier Falca˜o† Co-Advisor: Pedro Jussieu de Rezende† †Institute of Computing, University of Campinas, Brazil ⋆Department of Computing, Federal University of Technology - Parana´, Brazil August 26-29, 2015 IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 2 Introduction DROP RDS Experiments Conclusions Introduction Large unlabeled datasetsData acquisition technologies How to organize them? Image Annnotation Giardia duodenalis Taenia Ascaris lumbricoides AncylostomatidaeSchistosoma mansoni Expert ? IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 3 Introduction DROP RDS Experiments Conclusions Motivation Can the expert label a small number of images and the computer annotate the remaining ones with high accuracy? Large unlabeled datasets Some labeled training samples Giardia duodenalis Taenia Ascaris lumbricoides Ancylostomatidae Schistosoma mansoni Supervised classifier Expert How many? Where? Which samples? IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 4 Introduction DROP RDS Experiments Conclusions Motivation Active learning techniques have been investigated to answer this question. These techniques aim to identify the most informative images for manual annotation in a few computer learning iterations. In each iteration, the computer selects images from the dataset and suggests their labels to the expert, who can accept/correct labels for the next learning iteration. IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 5 Introduction DROP RDS Experiments Conclusions Motivation However, these techniques usually adopt a common strategy, which requires at each learning iteration: 1 classification of the entire image dataset, 2 reorganization of all images according to some (sorting) criterion, and 3 selection of the most informative ones to train the classifier. IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 6 Introduction DROP RDS Experiments Conclusions Standard Paradigm Selector Non-Annotated Dataset Annotated DatasetAnnotated Images Learning Classifier Returned Images Image Selector Non-Annotated Dataset Analysis and Organization Classification and Selection user Labeled Non-labeled (first iteration) Feedback Cycle Annotated Dataset user Reduction and Organization Image Selector Selection and Classification Non-Annotated Dataset Annotated Dataset Training Classifier user Learning Cycle Labeled Non-labeled (first iteration) Annotated Images Reduced dataset Reduction and Organization (Large) Learning Set Training Expert Classification Selection Organization Annotated Samples Classifier Learning Cycle Selected Samples For large datasets, it is impractical to process it entirely at every iteration! IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 7 Introduction DROP RDS Experiments Conclusions Proposed Paradigm Selector Non-Annotated Dataset Annotated DatasetAnnotated Images Learning Classifier Returned Images Image Selector Non-Annotated Dataset Analysis and Organization Classification and Selection user Labeled Non-labeled (first iteration) Feedback Cycle Annotated Dataset user Reduced Learning Set Reduction and Organization Image Selector Selection and Classification Non-Annotated Dataset Annotated Dataset Training Classifier user Learning Cycle Labeled Non-labeled (first iteration) Annotated Images Reduced dataset Reduction and Organization Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning Set Learning Cycle Annotated Samples Selected Samples Preprocessing How can we reduce the dataset and organize the reduced set for effective active learning? IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 8 Introduction DROP RDS Experiments Conclusions Methodology Key Challenges How many samples should be used in the learning process? How to ensure that these samples will be the most informative for training the classifier? Goals keeping expert involvement to a minimum achieving high accuracies early IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 9 Introduction DROP RDS Experiments Conclusions Methodology This PhD research presents a solution that reduces the dataset into a subset, that potentially includes samples from all classes for the first iteration and the most informative ones for the remaining iterations, organizes the most informative samples, such that the most useful ones from distinct classes will be selected first, and selects a small number of samples per iteration (i.e., 2 x c samples, for c classes). IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 10 Introduction DROP RDS Experiments Conclusions Main Contributions Development of a novel active learning paradigm DROP - Data Reduction and Organization Paradigm Development of new active learning methods Cluster-OPF-Rand - Boundary-based Reduction DBE - Decreasing Boundary Edges MST-BE - Minimum-Spanning Tree Boundary Edges ASSL-OPF - Active Semi-Supervised Learning RDS - Root Distance-Based Sampling Validation in a real environment by an experienced expert in parasitology using a realistic scenario IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 11 Introduction DROP RDS Experiments Conclusions Preprocessing - Reduction Selector Non-Annotated Dataset Annotated DatasetAnnotated Images Learning Classifier Returned Images Image Selector Non-Annotated Dataset Analysis and Organization Classification and Selection user Labeled Non-labeled (first iteration) Feedback Cycle Annotated Dataset user Reduced Learning Set Reduction and Organization Image Selector Selection and Classification Non-Annotated Dataset Annotated Dataset Training Classifier user Learning Cycle Labeled Non-labeled (first iteration) Annotated Images Reduced dataset Reduction and Organization Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning Set Learning Cycle Annotated Samples Selected Samples Preprocessing IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 11 Introduction DROP RDS Experiments Conclusions Preprocessing - Reduction Selector Non-Annotated Dataset Annotated DatasetAnnotated Images Learning Classifier Returned Images Image Selector Non-Annotated Dataset Analysis and Organization Classification and Selection user Labeled Non-labeled (first iteration) Feedback Cycle Annotated Dataset user Reduced Learning Set Reduction and Organization Image Selector Selection and Classification Non-Annotated Dataset Annotated Dataset Training Classifier user Learning Cycle Labeled Non-labeled (first iteration) Annotated Images Reduced dataset Reduction and Organization Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning Set Learning Cycle Annotated Samples Selected Samples Preprocessing IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 12 Introduction DROP RDS Experiments Conclusions Preprocessing - ReductionAnalysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Learning Set Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 12 Introduction DROP RDS Experiments Conclusions Preprocessing - Reduction Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Clustering Learning Set Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 12 Introduction DROP RDS Experiments Conclusions Preprocessing - Reduction Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Clustering Learning Set Clusters Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6 Learning set ⇒ grouped by any clustering technique IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 12 Introduction DROP RDS Experiments Conclusions Preprocessing - Reduction Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-SupervisionadoProcesso Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Reduction Learning Set Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 12 Introduction DROP RDS Experiments Conclusions Preprocessing - Reduction Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Reduction Learning Set Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 12 Introduction DROP RDS Experiments Conclusions Preprocessing - Reduction Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Reduction Learning Set Reduced Set Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6 Reduced set ⇒ Cluster roots and boundary samples between distinct clusters Boundary sample ⇒ if there exists, among its k-NN, at least one whose label is different IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 12 Introduction DROP RDS Experiments Conclusions Preprocessing - Reduction Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de DescritoresCiclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Reduction Learning Set Reduced Set Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6 Reduced set ⇒ Cluster roots and boundary samples between distinct clusters Boundary sample ⇒ if there exists, among its k-NN, at least one whose label is different Boundary samples may allow to select the boundary between classes IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 13 Introduction DROP RDS Experiments Conclusions Preprocessing - Organization 1 Cluster-OPF-Rand randomly selects samples from the reduced set 2 Decreasing Boundary Edges - DBE organizes the reduced set based on the decreasing weight order of its boundary edges 3 Minimum-Spanning Tree Boundary Edges - MST-BE organizes the MST boundary edges from the reduced set in decreasing weight order 4 Active Semi-Supervised Learning - ASSL-OPF integrates semi-supervised learning and a priori reduction and organization criteria 5 Root Distance-Based Sampling - RDS pre-organizes the data and then properly balances the selection of diverse and uncertain samples IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 14 Introduction DROP RDS Experiments Conclusions Diagnosis of Parasites - Scenarios Impurities ⇒ exceedingly abundant and quite similar to some species of parasites IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 15 Introduction DROP RDS Experiments Conclusions Diagnosis of Parasites - Scenario without Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set Without impurities ⇒ the reduction strategy is very effective IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 15 Introduction DROP RDS Experiments Conclusions Diagnosis of Parasites - Scenario without Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set Roots of the clusters ⇒ representative samples from each class IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 15 Introduction DROP RDS Experiments Conclusions Diagnosis of Parasites - Scenario without Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized LearningSetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set Boundary samples ⇒ informative samples from different classes IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 16 Introduction DROP RDS Experiments Conclusions Diagnosis of Parasites - Scenario with Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set major challenge ⇒ unbalanced classes and several clusters in the feature space IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 16 Introduction DROP RDS Experiments Conclusions Diagnosis of Parasites - Scenario with Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set With impurities ⇒ the reduction strategy is considerably less effective IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 16 Introduction DROP RDS Experiments Conclusions Diagnosis of Parasites - Scenario with Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set Roots of the clusters ⇒ representative samples from each class IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 16 Introduction DROP RDS Experiments Conclusions Diagnosis of Parasites - Scenario with Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set Boundary samples ⇒ do not correspond to informative samples IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 17 Introduction DROP RDS Experiments Conclusions RDS Method - Diagnosis of Parasites Selector Non-Annotated Dataset Annotated DatasetAnnotatedImages Learning Classifier Returned Images Image Selector Non-Annotated Dataset Analysis and Organization Classification and Selection user Labeled Non-labeled (first iteration) Feedback Cycle Annotated Dataset user Reduced Learning Set Reduction and Organization Image Selector Selection and Classification Non-Annotated Dataset Annotated Dataset Training Classifier user Learning Cycle Labeled Non-labeled (first iteration) Annotated Images Reduced dataset Reduction and Organization Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning Set Learning Cycle Annotated Samples Selected Samples Preprocessing IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 17 Introduction DROP RDS Experiments Conclusions RDS Method - Diagnosis of Parasites Selector Non-Annotated Dataset Annotated DatasetAnnotated Images Learning Classifier Returned Images Image Selector Non-Annotated Dataset Analysis and Organization Classification and Selection user Labeled Non-labeled (first iteration) Feedback Cycle Annotated Dataset user Reduced Learning Set Reduction and Organization Image Selector Selection and Classification Non-Annotated Dataset Annotated Dataset Training Classifier user Learning Cycle Labeled Non-labeled (first iteration) Annotated Images Reduced dataset Reduction and Organization Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning Set Learning Cycle Annotated Samples Selected Samples Preprocessing IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 18 Introduction DROP RDS Experiments Conclusions RDS Method - Scenario with Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set Organization strategy ⇒ balances the selection of diverse and uncertain samples IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 18 Introduction DROP RDS Experiments Conclusions RDS Method - Scenario with Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set Samples from each cluster ⇒ to obtain the most diverse samples IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 18 Introduction DROP RDS Experiments Conclusions RDS Method - Scenario with Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set Samples closer to the roots and whose labels are distinct from those of the roots IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 18 Introduction DROP RDS Experiments Conclusions RDS Method - Scenario with Impurities Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado SupervisionadoClassificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples Selection Strategies classify Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction Final Classifier Training (9,?) (1,?) (8,?) (6,?) (5,?) (4,?) (2,?) (3,?) (7,?) (10,?) Organized Set Samples in the decreasing distance order from their roots ⇒ to obtain the most uncertain samples IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 19 Introduction DROP RDS Experiments Conclusions RDS Method - Scenario with Impurities (a) (b) Screen shots of the user interface, as used by the parasitologist to verify the label of selected objects: (a) images with labels given by the classifier. (b) images with labels corrected/confirmed by the parasitologist. Giardia duodenalis and some impurity components are difficult cases for class discrimination. IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 20 Introduction DROP RDS Experiments Conclusions Experiments Application on Parasites imagery1 Dataset d1 1,944 samples 15 classes 260 features Dataset d2 5,948 samples 16 classes 260 features Dataset d3 141,059 samples 16 classes 260 features ≈ 80% learning set ≈ 20% test set 1 Proprietary data: Laboratory of Visual, Biomedical and Health Informatics, University of Campinas http://www.liv.ic.unicamp.br IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 21 Introduction DROP RDS Experiments Conclusions Experiments Application on several domains2 3 4: image segmentation, handwritten digits, forest cover type, faces, cowhide Statlog2 2,310 samples 7 classes 18 features Pendigits2 10,992 samples 10 classes 16 features Covertype2 581,012 samples 7 classes 54 features Faces3 1,864 samples 54 classes 162 features Cowhide4 1,690 samples 5 classes 63 features 2 Public data: UCI - Machine Learning Repository - http://archive.ics.uci.edu/ml/datasets 3 Public data: The Computer Vision Laboratory, University of Notre Dame - www.nd.edu/˜cvrl/CVRL 4 Proprietary data: Institute of Computing, Federal University of Mato Grosso do Sul IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 22 Introduction DROP RDS Experiments Conclusions Results - Parasites Dataset d1 5 Accuracies and total annotated (a) (b) Best results in a scenario without impurities ⇒ MST-BE Accuracy 98.24%± 0.62 with 9% of annotated images 5 1,944 samples, ≈ 1,455 learning samples, ≈ 489 test samples, 15 classes, 260 features IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 23 Introduction DROP RDS Experiments Conclusions Results - Parasites Dataset d2 6 Accuracies ± standard deviations OPF OPF Kmeans Kmeans OPF OPF Kmeans Kmeans Methods MST-BE MST-BE MST-BE MST-BE RDS RDS RDS RDS Al- Rand OPF SVM OPF SVM OPF SVM OPF SVM SVM OPF accs 89.18% 85.96% 83.19% 81.40% 91.58% 90.27% 87.86% 84.90% 77.93% 74.07% std dev 1.18± 1.72± 1.51± 1.83± 0.90± 1.79± 1.50± 1.53± 1.61± 2.10± Best results in a scenario with impurities⇒ RDS MST-BE and RDS are superior to Al-SVM and Rand OPF RDS outperformed MST-BE 6 5,948 samples, ≈ 4,458 learning samples, ≈ 1,490 test samples, 16 classes, 260 features IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 24 Introduction DROP RDS Experiments Conclusions Results - Parasites Dataset d3 7 - OPF RDS OPF Method Practical experiment performed by the parasitologist (a) (b) Remarkable result in a realistic scenario ⇒ Accuracy 88% with 6.9% of annotated images Low sensitivity rates from the traditional diagnosis procedure based on visual analysis ⇒ 48.3% up to 75.9% 7 141,059 samples, 16 classes, 260 features IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 25 Introduction DROP RDS Experiments Conclusions Results - Parasites Dataset d3 8 - OPF RDS OPF Method Accuracies for each class Species Accuracies Entamoeba histolytica / E. dispar 60.16% Giardia duodenalis 72.83% Entamoeba coli 86.75% Endolimax nana 84.82% Iodameba bu¨tschlii 47.50% Blastocystis hominis 79.03% Ascaris lumbricoides 94.40% Enterobius vermicularis 91.43% Ancylostomatidae 92.24% Strongyloides stercoralis 91.96% Trichuris trichiura 95.15% Hymenolepis nana 93.95% Hymenolepis diminuta 95.97% Taenia spp. 96.48% Schistosoma mansoni 91.38% Impurities 80.36% 8 141,059 samples, 16 classes, 260 features IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 26 Introduction DROP RDS Experiments Conclusions Conclusions Proposal of a novel learning paradigm that is computationally and iteratively efficient, avoids to process the entire dataset at each iteration, affords interactive response time. Development of new active learning strategies that select the most useful (most diverse and most uncertain) samples, provide high classification accuracy quickly, identify samples from all classes, decrease the human effort to a minimum. IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 27 Introduction DROP RDS Experiments Conclusions Conclusions Evaluation of the proposed active learning strategies with different clustering and classification techniques, baseline learning strategies, datasets from different application domains: different sizes, and with feature spaces of various dimensions and classes Evaluation and validation in a real environment by an expert in parasitology using a realistic scenario solution is effective and suitable for laboratory routine Promising active learning methodology effective and more efficient in practice for large datasets valuable contribution to active machine learning IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 28 Introduction DROP RDS Experiments Conclusions Overview of the Contributions Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfactory? no Selection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-SupervisedLearning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 29 Introduction DROP RDS Experiments Conclusions Publications Journals � Saito, P. T. M. and Suzuki, C. T. N. and Gomes, J. F. and Falca˜o, A. X. and de Rezende, P. J. Robust Active Learning for the Diagnosis of Parasites. Pattern Recognition (PR), 2015. pp. 1–12. Impact Factor: 3.096. � Saito, P. T. M. and de Rezende, P. J. and Falca˜o, A. X. and Suzuki, C. T. N. and Gomes, J. F. An Active Learning Paradigm Based on a Priori Data Reduction and Organization. Expert Systems with Applications (ESwA), 2014. vol. 41, no. 14, pp. 6086–6097. Impact Factor: 2.240. � Saito, P. T. M. and Nakamura, R. Y. M. and Amorim, W. P. and Papa, J. P. and de Rezende, P. J. and Falca˜o, A. X. Choosing the most effective pattern classification model under learning-time constraint. Plos One, 2015. pp. 1–23. Impact Factor: 3.534. IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 30 Introduction DROP RDS Experiments Conclusions Publications Conferences � Saito, P. T. M. and Amorim, W. P. and Falca˜o, A. X. and de Rezende, P. J. and Suzuki, C. T. N. and Gomes, J. F. and de Carvalho, M. H. Active Semi-Supervised Learning using Optimum-Path Forest. 22nd International Conference on Pattern Recognition (ICPR), 2014. pp. 3798–3803. Qualis A1. � Saito, P. T. M. and de Rezende, P. J. and Falca˜o, A. X. and Suzuki, C. T. N. and Gomes, J. F. A data reduction and organization approach for efficient image annotation. Proceedings of the 28th Annual ACM Symposium on Applied Computing (SAC), 2013. Coimbra, Portugal. pp. 53–57. Qualis A1. � Saito, P. T. M. and de Rezende, P. J. and Falca˜o, A. X. and Suzuki, C. T. N. and Gomes, J. F. Improving Active Learning With Sharp Data Reduction. In: WSCG Communication Proceedings of 20th WSCG International Conference on Computer Graphics, Visualization and Computer Vision (WSCG), 2012. Plzen, Czech Republic. pp. 27–34. Qualis B1. � Vargas, J. E. and Saito, P. T. M. and Falca˜o, A. X. and de Rezende, P. J. and dos Santos, J. A. Superpixels-based interactive classification of very high resolution images. 27th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), 2014. pp. 173–179. Qualis B1. IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 31 Introduction DROP RDS Experiments Conclusions Future Extensions Investigation of techniques to make the RDS method more robust to possible expert’s mislabeling during active learning consider multiple experts throughout the annotation process develop a mechanism that identifies possible mislabeling according to the previous learning iterations; Development of new ways to explore the reduction and organization of data; Another direction is towards active learning in multi-label problems wherein each image can belong to multiple categories simultaneously IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 32 Introduction DROP RDS Experiments Conclusions Questions/Discussions Thank You! IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 33 Introduction DROP RDS Experiments Conclusions Active Learning with Interactive Response Time and its Application to the Diagnosis of Parasites Priscila Tiemi Maeda Saito†⋆ Advisor: Alexandre Xavier Falca˜o† Co-Advisor: Pedro Jussieu de Rezende† †Institute of Computing, University of Campinas, Brazil ⋆Department of Computing, Federal University of Technology - Parana´, Brazil August 26-29, 2015 IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 34 Introduction DROP RDS Experiments Conclusions DBE Method - Preprocessing - Reduction and Organization Selector Non-Annotated Dataset Annotated DatasetAnnotated Images Learning Classifier Returned Images Image Selector Non-Annotated Dataset Analysis and Organization Classification and Selection user Labeled Non-labeled (first iteration) Feedback Cycle Annotated Dataset user Reduced Learning Set Reduction and Organization Image Selector Selection and Classification Non-Annotated Dataset Annotated Dataset Training Classifier user Learning Cycle Labeled Non-labeled (first iteration) Annotated Images Reduced dataset Reduction and Organization Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning Set Learning Cycle Annotated Samples Selected Samples Preprocessing IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 34 Introduction DROP RDS Experiments Conclusions DBE Method - Preprocessing - Reduction and Organization Selector Non-Annotated Dataset Annotated DatasetAnnotated Images Learning Classifier Returned Images Image Selector Non-Annotated Dataset Analysis and Organization Classification and Selection user Labeled Non-labeled (first iteration) Feedback Cycle Annotated Dataset user Reduced Learning Set Reduction and Organization Image Selector Selection and Classification Non-Annotated Dataset Annotated Dataset Training Classifier user Learning Cycle Labeled Non-labeled (first iteration) Annotated Images Reduced dataset Reduction and Organization Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning Set Learning Cycle Annotated Samples Selected Samples Preprocessing IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 35 Introduction DROP RDS Experiments Conclusions DBE Method - Preprocessing - Reduction and Organization Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Clustering Learning Set Clusters Organized Set (Root and Boundary Samples) 1 8 Expert AnnotatedClassifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6 Learning set ⇒ grouped by any clustering technique IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 35 Introduction DROP RDS Experiments Conclusions DBE Method - Preprocessing - Reduction and Organization Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Reduction Learning Set Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 35 Introduction DROP RDS Experiments Conclusions DBE Method - Preprocessing - Reduction and Organization Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Reduction Learning Set Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6 Reduced Set Reduced set ⇒ Cluster roots and boundary edges between distinct clusters Boundary edge ⇒ pair of samples classified into distinct clusters IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 35 Introduction DROP RDS Experiments Conclusions DBE Method - Preprocessing - Reduction and Organization Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Organization Learning Set Organized Set (Root and Boundary Samples) 1 8 Expert AnnotatedClassifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6 1 7 5 2 3 4 6 Organized Set Organized set ⇒ based on the decreasing weight order of boundary edges Largest edges may allow to select sample pairs from distinct classes IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 36 Introduction DROP RDS Experiments Conclusions MST-BE Method - Preprocessing - Reduction and Organization Selector Non-Annotated Dataset Annotated DatasetAnnotated Images Learning Classifier Returned Images Image Selector Non-Annotated Dataset Analysis and Organization Classification and Selection user Labeled Non-labeled (first iteration) Feedback Cycle Annotated Dataset user Reduced Learning Set Reduction and Organization Image Selector Selection and Classification Non-Annotated Dataset Annotated Dataset Training Classifier user Learning Cycle Labeled Non-labeled (first iteration) Annotated Images Reduced dataset Reduction and Organization Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning Set Learning Cycle Annotated Samples Selected Samples Preprocessing IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 36 Introduction DROP RDS Experiments Conclusions MST-BE Method - Preprocessing - Reduction and Organization Selector Non-Annotated Dataset Annotated DatasetAnnotated Images Learning Classifier Returned Images Image Selector Non-Annotated Dataset Analysis and Organization Classification and Selection user Labeled Non-labeled (first iteration) Feedback Cycle Annotated Dataset user Reduced Learning Set Reduction and Organization Image Selector Selection and Classification Non-Annotated Dataset Annotated Dataset Training Classifier user Learning Cycle Labeled Non-labeled (first iteration) Annotated Images Reduced dataset Reduction and Organization Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning Set Learning Cycle Annotated Samples Selected Samples Preprocessing IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 37 Introduction DROP RDS Experiments Conclusions MST-BE Method - Preprocessing - Reduction and Organization Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Reduction Learning Set Reduced Set Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5 1 8 7 9 3 6 4 2 5 Reduction 1 8 9 7 5 23 4 6 1 8 9 7 5 23 4 6 Learning set ⇒ grouped by any clustering technique Reduced set ⇒ Cluster roots and boundary edges between distinct clusters IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br 37 Introduction DROP RDS Experiments Conclusions MST-BE Method - Preprocessing - Reduction and Organization Analysis and Organization Strategies Imagens rotuladas por grupo Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Image Databases Non-Annotated Dataset Processo Não-Supervisionado Processo Supervisionado Seletor de Imagens ok? sn Extração de Descritores Base de Imagens Base de Descritores Ciclo de Realimentação Extração de Descritores Base de Imagens Base de Descritores Extração de Descritores Análise e Ordenação Marcação de Classes Imagens Anotadas Aprendizado Supervisionado Classificador Supervisionado Usuário Base de Imagens Base de Descritores Seletor de Imagens ok? sn Ciclo de Realimentação Selector Non-Annotated Dataset Annotated Dataset Reduced Learning Set Reduction Organization (Large) Learning Set Training Expert Selection and Classification Classifier Organized Learning SetLearning Cycle Annotated Samples Selected Samples Selector Training Expert Selection and Classification Classifier Learning Cycle Annotated Samples Selected Samples iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Learning Cycle Preprocessing Active Semi-Supervised Learning Supervised Learning Cycle Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Analysis and Organization Strategies Image Databases Non-Annotated Dataset iteration?1st is satisfied? noSelection Strategies Training Classifier Expert classify no yes selection Non-Annotated Samples Non-Supervised Preprocessing Descriptor Extraction classification Annotated Samples Final Classifier Organization Learning Set Organized Set (Root and Boundary Samples) 1 8 Expert Annotated Classifier Root Set Selected Samples Clustering Boundary Set First Iteration Remaining Iterations Training 1 2 3 7 6 8 4 59 (9,?) (1,2) (8,1) (6,?) (5,?) (4,3) (2,?) (3,2) (7,3) (10,1) Samples 7 9 3 6 4 2 5
Compartilhar