Buscar

IF67B C71 aula25

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 3, do total de 89 páginas

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 6, do total de 89 páginas

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 9, do total de 89 páginas

Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados

16 milhões de materiais de várias disciplinas

Impressão de materiais

Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes

Prévia do material em texto

1
Introduction DROP RDS Experiments Conclusions
Active Learning with Interactive Response Time
and its Application to the Diagnosis of Parasites
Priscila Tiemi Maeda Saito†⋆
Advisor: Alexandre Xavier Falca˜o†
Co-Advisor: Pedro Jussieu de Rezende†
†Institute of Computing, University of Campinas, Brazil
⋆Department of Computing, Federal University of Technology - Parana´, Brazil
August 26-29, 2015
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
2
Introduction DROP RDS Experiments Conclusions
Introduction
Large unlabeled datasetsData acquisition
technologies
How to organize them?
Image Annnotation
Giardia
duodenalis
Taenia
Ascaris
lumbricoides
AncylostomatidaeSchistosoma
mansoni
Expert
?
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
3
Introduction DROP RDS Experiments Conclusions
Motivation
Can the expert label a small number of images and the computer
annotate the remaining ones with high accuracy?
Large unlabeled datasets
Some labeled training samples
Giardia duodenalis
Taenia
Ascaris lumbricoides
Ancylostomatidae
Schistosoma mansoni
Supervised
classifier
Expert
How many?
Where?
Which samples?
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
4
Introduction DROP RDS Experiments Conclusions
Motivation
Active learning techniques have been investigated to answer this
question.
These techniques aim to identify the most informative images for
manual annotation in a few computer learning iterations.
In each iteration, the computer selects images from the dataset and
suggests their labels to the expert, who can accept/correct labels for
the next learning iteration.
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
5
Introduction DROP RDS Experiments Conclusions
Motivation
However, these techniques usually adopt a common strategy, which
requires at each learning iteration:
1 classification of the entire image dataset,
2 reorganization of all images according to some (sorting) criterion, and
3 selection of the most informative ones to train the classifier.
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
6
Introduction DROP RDS Experiments Conclusions
Standard Paradigm
Selector
Non-Annotated
Dataset
Annotated
DatasetAnnotated
Images
Learning
Classifier
Returned
Images
Image Selector
Non-Annotated
Dataset
Analysis
and
Organization
Classification
and
Selection
user
Labeled
Non-labeled
(first iteration)
Feedback Cycle
Annotated
Dataset
user
Reduction
and
Organization
Image Selector
Selection
and
Classification
Non-Annotated
Dataset
Annotated
Dataset
Training
Classifier
user
Learning Cycle
Labeled
Non-labeled
(first iteration)
Annotated
Images
Reduced
dataset
Reduction
and
Organization
(Large) Learning Set
Training
Expert
Classification
Selection
Organization
Annotated
Samples Classifier
Learning Cycle
Selected
Samples
For large datasets, it is impractical to process it entirely at every
iteration!
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
7
Introduction DROP RDS Experiments Conclusions
Proposed Paradigm
Selector
Non-Annotated
Dataset
Annotated
DatasetAnnotated
Images
Learning
Classifier
Returned
Images
Image Selector
Non-Annotated
Dataset
Analysis
and
Organization
Classification
and
Selection
user
Labeled
Non-labeled
(first iteration)
Feedback Cycle
Annotated
Dataset
user
Reduced
Learning Set
Reduction
and
Organization
Image Selector
Selection
and
Classification
Non-Annotated
Dataset
Annotated
Dataset
Training
Classifier
user
Learning Cycle
Labeled
Non-labeled
(first iteration)
Annotated
Images
Reduced
dataset
Reduction
and
Organization
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning Set
Learning Cycle
Annotated
Samples
Selected
Samples
Preprocessing
How can we reduce the dataset and organize the reduced set for
effective active learning?
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
8
Introduction DROP RDS Experiments Conclusions
Methodology
Key Challenges
How many samples should be used in the learning process?
How to ensure that these samples will be the most informative for
training the classifier?
Goals
keeping expert involvement to a minimum
achieving high accuracies early
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
9
Introduction DROP RDS Experiments Conclusions
Methodology
This PhD research presents a solution that
reduces the dataset into a subset, that potentially includes
samples from all classes for the first iteration and
the most informative ones for the remaining iterations,
organizes the most informative samples, such that the most useful ones
from distinct classes will be selected first, and
selects a small number of samples per iteration (i.e., 2 x c samples, for
c classes).
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
10
Introduction DROP RDS Experiments Conclusions
Main Contributions
Development of a novel active learning paradigm
DROP - Data Reduction and Organization Paradigm
Development of new active learning methods
Cluster-OPF-Rand - Boundary-based Reduction
DBE - Decreasing Boundary Edges
MST-BE - Minimum-Spanning Tree Boundary Edges
ASSL-OPF - Active Semi-Supervised Learning
RDS - Root Distance-Based Sampling
Validation in a real environment
by an experienced expert in parasitology using a realistic scenario
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
11
Introduction DROP RDS Experiments Conclusions
Preprocessing - Reduction
Selector
Non-Annotated
Dataset
Annotated
DatasetAnnotated
Images
Learning
Classifier
Returned
Images
Image Selector
Non-Annotated
Dataset
Analysis
and
Organization
Classification
and
Selection
user
Labeled
Non-labeled
(first iteration)
Feedback Cycle
Annotated
Dataset
user
Reduced
Learning Set
Reduction
and
Organization
Image Selector
Selection
and
Classification
Non-Annotated
Dataset
Annotated
Dataset
Training
Classifier
user
Learning Cycle
Labeled
Non-labeled
(first iteration)
Annotated
Images
Reduced
dataset
Reduction
and
Organization
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning Set
Learning Cycle
Annotated
Samples
Selected
Samples
Preprocessing
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
11
Introduction DROP RDS Experiments Conclusions
Preprocessing - Reduction
Selector
Non-Annotated
Dataset
Annotated
DatasetAnnotated
Images
Learning
Classifier
Returned
Images
Image Selector
Non-Annotated
Dataset
Analysis
and
Organization
Classification
and
Selection
user
Labeled
Non-labeled
(first iteration)
Feedback Cycle
Annotated
Dataset
user
Reduced
Learning Set
Reduction
and
Organization
Image Selector
Selection
and
Classification
Non-Annotated
Dataset
Annotated
Dataset
Training
Classifier
user
Learning Cycle
Labeled
Non-labeled
(first iteration)
Annotated
Images
Reduced
dataset
Reduction
and
Organization
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning Set
Learning Cycle
Annotated
Samples
Selected
Samples
Preprocessing
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
12
Introduction DROP RDS Experiments Conclusions
Preprocessing - ReductionAnalysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Learning Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
12
Introduction DROP RDS Experiments Conclusions
Preprocessing - Reduction
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Clustering
Learning Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
12
Introduction DROP RDS Experiments Conclusions
Preprocessing - Reduction
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Clustering
Learning Set Clusters
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6
Learning set ⇒ grouped by any clustering technique
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
12
Introduction DROP RDS Experiments Conclusions
Preprocessing - Reduction
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-SupervisionadoProcesso Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Reduction
Learning Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
12
Introduction DROP RDS Experiments Conclusions
Preprocessing - Reduction
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Reduction
Learning Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
12
Introduction DROP RDS Experiments Conclusions
Preprocessing - Reduction
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Reduction
Learning Set Reduced Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6
Reduced set ⇒ Cluster roots and boundary samples between distinct clusters
Boundary sample ⇒ if there exists, among its k-NN, at least one whose label is different
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
12
Introduction DROP RDS Experiments Conclusions
Preprocessing - Reduction
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
DescritoresCiclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Reduction
Learning Set Reduced Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6
Reduced set ⇒ Cluster roots and boundary samples between distinct clusters
Boundary sample ⇒ if there exists, among its k-NN, at least one whose label is different
Boundary samples may allow to select the boundary between classes
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
13
Introduction DROP RDS Experiments Conclusions
Preprocessing - Organization
1 Cluster-OPF-Rand
randomly selects samples from the reduced set
2 Decreasing Boundary Edges - DBE
organizes the reduced set based on the decreasing weight order of its
boundary edges
3 Minimum-Spanning Tree Boundary Edges - MST-BE
organizes the MST boundary edges from the reduced set in decreasing
weight order
4 Active Semi-Supervised Learning - ASSL-OPF
integrates semi-supervised learning and a priori reduction and
organization criteria
5 Root Distance-Based Sampling - RDS
pre-organizes the data and then properly balances the selection of
diverse and uncertain samples
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
14
Introduction DROP RDS Experiments Conclusions
Diagnosis of Parasites - Scenarios
Impurities ⇒ exceedingly abundant and quite similar to some species of
parasites
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
15
Introduction DROP RDS Experiments Conclusions
Diagnosis of Parasites - Scenario without Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
Without impurities ⇒ the reduction strategy is very effective
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
15
Introduction DROP RDS Experiments Conclusions
Diagnosis of Parasites - Scenario without Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
Roots of the clusters ⇒ representative samples from each class
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
15
Introduction DROP RDS Experiments Conclusions
Diagnosis of Parasites - Scenario without Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
LearningSetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
Boundary samples ⇒ informative samples from different classes
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
16
Introduction DROP RDS Experiments Conclusions
Diagnosis of Parasites - Scenario with Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
major challenge ⇒ unbalanced classes and several clusters in the feature space
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
16
Introduction DROP RDS Experiments Conclusions
Diagnosis of Parasites - Scenario with Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
With impurities ⇒ the reduction strategy is considerably less effective
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
16
Introduction DROP RDS Experiments Conclusions
Diagnosis of Parasites - Scenario with Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
Roots of the clusters ⇒ representative samples from each class
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
16
Introduction DROP RDS Experiments Conclusions
Diagnosis of Parasites - Scenario with Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
Boundary samples ⇒ do not correspond to informative samples
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
17
Introduction DROP RDS Experiments Conclusions
RDS Method - Diagnosis of Parasites
Selector
Non-Annotated
Dataset
Annotated
DatasetAnnotatedImages
Learning
Classifier
Returned
Images
Image Selector
Non-Annotated
Dataset
Analysis
and
Organization
Classification
and
Selection
user
Labeled
Non-labeled
(first iteration)
Feedback Cycle
Annotated
Dataset
user
Reduced
Learning Set
Reduction
and
Organization
Image Selector
Selection
and
Classification
Non-Annotated
Dataset
Annotated
Dataset
Training
Classifier
user
Learning Cycle
Labeled
Non-labeled
(first iteration)
Annotated
Images
Reduced
dataset
Reduction
and
Organization
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning Set
Learning Cycle
Annotated
Samples
Selected
Samples
Preprocessing
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
17
Introduction DROP RDS Experiments Conclusions
RDS Method - Diagnosis of Parasites
Selector
Non-Annotated
Dataset
Annotated
DatasetAnnotated
Images
Learning
Classifier
Returned
Images
Image Selector
Non-Annotated
Dataset
Analysis
and
Organization
Classification
and
Selection
user
Labeled
Non-labeled
(first iteration)
Feedback Cycle
Annotated
Dataset
user
Reduced
Learning Set
Reduction
and
Organization
Image Selector
Selection
and
Classification
Non-Annotated
Dataset
Annotated
Dataset
Training
Classifier
user
Learning Cycle
Labeled
Non-labeled
(first iteration)
Annotated
Images
Reduced
dataset
Reduction
and
Organization
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning Set
Learning Cycle
Annotated
Samples
Selected
Samples
Preprocessing
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
18
Introduction DROP RDS Experiments Conclusions
RDS Method - Scenario with Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
Organization strategy ⇒ balances the selection of diverse and uncertain samples
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
18
Introduction DROP RDS Experiments Conclusions
RDS Method - Scenario with Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
Samples from each cluster ⇒ to obtain the most diverse samples
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
18
Introduction DROP RDS Experiments Conclusions
RDS Method - Scenario with Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
Samples closer to the roots and whose labels are distinct from those of the roots
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
18
Introduction DROP RDS Experiments Conclusions
RDS Method - Scenario with Impurities
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
SupervisionadoClassificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
Selection
Strategies
classify
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
Final
Classifier
Training
(9,?)
(1,?)
(8,?)
(6,?)
(5,?)
(4,?)
(2,?)
(3,?)
(7,?)
(10,?)
Organized Set
Samples in the decreasing distance order from their roots ⇒ to obtain the most
uncertain samples
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
19
Introduction DROP RDS Experiments Conclusions
RDS Method - Scenario with Impurities
(a) (b)
Screen shots of the user interface, as used by the parasitologist to verify the label of selected objects: (a) images with labels
given by the classifier. (b) images with labels corrected/confirmed by the parasitologist. Giardia duodenalis and some impurity
components are difficult cases for class discrimination.
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
20
Introduction DROP RDS Experiments Conclusions
Experiments
Application on Parasites imagery1
Dataset d1
1,944 samples
15 classes
260 features
Dataset d2
5,948 samples
16 classes
260 features
Dataset d3
141,059 samples
16 classes
260 features
≈ 80% learning set
≈ 20% test set
1
Proprietary data: Laboratory of Visual, Biomedical and Health Informatics, University of Campinas
http://www.liv.ic.unicamp.br
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
21
Introduction DROP RDS Experiments Conclusions
Experiments
Application on several domains2 3 4:
image segmentation, handwritten digits, forest cover type, faces, cowhide
Statlog2
2,310 samples
7 classes
18 features
Pendigits2
10,992 samples
10 classes
16 features
Covertype2
581,012 samples
7 classes
54 features
Faces3
1,864 samples
54 classes
162 features
Cowhide4
1,690 samples
5 classes
63 features
2
Public data: UCI - Machine Learning Repository - http://archive.ics.uci.edu/ml/datasets
3
Public data: The Computer Vision Laboratory, University of Notre Dame - www.nd.edu/˜cvrl/CVRL
4
Proprietary data: Institute of Computing, Federal University of Mato Grosso do Sul
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
22
Introduction DROP RDS Experiments Conclusions
Results - Parasites Dataset d1
5
Accuracies and total annotated
(a) (b)
Best results in a scenario without impurities ⇒ MST-BE
Accuracy 98.24%± 0.62 with 9% of annotated images
5
1,944 samples, ≈ 1,455 learning samples, ≈ 489 test samples, 15 classes, 260 features
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
23
Introduction DROP RDS Experiments Conclusions
Results - Parasites Dataset d2
6
Accuracies ± standard deviations
OPF OPF Kmeans Kmeans OPF OPF Kmeans Kmeans
Methods MST-BE MST-BE MST-BE MST-BE RDS RDS RDS RDS Al- Rand
OPF SVM OPF SVM OPF SVM OPF SVM SVM OPF
accs 89.18% 85.96% 83.19% 81.40% 91.58% 90.27% 87.86% 84.90% 77.93% 74.07%
std dev 1.18± 1.72± 1.51± 1.83± 0.90± 1.79± 1.50± 1.53± 1.61± 2.10±
Best results in a scenario with impurities⇒ RDS
MST-BE and RDS are superior to Al-SVM and Rand OPF
RDS outperformed MST-BE
6
5,948 samples, ≈ 4,458 learning samples, ≈ 1,490 test samples, 16 classes, 260 features
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
24
Introduction DROP RDS Experiments Conclusions
Results - Parasites Dataset d3
7 - OPF RDS OPF Method
Practical experiment performed by the parasitologist
(a) (b)
Remarkable result in a realistic scenario ⇒ Accuracy 88% with 6.9% of annotated images
Low sensitivity rates from the traditional diagnosis procedure based on visual analysis ⇒
48.3% up to 75.9%
7
141,059 samples, 16 classes, 260 features
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
25
Introduction DROP RDS Experiments Conclusions
Results - Parasites Dataset d3
8 - OPF RDS OPF Method
Accuracies for each class
Species Accuracies
Entamoeba histolytica / E. dispar 60.16%
Giardia duodenalis 72.83%
Entamoeba coli 86.75%
Endolimax nana 84.82%
Iodameba bu¨tschlii 47.50%
Blastocystis hominis 79.03%
Ascaris lumbricoides 94.40%
Enterobius vermicularis 91.43%
Ancylostomatidae 92.24%
Strongyloides stercoralis 91.96%
Trichuris trichiura 95.15%
Hymenolepis nana 93.95%
Hymenolepis diminuta 95.97%
Taenia spp. 96.48%
Schistosoma mansoni 91.38%
Impurities 80.36%
8
141,059 samples, 16 classes, 260 features
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
26
Introduction DROP RDS Experiments Conclusions
Conclusions
Proposal of a novel learning paradigm that
is computationally and iteratively efficient,
avoids to process the entire dataset at each iteration,
affords interactive response time.
Development of new active learning strategies that
select the most useful (most diverse and most uncertain) samples,
provide high classification accuracy quickly,
identify samples from all classes,
decrease the human effort to a minimum.
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
27
Introduction DROP RDS Experiments Conclusions
Conclusions
Evaluation of the proposed active learning strategies with
different clustering and classification techniques,
baseline learning strategies,
datasets from different application domains:
different sizes, and with feature spaces of various dimensions and
classes
Evaluation and validation in a real environment
by an expert in parasitology using a realistic scenario
solution is effective and suitable for laboratory routine
Promising active learning methodology
effective and more efficient in practice for large datasets
valuable contribution to active machine learning
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
28
Introduction DROP RDS Experiments Conclusions
Overview of the Contributions
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is
satisfactory?
no
Selection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-SupervisedLearning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated Samples
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
29
Introduction DROP RDS Experiments Conclusions
Publications
Journals
� Saito, P. T. M. and Suzuki, C. T. N. and Gomes, J. F. and Falca˜o, A. X. and de
Rezende, P. J. Robust Active Learning for the Diagnosis of Parasites. Pattern
Recognition (PR), 2015. pp. 1–12. Impact Factor: 3.096.
� Saito, P. T. M. and de Rezende, P. J. and Falca˜o, A. X. and Suzuki, C. T. N. and
Gomes, J. F. An Active Learning Paradigm Based on a Priori Data Reduction and
Organization. Expert Systems with Applications (ESwA), 2014. vol. 41, no. 14, pp.
6086–6097. Impact Factor: 2.240.
� Saito, P. T. M. and Nakamura, R. Y. M. and Amorim, W. P. and Papa, J. P. and de
Rezende, P. J. and Falca˜o, A. X. Choosing the most effective pattern classification model
under learning-time constraint. Plos One, 2015. pp. 1–23. Impact Factor: 3.534.
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
30
Introduction DROP RDS Experiments Conclusions
Publications
Conferences
� Saito, P. T. M. and Amorim, W. P. and Falca˜o, A. X. and de Rezende, P. J. and Suzuki, C.
T. N. and Gomes, J. F. and de Carvalho, M. H. Active Semi-Supervised Learning using
Optimum-Path Forest. 22nd International Conference on Pattern Recognition (ICPR), 2014.
pp. 3798–3803. Qualis A1.
� Saito, P. T. M. and de Rezende, P. J. and Falca˜o, A. X. and Suzuki, C. T. N. and Gomes, J.
F. A data reduction and organization approach for efficient image annotation. Proceedings of
the 28th Annual ACM Symposium on Applied Computing (SAC), 2013. Coimbra, Portugal.
pp. 53–57. Qualis A1.
� Saito, P. T. M. and de Rezende, P. J. and Falca˜o, A. X. and Suzuki, C. T. N. and Gomes, J.
F. Improving Active Learning With Sharp Data Reduction. In: WSCG Communication
Proceedings of 20th WSCG International Conference on Computer Graphics, Visualization
and Computer Vision (WSCG), 2012. Plzen, Czech Republic. pp. 27–34. Qualis B1.
� Vargas, J. E. and Saito, P. T. M. and Falca˜o, A. X. and de Rezende, P. J. and dos Santos, J.
A. Superpixels-based interactive classification of very high resolution images. 27th SIBGRAPI
Conference on Graphics, Patterns and Images (SIBGRAPI), 2014. pp. 173–179. Qualis B1.
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
31
Introduction DROP RDS Experiments Conclusions
Future Extensions
Investigation of techniques to make the RDS method more robust to
possible expert’s mislabeling during active learning
consider multiple experts throughout the annotation process
develop a mechanism that identifies possible mislabeling according to
the previous learning iterations;
Development of new ways to explore the reduction and organization
of data;
Another direction is towards active learning in multi-label problems
wherein each image can belong to multiple categories simultaneously
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
32
Introduction DROP RDS Experiments Conclusions
Questions/Discussions
Thank You!
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
33
Introduction DROP RDS Experiments Conclusions
Active Learning with Interactive Response Time
and its Application to the Diagnosis of Parasites
Priscila Tiemi Maeda Saito†⋆
Advisor: Alexandre Xavier Falca˜o†
Co-Advisor: Pedro Jussieu de Rezende†
†Institute of Computing, University of Campinas, Brazil
⋆Department of Computing, Federal University of Technology - Parana´, Brazil
August 26-29, 2015
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
34
Introduction DROP RDS Experiments Conclusions
DBE Method - Preprocessing - Reduction and Organization
Selector
Non-Annotated
Dataset
Annotated
DatasetAnnotated
Images
Learning
Classifier
Returned
Images
Image Selector
Non-Annotated
Dataset
Analysis
and
Organization
Classification
and
Selection
user
Labeled
Non-labeled
(first iteration)
Feedback Cycle
Annotated
Dataset
user
Reduced
Learning Set
Reduction
and
Organization
Image Selector
Selection
and
Classification
Non-Annotated
Dataset
Annotated
Dataset
Training
Classifier
user
Learning Cycle
Labeled
Non-labeled
(first iteration)
Annotated
Images
Reduced
dataset
Reduction
and
Organization
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning Set
Learning Cycle
Annotated
Samples
Selected
Samples
Preprocessing
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
34
Introduction DROP RDS Experiments Conclusions
DBE Method - Preprocessing - Reduction and Organization
Selector
Non-Annotated
Dataset
Annotated
DatasetAnnotated
Images
Learning
Classifier
Returned
Images
Image Selector
Non-Annotated
Dataset
Analysis
and
Organization
Classification
and
Selection
user
Labeled
Non-labeled
(first iteration)
Feedback Cycle
Annotated
Dataset
user
Reduced
Learning Set
Reduction
and
Organization
Image Selector
Selection
and
Classification
Non-Annotated
Dataset
Annotated
Dataset
Training
Classifier
user
Learning Cycle
Labeled
Non-labeled
(first iteration)
Annotated
Images
Reduced
dataset
Reduction
and
Organization
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning Set
Learning Cycle
Annotated
Samples
Selected
Samples
Preprocessing
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
35
Introduction DROP RDS Experiments Conclusions
DBE Method - Preprocessing - Reduction and Organization
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Clustering
Learning Set Clusters
Organized Set
(Root and Boundary Samples)
1
8
Expert
AnnotatedClassifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6
Learning set ⇒ grouped by any clustering technique
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
35
Introduction DROP RDS Experiments Conclusions
DBE Method - Preprocessing - Reduction and Organization
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Reduction
Learning Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
35
Introduction DROP RDS Experiments Conclusions
DBE Method - Preprocessing - Reduction and Organization
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Reduction
Learning Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6
Reduced Set
Reduced set ⇒ Cluster roots and boundary edges between distinct clusters
Boundary edge ⇒ pair of samples classified into distinct clusters
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
35
Introduction DROP RDS Experiments Conclusions
DBE Method - Preprocessing - Reduction and Organization
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Organization
Learning Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
AnnotatedClassifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6
1
7
5
2
3
4 6
Organized Set
Organized set ⇒ based on the decreasing weight order of boundary edges
Largest edges may allow to select sample pairs from distinct classes
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
36
Introduction DROP RDS Experiments Conclusions
MST-BE Method - Preprocessing - Reduction and Organization
Selector
Non-Annotated
Dataset
Annotated
DatasetAnnotated
Images
Learning
Classifier
Returned
Images
Image Selector
Non-Annotated
Dataset
Analysis
and
Organization
Classification
and
Selection
user
Labeled
Non-labeled
(first iteration)
Feedback Cycle
Annotated
Dataset
user
Reduced
Learning Set
Reduction
and
Organization
Image Selector
Selection
and
Classification
Non-Annotated
Dataset
Annotated
Dataset
Training
Classifier
user
Learning Cycle
Labeled
Non-labeled
(first iteration)
Annotated
Images
Reduced
dataset
Reduction
and
Organization
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning Set
Learning Cycle
Annotated
Samples
Selected
Samples
Preprocessing
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
36
Introduction DROP RDS Experiments Conclusions
MST-BE Method - Preprocessing - Reduction and Organization
Selector
Non-Annotated
Dataset
Annotated
DatasetAnnotated
Images
Learning
Classifier
Returned
Images
Image Selector
Non-Annotated
Dataset
Analysis
and
Organization
Classification
and
Selection
user
Labeled
Non-labeled
(first iteration)
Feedback Cycle
Annotated
Dataset
user
Reduced
Learning Set
Reduction
and
Organization
Image Selector
Selection
and
Classification
Non-Annotated
Dataset
Annotated
Dataset
Training
Classifier
user
Learning Cycle
Labeled
Non-labeled
(first iteration)
Annotated
Images
Reduced
dataset
Reduction
and
Organization
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning Set
Learning Cycle
Annotated
Samples
Selected
Samples
Preprocessing
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
37
Introduction DROP RDS Experiments Conclusions
MST-BE Method - Preprocessing - Reduction and Organization
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Reduction
Learning Set Reduced Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5
1
8 7
9
3 6 4
2
5
Reduction
1
8
9
7
5
23
4
6
1
8
9
7
5
23
4
6
Learning set ⇒ grouped by any clustering technique
Reduced set ⇒ Cluster roots and boundary edges between distinct clusters
IC (UNICAMP) SIBGRAPI 2015 maeda@ic.unicamp.br
37
Introduction DROP RDS Experiments Conclusions
MST-BE Method - Preprocessing - Reduction and Organization
Analysis and
Organization Strategies
Imagens rotuladas
por grupo
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Image
Databases
Non-Annotated
Dataset
Processo Não-Supervisionado
Processo Supervisionado
Seletor de
Imagens
ok?
sn
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Ciclo de Realimentação
Extração de
Descritores
Base de
Imagens
Base de
Descritores
Extração de
Descritores
Análise e
Ordenação
Marcação
de Classes
Imagens
Anotadas
Aprendizado
Supervisionado
Classificador
Supervisionado
Usuário
Base de
Imagens
Base de
Descritores
Seletor de
Imagens
ok?
sn
Ciclo de Realimentação
Selector
Non-Annotated
Dataset
Annotated
Dataset
Reduced
Learning Set
Reduction Organization
(Large)
Learning Set
Training
Expert
Selection
and
Classification
Classifier
Organized
Learning SetLearning Cycle
Annotated
Samples
Selected
Samples
Selector
Training
Expert
Selection
and
Classification
Classifier
Learning Cycle
Annotated
Samples
Selected
Samples
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Learning Cycle
Preprocessing
Active Semi-Supervised Learning
Supervised Learning Cycle
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Analysis and
Organization Strategies
Image
Databases
Non-Annotated
Dataset
iteration?1st
is satisfied?
noSelection
Strategies
Training
Classifier
Expert
classify
no
yes
selection
Non-Annotated
Samples
Non-Supervised Preprocessing
Descriptor
Extraction
classification
Annotated
Samples
Final
Classifier
Organization
Learning Set
Organized Set
(Root and Boundary Samples)
1
8
Expert
Annotated Classifier
Root Set
Selected Samples
Clustering
Boundary Set
First
Iteration
Remaining
Iterations
Training
1
2
3
7
6
8
4
59
(9,?)
(1,2)
(8,1)
(6,?)
(5,?)
(4,3)
(2,?)
(3,2)
(7,3)
(10,1)
Samples
7
9
3 6 4
2
5

Outros materiais