-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathexperiments.html
More file actions
150 lines (121 loc) · 6.2 KB
/
experiments.html
File metadata and controls
150 lines (121 loc) · 6.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
<!DOCTYPE html>
<html lang = "en">
<link rel="stylesheet" href="style.css">
<head>
<title>fbGAN | Experiments</title>
</head>
<body class = container>
<section id = 'navbar'>
<div class="topnav">
<a href="./index.html"> Home</a>
<a href="./approach.html"> Approach </a>
<a href="./dataset.html"> Dataset </a>
<a href="./training.html"> Training </a>
<a class="current" href="./experiments.html"> Experiments </a>
<a href="./connect.html">Connect</a>
</div>
</section>
<section id ="exper">
<h1 class = 'training-intro-header' > Experiments</h1>
<p>Select an experiment to explore.</p>
<ul>
<!-- FIRST SECTION -->
<li>
<button type="button" class="collapsible">Optimizing for single features</button>
<div class="content">
<ul>
<li>
<button type="button" class="collapsible">Common single features ( H / E) </button>
<div class="content">
First, we optimized the sequences for three most common features: C, H, E.
Feature C is, however, already present in almost every sequence with an average score
of 100% throughout the run. Thus, only features E and H are shown. During each step (epoch),
2-5 logs were collected and the average of the score is shown.
<br>
Despite the flutuation in the average score per step, we can observe a general increasing that
indicates the increasing probability of the feature being present in the sequence.
<iframe id="igraph" scrolling="no" class ='exper' style="border:none;" seamless="seamless"
src="./images/experiments/E_experiment.html" height="800" width="100%"></iframe>
The probability of the feature H being present gradually increases from 63% to 75-85% at
steps 6-9.
<iframe id="igraph" scrolling="no" class ='exper' style="border:none;" seamless="seamless"
src="./images/experiments/H_experiment.html" height="800" width="100%"></iframe>
</div>
</li>
<li>
<button type="button" class="collapsible">Rare single feature (I) </button>
<div class="content">
(Pi Helix) is one of the rarest features and is found in only around 15% of known proteins.
The optimization process for this feature ended up being the hardest, with an average score of 0% appearing during a number of epochs.
In our previous experiments, we used a score of 0.8+ as the threshold for adding the sequence to the dataset.
However, because the feature is so rare, we took the threshold down to 0.1 for this experiment. In other words, if there was an even 10% chance of the sequence having alpha-helix, we were adding it to the dataset. </br>
After some time, the generator actually starts producing the sequences that exceed a 20% chance of belonging to the pi-helix class.
At that point, the size of the dataset starts reaching a million sequences - which are problematic to keep in the running memory.
Future optimization to our code might include getting read of the old sequences while continually adding new and better ones.
Alternatively, RAM problems can be avoided by creating a flexible threshold that gets higher once the generator outputs sequences that are more successful.
<iframe id="igraph" scrolling="no" class ='exper' style="border:none;" seamless="seamless" src="./images/experiments/I_experiment.html" height="800" width="100%"></iframe>
</div>
</li>
</ul>
</div>
</li>
<!-- SECOND SECTION -->
<li>
<button type="button" class="collapsible">Optimizing for combination of features</button>
<div class="content">
<ul>
<li>
<button type="button" class="collapsible">Combination of common features (C,H,E) </button>
<div class="content">
To see if the network is able to learn several features at once, we have performed
simultaneous enrichment for three most common features. The initial score for feature C was already 100% and
thus no improvement can be observed there.
While collecting more logs per epoch would smoothen out the fluctuation, we can observe
an increase of by 30-40% for the feature E and up to 20% for the feature H.
<iframe id="igraph" scrolling="no" class ='exper' style="border:none;" seamless="seamless" src="./images/experiments/CHE_experiment.html" height="800" width="100%"></iframe>
</div>
</li>
<li>
<button type="button" class="collapsible">Combination of common and rare features (T,E,G) </button>
<div class="content">
Next, we optimized for a combination of rare and common features.
Specifically, two rare features (T,G) were enriched simultaneously with a more common feature (E).
<iframe id="igraph" scrolling="no" class ='exper' style="border:none;" seamless="seamless" src="./images/experiments/ETG_experiment.html" height="800" width="100%"></iframe>
</div>
</li>
</ul>
</div>
</li>
<!-- THIRD SECTION -->
<li>
<button type="button" class="collapsible">Mode collapse</button>
<div class="content">
It is known that GAN models are prone to mode collapse where the generator simply repeats
one output over and over. In the future, we aim to add a similarity penalty to the sequences
(Linder et al. 2020) that would force the generator to create different sequences.
In the meantime, we used a simple edit distance metric that measures sequence similarity to
monitor how different generated sequences are, before and after optimization. The following graph
is for the optimization of the feature H (before optimization and after optimization).
<iframe id="igraph" class ='exper' scrolling="no" style="border:none;" seamless="seamless"
src="./images/experiments/edit_H3.html" height="500" width="100%" style = "margin-left:20%"></iframe>
</div>
</li>
</ul>
</section>
<script>
var coll = document.getElementsByClassName("collapsible");
var i;
for (i = 0; i < coll.length; i++) {
coll[i].addEventListener("click", function() {
this.classList.toggle("active");
var content = this.nextElementSibling;
if (content.style.display === "block") {
content.style.display = "none";
} else {
content.style.display = "block";
}
});
}
</script>
</body>
</html>