
38:04
yes

42:45
do you normally mix different activations a lot?

45:03
thank you

01:05:04
ok!

01:27:45
what is epsilon? just a small value?

01:29:01
ok :-)

01:45:17
what means "applied element-wise"? we take each r_i add \delta?

01:48:20
yes

02:01:27
do you decrease the learning rate also for algos like Adam?

02:05:13
yes!