Figure 10From: Learning without lossTraining error vs. work for the depth 3 majority-gate circuit data, compared for RRR with 50, 100 and 200 iterations per batch. Results are averages of 10 runs with random initial weightsBack to article page