Whenever you use nn.Parameter(torch.Tensor(...)) , it will sometimes include nan value and results in training failure. In case someone skip the nn.init.xavier_uniform_, the right way to initialize parameter is to use nn.Parameter(torch.rand(...)) or nn.Parameter(torch.randn(...)). For example, at GAT.py.
See also: https://discuss.pytorch.org/t/nn-parameter-contains-nan-when-initializing/44559
Whenever you use
nn.Parameter(torch.Tensor(...)), it will sometimes include nan value and results in training failure. In case someone skip thenn.init.xavier_uniform_, the right way to initialize parameter is to usenn.Parameter(torch.rand(...))ornn.Parameter(torch.randn(...)). For example, at GAT.py.See also: https://discuss.pytorch.org/t/nn-parameter-contains-nan-when-initializing/44559