解決pytorch trainloader遇到的多進程問題
pytorch中嘗試用多進程加載訓練數據集,源碼如下:
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=3)
結果報錯:
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:if __name__ == ‘__main__’:
freeze_support()
…The “freeze_support()” line can be omitted if the program
is not going to be frozen to produce an executable.
從報錯信息可以看到,當前進程在運行可執行代碼時,產生瞭一個新進程。這可能意味著您沒有使用fork來啟動子進程或者是未在主模塊中正確使用。
後來經過查閱發現瞭原因,因為windows系統下默認用spawn方法部署多線程,如果代碼沒有受到__main__模塊的保護,新進程都認為是要再次運行的代碼,將嘗試再次執行與父進程相同的代碼,生成另一個進程,依此類推,直到程序崩潰。
解決方法很簡單
把調用多進程的代碼放到__main__模塊下即可。
if __name__ == '__main__': transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform) trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=3)
補充:pytorch-Dataloader多進程使用出錯
使用Dataloader進行多進程數據導入訓練時,會因為多進程的問題而出錯
dataloader = DataLoader(transformed_dataset, batch_size=4,shuffle=True, num_workers=4)
其中參數num_works=表示載入數據時使用的進程數,此時如果參數的值不為0而使用多進程時會出現報錯
RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == ‘__main__’: freeze_support() … The “freeze_support()” line can be omitted if the program is not going to be frozen to produce an executable.
此時在數據的調用之前加上if __name__ == ‘__main__’:即可解決問題
if __name__ == '__main__':#這個地方可以解決多線程的問題 for i_batch, sample_batched in enumerate(dataloader):
以上為個人經驗,希望能給大傢一個參考,也希望大傢多多支持WalkonNet。
推薦閱讀:
- 解決pytorch讀取自制數據集出現過的問題
- pytorch DataLoader的num_workers參數與設置大小詳解
- Pytorch數據讀取與預處理該如何實現
- pytorch鎖死在dataloader(訓練時卡死)
- python遍歷迭代器自動鏈式處理數據的實例代碼