我對(duì)PyTorch dataloader里的shuffle=True的理解
對(duì)shuffle=True的理解:
之前不了解shuffle的實(shí)際效果,假設(shè)有數(shù)據(jù)a,b,c,d,不知道batch_size=2后打亂,具體是如下哪一種情況:
1.先按順序取batch,對(duì)batch內(nèi)打亂,即先取a,b,a,b進(jìn)行打亂;
2.先打亂,再取batch。
證明是第二種
shuffle (bool, optional): set to ``True`` to have the data reshuffled at every epoch (default: ``False``). if shuffle: sampler = RandomSampler(dataset) #此時(shí)得到的是索引
補(bǔ)充:簡(jiǎn)單測(cè)試一下pytorch dataloader里的shuffle=True是如何工作的
看代碼吧~
import sys
import torch
import random
import argparse
import numpy as np
import pandas as pd
import torch.nn as nn
from torch.nn import functional as F
from torch.optim import lr_scheduler
from torchvision import datasets, transforms
from torch.utils.data import TensorDataset, DataLoader, Dataset
class DealDataset(Dataset):
def __init__(self):
xy = np.loadtxt(open('./iris.csv','rb'), delimiter=',', dtype=np.float32)
#data = pd.read_csv("iris.csv",header=None)
#xy = data.values
self.x_data = torch.from_numpy(xy[:, 0:-1])
self.y_data = torch.from_numpy(xy[:, [-1]])
self.len = xy.shape[0]
def __getitem__(self, index):
return self.x_data[index], self.y_data[index]
def __len__(self):
return self.len
dealDataset = DealDataset()
train_loader2 = DataLoader(dataset=dealDataset, batch_size=2, shuffle=True)
#print(dealDataset.x_data)
for i, data in enumerate(train_loader2):
inputs, labels = data
#inputs, labels = Variable(inputs), Variable(labels)
print(inputs)
#print("epoch:", epoch, "的第" , i, "個(gè)inputs", inputs.data.size(), "labels", labels.data.size())
簡(jiǎn)易數(shù)據(jù)集


shuffle之后的結(jié)果,每次都是隨機(jī)打亂,然后分成大小為n的若干個(gè)mini-batch.

以上為個(gè)人經(jīng)驗(yàn),希望能給大家一個(gè)參考,也希望大家多多支持本站。
版權(quán)聲明:本站文章來(lái)源標(biāo)注為YINGSOO的內(nèi)容版權(quán)均為本站所有,歡迎引用、轉(zhuǎn)載,請(qǐng)保持原文完整并注明來(lái)源及原文鏈接。禁止復(fù)制或仿造本網(wǎng)站,禁止在非maisonbaluchon.cn所屬的服務(wù)器上建立鏡像,否則將依法追究法律責(zé)任。本站部分內(nèi)容來(lái)源于網(wǎng)友推薦、互聯(lián)網(wǎng)收集整理而來(lái),僅供學(xué)習(xí)參考,不代表本站立場(chǎng),如有內(nèi)容涉嫌侵權(quán),請(qǐng)聯(lián)系alex-e#qq.com處理。
關(guān)注官方微信