LibreOffice macOS 解决编译问题

LibreOffice 编译时依赖需要 graphite, 而graphite的编译又依赖pkg-config。但是不想使用homebrew或者 MacPorts,喜欢下载源码自己编译。
pkg-config编译成功后,并且指定了环境变量。
然而make总是报checking for bogus pkg-config... configure: error: yes, from unknown origin. This *will* break the build. Please modify your PATH variable so that $PKG_CONFIG is no longer found by configure scripts.

排查configure.ac发现pkg-config 检查出错后,就退出了。
pkg-config的作用主要是为了编译graphite,而graphite是一种“智能字体”系统,专门为处理世界上鲜为人知的语言的复杂性而开发。
然而,对于中文字体可能影响不大。试试忽略这部分功能。于是修改代码如下:

diff --git a/configure.ac b/configure.ac
index 99ccaf54f748..dbb727422dec 100644
--- a/configure.ac
+++ b/configure.ac
@@ -7099,7 +7099,8 @@ if test $_os = Darwin; then
                 if test -z "$($PKG_CONFIG --list-all |grep -v '^libpkgconf')" ; then
                     AC_MSG_RESULT([yes, accepted since no packages available in default searchpath])
                 else
-                    AC_MSG_ERROR([yes, from unknown origin. This *will* break the build. Please modify your PATH variable so that $PKG_CONFIG is no longer found by configure scripts.])
+                    # AC_MSG_ERROR([yes, from unknown origin. This *will* break the build. Please modify your PATH variable so that $PKG_CONFIG is no longer found by configure scripts.])
+                    echo here ;
                 fi
             fi
         fi

注掉后,执行make,果然成功了。

macOS pkg-config编译

下载源码

curl http://pkgconfig.freedesktop.org/releases/pkg-config-0.29.2.tar.gz -o pkg-config-0.29.2.tar.gz

编译

tar -xf pkg-config-0.29.2.tar.gz
cd pkg-config-0.29.2
./configure  --with-internal-glib
make

执行make 时报错

gatomic.c:392:10: error: incompatible integer to pointer conversion passing 'gssize' (aka 'long') to parameter of type 'gpointer' (aka 'void *') [-Wint-conversion]
  392 |   return g_atomic_pointer_add ((volatile gpointer *) atomic, val);
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./gatomic.h:170:46: note: expanded from macro 'g_atomic_pointer_add'
  170 |     (gssize) __sync_fetch_and_add ((atomic), (val));                         \
      |                                              ^~~~~
gatomic.c:416:10: error: incompatible integer to pointer conversion passing 'gsize' (aka 'unsigned long') to parameter of type 'gpointer' (aka 'void *') [-Wint-conversion]
  416 |   return g_atomic_pointer_and ((volatile gpointer *) atomic, val);
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./gatomic.h:177:45: note: expanded from macro 'g_atomic_pointer_and'
  177 |     (gsize) __sync_fetch_and_and ((atomic), (val));                          \
      |                                             ^~~~~
gatomic.c:440:10: error: incompatible integer to pointer conversion passing 'gsize' (aka 'unsigned long') to parameter of type 'gpointer' (aka 'void *') [-Wint-conversion]
  440 |   return g_atomic_pointer_or ((volatile gpointer *) atomic, val);
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./gatomic.h:184:44: note: expanded from macro 'g_atomic_pointer_or'
  184 |     (gsize) __sync_fetch_and_or ((atomic), (val));                           \
      |                                            ^~~~~
gatomic.c:464:10: error: incompatible integer to pointer conversion passing 'gsize' (aka 'unsigned long') to parameter of type 'gpointer' (aka 'void *') [-Wint-conversion]
  464 |   return g_atomic_pointer_xor ((volatile gpointer *) atomic, val);
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./gatomic.h:191:45: note: expanded from macro 'g_atomic_pointer_xor'
  191 |     (gsize) __sync_fetch_and_xor ((atomic), (val));                          \
      |                                             ^~~~~
4 errors generated.

代码修改

diff --git a/glib/glib/gatomic.c b/glib/glib/gatomic.c
index eb2fe46..33fbddc 100644
--- a/glib/glib/gatomic.c
+++ b/glib/glib/gatomic.c
@@ -386,10 +386,10 @@ gboolean
  * Since: 2.30
  **/
 gssize
-(g_atomic_pointer_add) (volatile void *atomic,
+(g_atomic_pointer_add) (volatile gssize *atomic,
                         gssize         val)
 {
-  return g_atomic_pointer_add ((volatile gpointer *) atomic, val);
+  return g_atomic_pointer_add ((volatile gssize *) atomic, val);
 }
 
 /**
@@ -410,10 +410,10 @@ gssize
  * Since: 2.30
  **/
 gsize
-(g_atomic_pointer_and) (volatile void *atomic,
+(g_atomic_pointer_and) (volatile gsize *atomic,
                         gsize          val)
 {
-  return g_atomic_pointer_and ((volatile gpointer *) atomic, val);
+  return g_atomic_pointer_and ((volatile gsize *) atomic, val);
 }
 
 /**
@@ -434,10 +434,10 @@ gsize
  * Since: 2.30
  **/
 gsize
-(g_atomic_pointer_or) (volatile void *atomic,
+(g_atomic_pointer_or) (volatile gsize *atomic,
                        gsize          val)
 {
-  return g_atomic_pointer_or ((volatile gpointer *) atomic, val);
+  return g_atomic_pointer_or ((volatile gsize *) atomic, val);
 }
 
 /**
@@ -458,10 +458,10 @@ gsize
  * Since: 2.30
  **/
 gsize
-(g_atomic_pointer_xor) (volatile void *atomic,
+(g_atomic_pointer_xor) (volatile gsize *atomic,
                         gsize          val)
 {
-  return g_atomic_pointer_xor ((volatile gpointer *) atomic, val);
+  return g_atomic_pointer_xor ((volatile gsize *) atomic, val);
 }
 
 #elif defined (G_PLATFORM_WIN32)
diff --git a/glib/glib/gatomic.h b/glib/glib/gatomic.h
index e7fd1f2..21746da 100644
--- a/glib/glib/gatomic.h
+++ b/glib/glib/gatomic.h
@@ -66,16 +66,16 @@ gboolean                g_atomic_pointer_compare_and_exchange (volatile void  *a
                                                                gpointer        oldval,
                                                                gpointer        newval);
 GLIB_AVAILABLE_IN_ALL
-gssize                  g_atomic_pointer_add                  (volatile void  *atomic,
+gssize                  g_atomic_pointer_add                  (volatile gssize  *atomic,
                                                                gssize          val);
 GLIB_AVAILABLE_IN_2_30
-gsize                   g_atomic_pointer_and                  (volatile void  *atomic,
+gsize                   g_atomic_pointer_and                  (volatile gsize  *atomic,
                                                                gsize           val);
 GLIB_AVAILABLE_IN_2_30
-gsize                   g_atomic_pointer_or                   (volatile void  *atomic,
+gsize                   g_atomic_pointer_or                   (volatile gsize  *atomic,
                                                                gsize           val);
 GLIB_AVAILABLE_IN_ALL
-gsize                   g_atomic_pointer_xor                  (volatile void  *atomic,
+gsize                   g_atomic_pointer_xor                  (volatile gsize  *atomic,
                                                                gsize           val);
 
 GLIB_DEPRECATED_IN_2_30_FOR(g_atomic_add)
diff --git a/glib/glib/gdataset.c b/glib/glib/gdataset.c
index 006bdc1..1793716 100644
--- a/glib/glib/gdataset.c
+++ b/glib/glib/gdataset.c
@@ -1188,7 +1188,7 @@ g_datalist_set_flags (GData **datalist,
   g_return_if_fail (datalist != NULL);
   g_return_if_fail ((flags & ~G_DATALIST_FLAGS_MASK) == 0);
 
-  g_atomic_pointer_or (datalist, (gsize)flags);
+  g_atomic_pointer_or ((gsize *)datalist, (gsize)flags);
 }
 
 /**
@@ -1211,7 +1211,7 @@ g_datalist_unset_flags (GData **datalist,
   g_return_if_fail (datalist != NULL);
   g_return_if_fail ((flags & ~G_DATALIST_FLAGS_MASK) == 0);
 
-  g_atomic_pointer_and (datalist, ~(gsize)flags);
+  g_atomic_pointer_and ((gssize *)datalist, ~(gsize)flags);
 }
 
 /**

再次编译成功

torch FashionMNIST数据集导出图片

import numpy as np
import struct
 
from PIL import Image
import os

path_home='./data/FashionMNIST/raw/'
data_file = path_home+'train-images-idx3-ubyte'
# It's 47040016B, but we should set to 47040000B
data_file_size = 47040016
data_file_size = str(data_file_size - 16) + 'B'
 
data_buf = open(data_file, 'rb').read()
 
magic, numImages, numRows, numColumns = struct.unpack_from(
    '>IIII', data_buf, 0)
datas = struct.unpack_from(
    '>' + data_file_size, data_buf, struct.calcsize('>IIII'))
datas = np.array(datas).astype(np.uint8).reshape(
    numImages, 1, numRows, numColumns)
 
label_file = path_home+'train-labels-idx1-ubyte'
 
# It's 60008B, but we should set to 60000B
label_file_size = 60008
label_file_size = str(label_file_size - 8) + 'B'
 
label_buf = open(label_file, 'rb').read()
 
magic, numLabels = struct.unpack_from('>II', label_buf, 0)
labels = struct.unpack_from(
    '>' + label_file_size, label_buf, struct.calcsize('>II'))
labels = np.array(labels).astype(np.int64)
 
datas_root = 'mnist_train'
if not os.path.exists(datas_root):
    os.mkdir(datas_root)
 
for i in range(10):
    file_name = datas_root + os.sep + str(i)
    if not os.path.exists(file_name):
        os.mkdir(file_name)
 
for ii in range(numLabels):
    img = Image.fromarray(datas[ii, 0, 0:28, 0:28])
    label = labels[ii]
    file_name = datas_root + os.sep + str(label) + os.sep + \
        'mnist_train_' + str(ii) + '.png'
    img.save(file_name)

torch自定义数据集模型训练demo

import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader
from torch import nn
import os
import pandas as pd
from torchvision.io import decode_image


device = torch.accelerator.current_accelerator().type if torch.accelerator.is_available() else "cpu"
#  独热数据逆转化 ,本例独热化非必须
def arc_one_hot(x,list=torch.tensor([0,1,2,3,4,5,6,7,8,9],dtype=torch.float).to(device)):
    return x@list
#  1.创建自定义数据集
class CustomImageDataset(Dataset):
    def __init__(self, annotations_file, img_dir, transform=None, target_transform=None):
        self.img_labels = pd.read_csv(annotations_file, header=None) # 注意首行默认会被作为标题行忽略,或者设置header=None 
        self.img_dir = img_dir
        self.transform = transform
        self.target_transform = target_transform

    def __len__(self):
        return len(self.img_labels)

    def __getitem__(self, idx):
        img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
        #print(img_path)
        image = decode_image(img_path).float().div(255) #需要转成float类型,否则无法训练
        #print(image.shape)
        label = self.img_labels.iloc[idx, 1]
        #print(label)
        if self.transform:
            image = self.transform(image)
        if self.target_transform:
            label = self.target_transform(label)
        #独热化
        # print(label)
        new_transform = Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(dim=0, index=torch.tensor(y), value=1))
        label = new_transform(label)
        # print("------fuck")
        # print(label)
        return image, label
    

# csv注意是否有标题行
csv_path='/Users/mnist_test_cus_data/imglist_train.csv'
img_dir='/Users/mnist_test_cus_data/imgs_train/'
batch_size = 64
# 创建自定义数据集实例
mydataset = CustomImageDataset(annotations_file=csv_path, img_dir=img_dir, transform=None, target_transform=None)

# 使用 DataLoader 加载数据
mydataloader = DataLoader(mydataset, batch_size, shuffle=True, num_workers=0) #, num_workers=4 macos报错
print(len(mydataloader))
print(len(mydataloader.dataset))
# print(mydataset[59999])
# print(mydataset[0][0])
# print(mydataset[0][1])
# exit()

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
    target_transform = Lambda(lambda y: torch.zeros(10,dtype=torch.float).scatter_(dim=0, index=torch.tensor(y), value=1))
)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

# 遍历 DataLoader
#for batch in mydataloader:
#    images, labels = batch
    #print(images.size(), labels.size())
    #print(images)
    #print(labels)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")

    # print(X)
    # print(y)
    # print(arc_one_hot(y.to(device)))

    break
print(len(mydataloader))
# exit()


#  2. 可视化数据
def showdata():
    labels_map = {
        0: "T-Shirt",
        1: "Trouser",
        2: "Pullover",
        3: "Dress",
        4: "Coat",
        5: "Sandal",
        6: "Shirt",
        7: "Sneaker",
        8: "Bag",
        9: "Ankle Boot",
    }
    figure = plt.figure(figsize=(8, 8))
    cols, rows = 3, 3
    xxx=''
    for i in range(1, cols * rows + 1):
        sample_idx = torch.randint(len(mydataset), size=(1,)).item()
        img, label = mydataset[sample_idx]
        figure.add_subplot(rows, cols, i)
        # 独热逆转化
        label=arc_one_hot(label.to(device)).item()
        plt.title(labels_map[label])
        plt.axis("off")
        xxx=img
        plt.imshow(img.squeeze(), cmap="gray")
    plt.show()
    print(xxx.shape)
    print('------')
    print(xxx.squeeze().shape)

    # Display image and label.
    train_features, train_labels = next(iter(mydataloader))
    print(f"Feature batch shape: {train_features.size()}")
    print(f"Labels batch shape: {train_labels.size()}")
    img = train_features[0].squeeze()
    label = train_labels[0]
    plt.imshow(img, cmap="gray")
    plt.show()
    print(f"Label: {label}")
    # exit()


# 3.定义模型

print(f"Using {device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten() #维度展平
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

# 4. 定义损失函数和优化器
loss_fn = nn.CrossEntropyLoss() #交叉熵
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

# 5. 训练
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    print("size="+str(size))
    model.train() # 启用 Batch Normalization 和 Dropout,归一化,随机丢弃神经元防止过拟合,测试时不丢弃
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)
        # Compute prediction error
        pred = model(X)
        # print("---------->pred=")
        # print(pred)
        # print(y)
        # print("-----------------------<")
        # 不需要独热逆转化,交叉熵计算过程包含了独热编码,但是不仍然可以使用独热参数
        # y=arc_one_hot(y)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward() # 计算梯度
        optimizer.step() # 根据梯度优化参数
        optimizer.zero_grad() # 梯度归零

        if batch % 100 == 0: # 每100个batch打印一次
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")
        # exit()
# 6. 测试
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            #print(y)
            #不需要独热逆编码
            #y=arc_one_hot(y)
            #print(y)
            pred = model(X)
            #print(pred)
            test_loss += loss_fn(pred, y).item()
            #统计个数
            # >>> xx==zz
            # tensor([ True, False, False,  True, False,  True, False,  True, False, False,
            #          True, False, False,  True, False, False,  True, False,  True, False,
            #          True, False, False, False, False,  True,  True, False,  True, False,
            #         False,  True, False, False, False, False, False, False,  True,  True,
            #         False, False, False,  True, False, False,  True, False, False,  True,
            #         False,  True, False,  True,  True, False, False, False, False,  True,
            #         False,  True,  True,  True])
            # >>> (xx==zz).type(torch.float)
            # tensor([1., 0., 0., 1., 0., 1., 0., 1., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0.,
            #         1., 0., 1., 0., 0., 0., 0., 1., 1., 0., 1., 0., 0., 1., 0., 0., 0., 0.,
            #         0., 0., 1., 1., 0., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0., 1., 0., 1.,
            #         1., 0., 0., 0., 0., 1., 0., 1., 1., 1.])
            # >>> (xx==zz).type(torch.float).sum()
            # tensor(25.)
            # >>> (xx==zz).type(torch.float).sum().item()
            # 25.0
            yy=arc_one_hot(y)
            correct += (pred.argmax(1) == yy).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

#  7. 训练和测试
def do_train():
    epochs = 5
    for t in range(epochs):
        print(f"Epoch {t+1}\n-------------------------------")
        train(mydataloader, model, loss_fn, optimizer)
        test(test_dataloader, model, loss_fn)
    print("Done!")
    __save_model__()

# for var_name in model.state_dict():
#     print(var_name, "\t", model.state_dict()[var_name])

# for var_name in optimizer.state_dict():
#     print(var_name, "\t", optimizer.state_dict()[var_name])

#  8. 保存模型
def __save_model__():
    path="model.pth"
    torch.save(model.state_dict(), path)
    print("Saved PyTorch Model State to "+path)
#  9. 加载模型
def load_model():
    model = NeuralNetwork().to(device)
    model.load_state_dict(torch.load("model.pth", weights_only=True))
    return model
#  10. 测试模型
def  test_model():
    model = load_model()
    classes = [
        "T-shirt/top",
        "Trouser",
        "Pullover",
        "Dress",
        "Coat",
        "Sandal",
        "Shirt",
        "Sneaker",
        "Bag",
        "Ankle boot",
    ]

    model.eval()
    x, y = test_data[0][0], test_data[0][1]
    with torch.no_grad():
        x = x.to(device)
        pred = model(x)
        # 独热转化,同device才能计算
        y=arc_one_hot(y.to(device)).int()
        print(y)

        print(f"Predicted: {pred[0].argmax(0)}, Actual: {y}")
        predicted, actual = classes[pred[0].argmax(0)], classes[y]
        print(f'Predicted: "{predicted}", Actual: "{actual}"')

def  do_test_model():
    load_model()
    test_model()
def do_train_model():
    showdata()
    do_train()
    test_model()
def main():
    do_train_model()
    #do_test_model()
    

if __name__ == '__main__':
    main()

bash zsh awk按列表整理文件

背景

我们有一堆文件,在文件夹A中,需要按照文件列表list.txt 复制文件到B中,A中文件和list.txt 中文件均较多

1、文件列表

list.txt类似如下

mnist_test_644.png

mnist_test_2180.png

mnist_test_122.png

mnist_test_2816.png

2、bash

b=`sed ‘r/g’ list.txt`

for i in $b;

do

cp -r “A/”$i “B/”;

done

3、zsh

for i (${(s: :)$(<list.txt)});

do

cp -r “A/”$i “B/”;

done

或者(不推荐,如果txt每行有空格,导致错误,上面兼容更好)

fl=$(<list.txt)

for i (${(f)fl});

do

cp -r “A/”$i “B/”;

done

4、awk 拼接语句管道执行

awk ‘{print “cp -r A/”$1″ B/;” }’ list.txt |sh 

bash zsh awk字符串分割

fl.txt文件
12
34
56
78


Bash

1、IFS定义分隔符,默认空格、tab、换行、回车

bash-3.2$ a="a b c d"
bash-3.2$ for i in $a;
> do
> echo $i","
> done
a,
b,
c,
d,

bash-3.2$ b=`sed 'r/g' fl.txt`
bash-3.2$ for i in $b; do echo $i","; done
12,
34,
56,
78,

a="a,b,c,d"
#换行符分割
IFS=$'\n'

bash-3.2$ a="a,b,c,d"
bash-3.2$ for i in $a;
> do 
> echo $i;
> done
a,b,c,d

设置分隔符为逗号
bash-3.2$ IFS=$','
bash-3.2$ for i in $a; do  echo $i; done
a
b
c
d

2、使用分割符生成数组
bash-3.2$ aa="hello,shell,split,test"
bash-3.2$ array=(${aa//,/})
bash-3.2$ for i in ${array[@]}
> do
> echo $i
> done
Helloshellsplittest

bash-3.2$ array=(${aa/\n/,/})
bash-3.2$ for i in ${array[@]}; do echo $i; done
hello
shell
split
Test

bash-3.2$ echo ${array[0]}
hello
bash-3.2$ echo ${array[1]}
Shell


Zsh 
Zsh 不会默认使用空格、tab、换行、回车分割

1、(f)按行分割

str=$(<fl.txt)

% echo $str
12 
34 
56 
78

for i (${(f)str}){
echo $i"#"
}
12#
34# 
56# 
78#

注意,写在一起这样不行
for i (${(f)$(<fl.txt)});
do 
echo $i",";
done
12 34 56 78#

直接输出和使用变量行为不一致
echo $(<fl.txt)
12 34 56 78
aa=$(<fl.txt)
echo $aa
12
34
56
78


需要使用(s:chr:)方式
for i (${(s: :)$(<fl.txt)});
do 
echo $i",";
done


或者使用sed读取
aa=`sed 'r/g' fl.txt`;
for i (${(f)aa});
do 
echo $i",";
Done



2、(s:chr:)

s='foo,bar,baz'
#仅s也可,见过p w @,:可以用其他符号代替
for i  in ${(ps:,:)s} ; do   
echo "$i END"
done
foo END
bar END
baz END


awk

bash-3.2$ aa=`awk '{print $1}' fl.txt`
bash-3.2$ for i in $aa
> do
> echo $i
> done
12
34
56
78

ipynb转markdown

在vs code中Jupyter Notebook ipynb文件不能复制内容,不方便分享,而自带的导出功能非常鸡肋,需要安装XETEX,官网显示全部安装需要7个多G“By default, everything is installed (7+GB)”,直接劝退了。

既然ipynb就是markdown写出来的,直接转markdown岂不是更好?

网上很多使用下面的命令,但是我遇到了错误 Jupyter command `jupyter-nbconvert` not found. 有可能是我环境变量问题。

jupyter nbconvert –to markdown ‘abc.ipynb’

推荐使用下面的代码,不依赖环境

python3 -m nbconvert –to markdown ‘abc.ipynb’

torch 张量

张量是一种特殊的数据结构,与数组和矩阵非常相似。在 PyTorch 中,我们使用张量来编码模型的输入和输出,以及模型的参数。

张量类似于 NumPy 的 ndarray,不同之处在于张量可以在 GPU 或其他硬件加速器上运行。实际上,张量和 NumPy 数组通常可以共享底层内存,从而无需复制数据(详见与 NumPy 的桥接)。张量还针对自动微分进行了优化(我们将在后面的 Autograd 部分详细介绍)。如果您熟悉 ndarrays,您会很快适应 Tensor API。如果不熟悉,请继续阅读!

import torch
import numpy as np

初始化张量

张量可以通过多种方式进行初始化。请看以下示例

直接从数据创建

张量可以直接从数据创建。数据类型会自动推断。

data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)

从 NumPy 数组创建

张量可以从 NumPy 数组创建(反之亦然 – 详见与 NumPy 的桥接)。

np_array = np.array(data)
x_np = torch.from_numpy(np_array)

继续阅读“torch 张量”

torch 快速入门

Quickstart

This section runs through the API for common tasks in machine learning. Refer to the links in each section to dive deeper.

Working with data

PyTorch has two primitives to work with datatorch.utils.data.DataLoaderand torch.utils.data.DatasetDataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset.

import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

PyTorch offers domain-specific libraries such as TorchTextTorchVision, andTorchAudio, all of which include datasets. For this tutorial, we will be using a TorchVision dataset.

The torchvision.datasets module contains Dataset objects for many real-world vision data like CIFAR, COCO (full list here). In this tutorial, we use the FashionMNIST dataset. Every TorchVision Dataset includes two arguments:transform and target_transform to modify the samples and labels respectively.

# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)
print(len(training_data))
60000
继续阅读“torch 快速入门”