- Sep 2023
he xv6 shell uses the above calls to run programs on behalf of users. The main structure ofthe shell is simple; see main (user/sh.c:145). The main loop reads a line of input from the user withgetcmd. Then it calls fork, which creates a copy of the shell process. The parent calls wait,while the child runs the command. For example, if the user had typed “echo hello” to the shell,runcmd would have been called with “echo hello” as the argument. runcmd (user/sh.c:58) runsthe actual command. For “echo hello”, it would call exec (user/sh.c:78). If exec succeeds thenthe child will execute instructions from echo instead of runcmd. At some point echo will callexit, which will cause the parent to return from wait in main (user/sh.c:145).
在xv6 shell中,当用户输入一个命令时,shell会创建一个子进程来执行该命令,而父进程则负责等待子进程的完成。具体的流程如下:
The file must have a particular format, which specifies whichpart of the file holds instructions, which part is data, at which instruction to start, etc.
exec() 系统调用方法,只有特定格式的文件系统中的文件可以被调用
Often set to (0,1,2) to enable aux.
Ahhagar objective: 特定分辨率下,再提高比特率,不会对感知质量有显著提升时,选择最小的比特率
NN input
zero ini-tialized before being fed into the Transformer decoder
query初始化为0,模型从image features的先验知识中学习
Ml−1 ∈ {0, 1}N ×HlWl is the binarized output(thresholded at 0.5) of the resized mask prediction of theprevious (l − 1)-th Transformer decoder layer.
Mask prediction是二值的输出,M(x,y) = 1代表mask的前景区域,0代表不关注的区域。M0由X0产生,之后Mi是由Mi-1 resize得到的
the slow convergence of Transformer-basedmodels is due to global context in the cross-attention layer
采取的改进措施就是masked attention->关注局部区域上下文 + self-attention融合上下文信息
we propose an efficientmulti-scale strategy to utilize high-resolution features.
将pixel decoder产生的多尺度的高分辨率特征作为key,value放入transformer decoder layers,最后融合特征
masked attention opera-tor
cross attention关注了mask的区域->提取局部特征
- Aug 2023
receives a belief state bct ∈ B from the environment
不太理解,belief state不是agent产生的吗,还是这里就是想说用agent产生的state和来自环境的belief state计算loss或者reward
withno signs of saturation
W-MSA and SW-MSA denote window based multi-headself-attention using regular and shifted window partitioningconfigurations
For long sequences, the indices can grow large in magnitude. If you normalize the index value to lie between 0 and 1, it can create problems for variable length sequences as they would be normalized differently.
The reason why can't use the index as token's positional encoding.
using ReLU is dramaticallyworse and in fact causes the performance of SE-ResNet-50to drop below that of the ResNet-50 baseline
- Jul 2023
Pyramid Scene Parsing Network
- 场景语义分析是为图像的每个像素点分配具体的语义标签,这个任务的难度取决于场景的复杂性和标签的多样性,dataset有LMO、PASCAL VOC和ADE20K,难度依次递增。
- 分析FCN表现不佳的图,因为FCN会将形状相似、但场景区别很大的物体混淆,故得出全局先验知识不足的结论。 -> 采用改进的spatial pyramid pooling,获得全局像素信息。
- 本文的baseline network: 有空洞卷积的FCN。
- baseline model: Global average pooling
- 学习全局上下文信息时,使用global pooling效果不佳,因为复杂的场景图像(ADE20K)中的像素标注很复杂,直接将这些融合为1个向量,信息量减少太多,需要更细致的分类场景(基于不同区域的上下文聚合来利用全局上下文信息,也是pooling)
- 金字塔式的池化可以多尺度的学习场景的上下文
- 使用ResNet做backbone network,可以将卷积变为3×3,减少空间,增加非线性变换
- 空洞卷积能够增加感受野。在保持卷积核大小不变的情况下,能够更有效地捕捉到输入图像的全局信息和上下文相关性。
Figure 4
- cls模块:主分类器,计算loss1
- aux模块: 50为backbone,aux_loss是辅助分类器,以第四阶段卷积的输出为输入,计算loss2
Figure 3
假设backbone是resnet50: - 第一个卷积的卷积核从7×7变为3个3×3 - 空洞卷积用在第四部分的六层(dilation=(2,2))和第五部分的三层(dilation=(4,4))的3×3卷积中 - ->多尺度pooling->按照通道连接->分类所有的卷积
Table 1
对比了各种池化尺度(1×1,2×2,3×3,6×6)、pooling方案(MAX or AVE)、有无DR(Dimension Reduction,表示在池化操作之后使用1x1卷积层来减少特征图的通道数。可以减少模型的计算量和内存占用,提高模型的泛化能力) - 1×1是全局池化,只改变通道数
Figure 2
- May 2023
colab.research.google.com colab.research.google.com
much lower learning rate
initialize lower
Note that by default PyTorch handles images that are arranged [channel, height, width], but matplotlib expects images to be [height, width, channel], hence we need to permute our images before plotting them.
A new import is the _LRScheduler which we will use to implement our learning rate finder.
We will also show how to initialize the weights of our neural network and how to find a suitable learning rate using a modified version of the learning rate finder.
- Apr 2023
an end-to-end RLHF pipeline