27 Matching Annotations
  1. Sep 2023
    1. he xv6 shell uses the above calls to run programs on behalf of users. The main structure ofthe shell is simple; see main (user/sh.c:145). The main loop reads a line of input from the user withgetcmd. Then it calls fork, which creates a copy of the shell process. The parent calls wait,while the child runs the command. For example, if the user had typed “echo hello” to the shell,runcmd would have been called with “echo hello” as the argument. runcmd (user/sh.c:58) runsthe actual command. For “echo hello”, it would call exec (user/sh.c:78). If exec succeeds thenthe child will execute instructions from echo instead of runcmd. At some point echo will callexit, which will cause the parent to return from wait in main (user/sh.c:145).

      在xv6 shell中,当用户输入一个命令时,shell会创建一个子进程来执行该命令,而父进程则负责等待子进程的完成。具体的流程如下:

      1. Shell程序通过调用fork函数(user/sh.c:58)创建一个子进程。这个子进程是父进程的副本,包括程序、数据和文件描述符等信息。

      2. 在子进程中,shell解析用户输入的命令,确定要执行的操作。这一部分逻辑由runcmd函数(user/sh.c:58)处理。根据命令的类型,runcmd可能会调用不同的函数来执行相应的操作,例如执行可执行程序的exec函数、建立管道通信的pipe函数等。

      3. 如果命令是一个可执行程序,那么shell会调用exec函数(user/sh.c:78)来执行该程序。这样,子进程将执行新的程序代码,取代掉原来的shell代码。

      4. 在某个时刻,子进程执行的程序完成了它的任务,可能会调用exit函数来终止自己的执行。这会导致父进程从wait函数(user/sh.c:145)中返回,继续执行其他操作。

    2. The file must have a particular format, which specifies whichpart of the file holds instructions, which part is data, at which instruction to start, etc.

      exec() 系统调用方法,只有特定格式的文件系统中的文件可以被调用

    1. elect the minimum bitrate

      Ahhagar objective: 特定分辨率下,再提高比特率,不会对感知质量有显著提升时,选择最小的比特率

    2. inputs

      NN input

    Annotators

    1. zero ini-tialized before being fed into the Transformer decoder

      query初始化为0,模型从image features的先验知识中学习

    2. Ml−1 ∈ {0, 1}N ×HlWl is the binarized output(thresholded at 0.5) of the resized mask prediction of theprevious (l − 1)-th Transformer decoder layer.

      Mask prediction是二值的输出,M(x,y) = 1代表mask的前景区域,0代表不关注的区域。M0由X0产生,之后Mi是由Mi-1 resize得到的

    3. the slow convergence of Transformer-basedmodels is due to global context in the cross-attention layer

      采取的改进措施就是masked attention->关注局部区域上下文 + self-attention融合上下文信息

    4. we propose an efficientmulti-scale strategy to utilize high-resolution features.

      将pixel decoder产生的多尺度的高分辨率特征作为key,value放入transformer decoder layers,最后融合特征

    5. masked attention opera-tor

      cross attention关注了mask的区域->提取局部特征

  2. Aug 2023
    1. receives a belief state bct ∈ B from the environment

      不太理解,belief state不是agent产生的吗,还是这里就是想说用agent产生的state和来自环境的belief state计算loss或者reward

    Annotators

    1. withno signs of saturation

      没有饱和迹象?不太理解

    1. W-MSA and SW-MSA denote window based multi-headself-attention using regular and shifted window partitioningconfigurations

      W-MSA和SW-MSA的区别

    1. For long sequences, the indices can grow large in magnitude. If you normalize the index value to lie between 0 and 1, it can create problems for variable length sequences as they would be normalized differently.

      The reason why can't use the index as token's positional encoding.

    1. using ReLU is dramaticallyworse and in fact causes the performance of SE-ResNet-50to drop below that of the ResNet-50 baseline

      疑问,baseline不是Table2的24.7+7.8吗

    2. SE-PRE

      表现优异,为何不使用

  3. Jul 2023
    1. Pyramid Scene Parsing Network

      从知乎得到的知识

      • 场景语义分析是为图像的每个像素点分配具体的语义标签,这个任务的难度取决于场景的复杂性和标签的多样性,dataset有LMO、PASCAL VOC和ADE20K,难度依次递增。
      • 分析FCN表现不佳的图,因为FCN会将形状相似、但场景区别很大的物体混淆,故得出全局先验知识不足的结论。 -> 采用改进的spatial pyramid pooling,获得全局像素信息。
      • 本文的baseline network: 有空洞卷积的FCN。
      • baseline model: Global average pooling
      • 学习全局上下文信息时,使用global pooling效果不佳,因为复杂的场景图像(ADE20K)中的像素标注很复杂,直接将这些融合为1个向量,信息量减少太多,需要更细致的分类场景(基于不同区域的上下文聚合来利用全局上下文信息,也是pooling)

      全文的心得体会:

      1. 金字塔式的池化可以多尺度的学习场景的上下文
      2. 使用ResNet做backbone network,可以将卷积变为3×3,减少空间,增加非线性变换
      3. 空洞卷积能够增加感受野。在保持卷积核大小不变的情况下,能够更有效地捕捉到输入图像的全局信息和上下文相关性。
    2. Figure 4

      将FCN的backbone换成ResNet,更深层的网络效果更好,但会出现梯度消失和爆炸的问题,所以采用深度监督的方法。例:ResNet101的第四阶段之后还应用了另一个分类器,res4b22。这个分类器计算的loss2和最终分类器计算的loss1加权融合,反向传播,提高学习的效果

      代码实现的最后一步

      • cls模块:主分类器,计算loss1
      • aux模块: 50为backbone,aux_loss是辅助分类器,以第四阶段卷积的输出为输入,计算loss2
    3. Figure 3

      假设backbone是resnet50: - 第一个卷积的卷积核从7×7变为3个3×3 - 空洞卷积用在第四部分的六层(dilation=(2,2))和第五部分的三层(dilation=(4,4))的3×3卷积中 - ->多尺度pooling->按照通道连接->分类所有的卷积

    4. Table 1

      对比了各种池化尺度(1×1,2×2,3×3,6×6)、pooling方案(MAX or AVE)、有无DR(Dimension Reduction,表示在池化操作之后使用1x1卷积层来减少特征图的通道数。可以减少模型的计算量和内存占用,提高模型的泛化能力) - 1×1是全局池化,只改变通道数

    5. Figure 2

      根据figure2发现了FCN的问题:错误匹配(第一行)、相似标签被错认(第二行),小目标丢失(第三行)

    Annotators

  4. May 2023
    1. Note that by default PyTorch handles images that are arranged [channel, height, width], but matplotlib expects images to be [height, width, channel], hence we need to permute our images before plotting them.

    2. A new import is the _LRScheduler which we will use to implement our learning rate finder.

    3. We will also show how to initialize the weights of our neural network and how to find a suitable learning rate using a modified version of the learning rate finder.

  5. Apr 2023