accumulate
积累
accumulate
积累
blurry reconstruction
传统的stride/pool出来的都是模糊的结果,因为pack, unpack并不损失信息,所以重建效果很好
illustrate
阐明
2D convolutions are not designedto directly leverage the tiled structure of this feature space.Instead, we propose to first learn to expand this structured
2d卷积并不适用
The result-ing tensor is at a reduced resolution, but in contrast to strid-ing or pooling
经过此操作之后的tensor降低了分辨率,但与stride和pooling相对,这样的操作没有损失
folding the spatialdimensions of convolutional feature maps into extra featurechannels
将卷积产生的特征图压缩到额外的特征channel
information loss is not a necessarycondition to learn representations capable of generalizing todifferent scenarios
wtf
Lv between the magnitudeof the pose-translation component of the pose network pre-diction ˆtand the measured instantaneous velocity scalar vmultiplied by the time difference between target and sourceframes ∆Tt→s,
新的对于速度的损失函数,两帧之间的t变换和速度v,时间△t之间的关系
instantaneous
瞬间
Mt is a binary mask that avoids computing the pho-tometric loss on the pixels that do not have a valid map-ping
没有可用地图时候就不用计算光度什么的loss
an appearance matching loss term Lp
appearance matching loss
Synthesis
合成
Photometric
光度学
impractical
不切实际的
alleviating
缓解
velocity
速度
instantaneous
瞬间
and rely on the ground-truth LiDARmeasu,rements to scale their depth estimates appropriatelyfor evaluation purpose
用雷达的测量来验证结果
ad,ditional
additional
where a depth and pose network are simultaneouslylearned from unlabeled monocular videos
从无标签单目视频中同时学习深度和相机运动
vided an alternative strategy involving training a monocu-lar depth network with stereo cameras, without requiringground-truth depth labels
好思路
theavailability of target depth labels became challenging
开始怀疑数据了
amortize
缓冲
not require supervised pretraining on ImageNetto achieve state-of-the-art results
pretraining 一定会提高性能吗
DDAD enables much more accuratedepth evaluation at range
精度高
Ourthird contribution is a new dataset:
第三个贡献就是推出了一个新的数据集
is a novelloss
第二个贡献是设计了一个novel的和相机速度有关的loss function, 来解决单目视觉中固有的尺度歧义
inherent scale ambiguity
固有的尺度歧义
velocity
速率
propagate
传播
PackNet, for high-resolution self-supervisedmonocular depth estimation
main contribution
estimation have mostly focused on engineering the lossfunction
醉经的在自监督单目深度估计的工作大多集中在loss function的设计
Although self-supervised, ourmethod outperforms other self, semi, and fully supervisedmethods on the KITTI benchmark
important
prerequisite
先决条件
DDAD
新数据集又来了
inductive
归纳,感应
symmetrical
对称
ubiquitous
无处不在
ex-ploit redundancies
利用冗余
catastrophic
灾难的
exploit the motion parallax
利用视差
which is a degenerate case for traditional structure from motion
传统方法退化
identical
完全相同的
concatenation
级联
Higher resolution gives the Base-*methods an advantage in depth accuracy, but on the otherhand these methods are more prone to outliers.
吹嘘自己
This enhances the smooth-ness of estimated flow fields and the sharpness of motiondiscontinuities
就是这个意思,约束周围的深度不能过分差异
Scale invariant gradient loss
描述在一个像素周围不同尺度的平滑性,太牛了
discontinuities
不连续
Therotation r = θv is an angle axis representation with angle θand axis v. The translation t is given in Cartesian coordi-nates.
r是由angle和axis表示的,t是直角坐标表示的
is identical to the boot-strap net, but it takes additional inputs.
结构完全相同,但添加了部分输入
By feeding optical flow estimate into the secondencoder-decoder we let it make use of motion parallax
怎么样使得光溜能够在第二个ed中使用,利用了视差
gets the imagepair as input and outputs the initial depth and motion es-timates.
input:image pair output:initial depth and motion estimates
takes as input the optical flow, its confidence, the im-age pair, and the second image warped with the estimatedflow field. Based on these inputs it estimates depth, sur-face normals, and camera motion.
第2个encoder-decoder input:第一个encoder的光流,置信度,一对图像,用估计的flow wrap的第二张图 output:depth,表面法向量,相机运动
Schematic representation
示意图
The network estimates not only depthand motion, but additionally surface normals, optical flowbetween the images and confidence of the matching
该网络可以同时估计: 1.深度 2.相机运动 3.平面法向量 4.光溜 5.匹配置信度
succes-sive
连续的
egomotion
自我运动
integrated
融合的
prone
容易出错
homogeneous
同质
scale-invarian
比例不变
emphasizes
强调
invariant
不变量
magnitude
幅度
deviations
偏差
penalize
惩罚
and stimulate synergy of thetwo tasks without over-fitting to a specific scenario
协同
images with unknown camera motion can be determinedonly up to scale
???
Parameterization
参数化
discontinuities
间断
ambiguity
模糊,多义性
The second encoder-decoder
input:image pair,optical flow confidence.wrap image output:depth, surface normals,camera motion
recursively
递归
DeMoN takes an image pair as input and predicts the depth map of the first image and the relativepose of the second camera.
输入是image pair, 预测出第一幅图的depth,以及第2张图的相对位姿
explicitly
明确的
implicitly
隐式的
temporal
时间的
semi-dense
半稠密
Outliers
异常
unconstrained pair of images
任意的自由的imgae pair
In this paper, we succeed for the first time in training aconvolutional network to jointly estimate the depth and thecamera motion from an unconstrained pair of images.
主要工作
unconstrained
自由的
variational
变化的
approximations
近似
arguablely
有争议的
propagate
传播
aggregate
聚合
farthest
最远
translation-invariant
平移不变性
points that is irregular andorderless
点云set都是不规则和无序的,和图像不一样
(1) point feature learn-ing, (2) point mixture, and (3) flow refinement.
1.特征提取和学习 2.点融合 3.refine
Figure 2
1.提取特征,类似于降采样 2.学习运动 3.upsample
given P and Q
给定两point cloud,恢复他两之间的运动
XY Z only
定义两个只包含xyz的点云,这两个点云的数量可能不一样,或者是点云之间没有对应?(或者是指单个point并没有严格的对应)
key contributions
主要贡献
fine tuning
要做fine tuning
generalizability
泛化性,数据合成,在实际中仍然表现不俗
our network
主要思想,需要仔细
given input point clouds from two consec-utive frames (point cloud 1 and point cloud 2), our networkestimates a translational flow vector for every point in thefirst frame to indicate its motion between the two frames
给定两连续帧的点云,得出点云1中每个点到点云2的rt
(7)
体积相等,可见注记
(6)
面积相等
mitigate
减缓
the parallelization
并行加速,不公平
Although this method seems very simple, itworks surprisingly well as shown in the results
确实
Piecewise processing
分段处理
In-frame motion compensation
主要解决运动补偿问题
cannot be constantly tracke
和线阵激光的区别
we employthe LiDAR reflectivity as the 4-th dimensional measurement
有道理
local smoothness
LOAM
Fig. 6
点云的一些相关定义和计算(亮点)
To increase the localization and mapping accuracy, weremove any of the following points
对点云进行的筛选操作,很有借鉴意义 1.视场边界的地方,解释说有较大的曲率,>17° 2.算出来的I(P)值过大或者过小,可能觉得异常 3.算出来的角度Θ过大或者过小 4.在边界处去除那个较远的点
(4)
计算角度
Small intensityI(P)
I(P)这个值代表的物理含义:距离远或者反射率低
Fig. 5
pipeline
Our con-tributions are
主要贡献 1.LOAM算法改写,使他适合于livox这种模式的雷达 2.小trick优化LOAM算法 3.(重点)运动问题解决
revious work were based on spinningLiDARs,
之前的工作主要集中在线阵扫描
better performance but cannot run in real-time
必然不可兼得
linear interpolatingthe LiDAR pose
好像可行
[12] , [17] and [18]
主要解决雷达的移动问题,需要仔细看一下
key-points based method
设计特征点
generalized ICP
改进化的icp算法,point-to-plane
ICP algorithm
icp算法的优缺点和适用场景,适用于稠密点云配准,不适合与稀疏点云配准
new features
1.视场小 2.不规律采样 3.不重叠采样 4.运动模糊
This regular scanning greatly simplifies the featureextraction
确实
linear (et) and angular (er) errors
评价标准
Fig. 9. Statistics from real sensor data: Dispersion in the localization of thereference points (a) and calibration deviation from the final result (linear, inm, and angular, in rad) (b)
实验结论
Table VI shows the linear (et) and angular (er) calibrationerrors
别人的评价标准
2)
fig 6
s the distance increases
普遍存在,没办法
Experimentation in the synthetic suite can be divided into threedifferent focus points:
仿真实验主要验证的3个部分
Gazebo simulato
仿真
except for the tuning of the pass-through filters mentionedin Sec. IV,
。。。
without user interven-tion
所以说那个filter咋弄的
n the one hand,
仿真实验,主要侧重点,包含很多quantitative results
On the other hand
真实环境中实验,写的很简单,把点云重投影到图像上,仅此而已
Umeyama registration procedure
对应起来,优化问题
Distances from this point to the other three determine thecorrect ordering
类似的思路
Note thatthis condition would not be fulfilled when calibrating a front-facing 360° LiDAR scanner and a rear-looking camera, forinstance
啥意思
that is, pairs of points representing the samereference points in both clouds must be associated
两数据之间4个hole必须要相互对应
The goal of the registration step is to find the optimaltransformation parametersˆθso that when the resulting trans-formationˆTis applied
目的
2)
没明白
he time necessary to completethe procedure depends on the sensor’s framerate but is rarelylonger than a few seconds.
不信
Ndata frames
N帧积累,实验中N是30
To increase the robustness of thealgorithm,
增加鲁棒性,采取了几个措施,肯定要多帧,单帧按照上面的提取方法完全有可能提取不到信息。
requires the detection ofArUco markers,
对图像直接aruco marker就行了
Pp
从2d点投回3d点,注意根据上述一顿阈值操作之后,得出当前帧必须准确的包含4个点,这真的会效果很好吗?
2D circle segmentation isused to extract a model of the pattern holes present inP4
3.P4上2d圆检测,设定很多个阈值来判断4个hole,繁琐
XY-plane coincides withπand projecting allthe 3D points ontoπ
2.投影到二维点得到P4, P4就是二维平面上的点
P3contains only points representing the edges of the calibrationtarget; that is, the outer borders and the holes.
只包含目标hole边界的点,首先P2是边界点云,第一步又拟合出了target周围的平面,两个求个交集,同时定义一个阈值提高一下精度
impose that the plane must be roughly vertical in sensorcoordinates
必须垂直?
RANSAC is applied toP1
1.ransac拟合通过filter之后的平面
The intended outcome is an estimate ofthe 3D location of the centers in sensor coordinates.
开始对立体性质的4个hole中心点位置进行估计
PS
sobel提取之后的双目sensor中的点云
Sobel filter is applied over the image
双目中的边界提取用的是图像中的边缘
PS1
通过filter之后的点云
the calibration target isexpected to have some texture
为啥真实环境中是白纸
PS0can be straightforwardly obtained
双目系统通过深度估计得出的点云
PL2
提取了边缘点之后的点云
same scan plane
。
(1)
提取边缘点,多线雷达
PL1
筛选之后的点云P1,怎么删除的没太明白,人工删除?
pass-through filters are applied in the threecartesian coordinates to remove points outside the area wherethe target is to be placed
对点云精心预处理,剔除掉不是target周围的点?
The monocularalternative is substantially different as it relies on the ArUcomarkers instead
视觉上依靠aruco marker
In all cases, the output ofthis stage is a set of four 3D points representing the center ofthe holes in the target, in local coordinates
定位输出的结果就是4个hole中心点的三维坐标
hree different variants of the procedure areproposed here, one per sensor type
对3种不同的传感器提出了标志点定位的方法
localization of
stage 1
registration
stage 2
thesegmentation
stage 1
The proposed calibration algorithm, illustrated in Fig. 3
pipeline
As noticeable in the two different embodiments shown inFig. 2,
标定板的样子,4个aruco marker在四周,中间4个圆圈,由后文可知,这个摆放方式需要固定