Neuralangelo复现
Neuralangelo提出的第一个创新数值梯度帮助全局表面都得到反向传播的优化,使得全局表面都变得比较光滑(DTU的63号场景的苹果没有那么凹陷,且表面更加平整光滑),第二个创新是通过对数值梯度的参数ϵ和哈希网格的分辨率进行渐进的优化,为的都是先识别出物体的整体形态(能够重建出植物枝干),然后在逐步细化,恢复出物体详细的纹理特征(N3底部枝干也能重建出来的原因),这两个创新极大的帮助了神经网络更
1、配置环境
创建如下配置的实例:

在正式配置环境之前,将该实例的pip源选择为清华源头(替换为清华源后需要将实例关机后重新开机):

将autodl-tmp/neuralangelo/neuralangelo.yaml文件中的python=3.8改为python=3.9,并指定cudatoolkit、numpy和pytorch的版本,以免后续发生版本不兼容的问题(因为不指定版本的话,他就会通过管道conda-forge、pytorch自己去安装一些杂七杂八的版本,最后导致不兼容,这里是根据现有的unbuntu20.04、cuda11.8、GPU4090D询问GPT指定的cudatoolkit、numpy和pytorch的版本),修改后如下:
name: neuralangelo
channels:
- conda-forge
- pytorch
dependencies:
# general
- gpustat
- gdown
- cudatoolkit=11.8
- cmake
# python general
- python=3.9
- pip
- numpy>=1.21,<1.24
- scipy
- ipython
- jupyterlab
- cython
- ninja
- diskcache
# pytorch
- pytorch=2.0.0
- torchvision
- pip:
- -r requirements.txt
将autodl-tmp/neuralangelo/requirements.txt文件中的安装tiny-cuda-nn的命令行注释掉,修改后如下:
addict
gdown
#git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
gpustat
icecream
imageio-ffmpeg
imutils
ipdb
k3d
kornia
lpips
matplotlib
mediapy
nvidia-ml-py3
open3d
opencv-python-headless
OpenEXR
pathlib
pillow
plotly
pyequilib
pyexr
PyMCubes
pyquaternion
pyyaml
requests
scikit-image
scikit-video
scipy
seaborn
tensorboard
termcolor
tqdm
trimesh
wandb
运行下述命令创建虚拟环境、安装所需要的依赖和软件包:
cd autodl-tmp
git clone https://github.com/NVlabs/neuralangelo.git
cd neuralangelo
conda env create --file neuralangelo.yaml
有以下输出信息证明安装成功:
Successfully installed Flask-3.0.3 OpenEXR-3.2.4 PyMCubes-0.1.6 absl-py-2.1.0 addict-2.4.0 blinker-1.8.2 click-8.1.7 configargparse-1.7 contourpy-1.2.1 cycler-0.12.1 dash-2.17.1 dash-core-components-2.0.0 dash-html-components-2.0.0 dash-table-5.0.0 docker-pycreds-0.4.0 fonttools-4.53.1 gitdb-4.0.11 gitpython-3.1.43 grpcio-1.65.5 icecream-2.1.3 imageio-2.35.1 imageio-ffmpeg-0.5.1 imutils-0.5.4 ipdb-0.13.13 ipywidgets-8.1.3 itsdangerous-2.2.0 joblib-1.4.2 jupyterlab-widgets-3.0.11 k3d-2.16.1 kiwisolver-1.4.5 kornia-0.7.3 kornia-rs-0.1.5 lazy-loader-0.4 lpips-0.1.4 markdown-3.7 matplotlib-3.9.2 mediapy-1.2.2 msgpack-1.0.8 nvidia-ml-py3-7.352.0 open3d-0.18.0 opencv-python-headless-4.10.0.84 pandas-2.2.2 pathlib-1.0.1 plotly-5.23.0 protobuf-5.27.3 pyequilib-0.5.8 pyexr-0.4.0 pyparsing-3.1.2 pyquaternion-0.9.9 retrying-1.3.4 scikit-image-0.24.0 scikit-learn-1.5.1 scikit-video-1.1.11 seaborn-0.13.2 sentry-sdk-2.13.0 setproctitle-1.3.3 smmap-5.0.1 tenacity-9.0.0 tensorboard-2.17.1 tensorboard-data-server-0.7.2 termcolor-2.4.0 threadpoolctl-3.5.0 tifffile-2024.8.10 traittypes-0.2.1 trimesh-4.4.6 tzdata-2024.1 wandb-0.17.7 werkzeug-3.0.3 widgetsnbextension-4.0.11
done
#
# To activate this environment, use
#
# $ conda activate neuralangelo
#
# To deactivate an active environment, use
#
# $ conda deactivate
然后运行下述命令激活neuralangelo环境:
conda activate neuralangelo
然后运行下述命令单独安装tinycuda-nn(tinycuda-nn安装经常出错,所以这里选择单独安装):
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
输入以下命令输出信息tiny-cuda-nn installed successfully,则证明tinycuda-nn安装成功:
python -c "import torch; from tinycudann import NetworkWithInputEncoding; print('tiny-cuda-nn installed successfully')"
2、准备DTU测试数据集
运行下述命令更新和初始化github仓库的colmap子模块:
git submodule update --init --recursive
执行以下命令,下载Neus作者预处理的DTU数据集,并生成json文件:
PATH_TO_DTU=datasets/dtu # 修改为DTU数据集根目录
bash projects/neuralangelo/scripts/preprocess_dtu.sh ${PATH_TO_DTU}
运行上述命令显示无法连接到Dropbox进行数据集的下载,那么只好手动下载数据集,将网址https://www.dropbox.com/sh/w0y8bbdmxzik3uk/AAAaZffBiJevxQzRskoOYcyja?dl=1中NeuS作者预处理的DTU数据集neus.zip下载到本地,然后将neus.zip解压得到六个压缩包,将其中的data_dtu.zip上传至autodl的目录autodl-tmp/neuralangelo/datasets/dtu下(autodl-tmp/neuralangelo/datasets/dtu为自己创建的目录):

手动下载数据集后,运行下述命令进行解压,并将压缩包data_DTU.zip删除
cd autodl-tmp/neuralangelo/datasets/dtu unzip data_DTU.zip
运行以下命令,生成json文件(该json文件格式与Instant-NPG的格式相同):
cd autodl-tmp/neuralangelo python3 projects/neuralangelo/scripts/convert_dtu_to_json.py --dtu_path datasets/dtu
运行成功将输出以下信息:

3、在DTU上训练Neuralangelo
将有关Neuralangelo训练DTU数据集的配置文件autodl-tmp/neuralangelo/projects/neuralangelo/configs/dtu.yaml修改如下:
# -----------------------------------------------------------------------------
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto. Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.
# -----------------------------------------------------------------------------
_parent_: projects/neuralangelo/configs/base.yaml
model:
object:
sdf:
mlp:
inside_out: False
encoding:
coarse2fine:
init_active_level: 4
s_var:
init_val: 1.4
appear_embed:
enabled: False
data:
type: projects.neuralangelo.data
root: datasets/dtu/dtu_scan63#将这里的路径改为对应想训练的dtu数据集场景号
train:
image_size: [1200,1600]
batch_size: 1
subset:
val:
image_size: [300,400]
batch_size: 1
subset: 1
max_viz_samples: 16
DTU测试数据集准备好之后就可以运行下述命令开始训练了:
EXPERIMENT=dtu
GROUP=dtu_group
NAME=dtu_scan63
CONFIG=projects/neuralangelo/configs/${EXPERIMENT}.yaml
GPUS=1 # use >1 for multi-GPU training!
torchrun --nproc_per_node=${GPUS} train.py \
--logdir=logs/${GROUP}/${NAME} \
--config=${CONFIG} \
--show_pbar
直接运行上述命令会显示两个报错:第一个是 wandb 版本问题或初始化配置不正确的报错;第二个是多进程通信的报错,这可能与 GPU 资源分配或数据加载器的 pin_memory 选项有关。
针对于第一个问题,我想的是直接禁用wandb(WandB 是 "Weights and Biases" 的缩写,是一个用于机器学习实验管理的工具。它帮助开发者和研究人员跟踪、记录和可视化模型的训练过程。可以实验跟踪、数据可视化、模型管理、团队协作、报告生成等等),因为wandb只是一个帮助记录模型训练过程固定工具,相对于记录模型训练过程,我更想先能够让模型的训练过程能够成功的跑起来,所以需要在autodl-tmp/neuralangelo/projects/neuralangelo/configs/base.yaml、autodl-tmp/neuralangelo/train.py和autodl-tmp/neuralangelo/projects/neuralangelo/trainer.py中进行对应的修改,禁用wandb。
针对于第二个问题可以尝试在配置中禁用 pin_memory,或者减少 num_workers 的数量,特别是在数据加载部分,所以需要在base.yaml中进行修改。
以下是修改后的autodl-tmp/neuralangelo/train.py:
'''
-----------------------------------------------------------------------------
Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
NVIDIA CORPORATION and its licensors retain all intellectual property
and proprietary rights in and to this software, related documentation
and any modifications thereto. Any use, reproduction, disclosure or
distribution of this software and related documentation without an express
license agreement from NVIDIA CORPORATION is strictly prohibited.
-----------------------------------------------------------------------------
'''
import argparse
import os
import imaginaire.config
from imaginaire.config import Config, recursive_update_strict, parse_cmdline_arguments
from imaginaire.utils.cudnn import init_cudnn
from imaginaire.utils.distributed import init_dist, get_world_size, master_only_print as print, is_master
from imaginaire.utils.gpu_affinity import set_affinity
from imaginaire.trainers.utils.logging import init_logging
from imaginaire.trainers.utils.get_trainer import get_trainer
from imaginaire.utils.set_random_seed import set_random_seed
def parse_args():
parser = argparse.ArgumentParser(description='Training')
parser.add_argument('--config', help='Path to the training config file.', required=True)
parser.add_argument('--logdir', help='Dir for saving logs and models.', default=None)
parser.add_argument('--checkpoint', default=None, help='Checkpoint path.')
parser.add_argument('--seed', type=int, default=0, help='Random seed.')
parser.add_argument('--local_rank', type=int, default=os.getenv('LOCAL_RANK', 0))
parser.add_argument('--single_gpu', action='store_true')
parser.add_argument('--debug', action='store_true')
parser.add_argument('--profile', action='store_true')
parser.add_argument('--show_pbar', action='store_true')
parser.add_argument('--wandb', action='store_true', help="Enable using Weights & Biases as the logger")
#parser.add_argument('--wandb_name', default='default', type=str)
parser.add_argument('--resume', action='store_true')
args, cfg_cmd = parser.parse_known_args()
return args, cfg_cmd
def main():
args, cfg_cmd = parse_args()
set_affinity(args.local_rank)
cfg = Config(args.config)
cfg_cmd = parse_cmdline_arguments(cfg_cmd)
recursive_update_strict(cfg, cfg_cmd)
# If args.single_gpu is set to True, we will disable distributed data parallel.
if not args.single_gpu:
# this disables nccl timeout
os.environ["NCLL_BLOCKING_WAIT"] = "0"
os.environ["NCCL_ASYNC_ERROR_HANDLING"] = "0"
cfg.local_rank = args.local_rank
init_dist(cfg.local_rank, rank=-1, world_size=-1)
print(f"Training with {get_world_size()} GPUs.")
# set random seed by rank
set_random_seed(args.seed, by_rank=True)
# Global arguments.
imaginaire.config.DEBUG = args.debug
# Create log directory for storing training results.
cfg.logdir = init_logging(args.config, args.logdir, makedir=True)
# Print and save final config
if is_master():
cfg.print_config()
cfg.save_config(cfg.logdir)
# Initialize cudnn.
init_cudnn(cfg.cudnn.deterministic, cfg.cudnn.benchmark)
# Initialize data loaders and models.
trainer = get_trainer(cfg, is_inference=False, seed=args.seed)
trainer.set_data_loader(cfg, split="train")
trainer.set_data_loader(cfg, split="val")
trainer.checkpointer.load(args.checkpoint, args.resume, load_sch=True, load_opt=True)
# Initialize Wandb.
#trainer.init_wandb(cfg,
# project=args.wandb_name,
# mode="disabled" if args.debug or not args.wandb else "online",
# resume=args.resume,
# use_group=True)
trainer.mode = 'train'
# Start training.
trainer.train(cfg,
trainer.train_data_loader,
single_gpu=args.single_gpu,
profile=args.profile,
show_pbar=args.show_pbar)
# Finalize training.
trainer.finalize(cfg)
if __name__ == "__main__":
main()
以下是修改后的autodl-tmp/neuralangelo/projects/neuralangelo/configs/base.yaml:
# -----------------------------------------------------------------------------
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto. Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.
# -----------------------------------------------------------------------------
logging_iter: 9999999999999 # disable the printing logger
max_iter: 500000
wandb_scalar_iter: 9999999999
wandb_image_iter: 9999999999
validation_iter: 5000
speed_benchmark: False
checkpoint:
save_iter: 20000
trainer:
type: projects.neuralangelo.trainer
ema_config:
enabled: False
load_ema_checkpoint: False
loss_weight:
render: 1.0
eikonal: 0.1
curvature: 5e-4
init:
type: none
amp_config:
enabled: False
depth_vis_scale: 0.5
model:
type: projects.neuralangelo.model
object:
sdf:
mlp:
num_layers: 1
hidden_dim: 256
skip: []
activ: softplus
activ_params:
beta: 100
geometric_init: True
weight_norm: True
out_bias: 0.5
inside_out: False
encoding:
type: hashgrid
levels: 16
hashgrid:
min_logres: 5
max_logres: 11
dict_size: 22
dim: 8
range: [-2,2]
coarse2fine:
enabled: True
init_active_level: 4
step: 5000
gradient:
mode: numerical
taps: 4
rgb:
mlp:
num_layers: 4
hidden_dim: 256
skip: []
activ: relu_
activ_params: {}
weight_norm: True
mode: idr
encoding_view:
type: spherical
levels: 3
s_var:
init_val: 3.
anneal_end: 0.1
background:
enabled: True
white: False
mlp:
num_layers: 8
hidden_dim: 256
skip: [4]
num_layers_rgb: 2
hidden_dim_rgb: 128
skip_rgb: []
activ: relu
activ_params: {}
activ_density: softplus
activ_density_params: {}
view_dep: True
encoding:
type: fourier
levels: 10
encoding_view:
type: spherical
levels: 3
render:
rand_rays: 512
num_samples:
coarse: 64
fine: 16
background: 32
num_sample_hierarchy: 4
stratified: True
appear_embed:
enabled: False
dim: 8
optim:
type: AdamW
params:
lr: 1e-3
weight_decay: 1e-2
sched:
iteration_mode: True
type: two_steps_with_warmup
warm_up_end: 5000
two_steps: [300000,400000]
gamma: 10.0
data:
type: projects.nerf.datasets.nerf_blender
root: datasets/nerf-synthetic/lego
use_multi_epoch_loader: True
num_workers: 2
preload: False
num_images: # The number of training images.
train:
image_size: [800,800]
batch_size: 2
subset:
val:
image_size: [400,400]
batch_size: 2
subset: 4
max_viz_samples: 16
readjust:
center: [0.,0.,0.]
scale: 1.
以下是修改后的autodl-tmp/neuralangelo/projects/neuralangelo/trainer.py:
'''
-----------------------------------------------------------------------------
Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
NVIDIA CORPORATION and its licensors retain all intellectual property
and proprietary rights in and to this software, related documentation
and any modifications thereto. Any use, reproduction, disclosure or
distribution of this software and related documentation without an express
license agreement from NVIDIA CORPORATION is strictly prohibited.
-----------------------------------------------------------------------------
'''
import torch
import torch.nn.functional as torch_F
#import wandb
from imaginaire.utils.distributed import master_only
from imaginaire.utils.visualization import wandb_image
from projects.nerf.trainers.base import BaseTrainer
from projects.neuralangelo.utils.misc import get_scheduler, eikonal_loss, curvature_loss
class Trainer(BaseTrainer):
def __init__(self, cfg, is_inference=True, seed=0):
super().__init__(cfg, is_inference=is_inference, seed=seed)
self.metrics = dict()
self.warm_up_end = cfg.optim.sched.warm_up_end
self.cfg_gradient = cfg.model.object.sdf.gradient
if cfg.model.object.sdf.encoding.type == "hashgrid" and cfg.model.object.sdf.encoding.coarse2fine.enabled:
self.c2f_step = cfg.model.object.sdf.encoding.coarse2fine.step
self.model.module.neural_sdf.warm_up_end = self.warm_up_end
def _init_loss(self, cfg):
self.criteria["render"] = torch.nn.L1Loss()
def setup_scheduler(self, cfg, optim):
return get_scheduler(cfg.optim, optim)
def _compute_loss(self, data, mode=None):
if mode == "train":
# Compute loss only on randomly sampled rays.
self.losses["render"] = self.criteria["render"](data["rgb"], data["image_sampled"]) * 3 # FIXME:sumRGB?!
self.metrics["psnr"] = -10 * torch_F.mse_loss(data["rgb"], data["image_sampled"]).log10()
if "eikonal" in self.weights.keys():
self.losses["eikonal"] = eikonal_loss(data["gradients"], outside=data["outside"])
if "curvature" in self.weights:
self.losses["curvature"] = curvature_loss(data["hessians"], outside=data["outside"])
else:
# Compute loss on the entire image.
self.losses["render"] = self.criteria["render"](data["rgb_map"], data["image"])
self.metrics["psnr"] = -10 * torch_F.mse_loss(data["rgb_map"], data["image"]).log10()
def get_curvature_weight(self, current_iteration, init_weight):
if "curvature" in self.weights:
if current_iteration <= self.warm_up_end:
self.weights["curvature"] = current_iteration / self.warm_up_end * init_weight
else:
model = self.model_module
decay_factor = model.neural_sdf.growth_rate ** (model.neural_sdf.anneal_levels - 1)
self.weights["curvature"] = init_weight / decay_factor
def _start_of_iteration(self, data, current_iteration):
model = self.model_module
self.progress = model.progress = current_iteration / self.cfg.max_iter
if self.cfg.model.object.sdf.encoding.coarse2fine.enabled:
model.neural_sdf.set_active_levels(current_iteration)
if self.cfg_gradient.mode == "numerical":
model.neural_sdf.set_normal_epsilon()
self.get_curvature_weight(current_iteration, self.cfg.trainer.loss_weight.curvature)
elif self.cfg_gradient.mode == "numerical":
model.neural_sdf.set_normal_epsilon()
return super()._start_of_iteration(data, current_iteration)
@master_only
def log_wandb_scalars(self, data, mode="val"):
# 禁用 wandb 的调用
pass
@master_only
def log_wandb_images(self, data, mode="val",max_samples=None):
# 禁用 wandb 的调用
pass
def train(self, cfg, data_loader, single_gpu=False, profile=False, show_pbar=False):
self.progress = self.model_module.progress = self.current_iteration / self.cfg.max_iter
super().train(cfg, data_loader, single_gpu, profile, show_pbar)
然后再次运行下述训练命令:
EXPERIMENT=dtu
GROUP=dtu_group
NAME=dtu_scan63
CONFIG=projects/neuralangelo/configs/${EXPERIMENT}.yaml
GPUS=1 # use >1 for multi-GPU training!
torchrun --nproc_per_node=${GPUS} train.py \
--logdir=logs/${GROUP}/${NAME} \
--config=${CONFIG} \
--show_pbar
运行成功后有以下信息输出:

一共需要迭代50万次,一次6秒左右,因此在DTU数据集的63号场景上跑Neuralangelo模型大概需要18.7小时。训练完后输出的东西都存于autodl-tmp/neuralangelo/logs/dtu_group/dtu_scan63目录。
若训练过程突然中断,则可以执行下述命令继续在检查点文件上训练:
EXPERIMENT=dtu
GROUP=dtu_group
NAME=dtu_scan63
CONFIG=projects/neuralangelo/configs/${EXPERIMENT}.yaml
GPUS=1 # use >1 for multi-GPU training!
torchrun --nproc_per_node=${GPUS} train.py \
--logdir=logs/${GROUP}/${NAME} \
--config=${CONFIG} \
--show_pbar\
--checkpoint=logs/${GROUP}/${NAME}/xxx.pt\
--resume
运行下述命令提取mesh表面:
CHECKPOINT=logs/dtu_group/dtu_scan63/epoch_10204_iteration_000500000_checkpoint.pt
OUTPUT_MESH=output/dtu_63.ply
CONFIG=logs/dtu_group/dtu_scan63/config.yaml
RESOLUTION=2048
BLOCK_RES=128
GPUS=1 # use >1 for multi-GPU mesh extraction
torchrun --nproc_per_node=${GPUS} projects/neuralangelo/scripts/extract_mesh.py \
--config=${CONFIG} \
--checkpoint=${CHECKPOINT} \
--output_file=${OUTPUT_MESH} \
--resolution=${RESOLUTION} \
--block_res=${BLOCK_RES}
下面是DTU数据集63号场景提取的mesh:

下图是Neus重建的DTU数据集63号场景:

可以明显的看出两者的差别,Neuralangelo几乎把所有能看见的苹果枝干都重建出来了,而且苹果表面的凹陷没有那么严重,在几何质量方面是高于Neus的,这也应该是Neuralangelo点高的原因。
4、准备TNT测试数据集
首先到网站Tanks and Temples Benchmark中下载Tanks and Temples数据集,文件结构应如下所示 (需要将下载的图像移动到文件夹 images_raw):

然后利用IrfanView软件将images_raw里面的图片进行重命名为000000.png、000001.png等等,并将格式由jpg转为png,并将TNT数据集中的各个场景的图像的分辨率都转换为宽1920,高1080的分辨率。
然后将这个文件夹压缩上传至“制作NeuS数据集”的服务器的autodl-tmp/neuralangelo/datasets/tanks_and_temples目录上面(tanks_and_temples/Barn目录需要自己创建,上传至这个服务器的原因是因为这个服务器中装了colmap,我尝试了重新装,但是又出现了报错,所以突发其想使用旧环境),并进行解压,如下图所示,为Tanks数据集中的法庭(Courthouse)数据集:

然后运行下述命令调用colmap预处理数据,并生成和Instant-NPG一样的json文件:
cd autodl-tmp/neuralangelo/datasets/tanks_and_temples
rm -rf .ipynb_checkpoints
cd Barn
rm -rf .ipynb_checkpoints
cd images_raw
rm -rf .ipynb_checkpoints
cd autodl-tmp/neuralangelo
PATH_TO_TNT=datasets/tanks_and_temples # Modify this to be the Tanks and Temples root directory.
bash projects/neuralangelo/scripts/preprocess_tnt.sh ${PATH_TO_TNT}
过程没有报错,并且最后输出信息中有writing …….to transform.json,那么Tanks数据集就预处理成功了。预处理成功后的有以下文件,,下述为Tank数据集中的谷场(Barn)数据集:

5、在TNT数据集上训练Neuralangelo
将有关Neuralangelo训练TNT数据集的配置文件autodl-tmp/neuralangelo/projects/neuralangelo/configs/tnt.yaml修改如下:
# -----------------------------------------------------------------------------
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto. Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.
# -----------------------------------------------------------------------------
_parent_: projects/neuralangelo/configs/base.yaml
model:
object:
sdf:
mlp:
inside_out: False # True for Meetingroom.
encoding:
coarse2fine:
init_active_level: 8
appear_embed:
enabled: True #开启外观嵌入变量,训练tnt数据集需要开启外观嵌入变量
dim: 8
data:
type: projects.neuralangelo.data
root: datasets/tanks_and_temples/Barn#改为对应要训练的tnt数据集的地址
num_images: 410 # 改为数据集图片的数量
train:
image_size: [1080,1920] #改为数据集的分辨率
batch_size: 4 #TNT数据集的训练批次最好设置为16,且用多个GPU计算才能达到较好的效果
subset:
val:
image_size: [300,540]
batch_size: 4 #TNT数据集的评估批次最好设置为16,且用多个GPU计算才能达到较好的效果
subset: 1
max_viz_samples: 16
将有关Neuralangelo训练TNT数据集的配置文件autodl-tmp/neuralangelo/projects/neuralangelo/configs/base.yaml修改如下:
# -----------------------------------------------------------------------------
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto. Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.
# -----------------------------------------------------------------------------
logging_iter: 9999999999999 # disable the printing logger
max_iter: 500000
wandb_scalar_iter: 100
wandb_image_iter: 10000
validation_iter: 5000
speed_benchmark: False
checkpoint:
save_iter: 250000
trainer:
type: projects.neuralangelo.trainer
ema_config:
enabled: False
load_ema_checkpoint: False
loss_weight:
render: 1.0
eikonal: 0.1
curvature: 5e-4
init:
type: none
amp_config:
enabled: False
depth_vis_scale: 0.5
model:
type: projects.neuralangelo.model
object:
sdf:
mlp:
num_layers: 1
hidden_dim: 256
skip: []
activ: softplus
activ_params:
beta: 100
geometric_init: True
weight_norm: True
out_bias: 0.5
inside_out: False
encoding:
type: hashgrid
levels: 16
hashgrid:
min_logres: 5
max_logres: 11
dict_size: 22
dim: 8
range: [-2,2]
coarse2fine:
enabled: True
init_active_level: 4
step: 5000
gradient:
mode: numerical
taps: 4
rgb:
mlp:
num_layers: 4
hidden_dim: 256
skip: []
activ: relu_
activ_params: {}
weight_norm: True
mode: idr
encoding_view:
type: spherical
levels: 3
s_var:
init_val: 3.
anneal_end: 0.1
background:
enabled: True
white: False
mlp:
num_layers: 8
hidden_dim: 256
skip: [4]
num_layers_rgb: 2
hidden_dim_rgb: 128
skip_rgb: []
activ: relu
activ_params: {}
activ_density: softplus
activ_density_params: {}
view_dep: True
encoding:
type: fourier
levels: 10
encoding_view:
type: spherical
levels: 3
render:
rand_rays: 512
num_samples:
coarse: 64
fine: 16
background: 32
num_sample_hierarchy: 4
stratified: True
appear_embed:
enabled: True #开启外观变量,训练TNT数据集需要开启外观嵌入变量
dim: 8
optim:
type: AdamW
params:
lr: 1e-3
weight_decay: 1e-2
sched:
iteration_mode: True
type: two_steps_with_warmup
warm_up_end: 5000
two_steps: [300000,400000]
gamma: 10.0
data:
type: projects.nerf.datasets.nerf_blender
root: datasets/tanks_and_temples/Barn #设置TNT数据集路径
use_multi_epoch_loader: True
num_workers: 2
preload: False
num_images: 410 #设置数据集中的图片数量
train:
image_size: [800,800]
batch_size: 4 #TNT数据集的训练批次最好设置为16,且用多个GPU计算才能达到较好的效果
subset:
val:
image_size: [400,400]
batch_size: 4 #TNT数据集的训练批次最好设置为16,且用多个GPU计算才能达到较好的效果
subset: 4
max_viz_samples: 16
readjust:
center: [0.,0.,0.]
scale: 1.
TNT测试数据集准备好之后就可以运行下述命令开始训练了(有颜色的模型训练提取需要参考Neuralangelo代码修改.md):
EXPERIMENT=tnt
GROUP=tnt_group
NAME=tnt_Barn
CONFIG=projects/neuralangelo/configs/${EXPERIMENT}.yaml
GPUS=8 #训练TNT数据集需要多个GPU
torchrun --nproc_per_node=${GPUS} train.py \
--logdir=logs/${GROUP}/${NAME} \
--config=${CONFIG} \
--show_pbar \
--wandb_name="Neuralangelo_Color" \
--wandb
运行成功后有以下信息输出:

一共需要迭代50万次,一次两分钟左右,因为TNT数据集比较大,所以运行的时间很长,我估计得两天的时间。
若训练过程突然中断,则可以执行下述命令继续在检查点文件上训练:
EXPERIMENT=tnt
GROUP=tnt_group
NAME=tnt_Courthouse
CONFIG=projects/neuralangelo/configs/${EXPERIMENT}.yaml
GPUS=2 # use >1 for multi-GPU training!
torchrun --nproc_per_node=${GPUS} train.py \
--logdir=logs/${GROUP}/${NAME} \
--config=${CONFIG} \
--show_pbar\
--checkpoint=logs/${GROUP}/${NAME}/xxx.pt\
--resume\
--wandb_name="Neuralangelo_Color" \
--wandb
运行下述命令提取mesh表面(有颜色的表面提取需要参考Neuralangelo代码修改.md):
CHECKPOINT=logs/tnt_group/tnt_Courthouse/epoch_00226_iteration_000250000_checkpoint.pt
OUTPUT_MESH=output/tnt_Courthouse.ply
CONFIG=logs/tnt_group/tnt_Courthouse/config.yaml
RESOLUTION=2048
BLOCK_RES=128
GPUS=6 # use >1 for multi-GPU mesh extraction
torchrun --nproc_per_node=${GPUS} projects/neuralangelo/scripts/extract_mesh.py \
--config=${CONFIG} \
--checkpoint=${CHECKPOINT} \
--output_file=${OUTPUT_MESH} \
--resolution=${RESOLUTION} \
--block_res=${BLOCK_RES}
6、在大叶植物数据集上训练Neuralangelo
Neuralangelo使用的数据集格式和Neus一样,即如下几个文件:

只不过Neuralangelo加入了Instant-NPG中的多分辨率哈希位置编码,所以还需要一个与Instant-NPG中相同的json文件,因此Neuralangelo使用的数据集格式如下:

那么要制作Neuralangelo模型格式的植物数据集就很简单了,只需把之前Neus处理好的植物数据集拿过来,上传至目录autodl-tmp/neuralangelo/datasets/zhiwu:

这时还差json文件,我决定采用生成DTU数据集json文件的方式生成植物数据集对应的json文件 ,首先将对应的植物数据集移至autodl-tmp/neuralangelo/datasets/dtu,并随便重新命名为dtu_sanxx(DTU没有的场景号),如dtu_scan200,然后运行下述命令生成对应植物数据集的json文件:
cd autodl-tmp/neuralangelo python3 projects/neuralangelo/scripts/convert_dtu_to_json.py --dtu_path datasets/dtu
如下图可见dtu_scan200也生成了对应的json文件,dtu_scan200实际上是我们的植物数据集:

然后再将dtu_scan200重新移至目录autodl-tmp/neuralangelo/datasets/zhiwu,并将其重新命名为植物数据集的名字,这样一来Neuralangelo格式的大叶植物数据集就准备好了:

然后编写植物数据集对应的yaml配置文件autodl-tmp/neuralangelo/projects/neuralangelo/configs/zhiwu.yaml:
# your_plant_dataset.yaml _parent_: projects/neuralangelo/configs/base.yaml#各种类型的数据集配置文件都基于base.yaml model: object: sdf: mlp: inside_out: False encoding: coarse2fine: init_active_level: 4 s_var: init_val: 1.4 appear_embed: enabled: False data: type: projects.neuralangelo.data # 指定数据集处理模块 root: datasets/zhiwu/dayezhiwu # 指定数据集根目录 train: image_size: [1728,1152] # 根据植物数据集的图像大小调整 batch_size: 1 subset: # 如果使用部分数据集,指定子集 val: image_size: [300,400] batch_size: 1 subset: 1 max_viz_samples: 16
按照之前在DTU数据集上训练Neuralangelo模型那样修改文件autodl-tmp/neuralangelo/projects/neuralangelo/configs/base.yaml、autodl-tmp/neuralangelo/train.py和autodl-tmp/neuralangelo/projects/neuralangelo/trainer.py。
运行下述命令开始在大叶植物数据集上训练Neuralangelo模型:
EXPERIMENT=zhiwu
GROUP=zhiwu_group
NAME=dayezhiwu
CONFIG=projects/neuralangelo/configs/${EXPERIMENT}.yaml
GPUS=1 # use >1 for multi-GPU training!
torchrun --nproc_per_node=${GPUS} train.py \
--logdir=logs/${GROUP}/${NAME} \
--config=${CONFIG} \
--show_pbar
运行下述命令提取mesh表面:
CHECKPOINT=logs/zhiwu_group/N3/epoch_04098_iteration_000500000_checkpoint.pt
OUTPUT_MESH=output/N3.ply
CONFIG=logs/zhiwu_group/N3/config.yaml
RESOLUTION=2048
BLOCK_RES=128
GPUS=1 # use >1 for multi-GPU mesh extraction
torchrun --nproc_per_node=${GPUS} projects/neuralangelo/scripts/extract_mesh.py \
--config=${CONFIG} \
--checkpoint=${CHECKPOINT} \
--output_file=${OUTPUT_MESH} \
--resolution=${RESOLUTION} \
--block_res=${BLOCK_RES}
一共需要迭代50万次,一次16秒左右,因此在DTU数据集的63号场景上跑Neuralangelo模型大概需要19.1小时。
下面是大叶植物提取的mesh:

下述是Neus跑的大叶植物:

通过下述放大的叶片可以看出Neuralangelo重建的大叶植物的纹理比Neus重建的大叶植物纹理细致得多,那么再次证明Neuralangelo的纹理特征和几何质量比Neus好:


另外还在N3植物数据集上训练了Neuralangelo模型,下面是N3植物数据集提取的mesh:

下述是Neus跑的N3:

可以看出Neuralangelo居然把N3的植物枝干重建出来了,这是我没有想到的,我原本以为NeuralAngelo会和Neus一样重建不出来N3的枝干,因为之前做的表面重建方法中,只有有颜色的表面重建color-Neus和NeuralRecon-W才把N3的植物枝干重建出来了,其他表面重建的方法都没有成功,我就以为是因为采样点颜色输出一致帮助神经网络识别了几何的不同部分,从而提高了几何重建质量,才把N3的植物枝干重建出来的,由于Neuralangelo并不是有颜色的表面重建,也不存在采样点颜色输出一致,那么我自然就以为Neuralangelo不能重建出N3的植物枝干,但是Neuralangelo居然成功重建出了枝干,那么现在看来不止是采样点颜色输出一致可以帮忙重建出N3的植物枝干,Neuralangelo提出的两个创新数值梯度和渐进层次优化也可以帮忙重建出N3的枝干。
Neuralangelo提出的第一个创新数值梯度帮助全局表面都得到反向传播的优化,使得全局表面都变得比较光滑(DTU的63号场景的苹果没有那么凹陷,且表面更加平整光滑),第二个创新是通过对数值梯度的参数ϵ和哈希网格的分辨率进行渐进的优化,为的都是先识别出物体的整体形态(能够重建出植物枝干),然后在逐步细化,恢复出物体详细的纹理特征(N3底部枝干也能重建出来的原因),这两个创新极大的帮助了神经网络更好的识别物体表面的几何形态,进而提高重建的几何质量,使得N3的植物枝干能被完整的重建出来。

Neuralangelo NeuralRecon-W color-Neus
由上图可见Neuralangelo能够重建出N3植物底部的枝干,其重建物体的几何质量高于NeuralRecon-W和Color-Neus。我认为其原因是因为NeuralRecon-W和Color-Neus是因为采样点输出颜色一致来帮助神经网络识别N3的植物枝干,但是由于N3植物枝干的底部与位于花盆上叶片的绿色过于相近,因此不能通过采样点颜色一致来帮助神经网络识别出植物枝干底部,因此重建失败。那为什么Neuralangelo可以成功呢?因为Neuralangelo不是使用采样点颜色输出一致来提高重建的几何质量,我认为之所以能够重建出植物枝干的底部是因为Neuralanlo的第二个创新渐进层次优化,首先通过较大的数值梯度参数ϵ和较小的(粗糙的)哈希网格识别出重建物体的整体形态,粗糙的识别出物体的整体形态,植物枝干底部和花盆连成一坨,再用较小的数值梯度参数ϵ和较大的(细致的)哈希网格优化出重建物体的纹理特征,将植物枝干底部与花盆由一坨变得逐渐分离,形态分明。
更多推荐


所有评论(0)