Kaldi单步完美运行AIShell v1 S5之四:DNN (nnet3、xent、MPE)

Kaldi单步完美运行AIShell v1 S5之四:DNN(nnet3、xent、MPE)

  • 致谢
  • 机器配置
    • 问题:显卡设备老旧,一个GPU,想跑tdnn模型,如何破?
  • 第11部分:nnet3 DNN
  • 第12部分:nnet3训练、解码、校准
  • 第13部分:迭代深度计算
  • 第14部分:Chain

致谢

感谢AIShell在商业化道路上的探索。期待着v3的到来。

机器配置

sv@HP:~$ sudo lsb_release -a
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.1 LTS
Release:	18.04
Codename:	bionic

sv@HP:~$ cat /proc/cpuinfo | grep model\ name
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
sv@HP:~$ cat /proc/meminfo | grep MemTotal
MemTotal:       16321360 kB
sv@HP:~$ lspci | grep 'VGA'
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)

问题:显卡设备老旧,一个GPU,想跑tdnn模型,如何破?

**解答:**
将num-jobs-initial和num-jobs-final都设为1,将epochs改为2或者3GPU设为独占。
sv@HP:~/lkaldi/egs/aishell/s5$ sudo nvidia-smi -c 3
[sudo] password for sv: 
Set compute mode to EXCLUSIVE_PROCESS for GPU 00000000:01:00.0.
All done.
sv@HP:~/lkaldi/egs/aishell/s5$ sudo nvidia-smi
Wed Jan 16 10:31:58 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.78       Driver Version: 410.78       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:01:00.0  On |                  N/A |
| 27%   31C    P8     7W / 151W |    225MiB /  8116MiB |      0%   E. Process |
+-------------------------------+----------------------+----------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1432      G   /usr/lib/xorg/Xorg                           125MiB |
|    0      1645      G   /usr/bin/gnome-shell                          94MiB |
|    0      2622      G   /opt/firefox/firefox-bin                       3MiB |
+-----------------------------------------------------------------------------+

第11部分:nnet3 DNN

sv@HP:~/lkaldi/egs/aishell/s5$ for x in exp/ */decode_test; do [ -d $x ] && grep WER $x/cer_* | utils/best_wer.sh; done 2>/dev/null

%WER 36.59 [ 38335 / 104765, 849 ins, 3183 del, 34303 sub ] exp/mono/decode_test/cer_10_0.0
%WER 18.83 [ 19727 / 104765, 971 ins, 1161 del, 17595 sub ] exp/tri1/decode_test/cer_13_0.5
%WER 18.79 [ 19684 / 104765, 957 ins, 1142 del, 17585 sub ] exp/tri2/decode_test/cer_14_0.5
%WER 16.84 [ 17643 / 104765, 791 ins, 991 del, 15861 sub ] exp/tri3a/decode_test/cer_14_0.5
%WER 13.63 [ 14277 / 104765, 762 ins, 639 del, 12876 sub ] exp/tri4a/decode_test/cer_13_0.5
%WER 8.68 [ 9097 / 104765, 355 ins, 464 del, 8278 sub ] exp/nnet3/tdnn_sp/decode_test/cer_14_1.0

第12部分:nnet3训练、解码、校准

sv@HP:~/lkaldi/egs/aishell/s5$ local/nnet3/run_tdnn.sh
local/nnet3/run_ivector_common.sh: expected file data/train/feats.scp to exist
sv@HP:~/lkaldi/egs/aishell/s5$ local/nnet3/run_tdnn.sh
local/nnet3/run_ivector_common.sh: preparing directory for low-resolution speed-perturbed data (for alignment)
utils/data/perturb_data_dir_speed_3way.sh: data/train_sp/feats.scp already exists: refusing to run this (please delete data/train_sp/feats.scp if you want this to run)
sv@HP:~/lkaldi/egs/aishell/s5$ local/nnet3/run_tdnn.sh
local/nnet3/run_ivector_common.sh: preparing directory for low-resolution speed-perturbed data (for alignment)
utils/data/perturb_data_dir_speed_3way.sh: making sure the utt2dur and the reco2dur files are present
... in data/train, because obtaining it after speed-perturbing
... would be very slow, and you might need them.
utils/data/get_utt2dur.sh: data/train/utt2dur already exists with the expected length.  We won't recompute it.
utils/data/get_reco2dur.sh: data/train/reco2dur already exists with the expected length.  We won't recompute it.
utils/data/perturb_data_dir_speed.sh: generated speed-perturbed version of data in data/train, in data/train_sp_speed0.9
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_speed0.9
utils/data/perturb_data_dir_speed.sh: generated speed-perturbed version of data in data/train, in data/train_sp_speed1.1
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_speed1.1
utils/data/combine_data.sh data/train_sp data/train data/train_sp_speed0.9 data/train_sp_speed1.1
utils/data/combine_data.sh: combined utt2uniq
utils/data/combine_data.sh [info]: not combining segments as it does not exist
utils/data/combine_data.sh: combined utt2spk
utils/data/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/data/combine_data.sh: combined utt2dur
utils/data/combine_data.sh: combined reco2dur
utils/data/combine_data.sh [info]: **not combining feats.scp as it does not exist everywhere**
utils/data/combine_data.sh: combined text
utils/data/combine_data.sh [info]: **not combining cmvn.scp as it does not exist everywhere**
utils/data/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/data/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/data/combine_data.sh: combined wav.scp
utils/data/combine_data.sh [info]: not combining spk2gender as it does not exist
fix_data_dir.sh: kept all 360294 utterances.
fix_data_dir.sh: old files are kept in data/train_sp/.backup
utils/data/perturb_data_dir_speed_3way.sh: generated 3-way speed-perturbed version of data in data/train, in data/train_sp
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp
local/nnet3/run_ivector_common.sh: making MFCC features for low-resolution speed-perturbed data
steps/make_mfcc_pitch.sh --cmd run.pl --mem 8G --nj 70 data/train_sp exp/make_mfcc/train_sp mfcc_perturbed
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for train_sp
steps/compute_cmvn_stats.sh data/train_sp exp/make_mfcc/train_sp mfcc_perturbed
Succeeded creating CMVN stats for train_sp
fix_data_dir.sh: kept all 360294 utterances.
fix_data_dir.sh: old files are kept in data/train_sp/.backup
local/nnet3/run_ivector_common.sh: aligning with the perturbed low-resolution data
steps/align_fmllr.sh --nj 30 --cmd run.pl --mem 8G data/train_sp data/lang exp/tri5a exp/tri5a_sp_ali
steps/align_fmllr.sh: feature type is lda
steps/align_fmllr.sh: compiling training graphs
steps/align_fmllr.sh: aligning data in data/train_sp using exp/tri5a/final.alimdl and speaker-independent features.
steps/align_fmllr.sh: computing fMLLR transforms
steps/align_fmllr.sh: doing final alignment.
steps/align_fmllr.sh: done aligning data.
steps/diagnostic/analyze_alignments.sh --cmd run.pl --mem 8G data/lang exp/tri5a_sp_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri5a_sp_ali/log/analyze_alignments.log
403 warnings in exp/tri5a_sp_ali/log/align_pass2.*.log
2 warnings in exp/tri5a_sp_ali/log/fmllr.*.log
385 warnings in exp/tri5a_sp_ali/log/align_pass1.*.log
local/nnet3/run_ivector_common.sh: creating high-resolution MFCC features
utils/copy_data_dir.sh: copied data from data/train_sp to data/train_sp_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_hires
utils/copy_data_dir.sh: copied data from data/dev to data/dev_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/dev_hires
utils/copy_data_dir.sh: copied data from data/test to data/test_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/test_hires
utils/data/perturb_data_dir_volume.sh: data/train_sp_hires/feats.scp exists; moving it to data/train_sp_hires/.backup/ as it wouldn't be valid any more.
utils/data/perturb_data_dir_volume.sh: added volume perturbation to the data in data/train_sp_hires
steps/make_mfcc_pitch.sh --nj 10 --mfcc-config conf/mfcc_hires.conf --cmd run.pl --mem 8G data/train_sp_hires exp/make_hires/train_sp mfcc_perturbed_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_hires
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for train_sp_hires
steps/compute_cmvn_stats.sh data/train_sp_hires exp/make_hires/train_sp mfcc_perturbed_hires
Succeeded creating CMVN stats for train_sp_hires
fix_data_dir.sh: kept all 360294 utterances.
fix_data_dir.sh: old files are kept in data/train_sp_hires/.backup
utils/copy_data_dir.sh: copied data from data/train_sp_hires to data/train_sp_hires_nopitch
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_hires_nopitch
utils/data/limit_feature_dim.sh: warning: removing data/train_sp_hires_nopitch/cmvn.cp, you will have to regenerate it from the features.
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_hires_nopitch
steps/compute_cmvn_stats.sh data/train_sp_hires_nopitch exp/make_hires/train_sp mfcc_perturbed_hires
Succeeded creating CMVN stats for train_sp_hires_nopitch
steps/make_mfcc_pitch.sh --nj 10 --mfcc-config conf/mfcc_hires.conf --cmd run.pl --mem 8G data/dev_hires exp/make_hires/dev mfcc_perturbed_hires
steps/make_mfcc_pitch.sh: moving data/dev_hires/feats.scp to data/dev_hires/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/dev_hires
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for dev_hires
steps/compute_cmvn_stats.sh data/dev_hires exp/make_hires/dev mfcc_perturbed_hires
Succeeded creating CMVN stats for dev_hires
fix_data_dir.sh: kept all 14326 utterances.
fix_data_dir.sh: old files are kept in data/dev_hires/.backup
utils/copy_data_dir.sh: copied data from data/dev_hires to data/dev_hires_nopitch
utils/validate_data_dir.sh: Successfully validated data-directory data/dev_hires_nopitch
utils/data/limit_feature_dim.sh: warning: removing data/dev_hires_nopitch/cmvn.cp, you will have to regenerate it from the features.
utils/validate_data_dir.sh: Successfully validated data-directory data/dev_hires_nopitch
steps/compute_cmvn_stats.sh data/dev_hires_nopitch exp/make_hires/dev mfcc_perturbed_hires
Succeeded creating CMVN stats for dev_hires_nopitch
steps/make_mfcc_pitch.sh --nj 10 --mfcc-config conf/mfcc_hires.conf --cmd run.pl --mem 8G data/test_hires exp/make_hires/test mfcc_perturbed_hires
steps/make_mfcc_pitch.sh: moving data/test_hires/feats.scp to data/test_hires/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/test_hires
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for test_hires
steps/compute_cmvn_stats.sh data/test_hires exp/make_hires/test mfcc_perturbed_hires
Succeeded creating CMVN stats for test_hires
fix_data_dir.sh: kept all 7176 utterances.
fix_data_dir.sh: old files are kept in data/test_hires/.backup
utils/copy_data_dir.sh: copied data from data/test_hires to data/test_hires_nopitch
utils/validate_data_dir.sh: Successfully validated data-directory data/test_hires_nopitch
utils/data/limit_feature_dim.sh: warning: removing data/test_hires_nopitch/cmvn.cp, you will have to regenerate it from the features.
utils/validate_data_dir.sh: Successfully validated data-directory data/test_hires_nopitch
steps/compute_cmvn_stats.sh data/test_hires_nopitch exp/make_hires/test mfcc_perturbed_hires
Succeeded creating CMVN stats for test_hires_nopitch
local/nnet3/run_ivector_common.sh: computing a subset of data to train the diagonal UBM.
utils/data/subset_data_dir.sh: reducing #utt from 360294 to 90073
local/nnet3/run_ivector_common.sh: computing a PCA transform from the hires data.
steps/online/nnet2/get_pca_transform.sh --cmd run.pl --mem 8G --splice-opts --left-context=3 --right-context=3 --max-utts 10000 --subsample 2 exp/nnet3/diag_ubm/train_sp_hires_nopitch_subset exp/nnet3/pca_transform
Done estimating PCA transform in exp/nnet3/pca_transform
local/nnet3/run_ivector_common.sh: training the diagonal UBM.
steps/online/nnet2/train_diag_ubm.sh --cmd run.pl --mem 8G --nj 30 --num-frames 700000 --num-threads 8 exp/nnet3/diag_ubm/train_sp_hires_nopitch_subset 512 exp/nnet3/pca_transform exp/nnet3/diag_ubm
steps/online/nnet2/train_diag_ubm.sh: Directory exp/nnet3/diag_ubm already exists. Backing up diagonal UBM in exp/nnet3/diag_ubm/backup.Uyl
steps/online/nnet2/train_diag_ubm.sh: initializing model from E-M in memory, 
steps/online/nnet2/train_diag_ubm.sh: starting from 256 Gaussians, reaching 512;
steps/online/nnet2/train_diag_ubm.sh: for 20 iterations, using at most 700000 frames of data
Getting Gaussian-selection info
steps/online/nnet2/train_diag_ubm.sh: will train for 4 iterations, in parallel over
steps/online/nnet2/train_diag_ubm.sh: 30 machines, parallelized with 'run.pl --mem 8G'
steps/online/nnet2/train_diag_ubm.sh: Training pass 0
steps/online/nnet2/train_diag_ubm.sh: Training pass 1
steps/online/nnet2/train_diag_ubm.sh: Training pass 2
steps/online/nnet2/train_diag_ubm.sh: Training pass 3
local/nnet3/run_ivector_common.sh: training the iVector extractor
steps/online/nnet2/train_ivector_extractor.sh --cmd run.pl --mem 8G --nj 10 data/train_sp_hires_nopitch exp/nnet3/diag_ubm exp/nnet3/extractor
steps/online/nnet2/train_ivector_extractor.sh: Directory exp/nnet3/extractor already exists. Backing up iVector extractor in exp/nnet3/extractor/backup.ZTF
steps/online/nnet2/train_ivector_extractor.sh: doing Gaussian selection and posterior computation
Accumulating stats (pass 0)
Summing accs (pass 0)
Updating model (pass 0)
Accumulating stats (pass 1)
Summing accs (pass 1)
Updating model (pass 1)
Accumulating stats (pass 2)
Summing accs (pass 2)
Updating model (pass 2)
Accumulating stats (pass 3)
Summing accs (pass 3)
Updating model (pass 3)
Accumulating stats (pass 4)
Summing accs (pass 4)
Updating model (pass 4)
Accumulating stats (pass 5)
Summing accs (pass 5)
Updating model (pass 5)
Accumulating stats (pass 6)
Summing accs (pass 6)
Updating model (pass 6)
Accumulating stats (pass 7)
Summing accs (pass 7)
Updating model (pass 7)
Accumulating stats (pass 8)
Summing accs (pass 8)
Updating model (pass 8)
Accumulating stats (pass 9)
Summing accs (pass 9)
Updating model (pass 9)
utils/data/modify_speaker_info.sh: copied data from data/train_sp_hires_nopitch to exp/nnet3/ivectors_train_sp/train_sp_sp_hires_nopitch_max2, number of speakers changed from 1020 to 180399
utils/validate_data_dir.sh: Successfully validated data-directory exp/nnet3/ivectors_train_sp/train_sp_sp_hires_nopitch_max2
steps/online/nnet2/extract_ivectors_online.sh --cmd run.pl --mem 8G --nj 30 exp/nnet3/ivectors_train_sp/train_sp_sp_hires_nopitch_max2 exp/nnet3/extractor exp/nnet3/ivectors_train_sp
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/nnet3/ivectors_train_sp using the extractor in exp/nnet3/extractor.
steps/online/nnet2/extract_ivectors_online.sh --cmd run.pl --mem 8G --nj 8 data/dev_hires_nopitch exp/nnet3/extractor exp/nnet3/ivectors_dev
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/nnet3/ivectors_dev using the extractor in exp/nnet3/extractor.
steps/online/nnet2/extract_ivectors_online.sh --cmd run.pl --mem 8G --nj 8 data/test_hires_nopitch exp/nnet3/extractor exp/nnet3/ivectors_test
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/nnet3/ivectors_test using the extractor in exp/nnet3/extractor.
local/nnet3/run_tdnn.sh: creating neural net configs
tree-info exp/tri5a_sp_ali/tree 
steps/nnet3/xconfig_to_configs.py --xconfig-file exp/nnet3/tdnn_sp/configs/network.xconfig --config-dir exp/nnet3/tdnn_sp/configs/
nnet3-init exp/nnet3/tdnn_sp/configs//init.config exp/nnet3/tdnn_sp/configs//init.raw 
LOG (nnet3-init[5.5.164~1-9698]:main():nnet3-init.cc:80) Initialized raw neural net and wrote it to exp/nnet3/tdnn_sp/configs//init.raw
nnet3-info exp/nnet3/tdnn_sp/configs//init.raw 
nnet3-init exp/nnet3/tdnn_sp/configs//ref.config exp/nnet3/tdnn_sp/configs//ref.raw 
LOG (nnet3-init[5.5.164~1-9698]:main():nnet3-init.cc:80) Initialized raw neural net and wrote it to exp/nnet3/tdnn_sp/configs//ref.raw
nnet3-info exp/nnet3/tdnn_sp/configs//ref.raw 
nnet3-init exp/nnet3/tdnn_sp/configs//ref.config exp/nnet3/tdnn_sp/configs//ref.raw 
LOG (nnet3-init[5.5.164~1-9698]:main():nnet3-init.cc:80) Initialized raw neural net and wrote it to exp/nnet3/tdnn_sp/configs//ref.raw
nnet3-info exp/nnet3/tdnn_sp/configs//ref.raw 
2019-01-15 01:14:35,894 [steps/nnet3/train_dnn.py:36 - <module> - INFO ] Starting DNN trainer (train_dnn.py)
steps/nnet3/train_dnn.py --stage=-10 --cmd=run.pl --mem 8G --feat.online-ivector-dir exp/nnet3/ivectors_train_sp --feat.cmvn-opts=--norm-means=false --norm-vars=false --trainer.num-epochs 2 --trainer.optimization.num-jobs-initial 1 --trainer.optimization.num-jobs-final 1 --trainer.optimization.initial-effective-lrate 0.0015 --trainer.optimization.final-effective-lrate 0.00015 --egs.dir  --cleanup.remove-egs true --cleanup.preserve-model-interval 500 --use-gpu true --feat-dir=data/train_sp_hires --ali-dir exp/tri5a_sp_ali --lang data/lang --reporting.email= --dir=exp/nnet3/tdnn_sp
['steps/nnet3/train_dnn.py', '--stage=-10', '--cmd=run.pl --mem 8G', '--feat.online-ivector-dir', 'exp/nnet3/ivectors_train_sp', '--feat.cmvn-opts=--norm-means=false --norm-vars=false', '--trainer.num-epochs', '2', '--trainer.optimization.num-jobs-initial', '1', '--trainer.optimization.num-jobs-final', '1', '--trainer.optimization.initial-effective-lrate', '0.0015', '--trainer.optimization.final-effective-lrate', '0.00015', '--egs.dir', '', '--cleanup.remove-egs', 'true', '--cleanup.preserve-model-interval', '500', '--use-gpu', 'true', '--feat-dir=data/train_sp_hires', '--ali-dir', 'exp/tri5a_sp_ali', '--lang', 'data/lang', '--reporting.email=', '--dir=exp/nnet3/tdnn_sp']
2019-01-15 01:14:35,980 [steps/nnet3/train_dnn.py:177 - train - INFO ] Arguments for the experiment
{'ali_dir': 'exp/tri5a_sp_ali',
 'backstitch_training_interval': 1,
 'backstitch_training_scale': 0.0,
 'cleanup': True,
 'cmvn_opts': '--norm-means=false --norm-vars=false',
 'combine_sum_to_one_penalty': 0.0,
 'command': 'run.pl --mem 8G',
 'compute_per_dim_accuracy': False,
 'dir': 'exp/nnet3/tdnn_sp',
 'do_final_combination': True,
 'dropout_schedule': None,
 'egs_command': None,
 'egs_dir': None,
 'egs_opts': None,
 'egs_stage': 0,
 'email': None,
 'exit_stage': None,
 'feat_dir': 'data/train_sp_hires',
 'final_effective_lrate': 0.00015,
 'frames_per_eg': 8,
 'initial_effective_lrate': 0.0015,
 'input_model': None,
 'lang': 'data/lang',
 'max_lda_jobs': 10,
 'max_models_combine': 20,
 'max_objective_evaluations': 30,
 'max_param_change': 2.0,
 'minibatch_size': '512',
 'momentum': 0.0,
 'num_epochs': 2.0,
 'num_jobs_compute_prior': 10,
 'num_jobs_final': 1,
 'num_jobs_initial': 1,
 'online_ivector_dir': 'exp/nnet3/ivectors_train_sp',
 'preserve_model_interval': 500,
 'presoftmax_prior_scale_power': -0.25,
 'prior_subset_size': 20000,
 'proportional_shrink': 0.0,
 'rand_prune': 4.0,
 'remove_egs': True,
 'reporting_interval': 0.1,
 'samples_per_iter': 400000,
 'shuffle_buffer_size': 5000,
 'srand': 0,
 'stage': -10,
 'train_opts': [],
 'use_gpu': 'yes'}
2019-01-15 01:14:42,571 [steps/nnet3/train_dnn.py:227 - train - INFO ] Initializing a basic network for estimating preconditioning matrix
2019-01-15 01:14:42,814 [steps/nnet3/train_dnn.py:237 - train - INFO ] Generating egs
steps/nnet3/get_egs.sh --cmd run.pl --mem 8G --cmvn-opts --norm-means=false --norm-vars=false --online-ivector-dir exp/nnet3/ivectors_train_sp --left-context 16 --right-context 12 --left-context-initial -1 --right-context-final -1 --stage 0 --samples-per-iter 400000 --frames-per-eg 8 --srand 0 data/train_sp_hires exp/tri5a_sp_ali exp/nnet3/tdnn_sp/egs
File data/train_sp_hires/utt2uniq exists, so augmenting valid_uttlist to
include all perturbed versions of the same 'real' utterances.
steps/nnet3/get_egs.sh: creating egs.  To ensure they are not deleted later you can do:  touch exp/nnet3/tdnn_sp/egs/.nodelete
steps/nnet3/get_egs.sh: feature type is raw
feat-to-dim scp:exp/nnet3/ivectors_train_sp/ivector_online.scp - 
steps/nnet3/get_egs.sh: working out number of frames of training data
steps/nnet3/get_egs.sh: working out feature dim
steps/nnet3/get_egs.sh: creating 52 archives, each with 394272 egs, with
steps/nnet3/get_egs.sh:   8 labels per example, and (left,right) context = (16,12)
steps/nnet3/get_egs.sh: copying data alignments
copy-int-vector ark:- ark,scp:exp/nnet3/tdnn_sp/egs/ali.ark,exp/nnet3/tdnn_sp/egs/ali.scp 
LOG (copy-int-vector[5.5.164~1-9698]:main():copy-int-vector.cc:83) Copied 360290 vectors of int32.
steps/nnet3/get_egs.sh: Getting validation and training subset examples.
steps/nnet3/get_egs.sh: ... extracting validation and training-subset alignments.
... Getting subsets of validation examples for diagnostics and combination.
steps/nnet3/get_egs.sh: Generating training examples on disk
steps/nnet3/get_egs.sh: recombining and shuffling order of archives on disk
steps/nnet3/get_egs.sh: removing temporary archives
steps/nnet3/get_egs.sh: removing temporary alignments
steps/nnet3/get_egs.sh: Finished preparing training examples.

第13部分:迭代深度计算

2019-01-15 01:47:45,352 [steps/nnet3/train_dnn.py:275 - train - INFO ] Computing the preconditioning matrix for input features
2019-01-15 01:52:14,035 [steps/nnet3/train_dnn.py:286 - train - INFO ] Computing initial vector for FixedScaleComponent before softmax, using priors^-0.25 and rescaling to average 1
2019-01-15 01:52:24,348 [steps/nnet3/train_dnn.py:293 - train - INFO ] Preparing the initial acoustic model.
2019-01-15 01:52:28,117 [steps/nnet3/train_dnn.py:318 - train - INFO ] Training will run for 2.0 epochs = 832 iterations
2019-01-15 01:52:28,117 [steps/nnet3/train_dnn.py:352 - train - INFO ] Iter: 0/831    Epoch: 0.00/2.0 (0.0% complete)    lr: 0.001500    
以下省略18万字左右   。。。。  请脑补  。。。
2019-01-15 15:14:54,786 [steps/nnet3/train_dnn.py:352 - train - INFO ] Iter: 827/831    Epoch: 1.99/2.0 (99.4% complete)    lr: 0.000152    
2019-01-15 15:15:53,379 [steps/nnet3/train_dnn.py:352 - train - INFO ] Iter: 828/831    Epoch: 1.99/2.0 (99.5% complete)    lr: 0.000152    
2019-01-15 15:16:51,659 [steps/nnet3/train_dnn.py:352 - train - INFO ] Iter: 829/831    Epoch: 1.99/2.0 (99.6% complete)    lr: 0.000151    
2019-01-15 15:17:50,104 [steps/nnet3/train_dnn.py:352 - train - INFO ] Iter: 830/831    Epoch: 2.00/2.0 (99.8% complete)    lr: 0.000151    
2019-01-15 15:18:49,024 [steps/nnet3/train_dnn.py:352 - train - INFO ] Iter: 831/831    Epoch: 2.00/2.0 (99.9% complete)    lr: 0.000150    
2019-01-15 15:19:47,727 [steps/nnet3/train_dnn.py:398 - train - INFO ] Doing final combination to produce final.mdl
2019-01-15 15:19:47,727 [steps/libs/nnet3/train/frame_level_objf/common.py:491 - combine_models - INFO ] Combining set([832, 644, 774, 654, 784, 664, 794, 674, 804, 684, 814, 694, 824, 704, 714, 724, 734, 744, 624, 754, 634, 764]) models.
2019-01-15 15:20:09,585 [steps/nnet3/train_dnn.py:407 - train - INFO ] Getting average posterior for purposes of adjusting the priors.
2019-01-15 15:22:16,422 [steps/nnet3/train_dnn.py:418 - train - INFO ] Re-adjusting priors based on computed posteriors
2019-01-15 15:22:16,570 [steps/nnet3/train_dnn.py:428 - train - INFO ] Cleaning up the experiment directory exp/nnet3/tdnn_sp
steps/nnet2/remove_egs.sh: Finished deleting examples in exp/nnet3/tdnn_sp/egs
exp/nnet3/tdnn_sp: num-iters=832 nj=2..1 num-params=12.3M dim=43+100->3040 combine=-0.49->-0.48 (over 11) loglike:train/valid[553,831,combined]=(-0.52,-0.49,-0.48/-0.75,-0.75,-0.74) accuracy:train/valid[553,831,combined]=(0.822,0.832,0.836/0.779,0.784,0.784)
steps/nnet3/decode.sh --nj 40 --cmd run.pl --mem 8G --online-ivector-dir exp/nnet3/ivectors_dev exp/tri5a/graph data/dev_hires exp/nnet3/tdnn_sp/decode_dev
steps/nnet3/decode.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --mem 8G --iter final exp/tri5a/graph exp/nnet3/tdnn_sp/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/nnet3/tdnn_sp/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,3,19) and mean=8.6
steps/diagnostic/analyze_lats.sh: see stats in exp/nnet3/tdnn_sp/decode_dev/log/analyze_lattice_depth_stats.log
score best paths
+ steps/score_kaldi.sh --cmd 'run.pl --mem 8G' data/dev_hires exp/tri5a/graph exp/nnet3/tdnn_sp/decode_dev
steps/score_kaldi.sh --cmd run.pl --mem 8G data/dev_hires exp/tri5a/graph exp/nnet3/tdnn_sp/decode_dev
steps/score_kaldi.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ steps/scoring/score_kaldi_cer.sh --stage 2 --cmd 'run.pl --mem 8G' data/dev_hires exp/tri5a/graph exp/nnet3/tdnn_sp/decode_dev
steps/scoring/score_kaldi_cer.sh --stage 2 --cmd run.pl --mem 8G data/dev_hires exp/tri5a/graph exp/nnet3/tdnn_sp/decode_dev
steps/scoring/score_kaldi_cer.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ echo 'local/score.sh: Done'
local/score.sh: Done
score confidence and timing with sclite
Decoding done.
steps/nnet3/decode.sh --nj 20 --cmd run.pl --mem 8G --online-ivector-dir exp/nnet3/ivectors_test exp/tri5a/graph data/test_hires exp/nnet3/tdnn_sp/decode_test
steps/nnet3/decode.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --mem 8G --iter final exp/tri5a/graph exp/nnet3/tdnn_sp/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/nnet3/tdnn_sp/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,4,26) and mean=11.2
steps/diagnostic/analyze_lats.sh: see stats in exp/nnet3/tdnn_sp/decode_test/log/analyze_lattice_depth_stats.log
score best paths
+ steps/score_kaldi.sh --cmd 'run.pl --mem 8G' data/test_hires exp/tri5a/graph exp/nnet3/tdnn_sp/decode_test
steps/score_kaldi.sh --cmd run.pl --mem 8G data/test_hires exp/tri5a/graph exp/nnet3/tdnn_sp/decode_test
steps/score_kaldi.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ steps/scoring/score_kaldi_cer.sh --stage 2 --cmd 'run.pl --mem 8G' data/test_hires exp/tri5a/graph exp/nnet3/tdnn_sp/decode_test
steps/scoring/score_kaldi_cer.sh --stage 2 --cmd run.pl --mem 8G data/test_hires exp/tri5a/graph exp/nnet3/tdnn_sp/decode_test
steps/scoring/score_kaldi_cer.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ echo 'local/score.sh: Done'
local/score.sh: Done
score confidence and timing with sclite
Decoding done.


第14部分:Chain

继续:Kaldi单步完美运行AIShell v1 S5之五:chain DNN
继续:Kaldi单步完美运行AIShell v1 S5之四:nnet3 DNN
回头:Kaldi单步完美运行AIShell v1 S5之三:三音素TriPhone
回头:Kaldi单步完美运行AIShell v1 S5之二:单音素MonoPhone
回头:Kaldi单步完美运行AIShell v1 S5之一:MONO前

其他参考:Kaldi完美运行TIMIT完整结果(含DNN)

你可能感兴趣的:(Kaldi,dnn,nnet3,kaldi,asr,语音识别)