As discussed by the TVM PPMC, our goal is to provide a monthly summary of the project so users and
developers can get a better understanding of the goings on of the TVM community.
Feedback and suggestions are welcomed so that we can further improve these updates.
During July of 2021 we welcomed many new contributors to the project. Importantly we welcomed @comaniac , @junrushao1994 as new PMC members! Thanks to everyone for the hard work and contributions! Lots of discussion around RFCs happens in the new RFC process, like Automatic Mixed Precision Pass 7, Static Memory Planing 7, etc. Welcome to check out。
This forum got 129k pageviews, 2.9k user visits in the last month.
More improvements along with details are listed below.
Fix type relation for batch_matmul #8376 1
Resize 1D #8346
Fix index order in conv2d computation for Arm CPU. #8361
Add support of conv2d with NHWC for Mali #8422
Add support of conv2d with NHWC for Bifrost #8430
Modify create_executor to pass params #8418
Batch_matmul to dense optimization #8440
Add ConvInteger support. #8456
Add RandomUniform converter and tests to onnx frontend. #8426
Allow importing models with malformed Loop nodes. #8475
Switch from CompileEngine to TECompiler in Interpreter #8486
Support resize in the ONNX conversion #8455
Fix bug in test_op_level3 #8508
Extend FakeQuantizationToInteger to more ops #8241
Change Default "opt_level" of Sequential from 2 to 0 #8634
Remove dead code from depthwise_conv2d for Intel graphics #8381
Enforce attaching storage scope to PointerType #8366
Remove scope attribute from Buffer class #8463
Remove AttrStmt with storage_scope key #8516
Unify the shared pass prefix between vm and graph #8526
Avoid Override Generic Op Strategy in "hls.py" #8614
Fix for broken link in apps for wasm-standalone dir #8045
Add docs for Pass Instrument #8220
Corrected typo in googletest build instructions. #8459
Fix scipy docs inv #8619
TVM install addenda for M1 Macs #8568
fix storage rewrite index remap #8338
Bugfix for zero number arguments tir functions. #8515
cast disparate floating point types for binary ops #8517 1
specialize #8354
Add Nucleo stm32l4r5zi board to zephyr #8386 1
Add fixture to zephyr test #8393
Fix Stack Size Issue for Zephyr AOT Demo on Physical Hardware #8453
Fix clock skew on virtualbox #8395
Add zephyr cortex-r5 board to Zephyr #8519 1
Set the number of cores based on the VM sizing #8624
Fix platform name in base-box-tool #8612
Wrap 'If' if it has multiple outputs #8385
Parametrize ONNX Unit tests #8621
dense_tensorcore/batch_matmul_tensorcore support int8/int4 #8402
Improve injective schedule to enable half2 #8457
Initial support for dynamic shared memory #8466
Support multiple TIR-level dynamic shared memory allocations #8571
Bugfix for topi.prod #8416
Add support for arbitrary dtypes to CSRMV and CSRMM #8437
Parameterize conv2d and depthwise_conv2d tests #8433
minor change on assert statement in conv2d_NCHWc_int8.cuda #8554
Fix nn.pool*d
issue with 'vectorize' function and add unit tests #8541
Add transpose_a/b & dynamic shape support for batch matmul #8527
Improve the performance of scatter_nd #8479
Stridedslice and concat_v2 fix #8483
Added support for TensorList ops #8454
Check LLVM enabled/installed #8414
Add support for unpack with dim 0 after tensorlist stack #8558
Vc/pytorch lstm #8447
Minimal type checking on TIR schedule #8367
Update the tvmc tutorial with additional requirements #8334
Add pass for splitting kernel with huge number of args #8313
Minor bugfix to arm_compute_lib bulid scripts #8377
Add support for log_softmax #8369
Allow multiprocessing spawn to work (on macOS llvm at least) #8363
Allow tvmc to compile models with AOT executor in MLF #8331
Support QLinearAdd from onnx runtime com.microsoft contrib ops. #8305
Fix np.int and np.float usage in the tree. #8389
Add "operator" style to Model Library Format #8072
macOS is now supported by TVMC #8396
Remove unused conversion #8397
Support aten::flip #8398 1
Inverse affine map #8384
Add Compute Library tests to Jenkins for AArch64 CI #8394
add aten::masked_fill_ in pytorch frontend #8403
Cleanup more uses of np.bool and np.int. #8399
Revert "Actually add Compute Library tests to the Jenkins File (#8394)" #8400
TECompiler: Staged refactor and removal of compile engine #7518
fix keras install #8391
Add missing annotation for requires_gpu in test_topi_dense.py #8387
Minor updates to pass pylint locally. #8424
Fix x86 dense schedule extern ops #8420
Fix Relay pattern rewrite #8425
Simplify MatchFusePattern in InverseAffineMap #8427
Improve XGBTuner document #8428
TVMScript Parser support BufferSlice indices #8408
Replace RuntimeError in _lookup_task with deferred error. #8421
fix flaky TF crop_and_resize #8431
Fix address and port reported by android_rpc to tracker #8405
Fix undefined symbols by adding library #8446
Extend type checking and annotation for TIR #8429
Add qnn batch_matmul operator #8401
Use PAPI to collect hardware performance counters on CPU and CUDA #7983 1
Fix cpp_rpc connection to rpc_tracker #8388
Minor fixes to unit tests for cudnn/vulkan targets #8462
Add default op attribute registration to __init __.py #8460
Fix auto-scheduling after 9c6658721 #8478
FoldScaleAxis became non-recursive #8325
Remove compile_enginer header #8471
DeviceType enums match dlpack #8407
fix typo #8484
Fix the shape function of conv & Add dynamic support for conv2d nhwc #8480
add multi functions support in partition pass #8464
Fix _get_yolo_detections #8477
apps: microtvm: Disable CONFIG_FPU
for Zephyr runtime #8055
Support tir.abs node in tvm script #8488
Allow serialization of function attrs which are strings #8485
Re-enable ref_input
#8113
Fix dynamic batching when use_implicit_batch=False #8461
fix zero iter bug in arith #8494
Add missing shape functions for relay.nn operations #8489
Better error message for src/runtime/module.cc
if function cannot be loaded. #8496
Update Docker CI #8193
Re-enabled tests and updated module hashes #8498
Keep CODEOWNERS file up to date. #8500
Rename runtime-config to executor-config and add documentation for Model Library Format #8270
Enable ONNX tests that needed onnxruntime 1.7.0 #8502
Fix #8093, Enhance Buffer Index Simplify #8204
Organize CodeOwners File #8512
Fuse, Split #8467
Fix script printters StructuralEqual check failed #8499
Add json output to profiling reports #8503
Fix the repeatitive cast in scripr printing #8531
Fix TypeKey2Index when for root Object #8547 1
Split out libinfo.cc into a separate target. #8520
Mimic the TFLite 2.4 reader's behaviour #8538
Remove unused variable in topi cpp test #8549
Add explicit type cast to print. #8524
Specifically check handle for recursion during shutdown #8548
Add a --context-path
for build.sh #8557
Handling a corner case in TRT RemoveDropout pass #8506
Re-enable Compute library tests. #8573
Fix AutoScheduler test to cover Conv2D Winograd #8539
Fix Coreml Input Shape Handling #8562
Fix task extraction with TE compiler #8560
add support for softmax and log_softmax with MIOpen #8543
Added default non-verbose to download_testdata(), pass to download() #8533
Disable pip cache when creating Docker images #8575
wasm32-standalone app repaired #8563
Bug fix for numpy scalar input in vm #8553
Reduce testing time of LSTM tests #8583
Prioritize discrete GPUs as device_id=0. #8588
speed up reference resize kernel #8592
Delete pytest-results as part of CI workspace preparation #8594
Use SizeVar instead of Var when convert Any in the GetShape function #8555
Fix storage_access not visiting else branch #8525
Reduction Factoring (RFactor) #8544
Support for match_buffer from subregion #8585
Recover rpc server support #8604
Add caching to CMake #8373
Add support for AOT in external code generation tests #8591
Fix global pip cache disable change #8590
Fix Initial Memory Misalignment #8487
Remove QEMU Install #8518
Remove unused parameter. #8580
Docker env for Arm® Ethos™-U55 Port #8514
Instruction and Trace #8615
Introduce --interface-api={c,packed} parameter #8280
Fix test_external_codegen, broken by #8591 #8630
Rewrote PointerValueTypeRewrite transform #8528
Framework for device querying for all targets. #8602
Add graph_executor get_input_index API. #8633 1
Disallow fp16 conversion for arange op #8644
Allow spaces in target attributes #8587
Several minor corrections to the device property query #8651
Fix depthwise conv2d on non-cuda GPU platforms #8379
Fix wrong log of tir pass VerifyMemory #8445
Explicitly retain __hash__
of StringImm
#8449
Update stale relay.Module API in docs/comments #8411
Remove unused variable in GraphExecutorCodegen #8465
Compiler supports input with a slash #8481
Minor misspelling #8476
Enhance robustness of DefuseOps #8564
Add USE_PAPI configuration to config.cmake #8567
Fix a typo in include/tvm/ir/function.h #8617
hotfix check_grad perf regression #8581
Fix broadcast type func with incomplete type #8438
Fix the integer overflow problem of the scatter_nd op. #8415
do not simplify 'Any() - Any()' to 0 #8266
Visit each input param of the function in ExprVisitor visit_function #8521
Correct class number in Golang frontend sample #8511
fix android rpc app undefined reference problem #8530
fix illegal memory access bug in reduce op schedule by constriant thread_y #8566
Preserve IRModule type definition and imports in NameMangleExtFuncs #8523
Fix #8536 Get Target When Heterogeneous Execution #8537