SambaTune Release Notes
For detailed release-dependent information for code examples, check the README and configuration files in
|
Release 1.18
Features
-
Added tensor parallel test case.
-
Added section type details display in full-stack-tracing tool discussed in Stack Tracing report
-
Improved the logger to dump more information for debugging.
-
Improved the instrumentation workflows.
Release 1.17
Release 1.16
Features
-
Added full stack tracing capability. See Stack Tracing report.
-
Added the ability to profile apps that do training using actual run command. See unet_e2e.yaml example below.
-
Added the ability to separate compile and run apps in YAML inputs. See unet_e2e.yaml example below.
-
Improvements to reports:
-
Improved collated report output to consolidate with other reports.
-
Improved section report output to support hypersection.
-
Improved DDR and PCIe reports to support hypersection.
-
Improved performance insights to support hypersection.
-
Bug Fixes
-
Fixed model sweep and samba sweep failure when
mac_resources.json
outputted empty or arch was not specified.
unet_e2e.yaml example
This unet_e2e.yaml
example illustrates some changes we made in release 1.16.
# This example assumes that the sambaflow-apps-datascale-image-segmentation package is installed.
app: /opt/sambaflow/apps/image/segmentation/compile.py
compile-args: >
--batch-size 64
--mac-v2
...
run-app: /opt/sambaflow/apps/image/segmentation/hook.py
run-args: >
--num-workers 4
--data-dir $DATA_DIR
--log-dir $LOG_DIR
...
env:
DATA_DIR: DATA_DIR
LOG_DIR: LOG_DIR
# This example assumes that the sambaflow-apps-datascale-image-segmentation package is installed.
app: /opt/sambaflow/apps/image/segmentation/compile.py
compile-args: >
--batch-size 64
--in-channels=3
--in-width=256
--in-height=256
--enable-conv-tiling
--mac-v2
--mac-human-decision $UNET_HD
run-app: /opt/sambaflow/apps/image/segmentation/hook.py
run-args: >
--data-transform-config /opt/sambaflow/apps/image/segmentation/segmentation/datasets/data_transforms_config.yaml
--num-workers 4
--mode train
--in-channels=3
--in-width=256
--in-height=256
--init-features 32
--batch-size 64
--epochs 1
--data-dir $DATA_DIR
--data-cache-dir $DATA_CACHE_DIR
--log-dir $LOG_DIR
--samba-runtime-profile
--run-benchmark
--benchmark-steps 10
--benchmark-warmup-steps 2
env:
UNET_HD: UNET_HD
DATA_DIR: DATA_DIR
DATA_CACHE_DIR: DATA_CACHE_DIR
LOG_DIR: LOG_DIR