How to use the command dvc diff (with examples)
The dvc diff
command is used to show changes in DVC tracked files and directories. It allows users to compare DVC tracked files from different Git commits, tags, and branches, as well as compare changes in DVC tracked files between two Git commits. Additionally, it provides the ability to display output as JSON or Markdown, and show the latest hash of DVC tracked files.
Use case 1: Compare DVC tracked files from different Git commits, tags, and branches w.r.t the current workspace
Code:
dvc diff commit_hash/tag/branch
Motivation: This use case is helpful for comparing the changes in DVC tracked files between different Git commits, tags, or branches. It allows users to see what changes have been made to the files, directories, or data within the DVC repository.
Explanation: In this use case, the dvc diff
command is used with the desired commit hash, tag, or branch to compare the changes in DVC tracked files. The command will compare the files in the DVC repository at the specified commit, tag, or branch with the current workspace.
Example output:
--- old/test.txt
+++ new/test.txt
@@ -1,3 +1,3 @@
-line 1
+modified line 1
line 2
line 3
Use case 2: Compare changes in DVC tracked files from 1 Git commit to another
Code:
dvc diff revision_b revision_a
Motivation: This use case is useful for comparing the changes in DVC tracked files between two specific Git commits. It allows users to see the differences in files, directories, or data between two different revisions within the DVC repository.
Explanation: In this use case, the dvc diff
command is used with the two desired Git revisions (revision_a
and revision_b
) to compare the changes in DVC tracked files. The command will compare the files in the DVC repository at revision_b
with the files at revision_a
.
Example output:
--- old/test.txt
+++ new/test.txt
@@ -1,3 +1,3 @@
-line 1
+modified line 1
line 2
line 3
Use case 3: Compare DVC tracked files, along with their latest hash
Code:
dvc diff --show-hash commit
Motivation: This use case is helpful for comparing the DVC tracked files and their latest hash. It allows users to see if any changes have been made to the content of the tracked files within the DVC repository.
Explanation: In this use case, the dvc diff
command is used with the --show-hash
flag and the desired commit to compare the DVC tracked files. The command will show the differences in the files and display the latest hash of the tracked files.
Example output:
--- old/test.txt (e9576ae)
+++ new/test.txt (3499c7c)
@@ -1,3 +1,3 @@
-line 1
+modified line 1
line 2
line 3
Use case 4: Compare DVC tracked files, displaying the output as JSON
Code:
dvc diff --show-json --show-hash commit
Motivation: This use case is useful for obtaining the differences in DVC tracked files in a machine-readable format. It allows users to programmatically process the changes in the files within the DVC repository.
Explanation: In this use case, the dvc diff
command is used with the --show-json
and --show-hash
flags to compare the DVC tracked files. The command will display the differences in the files as JSON and provide the latest hash of the tracked files.
Example output:
[
{
"file": "test.txt",
"diff": "@@ -1,3 +1,3 @@\n-line 1\n+modified line 1\n line 2\n line 3",
"hash": {
"old": "e9576ae",
"new": "3499c7c"
}
}
]
Use case 5: Compare DVC tracked files, displaying the output as Markdown
Code:
dvc diff --show-md --show-hash commit
Motivation: This use case is helpful for obtaining a human-readable format of the differences in DVC tracked files. It allows users to easily review the changes made to the files within the DVC repository.
Explanation: In this use case, the dvc diff
command is used with the --show-md
and --show-hash
flags to compare the DVC tracked files. The command will display the differences in the files as Markdown and include the latest hash of the tracked files.
Example output:
**test.txt** (e9576ae → 3499c7c)
```diff
@@ -1,3 +1,3 @@
-line 1
+modified line 1
line 2
line 3
Conclusion:
The dvc diff
command is a powerful tool for comparing changes in DVC tracked files and directories. It provides various options and flags to customize the output format and comparison scopes. By utilizing this command, users can easily understand the differences between different revisions, tags, branches, or commits within their DVC repositories.