How to use the command 'git filter-repo' (with examples)
The git filter-repo
command is a highly versatile tool designed for rewriting Git history with far superior performance and ease-of-use compared to its predecessor, the git filter-branch
. This utility is particularly handy when you need to alter the history of a repository, whether for removing sensitive data, reorganizing files, or purging unnecessary content while retaining commit history. One notable advantage of git filter-repo
is that it automates many operations that are required manually with other tools, thus streamlining the process of repository management significantly.
Use case 1: Replace a sensitive string in all files
Code:
git filter-repo --replace-text <(echo 'find==>replacement')
Motivation:
Maintaining the confidentiality of sensitive information, such as passwords, API keys, or any sensitive strings accidentally committed to a repository, is crucial to protect against potential security breaches. If such data has been committed, removing or replacing it across all instances in the repository’s history becomes a necessity. git filter-repo
simplifies this task by allowing users to safely replace all instances of sensitive strings across the entire repository history with minimal effort.
Explanation:
git filter-repo
: This command initiates the tool to start rewriting the repository history.--replace-text
: This option tellsgit filter-repo
that you want to replace certain text across the entire repository.<(echo 'find==>replacement')
: Utilizes process substitution to specify the text to find and the text to replace it with. Here,'find'
signifies the sensitive string, and'replacement'
is the string that will replace it.
Example output:
After executing the command, all instances of the word 'find'
within the repository’s history will be replaced by 'replacement'
. The output does not typically display changes but rest assured that the repository has been rewritten with the replacements made.
Use case 2: Extract a single folder, keeping history
Code:
git filter-repo --path path/to/folder
Motivation:
There are situations where developers might want to extract a single directory from a larger repository to create a more focused repository or to move certain project components to a new project. This is particularly useful when refactoring large monolithic repositories into smaller, more manageable pieces while wishing to retain the historical changes associated with the folder.
Explanation:
git filter-repo
: Begins the process of rewriting history.--path path/to/folder
: This option specifies the path to the folder you want to extract. Only changes related to this folder in the history are kept, and everything else is discarded.
Example output:
Once executed, the repository will contain only the specified folder with its complete history intact. The rest of the repository will have been pruned away.
Use case 3: Remove a single folder, keeping history
Code:
git filter-repo --path path/to/folder --invert-paths
Motivation:
Removing outdated or unnecessary directories from a repository’s history might be needed when cleaning up clutter or preparing the repository for open-source release by removing proprietary code. Keeping the rest of the history intact ensures that you retain the valuable context and changes associated with the rest of your project.
Explanation:
git filter-repo
: Initiates the repository rewriting procedure.--path path/to/folder
: Identifies the target folder that you wish to remove from the history.--invert-paths
: Inverts the operation, meaning it removes the specified path and keeps everything else.
Example output:
After running, the specified folder will no longer exist in the repository’s history while every other file stays untouched, along with their historical changes.
Use case 4: Move everything from sub-folder one level up
Code:
git filter-repo --path-rename path/to/folder/:
Motivation:
In restructuring your project structure, you may find it necessary to simplify the directory hierarchy by elevating the contents of a subdirectory up one level. This can facilitate easier navigation and access to project components while preserving the historical integrity of your files.
Explanation:
git filter-repo
: Commences the process of rewriting repository history.--path-rename path/to/folder/:
: Specifies the folder to be moved, with a colon denoting its new position. This option moves all contents ofpath/to/folder/
up a level in the directory structure.
Example output:
Instead of residing inside path/to/folder/
, files will now appear directly in the base of the repository, simplifying the directory layout without losing historical information about their changes.
Conclusion
In conclusion, git filter-repo
provides powerful options for managing and rewriting Git history to suit the needs of security, repository organization, and project evolution. By leveraging these functionalities, developers can ensure that their repositories remain secure, efficient, and structured in a manner that best facilitates ongoing development efforts.