
How to Use the Command 'gitleaks' (with examples)
Gitleaks is a powerful tool aimed at enhancing security in software development by identifying and alerting developers to secrets and API keys that may inadvertently be exposed within Git repositories. The tool scans for sensitive information like passwords, API tokens, and other confidential data that might be left in source code or commit history, thereby preventing potential security breaches. Gitleaks is an essential resource for developers and teams who aim to maintain high-security standards by ensuring that sensitive information doesn’t leave the secure boundaries of their development environments.
Use case 1: Scan a remote repository
Code:
gitleaks detect --repo-url https://github.com/username/repository.git
Motivation:
In today’s interconnected world, many developers and teams use remote repositories hosted on platforms like GitHub, GitLab, or Bitbucket. These repositories can sometimes inadvertently become the battleground for potential data leaks if proper care isn’t taken to secure sensitive data like API keys or passwords. Scanning a remote repository with Gitleaks helps in detecting such leaks, ensuring that only secure, non-sensitive data is exposed to public or shared environments.
Explanation:
- gitleaks detect: This part of the command invokes Gitleaks in detection mode, triggering the tool to search for exposed secrets.
- --repo-url https://github.com/username/repository.git: This specifies the URL of the remote Git repository that Gitleaks will scan. It instructs Gitleaks to pull the repository from the given URL and perform a thorough search for secrets and credentials across the repository’s history.
Example Output:
After running the command, you might get an output indicating areas of concern, such as:
{
  "config": {
    "path": "/path/to/config.toml"
  },
  "secrets": [
    {
      "line": 10,
      "lineContent": "API_KEY=1234567890abcdef",
      "offender": "API_KEY",
      "commit": "abcdef1",
      "repo": "repository",
      "repoURL": "https://github.com/username/repository.git",
      "date": "2023-01-01T12:00:00Z",
      "rule": "Generic API Key",
      "commitMessage": "Initial commit",
      ...
    }
  ]
}
Use case 2: Scan a local directory
Code:
gitleaks detect --source path/to/repository
Motivation:
Developers often work with local repositories during development, testing, or when managing forks of upstream projects. Protecting these local environments from inadvertently leaking sensitive data is crucial. By scanning a local directory with Gitleaks, developers can identify and rectify leaks before synchronizing changes to a remote repository, thereby enhancing the overall security posture of the development lifecycle.
Explanation:
- gitleaks detect: Again, this initiates the detection process with Gitleaks.
- --source path/to/repository: Specifies the local directory path where the Git repository resides. It directs Gitleaks to examine this specific location for any compromising data exposures.
Example Output:
Upon executing the scan, typical results might include alerts like:
{
  "secrets": [
    {
      "line": 42,
      "lineContent": "password=supersecret123",
      "offender": "password",
      "commit": "abcd123",
      "repo": "local-repo",
      ...
    }
  ]
}
Use case 3: Output scan results to a JSON file
Code:
gitleaks detect --source path/to/repository --report path/to/report.json
Motivation:
In team environments or formal security protocols, it may be necessary to document findings, maintain records of code audits, or share results with colleagues or security teams. Saving scan results to a JSON file not only facilitates this documentation process but also allows for further analysis with other tools or integrations into security dashboards.
Explanation:
- gitleaks detect: Initiates the detection scan.
- --source path/to/repository: Indicates the local repository path to be scanned.
- --report path/to/report.json: Directs Gitleaks to save the results in a JSON file at the specified path. This allows the user to archive and analyze the results post-scan.
Example Output:
The output file report.json might contain structured data similar to:
[
  {
    "line": 5,
    "lineContent": "TOKEN=abcdef123456",
    "offender": "TOKEN",
    "commit": "123abc",
    "repo": "example-repo",
    ...
  }
]
Use case 4: Use a custom rules file
Code:
gitleaks detect --source path/to/repository --config-path path/to/config.toml
Motivation:
Security scanning needs often differ between organizations and projects based on specific requirements or compliance standards. Using a custom rules file allows developers or security teams to tailor Gitleaks to their specific needs, thereby enhancing its effectiveness by focusing on particular patterns, keywords, or criteria pertinent to the organization’s operational environment.
Explanation:
- gitleaks detect: Triggers detection.
- --source path/to/repository: Local directory to be scanned.
- --config-path path/to/config.toml: Uses a custom configuration file in TOML format, which contains specific rules that Gitleaks applies while scanning. This customized detection helps focus on relevant or newly identified threats that might not be included in the default rule set.
Example Output:
If the custom configuration file targets specific proprietary tokens, the output might reveal breaches as follows:
[
  {
    "line": 8,
    "lineContent": "PROPRIETARY_TOKEN=xyz098",
    "offender": "PROPRIETARY_TOKEN",
    ...
  }
]
Use case 5: Start scanning from a specific commit
Code:
gitleaks detect --source path/to/repository --log-opts --since=commit_id
Motivation:
When a specific change set or timeline needs examination, starting from a particular commit allows teams to focus their scanning efforts, reducing the scope and time required. This technique is especially useful when addressing newly introduced processes or reviewing the impact of recent pulls or merges.
Explanation:
- gitleaks detect: Starts the detection process.
- --source path/to/repository: Path to the local repository being scanned.
- --log-opts --since=commit_id: Scoped scanning starting from the designated commit ID. This targeted approach scans only the changes made from a specified point, optimizing performance and precision in result analysis.
Example Output:
Upon completion, filtered results begin from the specified commit point, such as:
[
  {
    "line": 15,
    "lineContent": "session_key=zyx987",
    "offender": "session_key",
    "commit": "789xyz",
    ...
  }
]
Use case 6: Scan uncommitted changes before a commit
Code:
gitleaks protect --staged
Motivation:
Prior to committing changes, it’s essential to verify that new or altered code doesn’t introduce security vulnerabilities. Scanning staged changes provides immediate feedback to developers on potential leaks, allowing them to make corrections before these changes become part of the commit history and potentially propagate to remote repositories.
Explanation:
- gitleaks protect: Switches Gitleaks to protection mode, focusing on preventing leaks from entering the codebase.
- --staged: Specifies only staged changes (those added to the index via- git add) will be scanned, providing real-time inspection of what is about to be committed.
Example Output:
Running this command might reveal sensitive data as follows:
Uncommitted changes detected:
- stage/test_file.py: SECRET_KEY found on line 27
Use case 7: Display verbose output indicating which parts were identified as leaks during the scan
Code:
gitleaks protect --staged --verbose
Motivation:
Verbose output not only details where leaks occur but also provides deeper insights into the scanning process itself. This detailed view can be invaluable for debugging purposes, understanding detection mechanisms, and educating developers about potential vulnerabilities within their code.
Explanation:
- gitleaks protect: Enables protect mode, targeting uncommitted changes.
- --staged: Limits the scan to staged (i.e., ready-to-commit) changes.
- --verbose: Enriches the output with detailed information about the scan process and locations of potential leaks, aiding developer understanding and improving training outcomes regarding secure coding practices.
Example Output:
The output becomes more descriptive:
Scanning started for staged changes...
File: stage/config.py, line 30: Detected secret in inline configuration.
Conclusion:
Gitleaks serves as a comprehensive solution for monitoring and securing Git repositories against unauthorized disclosures of secrets and sensitive information. Each use case highlights its versatility and effectiveness whether examining remote repositories, local directories, or specific sets of changes—making it an indispensable part of modern DevSecOps workflows. By implementing automated scans and continuous secret monitoring, organizations not only safeguard their assets but also fortify their development practices against potential threats.


