Semgrep OSS in CI
Semgrep OSS can be set up run static application security testing (SAST) scans on repositories of any size.
This guide explains how to set up Semgrep OSS in your CI pipeline using entirely open source components, also known as a stand-alone CI setup. The preferred Semgrep OSS command is semgrep scan
.
Prerequisites
- Sufficient permissions in your repository to:
- Commit a CI configuration file.
- Start or stop a CI job.
- Optional: Create environment variables.
Ensure your scans use open source components
This setup uses only the LGPL 2.1 Semgrep CLI tool. It is not subject to the usage limits of Semgrep Pro. In order to remain strictly open source, you must ensure that the rules you run use open source licenses or are your own custom Semgrep rules.
To verify a rule's license, read the license
key under the metadata
of a Semgrep rule.
Click to expand for an example of a rule with a license
key.
This rule's last line displays a license: MIT
key-value pair.
rules:
- id: eslint.detect-object-injection
patterns:
- pattern: $O[$ARG]
- pattern-not: $O["..."]
- pattern-not: "$O[($ARG : float)]"
- pattern-not-inside: |
$ARG = [$V];
...
<... $O[$ARG] ...>;
- pattern-not-inside: |
$ARG = $V;
...
<... $O[$ARG] ...>;
- metavariable-regex:
metavariable: $ARG
regex: (?![0-9]+)
message: Bracket object notation with user input is present, this might allow an
attacker to access all properties of the object and even it's prototype,
leading to possible code execution.
languages:
- javascript
- typescript
severity: WARNING
metadata:
cwe: "CWE-94: Improper Control of Generation of Code ('Code Injection')"
primary_identifier: eslint.detect-object-injection
secondary_identifiers:
- name: ESLint rule ID security/detect-object-injection
type: eslint_rule_id
value: security/detect-object-injection
license: MIT
For a comparison of the behavior between Semgrep OSS CI scans and Semgrep Pro scans, see Semgrep Pro versus Semgrep OSS.
Set up the CI job
Use template configuration files
Click the link of your CI provider to view a configuration file you can commit to your repository to create a Semgrep job:
Use other methods
Use either of the following methods to run Semgrep on other CI providers.
Direct docker usage
Reference or add the semgrep/semgrep Docker image directly. The method to add the Docker image varies based on the CI provider. This method is used in the Bitbucket Pipelines code snippet.
Install semgrep
within your CI job
If you cannot use the Semgrep Docker image, install Semgrep as a step or command within your CI job:
- Add
pip3 install semgrep
into the configuration file as a step or command, depending on your CI provider's syntax. - Run any valid
semgrep scan
command, such assemgrep scan --config auto
.
For an example, see the Azure Pipelines code snippet.
Configure your CI job
The following sections describe methods to customize your CI job.
Schedule your scans
The following table is a summary of methods and resources to set up schedules for different CI providers.
CI provider | Where to set schedule |
---|---|
GitHub Actions | See Sample CI configs for information on how to modify your semgrep.yml file |
GitLab CI/CD | Refer to GitLab documentation |
Jenkins | Refer to Jenkins documentation |
Bitbucket Pipelines | Refer to Bitbucket documentation |
CircleCI | Refer to CircleCI documentation |
Buildkite | Refer to Buildkite documentation |
Azure Pipelines | Refer to Azure documentation |
Customize rules and rulesets
Add rules to scan with semgrep scan
You can customize what rules to run in your CI job. The rules and rulesets can come from the Semgrep Registry, or your own rules. The sources for rules to scan with are:
- The value of the
SEMGREP_RULES
environment variable. - The value passed after
--config
. You can use multiple--config
arguments, one per value. For example:semgrep scan --config p/default --config p/comment
.
The SEMGREP_RULES
environment variable accepts a list of local and remote rules and rulesets to run. The SEMGREP_RULES
list is delimited by a space (
) if the variable is exported from a shell command or script block. For example, see the following BitBucket Pipeline snippet:
# ...
script:
- export SEMGREP_RULES="p/nginx p/ci no-exec.yml"
- semgrep ci
# ...
The line defining SEMGREP_RULES
defines three different sources, delimited by a space:
- export SEMGREP_RULES="p/nginx p/ci no-exec.yml"
The example references two rulesets from Semgrep Registry (p/nginx
and p/ci
) and a rule available in the repository (no-exec.yml
).
If the SEMGREP_RULES
environment variable is defined from a YAML block, the list of rules and rulesets to run is delimited by a newline. See the following example of a GitLab CI/CD snippet:
# ...
variables:
SEMGREP_RULES: >-
p/nginx
p/ci
no-exec.yml
# ...
Write your own rules
Write custom rules to enforce your team's coding standards and security practices. Rules can be forked from existing community-written rules.
See Writing rules to learn how to write custom rules.
Ignore files
See Ignore files, folders, and code.
By default semgrep ci
skips files and directories such as tests/
, node_modules/
, and vendor/
. It uses the default .semgrepignore
file which you can find in the Semgrep GitHub repository. This default is used when no explicit .semgrepignore
file is found in the root of your repository.
Optional: Copy and commit the default .semgrepignore
file to the root of your repository and extend it with your own entries or write your .semgrepignore
file from scratch. If Semgrep detects a .semgrepignore
file within your repository, it does not append entries from the default .semgrepignore
file.
For a complete example, see the .semgrepignore file in Semgrep’s source code.
.semgrepignore
is only used by Semgrep. Integrations such as GitLab's Semgrep SAST Analyzer do not use it.
Save or export findings to a file
To save or export findings, pass file format options and send the formatted findings to a file.
For example, to save to a JSON file:
semgrep scan --json > findings.json
The JSON schema for Semgrep's CLI output can be found in semgrep/semgrep-interfaces.
You can also use the SARIF format:
semgrep scan --sarif > findings.sarif
Refer to the CLI reference for output formats.
Migrate to Semgrep AppSec Platform from a stand-alone CI setup
Migrate to Semgrep AppSec Platform to:
- View and manage findings in a centralized location. False positives can be ignored through triage actions. These actions can be undertaken in bulk.
- Configure rules and actions to undertake when a finding is generated by the rule. You can undertake the following actions:
- Audit the rule. This means that findings are kept within Semgrep's Findings page and are not surfaced to your team's SCM.
- Show the finding to your team through the use of PR and MR comments.
- Block the pull or merge request.
To migrate to Semgrep AppSec Platform:
- Create an account in Semgrep AppSec Platform.
- Click Projects > Scan New Project > Run scan in CI.
- Follow the steps in the setup page to complete your migration.
- Optional: Remove the old CI job that does not use Semgrep AppSec Platform.
Semgrep OSS jobs versus Semgrep Pro jobs
Feature | Semgrep Pro CI (semgrep ci ) | Semgrep OSS CI (semgrep scan ) |
---|---|---|
Customized SAST scans | ✔️ | ✔️ |
SCA (software composition analysis) scans | ✔️ | -- |
Secrets scans | ✔️ | -- |
PR (pull request) or MR (merge request) comments | ✔️ | -- |
Finding status tracked over lifetime | ✔️ | -- |
Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.