Gitlab CI/CD
Gitlab CI/CD docs: https://docs.gitlab.com/ci/
Validate syntax of your configuration: https://docs.gitlab.com/ci/yaml/lint/
Predefined Variables (they are available in every Gitlab CI/CD pipeline): https://docs.gitlab.com/ci/variables/predefined_variables/
Gitlab CI/CD keywords for YAML configuration files: https://docs.gitlab.com/ci/yaml/
CI/CD component examples: https://docs.gitlab.com/ci/components/examples/
CI/CD inputs: https://docs.gitlab.com/ci/inputs/
Gitlab CI/CD Security course from Gitlab University
What Gitlab CI/CD is
GitLab CI/CD is GitLab’s built-in continuous integration, delivery, and deployment system. It automates:
Building code
Running tests
Packaging artifacts
Deploying to environments
Ensuring quality gates / approvals
It is configured through a single file inside the repo:
GitLab CI/CD is tightly integrated: repo → merge requests → pipelines → environments → deployments → observability.
🏗️ GitLab CI/CD Architecture (High-Level)
GitLab’s CI/CD architecture consists of five main components:
GitLab Server (Coordinator)
This includes GitLab Rails/Workhorse/Gitaly. It is responsible for:
Parsing
.gitlab-ci.ymlCreating pipeline DAGs (jobs, stages, rules)
Storing pipeline metadata
Authenticating/authorizing runners
Scheduling CI jobs to available runners
Tracking job logs, statuses, artifacts
Think: brain of the CI system.
GitLab Runners
Runners are the compute nodes that actually execute jobs.
They can be:
Shared runners – provided by GitLab (SaaS) or by your company
Project/group runners – assigned to specific areas
Specific runners – dedicated to one project
Ephemeral runners – auto-scaled on cloud VMs or Kubernetes
Each runner is an agent registered with the GitLab coordinator.
Think of your .gitlab-ci.yml file as the blueprint. Runners are the machines that carry out the work. When a pipeline is triggered, available runners check in with GitLab to pick up jobs that perform various tasks, like running tests, building apps, or deploying changes.
GitLab’s runner system includes:
GitLab Runner (the software) This is the application/program you actually install on a server or machine. Think of it as the "engine" - it's the binary executable that sits on your infrastructure waiting for work to do.
Runners (the agents) These are the configured instances or "workers" that the GitLab Runner software manages. Each runner is registered with your GitLab instance and can execute CI/CD pipeline jobs.
Each runner runs inside an environment defined by an executor like Docker or Shell.
Potential issues with Runners and their troubleshooting
When creating job definitions in your .gitlab-ci.yml file, you have the ability to specify which runners can execute those jobs. This capability is essential for guaranteeing that jobs execute in appropriate environments—with the necessary permissions and resources available.
Why Runner Selection Matters
Some jobs may require specific environments or resources.
You may want to reserve certain runners for specific job types.
Security requirements may limit which runners can be used.
Runner selection in pipelines
The runner's availability and access level
Tags assigned to runners
Protected runners for sensitive operations
Runner Availability GitLab follows a specific hierarchy when selecting runners: it checks project-level runners first, then group-level runners (along with any parent groups), and finally instance-level runners. This ordering means more specific runners take precedence, giving you tailored environments where they matter most.
A typical CI/CD organization might use instance-level runners for standard microservices to minimize upkeep, while reserving project-specific runners for sensitive operations like payment processing. This strategy provides a good balance between convenience and security.
Instance-wide runners simplify administrative work, whereas project-dedicated runners can handle high-priority operations. Most teams can adopt this approach without changing their existing pipeline definitions.
Using Runner Tags Tags function as descriptive labels on runners, indicating what they're equipped to handle—for instance, 'android' or 'xcode'. You can use these tags to direct jobs toward runners with the necessary capabilities, guaranteeing that builds happen in appropriate environments.
Consider a mobile development team at a CI/CD-enabled company: they use tags to route iOS builds to runners with Xcode and Android builds to runners with the Android SDK. This precision reduces configuration mistakes and makes better use of available resources.
Tags ensure jobs only execute where the required tooling exists. Teams gain better environment separation, and there's no risk of untagged runners accidentally claiming incompatible jobs.
Using Protected Runners Protected runners only accept jobs from protected branches and tags, making them perfect for production pipelines. This restriction guarantees that only verified code reaches your live systems.
A company might configure a protected runner specifically for production releases to their Kubernetes infrastructure. This creates an additional security boundary around deployment operations.
With protected runners, only approved branches can initiate deployments. Sensitive credentials stay contained, and you can require manual sign-off as an extra safeguard for production changes.
Runner selection gives you precise control over the location, method, and circumstances under which your pipeline tasks execute.
Best practices for runner selection
Use specific tags to match jobs with appropriate environments. Example: Apply labels such as docker, linux, or android to ensure proper routing.
Establish consistent tagging conventions across your organization. Tip: Document your approved tags in a central location like a wiki or README for team reference.
Reserve protected runners for release pipelines. Benefit: Keeps sensitive credentials separate and restricts who can trigger critical deployments.
Avoid over-tagging your jobs. Best practice: Only specify the essential tags required for execution to prevent jobs from becoming unassignable.
Manage resource contention strategically when runner capacity is constrained. Strategy: Configure less critical jobs as interruptible so they can be preempted by high-priority work.
Diagnosing runner problems is a crucial skill for CI/CD practitioners. When jobs behave unexpectedly, recognizing typical runner-related issues enables quicker and more assured responses. If a job fails to launch or encounters unexpected failures, the underlying issue could stem from runner setup or connectivity challenges.
Common Runner-related Issues
Job Stuck in Pending Status When jobs get stuck showing "pending," you'll often encounter messages such as: "Waiting for an available runner…". This situation usually occurs because no runners match the specified tags, the job's tags don't align with any configured runners, or all suitable runners are currently occupied with other tasks.
In one case, a mobile development team inadvertently deleted the android tag from an important runner. Their build queue grew until they noticed and corrected the tag configuration.
Runner Connection Failures Connection-related failures happen when a job begins but then stops due to connectivity issues. These problems can result from network interruptions between the runner and GitLab, runners going offline while executing jobs, or brief infrastructure outages.
A company experienced this when a self-hosted runner in their staging setup terminated unexpectedly during a job because of an incorrectly configured restart policy.
Resource Limitations Jobs may fail with messages like "out of memory" or "process killed" when resource constraints are hit. This occurs when jobs demand more RAM or CPU than the runner provides, Docker executors exhaust available disk space, or build scripts spawn excessive parallel processes.
A test suite for a product recommendation system repeatedly failed until the team restructured it into smaller, more manageable test segments.
Identifying Runner Issues in Job Logs
When jobs fail in GitLab CI/CD, the job log provides your primary diagnostic resource. GitLab organizes logs into distinct sections that make it easier to determine whether issues stem from runner setup, network problems, or resource availability.
Job Start Section This area displays the GitLab Runner version and identifies which runner accepted the job. Verify that the correct runner was selected—particularly important when jobs need specific capabilities (such as Docker or shell executors).

Preparation Section This segment reveals how the runner configures the job environment. Failures in this phase often point to problems downloading images, setting up executors, or retrieving secured credentials.

Script Execution Look for system-level errors that indicate runner problems, such as:
Cannot allocate memory → Insufficient RAM on the runner
Connection reset by peer → Network connectivity loss
No space left on device → Exhausted disk capacity on the runner
These represent infrastructure limitations rather than flaws in your job's script logic.
Learning to read job logs effectively is critical for rapid runner troubleshooting, reducing delays, and maintaining development momentum.
Executors
The relationship between Runner and Executor:
While a runner picks up CI/CD jobs from GitLab, the executor determines how and where those jobs are run. Runners support many executors:
Shell
Runs directly on machine (fast, insecure)
Docker
Most common; runs jobs in containers
Docker Machine
Auto-scales VMs
Kubernetes
One pod per job; cloud-native
Custom
Any custom environment
SSH
Executes commands on remote hosts
Executors are critical because they determine isolation, scale, and portability.
A little more information about some Executor types:
Shell Executor Commands execute directly on the host system with the Shell executor, offering a simple approach for tasks needing direct server access. A company may leverage this on their secure deployment infrastructure when releasing production updates.
When you need straightforward execution and direct host system interaction, this executor type works well.
Docker Executor Each job gets its own fresh container when using the Docker executor, which isolates tasks from one another. A company may find this particularly valuable for frontend builds and testing, since it prevents jobs from interfering with each other.
Container-based execution provides both stronger security boundaries and more predictable behavior throughout the development workflow.
Kubernetes Executor Jobs run inside individual pods with the Kubernetes executor, which makes it great for scaling in cloud environments. A company may use this approach for their mobile builds, where they need to handle many simultaneous jobs efficiently.
Teams operating in cloud-native setups benefit from this executor's ability to dynamically manage resources and performance.
SSH Executor Remote command execution happens through SSH connections with this executor, giving you flexibility to work with distributed systems. A company may find it useful for their older infrastructure during a backend system migration.
This option helps bridge the gap when you're working with established systems while moving toward newer architectures.
Pipelines → Stages → Jobs DAG
A pipeline is a graph made of:
Pipeline (full execution)
Stages (sequential blocks, e.g. build → test → deploy)
Jobs (individual tasks)
Needs/DAG (modern approach, jobs run in parallel when dependencies are met)
Example DAG:
Or with DAG:
Artifacts, Packages, Environments
GitLab CI/CD automatically manages outputs:
Artifacts – files produced in jobs (binaries, logs, reports)
Cache – dependency caches between jobs
Environment deployments – dev/staging/prod
Releases & packages – container registry, package registry
🔁 How a Pipeline Runs (Execution Flow)
1. Developer pushes code or opens MR
Triggers pipeline based on rules:
on pushon merge requestschedulemanualwebhook
2. GitLab reads .gitlab-ci.yml
.gitlab-ci.ymlIt parses:
stages
jobs
variables
rules
dependencies
Generates a directed acyclic graph (DAG).
3. Jobs wait in the queue
GitLab Coordinator places all pending jobs into queue by tag.
4. Runners pull jobs
Runners use a pull model:
They match using:
Tags (
docker,k8s,gpu,linux)Runner assignments
Resource permissions
5. Runner executes the job
Based on executor configuration.
Typical steps:
Checkout source code
Restore caches
Run job script
Save artifacts
Upload logs
6. GitLab updates pipeline + MR status
GitLab shows:
Success
Failure
Skipped
Manual action required
7. (Optional) Deployments + Observability
GitLab can:
Deploy to Kubernetes
Create an environment URL
Track deployments via GitLab Deployments API
Integrate with Metrics/Tracing
🧩 Key Concepts of GitLab CI/CD
Here are the essential pieces:
.gitlab-ci.yml
Example minimal pipeline:
Tags
Used for routing jobs:
Variables
Pipeline-level or job-level:
Rules
Modern conditional logic:
Artifacts & Cache
Saves files between jobs and stages. TechStart’s build job saves a /public folder that later jobs reuse.
Artifacts = persist between stages Cache = speed up builds
Dependencies
📦 Putting It All Together: Full Architecture Diagram (Text)
🧠 When to Use GitLab CI/CD
GitLab shines if you want:
A unified Git+CI system
Strong merge request workflows
Kubernetes-native deployments
Built-in security scanning (SAST, DAST, dependency, secret detection)
Self-hosted + multi-cloud flexibility
Complex DAG pipelines
It’s extremely popular for DevOps teams that want an all-in-one platform.
Security practices on Gitlab
https://docs.gitlab.com/user/application_security/get-started-security/
Pipeline Types
https://docs.gitlab.com/ci/pipelines/pipeline_types/
Parent–Child Pipelines
A single main pipeline can trigger several smaller pipelines to run in parallel. For example, this becomes useful when a team breaks a large application into microservices, with each service having its own testing pipeline.
Multi-Project Pipelines
These pipelines span multiple repositories or projects and allow coordinated workflows. A common use case is when an organization introduces an additional service—such as a new analytics component—and wants deployments across both codebases to be synchronized.
Merge Request Pipelines
These run automatically whenever changes are pushed to a merge request. Teams often use them to speed up code reviews and detect bugs earlier in the development cycle.
Merge Trains
Merge trains queue and merge multiple merge requests safely and in a controlled order. This is especially helpful for teams where several developers push changes around the same time and want to avoid integration conflicts.
What is the include keyword?
GitLab allows you to modularize and share pipeline configurations using the include keyword.
This enables you to:
Eliminate duplicate logic across different files
Distribute pipeline components among multiple projects
Maintain templates from a single location for simpler updates
When you apply include, GitLab combines the referenced external YAML with your .gitlab-ci.yml during pipeline execution.
Before using include (if you repeat the same or similar configuration every project):
An reused with include:
This approach lets you define a job a single time—then apply it across numerous projects—without duplicating the same YAML repeatedly. Let's examine the various methods for including files from different sources.
include: local
Use this method to reference a YAML file within the same repository:
When your /ci/test-jobs.yml contains:
The test job will execute alongside any jobs specified in your primary .gitlab-ci.yml file.
Tip: GitLab's Pipelines section displays the fully expanded YAML configuration. This view helps with debugging or understanding how your included files merged into the complete pipeline definition.
include:project
This method lets you reference YAML files from different GitLab projects within your instance. It's valuable when distributing configurations among several projects.
Here, the test job from /ci/test-jobs.yml in the create-group/ci-config repository's main branch gets incorporated into your pipeline. The ref keyword accepts SHAs or tags as alternatives to branch names.
include:remote
This option enables you to pull in YAML files from external URLs beyond your GitLab instance. For instance:
This example incorporates the .gitlab-ci.yml file from example-project hosted on GitLab.com into your pipeline.
CI/CD Components and CI/CD Catalog
What Are CI/CD Components?
CI/CD Components are reusable, versioned building blocks for pipelines. Consider them modular templates that are:
Versioned – ensuring updates don't disrupt your existing pipelines
Parameterized – allowing you to provide inputs for customized behavior
Self-contained – designed around a specific function like linting or testing
Discoverable – available through the GitLab CI/CD Catalog
How to Include a Component
To incorporate a component, use the include keyword—but specify a component: reference rather than local, project, or remote.
This example utilizes the markdownlint component from the GitLab CI/CD Catalog. It comes prebuilt and ready for integration into any pipeline. By adding this component, you instantly access Markdown linting functionality without handling its configuration manually. We'll explore the details of including, using, and creating CI/CD Components in subsequent modules.
Understanding a Component Reference
A team lead introduces a developer to CI/CD Components—modular, version-controlled pipeline pieces that anyone in the organization can plug into their workflows.
They share this reference as an example:
Together, they walk through what each part represents:
$CI_SERVER_FQDN
Refers to the GitLab instance automatically—no need to hardcode the domain
components/yamllint
The group and project where the reusable component is stored
yamllint
The specific component provided by that project
@1.4.3
The exact release version that the pipeline should pull in
With a single line of configuration, the developer enables YAML linting across multiple repositories with consistent and centrally managed rules.
Enhancing Pipelines with Multiple Components
As their pipelines grow, they start combining several components—for code quality checks, container security, Go builds, and more.
Each referenced component adds a ready-to-use job or set of jobs, giving the team immediate advantages:
Automatically adopt established standards
Leverage expertise from specialists (security, language tooling, etc.)
Reduce duplication and keep
.gitlab-ci.ymlfiles clean and maintainable
If you want, I can also create a visual diagram showing how these components plug into a pipeline.
Gitlab CI/CD Catalog
In the GitLab CI/CD catalog, you can browse ready-made components for many common tasks, including:
Executing tests for a variety of languages
Building and pushing Docker images
Deploying to different types of environments
Running security and compliance scans
…and plenty of other workflow needs.
CI/CD Components must be stored in the /templates/ directory at the root of the repository. This standardized location makes components more discoverable and maintainable.
Adjusting a Component’s Behavior by adding Inputs
A development team recently added a component that runs static analysis to improve code quality. Everything works smoothly—except for one detail:
The component executes in the test stage by default, but the team already uses a dedicated lint stage:
Instead of modifying or duplicating the component’s job, the team looks into whether the component can be configured.
Discovering Available Inputs
Checking the CI/CD Catalog, they learn that the static-analysis component supports a few adjustable inputs, including:
stage
image
With that information, they update their pipeline:
Now the analysis job runs in the correct stage and uses the desired Python version—no rewriting required.
Where to Find Input Documentation
Every component listed in the CI/CD Catalog includes documentation describing its configurable inputs. You can locate this information in several places:
The component’s README Typically inside the repository where the component is defined.
The Catalog entry The component’s page in the CI/CD Catalog links directly to its README and lists supported inputs.
Usage examples Often included by the component maintainers in either the README or example subdirectories.
These resources help you understand how to customize each component to fit your team’s workflow.
Managing Component Versions
Why Versioning Matters
Controlling which version of a CI/CD Component your pipeline uses is essential. Locking the component to a specific release helps ensure consistent behavior—even as maintainers add new features or make changes.
An Unexpected Pipeline Failure
During a staging deployment, a pipeline that previously worked without issues suddenly starts failing—despite no recent code changes. After some investigation, the team discovers the root cause in their configuration:
The component was referenced using @~latest, which automatically pulls the most recent version. A new release had added a required parameter, and because the pipeline wasn’t specifying it, the job failed.
The realization: “We didn’t notice we were tracking an always-updating version—and a breaking change slipped in.”
Selecting an Appropriate Version Reference
Different version references serve different purposes. Here’s how they compare:
Commit SHA
a9f4cbd72318eef...
For maximum stability and auditability; ideal when you must guarantee exact behavior
Tag
@1.3.0
Use stable, released versions that won’t change unexpectedly
Branch
main
Follow the latest development work—useful for internal components or testing edge features
@~latest
Always fetches newest release
Suitable only for experimentation or non-critical pipelines where breakage is acceptable
By choosing the right version reference, teams can strike a balance between stability, predictability, and agility.
Gitlab CI/CD Global and Default Keywords
Full list of supported values for the default keyword
A small engineering group suddenly expands—from a handful of developers to a full-sized team. With more people contributing, the CI setups start drifting apart: different runtime versions, mismatched tools, and the usual “but it passes on my laptop” problems.
This is where GitLab’s global keywords make a difference. They allow you to define pipeline-wide defaults—ensuring every job starts with the same baseline configuration. Let’s break down why these keywords are so important and how they help maintain consistency across larger teams:
Inconsistent Runtime Versions
Example: Across a mid-sized engineering group, developers were unknowingly using a mix of Node.js versions—anything from 12 to 20. Some CI jobs failed purely because people copied old snippets from past repositories.
Multiple Dependency Install Methods
Example: Different developers preferred different commands:
npm ci, npm install, and even pnpm install.
Each approach produced different lockfile behavior and caching results, leading to unpredictable builds.
Mismatched Database Environments
Example: Local and CI test environments may run various Postgres versions—anywhere between 9.6 and 15—even though the production environment required a specific, newer release.
The Fix — The default Keyword
default KeywordThe default keyword lets you set shared configuration for all jobs in the pipeline unless a job overrides those settings. It’s declared once, at the top of the .gitlab-ci.yml, and instantly unifies behavior across the entire pipeline.
For example:
With this approach, every job starts with:
The same base container
A consistent database version
A unified dependency installation method
GitLab provides several default-level keywords—covering images, services, scripts, timeouts, artifacts, and more. The documentation lists all available options, but the principle is simple: set the standard once and let every job benefit from it.
GitLab CI/CD: Artifacts and Cache
Even though GitLab pipelines run inside fresh, isolated containers, real-world CI/CD almost always needs to share files, results, or dependencies so work isn’t repeated unnecessarily. GitLab provides two mechanisms for this: artifacts and cache. They sound similar but serve very different purposes.
Let’s break them down.
Artifacts — Deliverables That Move Through the Pipeline
Artifacts Docs, Dependencies Docs
Artifacts are files or directories that a job explicitly hands off to later stages.
Think of them as pipeline outputs: something a job produces that another job depends on.
What Artifacts Are Used For
Use artifacts when you want to transfer something forward in the pipeline, such as:
Compiled frontend bundles
Built binaries
Test reports / coverage reports
Generated documentation
Scan results
A job in a later stage can download and use these artifacts.
Important Behavior
Artifacts are available only to jobs in later stages, not parallel jobs in the same stage
They’re stored by GitLab and can be downloaded via the UI
They expire unless you set
expire_in: 0(never expire) or customize the duration
Quick Example
The dist/ folder created by the build job is passed to the deploy job.
Another example:
Cache — Speed Boosters for Repeated Work
Docs, Caching visualized, Official caching examples
Cache is designed to speed up jobs, not transfer deliverables.
While artifacts are about sharing, cache is about avoiding re-downloading or recomputing.
What Cache Is Used For
Cache shines with data that:
Takes a long time to download
Doesn’t need to be versioned
Can be reused across jobs and pipelines
Examples:
node_modules/.m2/Maven repositoryPython
venv/Docker build layers
Large dependency folders
Important Behavior
Cache is typically shared across jobs and pipelines within the same project
Not intended for build outputs
Cache keys control when caches are reused or invalidated
A job restores the cache before running, and uploads it afterwards (if changed)
Not guaranteed: Caches can be cleared or evicted, so your pipeline should still work without them (just slower)
Upload timing: Cache is uploaded after
scriptsucceeds, so failed jobs don't update the cachePull/Push policies: You can control whether a job downloads, uploads, or both:
pull: Only download (for jobs that don't modify dependencies). That means the cache can only be read/downloaded.push: Only upload (for the first job that installs dependencies). That means the cache can be updated/uploaded.pull-push: Both (default). A job can download the cache at the start and upload the cache at the end, both read and write permissions. That means the cache can be updated and read/uploaded.
Quick Example for per-job cache
GitLab restores cached dependencies so tests start faster.
Cache (.pip/ and venv/) in the above example:
These directories are cached between pipeline runs
Next time this job runs, pip packages won't need to be re-downloaded from the internet
The virtual environment is preserved
No cache key specified = uses the default key (all jobs share this cache)
Makes subsequent pipeline runs faster
Artifacts (htmlcov/index.html) in the above example:
This is saved within the current pipeline run
The coverage report is preserved and displayed in GitLab's UI under the test report section
Available for download after the pipeline finishes
Not carried over to future pipeline runs
Another example comes with a predefined variable. CI_COMMIT_REF_SLUG is a GitLab predefined variable that contains a sanitized version of your branch or tag name, safe for use in URLs and file paths.
Another example with strategic caching:
When you set cache under default:, you’re saying:
“All jobs in this pipeline should use this cache configuration unless they choose to override it.”
What this last example means in practice
Every job will restore this cache before execution
Every job will update this cache after completion
If a job doesn’t need Node.js, it will still unnecessarily cache
node_modules/Jobs may accidentally overwrite each other’s cache
⚠️ This can cause cache pollution, because a job that shouldn’t be touching cache might still rewrite it.
Cache sharing with global or branch scopes
Sharing Cache Across Jobs
Using the Same Cache Key
To share cache across jobs, use the same cache key in all jobs that need to access it:
All three jobs now share the same cache.
Sharing Across ALL Branches
Use a static key (no branch variables):
Without ${CI_COMMIT_REF_SLUG} or other dynamic variables, every branch and every job uses the same cache.
Global Cache Configuration
Define cache globally so all jobs inherit it:
Is Cache Automatically Applied?
Yes, but with conditions:
Same key required: The job must specify the same cache key (or inherit it globally)
Automatic download: GitLab automatically downloads and extracts the cache at the start of each job that requests it
Automatic upload: Cache is automatically uploaded after the job's
scriptsection succeedsNo explicit "restore" needed: You don't need to manually extract or apply it
Example flow:
Important notes:
Cache is not guaranteed: If the cache is cleared, evicted, or doesn't exist yet, the job still runs (just slower). Always ensure your jobs can work without cache.
First run has no cache: The first time a pipeline runs with a new key, there's no cache to download. Subsequent runs will have it.
Policy control: You can optimize by having only one job push to cache:
This prevents multiple jobs from trying to update the same cache simultaneously, which could cause conflicts.
Artifacts vs Cache — The Core Difference
Feature
Artifacts
Cache
Primary purpose
Pass files to later stages
Speed up jobs by reusing data
Scope
Within same pipeline
Across pipelines
Typical contents
Build outputs, reports
Dependencies, package caches
Persistence
Saved by GitLab and viewable
Can be overwritten frequently
Availability
Only to future stages
Any job that uses same cache key
Expiration
Configurable, default expires
Not stored forever
How to Decide Which One You Need
Use artifacts when: ✔ Another stage needs the exact output of a job ✔ You want downloadable files in GitLab ✔ You’re producing build or test results
Use cache when: ✔ You want to avoid reinstalling dependencies ✔ You’re optimizing frequent repetitive work ✔ The data can be safely regenerated
In One Sentence
Artifacts = hand-off packages between stages.
Cache = reusable stash for speeding up work.
🚀 Pipelines in GitLab CI/CD
Variables docs, variable precedence
Pipelines are the backbone of GitLab’s automation system. They define what happens after you push code—building, testing, scanning, deploying, and everything in between. But as your project grows, so does pipeline complexity. Small configuration changes can snowball into hours of maintenance unless you design pipelines in a scalable, DRY, and predictable way.
One of the key tools GitLab gives you for that is variables.
🔧 Why Pipelines Need Variables
Imagine a team maintaining several GitLab pipeline files across multiple services. One morning, infrastructure updates the internal API endpoint:
“New internal API URL: services.internal.example.net”
That should have been a simple update… but instead, the team spends half a day searching for the old URL across multiple YAML files scattered across microservice repositories.
Later that day, the integration pipeline fails — because one job in one file still references the old URL.
The problem?
Hard-coded values buried deep in different pipeline definitions.
💡 Variables Fix This
With GitLab CI/CD variables, you replace repeated values with a single source of truth.
Before (hard-coded everywhere):
After (one change updates everything):
Change the variable once → every job automatically uses the updated value.
🧩 Types of CI/CD Variables
GitLab provides several kinds of variables, each with different use cases and scopes.
1. Predefined Variables
GitLab injects these automatically into every pipeline.
Examples:
CI_COMMIT_SHA — the commit’s full SHA
CI_COMMIT_REF_NAME — the branch or tag name
CI_PIPELINE_SOURCE — whether it was triggered by push, merge request, schedule, etc.
These are ideal for tagging images, tracking builds, and making pipelines dynamic.
2. Custom Variables
You define your own values—great for anything that:
changes between environments
appears multiple times
should be controlled from a single location
contains secrets (when masked/protected)
Examples:
URLs and API endpoints
Docker registry addresses
Feature flags
Version strings (e.g.,
NODE_VERSION,TERRAFORM_VERSION)
Where can you define them?
In the
.gitlab-ci.ymlIn GitLab’s UI (project/group/instance level)
At runtime (manual pipeline triggers)
In child pipelines
In components or includes
🧠 Variable Precedence (Why It Matters)
GitLab allows the same variable name to appear in multiple places. But which one wins?
For example, if API_URL is defined:
in the GitLab UI
in the
.gitlab-ci.ymlinside a job
in a child pipeline
as a secret variable
GitLab has strict precedence rules to determine the final value. Higher-priority variables override lower ones.
Understanding precedence prevents extremely tricky bugs—like pipelines working in one branch but breaking in another because a variable value was overridden unintentionally.
(You can always check GitLab’s full precedence documentation when designing critical pipelines.)
Predefined variables
Common predefined variables
1. CI_COMMIT_SHA
What it is: The full commit hash of the current pipeline’s commit
Use case: Tagging builds or container images
Example:
2. CI_COMMIT_SHORT_SHA
What it is: Shortened version of the commit SHA (usually first 8 characters)
Use case: Labeling artifacts or build folders for easier reference
Example:
3. CI_COMMIT_REF_NAME
What it is: Name of the branch or tag for the current commit
Use case: Conditional deployments or environment routing
Example:
4. CI_COMMIT_MESSAGE
What it is: The commit message that triggered the pipeline
Use case: Include context in logs, notifications, or deployment messages
Example:
5. CI_PIPELINE_SOURCE
What it is: How the pipeline was triggered (
push,schedule,merge_request,manual, etc.)Use case: Run certain jobs only on scheduled or manual pipelines
Example:
6. CI_DEFAULT_BRANCH
What it is: The project’s default branch, usually
mainormasterUse case: Conditional logic for jobs that should run only on the default branch
Example:
7. CI_JOB_NAME
What it is: The name of the job currently running
Use case: Customize behavior, logging, or artifact naming per job
Example:
Practical Patterns
Scheduled jobs:
$CI_PIPELINE_SOURCE == "schedule"→ run overnight tasksTagged deployments:
$CI_COMMIT_TAG→ deploy only tagged releasesUnique builds:
$CI_COMMIT_SHORT_SHA→ name builds and artifacts uniquely
These predefined variables give pipelines dynamic behavior without hard-coding values, making your CI/CD more maintainable and safe.
Custom variables in GitLab CI/CD
Custom variables let you manage project-specific or environment-specific configuration. Unlike predefined variables, they are created and maintained by your team. They are ideal for:
API endpoints
Database credentials
Feature flags
Version numbers
Deployment tokens
Custom variables help avoid hard-coding values in .gitlab-ci.yml and make pipelines more maintainable.
Where Custom Variables Are Defined
Pipeline-level / in .gitlab-ci.yml
Non-sensitive values like build flags or version numbers visible to everyone who can view the repo
Project-level / GitLab UI
Sensitive variables like deployment tokens or API keys. Only authorized users can see or modify them
Group-level / GitLab UI
Shared values across multiple projects, e.g., company-wide Docker registry URLs or common deployment targets
Example Usage
1. Pipeline-level variable
Visible in code
Used for build options, feature flags, or versions
2. Project-level variable
Stored securely in Project Settings
Only authorized users can view/modify
Ideal for secrets like deployment tokens or API keys
3. Group-level variable
Shared across all projects in the group
Great for common registry URLs, environment URLs, or company-wide configurations
Custom Variable Best Practices
Avoid hard-coding
Hard-coded values must be updated in multiple jobs and files when things change
Using custom variables centralizes updates
Environment scope
Restrict a variable to a specific environment (e.g., production)
Prevents accidental use in other environments
Protected variables
Only available on protected branches (e.g., main/master)
Prevents accidental exposure in feature branches
Masked variables
Hides the value in job logs
Job scripts can still use the variable safely
Prevents secrets from being accidentally printed
Masked + hidden
In addition to masking, the variable value is not visible in the UI
Can only be set when creating a new variable, not after creation
Real-world Scenario: API Migration
Imagine a team’s API provider changes all endpoints to a new domain.
Approaches:
Hard-coded URLs
Must update every job manually (build, test, deploy-dev, deploy-prod, integration tests)
Error-prone, tedious
Custom variables
Update the variable once → all jobs automatically use the new endpoint
Reduces maintenance, lowers risk of deployment failures
Environment-specific if statements
Can control which endpoint to use per environment
More flexible than hard-coding, but less centralized than variables
Takeaway: Custom variables make pipelines more maintainable, safer, and easier to update.
Rules
Rules control when jobs run based on conditions you define. They're the primary way to make your pipelines dynamic and efficient.
Basic Structure
The logic:
Define a
rules:blockAdd one or more conditions (
if,changes,exists)Specify what happens when the condition is true (
when)
The if Clause
Use if to check variables (including GitLab's predefined variables):
Common conditions:
The changes Clause
Run jobs only when specific files change - great for performance!
This prevents unnecessary work. If you only changed a README, why rebuild the entire application?
The exists Clause
Run jobs only if certain files exist:
The when Keyword
Controls what happens when a rule matches:
Options:
on_success(default) - Run the job if all previous stages succeededalways- Run the job regardless of previous job statusnever- Don't run the jobmanual- Job requires manual approval in the UIdelayed- Wait before running (withstart_in)
Examples
Multiple Rules (Evaluation Order)
Rules are evaluated top to bottom. The first match wins:
Logic:
If branch is
main→ always deployElse if branch is
staging→ require manual approvalOtherwise → never run
Combining Conditions
Practical Real-World Example
Why Use Rules?
✅ Save time - Don't run unnecessary jobs
✅ Save resources - Fewer compute minutes = lower costs
✅ Faster feedback - Developers get results quicker
✅ Control deployment - Prevent accidental production deployments
✅ Improve efficiency - Only test what changed
More examples of using rules
These examples add a few important concepts:
1. Deploy Only from Main - The "Catch-All" Pattern
Key insight: The final when: never acts as a default/fallback. If no previous rule matches, the job won't run. This is a common pattern to explicitly block a job unless conditions are met.
2. Merge Request Trigger
Key insight: The CI_PIPELINE_SOURCE variable identifies how the pipeline was triggered:
"merge_request_event"- From a merge request"web"- From the GitLab web UI"push"- From a git push"schedule"- From a scheduled pipeline"api"- From API call
This is crucial for running jobs only in specific contexts (e.g., run extra validation only on MRs).
3. Workflow Rules - Pipeline-Level Control
Major new concept: workflow: applies rules to the entire pipeline, not just individual jobs.
What this does:
If commit message ends with
-wip→ entire pipeline is blockedIf triggered by a tag → entire pipeline is blocked
Otherwise → pipeline runs
Use cases:
Skip pipelines for work-in-progress commits
Prevent pipelines on tag creation
Only run pipelines on specific branches globally
Optimizing Job Run Order
Controlling how jobs run, Pipeline efficiency
There are two main strategies for optimizing pipeline performance: changing job execution order and parallelizing slow jobs.
1. The needs Keyword - Optimize Job Order
needs Keyword - Optimize Job OrderBy default, GitLab runs jobs in stages sequentially - all jobs in one stage must complete before the next stage starts:
Problem: Even if test only needs build to finish, it must wait for every job in the build stage.
Solution: Use needs
Result: Jobs start as soon as their specific dependencies finish, not when the entire stage completes.
More Complex Example
Without needs: test-backend waits for both backend AND frontend to build.
With needs: test-backend starts as soon as build-backend finishes, even if build-frontend is still running.
2. The parallel Keyword - Speed Up Individual Jobs
parallel Keyword - Speed Up Individual JobsWhen a single job is slow because it has too much work, split it across multiple runners:
How It Works
GitLab creates 4 identical runners and provides special variables:
$CI_NODE_TOTAL=4(total number of parallel runners)$CI_NODE_INDEX=1,2,3, or4(which runner this is)
Your test framework uses these to split the work:
Each runner executes 1/4 of the tests simultaneously.
Real-World Scenario
Before (100 tests on 1 runner):
Time: 400 seconds
CPU: maxed out
After (100 tests on 4 runners):
Time: ~100 seconds (4x faster)
CPU per runner: ~25% each
Total job time: significantly reduced
Parallel Examples for Different Tools
Jest (JavaScript)
pytest (Python)
RSpec (Ruby)
Manual splitting
Combining needs and parallel
needs and parallelYou can use both strategies together for maximum optimization:
Benefits:
Tests start immediately after build (don't wait for each other)
Each test suite runs in parallel (faster execution)
Deploy starts as soon as last test finishes
Parallel Matrix Strategy
You can also use parallel:matrix to run the same job with different configurations:
This creates 6 jobs (3 Ruby versions × 2 databases) that run simultaneously.
Key Takeaways
needs keyword:
✅ Optimizes job order
✅ Jobs start as soon as dependencies finish
✅ Reduces total pipeline time
parallel keyword:
✅ Optimizes individual job runtime
✅ Distributes work across multiple runners
✅ Reduces CPU bottlenecks
✅ Requires test framework support for sharding
Best practice: Use both together for maximum speed!
Managing Complexity in Gitlab CI/CD
As projects grow, CI/CD configurations can become unwieldy. GitLab provides several pipeline types to manage this complexity.
The Problem
Large projects face:
Huge config files (hundreds of lines in
.gitlab-ci.yml)Distributed teams wanting control over their own configuration
Unnecessary pipeline runs for commits that don't need CI
Solution 1: Parent-Child Pipelines
Split large configurations into smaller, manageable files within the same repository.
Example
How It Works
Parent pipeline (main
.gitlab-ci.yml) detects changesOnly triggers child pipeline for the affected area
Child pipeline runs independently with its own configuration
Benefits
✅ Modularity - Each team manages their own .gitlab-ci.yml
✅ Performance - Only relevant pipelines run (frontend changes don't trigger backend tests)
✅ Reduced complexity - Smaller, focused configuration files
✅ Parallel execution - Child pipelines run concurrently
✅ Easier to understand - Each file contains only relevant jobs
Solution 2: Multi-Project Pipelines
Trigger pipelines in different repositories - useful for microservices or split codebases.
Example
Real-World Scenario
Your e-commerce site has:
Main application in
ecommerce/main-appPayment service in
payments-team/payment-serviceShipping service in
logistics/shipping-service
When you change payment-related code in the main app, it automatically triggers tests in the payment service repository to ensure compatibility.
Benefits
✅ Cross-repo coordination - Test dependencies across repositories
✅ Microservices architecture - Each service has its own repo and CI
✅ Team independence - Payment team controls their pipeline
✅ Integration testing - Verify services work together
Solution 3: Merge Request Pipelines
Run different jobs for merge requests vs. regular branch pushes.
Example
Common Patterns
Skip deployments in MRs:
Run extra checks only on MRs:
Different behavior for MRs vs. main:
Benefits
✅ Faster feedback - Developers get quick results on MRs
✅ Save resources - Don't deploy to staging for every MR
✅ Targeted testing - Run different tests in different contexts
✅ Cost optimization - Skip expensive jobs when not needed
Combining All Three
Real-world complex setup:
Pipeline Type Comparison
Basic
Simple projects
Single file, sequential stages
With needs
Optimize dependencies
Single file, parallel execution
Parent-Child
Large monorepos, team separation
Same repo, multiple config files
Multi-Project
Microservices, cross-repo dependencies
Different repositories
Merge Request
Different behavior for MRs vs branches
Context-aware execution
Key Takeaways
Parent-Child Pipelines:
Break up large configs
Team ownership of their pipeline
Only run what changed
Multi-Project Pipelines:
Coordinate across repositories
Test microservice integrations
Maintain service independence
Merge Request Pipelines:
Faster developer feedback
Skip unnecessary jobs
Context-specific testing
Best Practice: Use the simplest pipeline type that solves your problem. Start simple, add complexity only when needed.
Gitlab Registires
Course to learn about Package, Container, Terraform Registries
Docker in Docker
Last updated
