Gitlab CI/CD


Gitlab CI/CD docs: https://docs.gitlab.com/ci/arrow-up-right

Validate syntax of your configuration: https://docs.gitlab.com/ci/yaml/lint/arrow-up-right

Predefined Variables (they are available in every Gitlab CI/CD pipeline): https://docs.gitlab.com/ci/variables/predefined_variables/arrow-up-right

Gitlab CI/CD keywords for YAML configuration files: https://docs.gitlab.com/ci/yaml/arrow-up-right

CI/CD component examples: https://docs.gitlab.com/ci/components/examples/arrow-up-right

CI/CD inputs: https://docs.gitlab.com/ci/inputs/arrow-up-right

Gitlab CI/CD Security arrow-up-rightcourse from Gitlab University


What Gitlab CI/CD is

GitLab CI/CD is GitLab’s built-in continuous integration, delivery, and deployment system. It automates:

  • Building code

  • Running tests

  • Packaging artifacts

  • Deploying to environments

  • Ensuring quality gates / approvals

It is configured through a single file inside the repo:

GitLab CI/CD is tightly integrated: repo → merge requests → pipelines → environments → deployments → observability.


🏗️ GitLab CI/CD Architecture (High-Level)

GitLab’s CI/CD architecture consists of five main components:

GitLab Server (Coordinator)

This includes GitLab Rails/Workhorse/Gitaly. It is responsible for:

  • Parsing .gitlab-ci.yml

  • Creating pipeline DAGs (jobs, stages, rules)

  • Storing pipeline metadata

  • Authenticating/authorizing runners

  • Scheduling CI jobs to available runners

  • Tracking job logs, statuses, artifacts

Think: brain of the CI system.


GitLab Runners

Runners are the compute nodes that actually execute jobs.

They can be:

  • Shared runners – provided by GitLab (SaaS) or by your company

  • Project/group runners – assigned to specific areas

  • Specific runners – dedicated to one project

  • Ephemeral runners – auto-scaled on cloud VMs or Kubernetes

Each runner is an agent registered with the GitLab coordinator.

Think of your .gitlab-ci.yml file as the blueprint. Runners are the machines that carry out the work. When a pipeline is triggered, available runners check in with GitLab to pick up jobs that perform various tasks, like running tests, building apps, or deploying changes.

GitLab’s runner system includes:

  1. GitLab Runner (the software) This is the application/program you actually install on a server or machine. Think of it as the "engine" - it's the binary executable that sits on your infrastructure waiting for work to do.

  2. Runners (the agents) These are the configured instances or "workers" that the GitLab Runner software manages. Each runner is registered with your GitLab instance and can execute CI/CD pipeline jobs.

Each runner runs inside an environment defined by an executor like Docker or Shell.

Potential issues with Runners and their troubleshooting

When creating job definitions in your .gitlab-ci.yml file, you have the ability to specify which runners can execute those jobs. This capability is essential for guaranteeing that jobs execute in appropriate environments—with the necessary permissions and resources available.

Why Runner Selection Matters

  1. Some jobs may require specific environments or resources.

  2. You may want to reserve certain runners for specific job types.

  3. Security requirements may limit which runners can be used.

chevron-rightRunner selection in pipelineshashtag

  • The runner's availability and access level

  • Tags assigned to runners

  • Protected runners for sensitive operations

Runner Availability GitLab follows a specific hierarchy when selecting runners: it checks project-level runners first, then group-level runners (along with any parent groups), and finally instance-level runners. This ordering means more specific runners take precedence, giving you tailored environments where they matter most.

A typical CI/CD organization might use instance-level runners for standard microservices to minimize upkeep, while reserving project-specific runners for sensitive operations like payment processing. This strategy provides a good balance between convenience and security.

Instance-wide runners simplify administrative work, whereas project-dedicated runners can handle high-priority operations. Most teams can adopt this approach without changing their existing pipeline definitions.

Using Runner Tags Tags function as descriptive labels on runners, indicating what they're equipped to handle—for instance, 'android' or 'xcode'. You can use these tags to direct jobs toward runners with the necessary capabilities, guaranteeing that builds happen in appropriate environments.

Consider a mobile development team at a CI/CD-enabled company: they use tags to route iOS builds to runners with Xcode and Android builds to runners with the Android SDK. This precision reduces configuration mistakes and makes better use of available resources.

Tags ensure jobs only execute where the required tooling exists. Teams gain better environment separation, and there's no risk of untagged runners accidentally claiming incompatible jobs.

Using Protected Runners Protected runners only accept jobs from protected branches and tags, making them perfect for production pipelines. This restriction guarantees that only verified code reaches your live systems.

A company might configure a protected runner specifically for production releases to their Kubernetes infrastructure. This creates an additional security boundary around deployment operations.

With protected runners, only approved branches can initiate deployments. Sensitive credentials stay contained, and you can require manual sign-off as an extra safeguard for production changes.

Runner selection gives you precise control over the location, method, and circumstances under which your pipeline tasks execute.

chevron-rightBest practices for runner selectionhashtag

  • Use specific tags to match jobs with appropriate environments. Example: Apply labels such as docker, linux, or android to ensure proper routing.

  • Establish consistent tagging conventions across your organization. Tip: Document your approved tags in a central location like a wiki or README for team reference.

  • Reserve protected runners for release pipelines. Benefit: Keeps sensitive credentials separate and restricts who can trigger critical deployments.

  • Avoid over-tagging your jobs. Best practice: Only specify the essential tags required for execution to prevent jobs from becoming unassignable.

  • Manage resource contention strategically when runner capacity is constrained. Strategy: Configure less critical jobs as interruptible so they can be preempted by high-priority work.

Diagnosing runner problems is a crucial skill for CI/CD practitioners. When jobs behave unexpectedly, recognizing typical runner-related issues enables quicker and more assured responses. If a job fails to launch or encounters unexpected failures, the underlying issue could stem from runner setup or connectivity challenges.

Identifying Runner Issues in Job Logs

When jobs fail in GitLab CI/CD, the job log provides your primary diagnostic resource. GitLab organizes logs into distinct sections that make it easier to determine whether issues stem from runner setup, network problems, or resource availability.

  • Job Start Section This area displays the GitLab Runner version and identifies which runner accepted the job. Verify that the correct runner was selected—particularly important when jobs need specific capabilities (such as Docker or shell executors).

  • Preparation Section This segment reveals how the runner configures the job environment. Failures in this phase often point to problems downloading images, setting up executors, or retrieving secured credentials.

  • Script Execution Look for system-level errors that indicate runner problems, such as:

    • Cannot allocate memory → Insufficient RAM on the runner

    • Connection reset by peer → Network connectivity loss

    • No space left on device → Exhausted disk capacity on the runner

These represent infrastructure limitations rather than flaws in your job's script logic.

Learning to read job logs effectively is critical for rapid runner troubleshooting, reducing delays, and maintaining development momentum.


Executors

The relationship between Runner and Executor:

While a runner picks up CI/CD jobs from GitLab, the executor determines how and where those jobs are run. Runners support many executors:

Executor
Description

Shell

Runs directly on machine (fast, insecure)

Docker

Most common; runs jobs in containers

Docker Machine

Auto-scales VMs

Kubernetes

One pod per job; cloud-native

Custom

Any custom environment

SSH

Executes commands on remote hosts

Executors are critical because they determine isolation, scale, and portability.

chevron-rightA little more information about some Executor types:hashtag

Shell Executor Commands execute directly on the host system with the Shell executor, offering a simple approach for tasks needing direct server access. A company may leverage this on their secure deployment infrastructure when releasing production updates.

When you need straightforward execution and direct host system interaction, this executor type works well.

Docker Executor Each job gets its own fresh container when using the Docker executor, which isolates tasks from one another. A company may find this particularly valuable for frontend builds and testing, since it prevents jobs from interfering with each other.

Container-based execution provides both stronger security boundaries and more predictable behavior throughout the development workflow.

Kubernetes Executor Jobs run inside individual pods with the Kubernetes executor, which makes it great for scaling in cloud environments. A company may use this approach for their mobile builds, where they need to handle many simultaneous jobs efficiently.

Teams operating in cloud-native setups benefit from this executor's ability to dynamically manage resources and performance.

SSH Executor Remote command execution happens through SSH connections with this executor, giving you flexibility to work with distributed systems. A company may find it useful for their older infrastructure during a backend system migration.

This option helps bridge the gap when you're working with established systems while moving toward newer architectures.


Pipelines → Stages → Jobs DAG

A pipeline is a graph made of:

  • Pipeline (full execution)

  • Stages (sequential blocks, e.g. build → test → deploy)

  • Jobs (individual tasks)

  • Needs/DAG (modern approach, jobs run in parallel when dependencies are met)

Example DAG:

Or with DAG:


Artifacts, Packages, Environments

GitLab CI/CD automatically manages outputs:

  • Artifacts – files produced in jobs (binaries, logs, reports)

  • Cache – dependency caches between jobs

  • Environment deployments – dev/staging/prod

  • Releases & packages – container registry, package registry


🔁 How a Pipeline Runs (Execution Flow)

1. Developer pushes code or opens MR

Triggers pipeline based on rules:

  • on push

  • on merge request

  • schedule

  • manual

  • webhook

2. GitLab reads .gitlab-ci.yml

It parses:

  • stages

  • jobs

  • variables

  • rules

  • dependencies

Generates a directed acyclic graph (DAG).

3. Jobs wait in the queue

GitLab Coordinator places all pending jobs into queue by tag.

4. Runners pull jobs

Runners use a pull model:

They match using:

  • Tags (docker, k8s, gpu, linux)

  • Runner assignments

  • Resource permissions

5. Runner executes the job

Based on executor configuration.

Typical steps:

  1. Checkout source code

  2. Restore caches

  3. Run job script

  4. Save artifacts

  5. Upload logs

6. GitLab updates pipeline + MR status

GitLab shows:

  • Success

  • Failure

  • Skipped

  • Manual action required

7. (Optional) Deployments + Observability

GitLab can:

  • Deploy to Kubernetes

  • Create an environment URL

  • Track deployments via GitLab Deployments API

  • Integrate with Metrics/Tracing


🧩 Key Concepts of GitLab CI/CD

Here are the essential pieces:

.gitlab-ci.yml

Example minimal pipeline:

Tags

Used for routing jobs:

Variables

Pipeline-level or job-level:

Rules

Modern conditional logic:

Artifacts & Cache

Saves files between jobs and stages. TechStart’s build job saves a /public folder that later jobs reuse.

Artifacts = persist between stages Cache = speed up builds

Dependencies


📦 Putting It All Together: Full Architecture Diagram (Text)


🧠 When to Use GitLab CI/CD

GitLab shines if you want:

  • A unified Git+CI system

  • Strong merge request workflows

  • Kubernetes-native deployments

  • Built-in security scanning (SAST, DAST, dependency, secret detection)

  • Self-hosted + multi-cloud flexibility

  • Complex DAG pipelines

It’s extremely popular for DevOps teams that want an all-in-one platform.


Security practices on Gitlab

https://docs.gitlab.com/user/application_security/get-started-security/arrow-up-right


Pipeline Types

https://docs.gitlab.com/ci/pipelines/pipeline_types/arrow-up-right

Parent–Child Pipelines

A single main pipeline can trigger several smaller pipelines to run in parallel. For example, this becomes useful when a team breaks a large application into microservices, with each service having its own testing pipeline.

Multi-Project Pipelines

These pipelines span multiple repositories or projects and allow coordinated workflows. A common use case is when an organization introduces an additional service—such as a new analytics component—and wants deployments across both codebases to be synchronized.

Merge Request Pipelines

These run automatically whenever changes are pushed to a merge request. Teams often use them to speed up code reviews and detect bugs earlier in the development cycle.

Merge Trains

Merge trains queue and merge multiple merge requests safely and in a controlled order. This is especially helpful for teams where several developers push changes around the same time and want to avoid integration conflicts.


What is the include keyword?

GitLab allows you to modularize and share pipeline configurations using the include keyword.

This enables you to:

  • Eliminate duplicate logic across different files

  • Distribute pipeline components among multiple projects

  • Maintain templates from a single location for simpler updates

When you apply include, GitLab combines the referenced external YAML with your .gitlab-ci.yml during pipeline execution.

Before using include (if you repeat the same or similar configuration every project):

An reused with include:

This approach lets you define a job a single time—then apply it across numerous projects—without duplicating the same YAML repeatedly. Let's examine the various methods for including files from different sources.

include: local

Use this method to reference a YAML file within the same repository:

When your /ci/test-jobs.yml contains:

The test job will execute alongside any jobs specified in your primary .gitlab-ci.yml file.

Tip: GitLab's Pipelines section displays the fully expanded YAML configuration. This view helps with debugging or understanding how your included files merged into the complete pipeline definition.

include:project

This method lets you reference YAML files from different GitLab projects within your instance. It's valuable when distributing configurations among several projects.

Here, the test job from /ci/test-jobs.yml in the create-group/ci-config repository's main branch gets incorporated into your pipeline. The ref keyword accepts SHAs or tags as alternatives to branch names.

include:remote

This option enables you to pull in YAML files from external URLs beyond your GitLab instance. For instance:

This example incorporates the .gitlab-ci.yml file from example-project hosted on GitLab.com into your pipeline.


CI/CD Components and CI/CD Catalog

What Are CI/CD Components?

Component for auto-versioningarrow-up-right

CI/CD Components are reusable, versioned building blocks for pipelines. Consider them modular templates that are:

  • Versioned – ensuring updates don't disrupt your existing pipelines

  • Parameterized – allowing you to provide inputs for customized behavior

  • Self-contained – designed around a specific function like linting or testing

  • Discoverable – available through the GitLab CI/CD Catalog

How to Include a Component

To incorporate a component, use the include keyword—but specify a component: reference rather than local, project, or remote.

This example utilizes the markdownlint component from the GitLab CI/CD Catalog. It comes prebuilt and ready for integration into any pipeline. By adding this component, you instantly access Markdown linting functionality without handling its configuration manually. We'll explore the details of including, using, and creating CI/CD Components in subsequent modules.

chevron-rightUnderstanding a Component Referencehashtag

A team lead introduces a developer to CI/CD Components—modular, version-controlled pipeline pieces that anyone in the organization can plug into their workflows.

They share this reference as an example:

Together, they walk through what each part represents:

Part
Meaning

$CI_SERVER_FQDN

Refers to the GitLab instance automatically—no need to hardcode the domain

components/yamllint

The group and project where the reusable component is stored

yamllint

The specific component provided by that project

@1.4.3

The exact release version that the pipeline should pull in

With a single line of configuration, the developer enables YAML linting across multiple repositories with consistent and centrally managed rules.


Enhancing Pipelines with Multiple Components

As their pipelines grow, they start combining several components—for code quality checks, container security, Go builds, and more.

Each referenced component adds a ready-to-use job or set of jobs, giving the team immediate advantages:

  • Automatically adopt established standards

  • Leverage expertise from specialists (security, language tooling, etc.)

  • Reduce duplication and keep .gitlab-ci.yml files clean and maintainable

If you want, I can also create a visual diagram showing how these components plug into a pipeline.


Gitlab CI/CD Catalog

In the GitLab CI/CD catalog, you can browse ready-made components for many common tasks, including:

  • Executing tests for a variety of languages

  • Building and pushing Docker images

  • Deploying to different types of environments

  • Running security and compliance scans

…and plenty of other workflow needs.

CI/CD Components must be stored in the /templates/ directory at the root of the repository. This standardized location makes components more discoverable and maintainable.

Adjusting a Component’s Behavior by adding Inputs

A development team recently added a component that runs static analysis to improve code quality. Everything works smoothly—except for one detail:

The component executes in the test stage by default, but the team already uses a dedicated lint stage:

Instead of modifying or duplicating the component’s job, the team looks into whether the component can be configured.


Discovering Available Inputs

Checking the CI/CD Catalog, they learn that the static-analysis component supports a few adjustable inputs, including:

  • stage

  • image

With that information, they update their pipeline:

Now the analysis job runs in the correct stage and uses the desired Python version—no rewriting required.


Where to Find Input Documentation

Every component listed in the CI/CD Catalog includes documentation describing its configurable inputs. You can locate this information in several places:

  • The component’s README Typically inside the repository where the component is defined.

  • The Catalog entry The component’s page in the CI/CD Catalog links directly to its README and lists supported inputs.

  • Usage examples Often included by the component maintainers in either the README or example subdirectories.

These resources help you understand how to customize each component to fit your team’s workflow.

chevron-rightManaging Component Versionshashtag

Why Versioning Matters

Controlling which version of a CI/CD Component your pipeline uses is essential. Locking the component to a specific release helps ensure consistent behavior—even as maintainers add new features or make changes.


An Unexpected Pipeline Failure

During a staging deployment, a pipeline that previously worked without issues suddenly starts failing—despite no recent code changes. After some investigation, the team discovers the root cause in their configuration:

The component was referenced using @~latest, which automatically pulls the most recent version. A new release had added a required parameter, and because the pipeline wasn’t specifying it, the job failed.

The realization: “We didn’t notice we were tracking an always-updating version—and a breaking change slipped in.”


Selecting an Appropriate Version Reference

Different version references serve different purposes. Here’s how they compare:

Reference Type
Example
When to Use It

Commit SHA

a9f4cbd72318eef...

For maximum stability and auditability; ideal when you must guarantee exact behavior

Tag

@1.3.0

Use stable, released versions that won’t change unexpectedly

Branch

main

Follow the latest development work—useful for internal components or testing edge features

@~latest

Always fetches newest release

Suitable only for experimentation or non-critical pipelines where breakage is acceptable

By choosing the right version reference, teams can strike a balance between stability, predictability, and agility.


Gitlab CI/CD Global and Default Keywords

Full list of supported values for the default keywordarrow-up-right

A small engineering group suddenly expands—from a handful of developers to a full-sized team. With more people contributing, the CI setups start drifting apart: different runtime versions, mismatched tools, and the usual “but it passes on my laptop” problems.

This is where GitLab’s global keywords make a difference. They allow you to define pipeline-wide defaults—ensuring every job starts with the same baseline configuration. Let’s break down why these keywords are so important and how they help maintain consistency across larger teams:

Inconsistent Runtime Versions

Example: Across a mid-sized engineering group, developers were unknowingly using a mix of Node.js versions—anything from 12 to 20. Some CI jobs failed purely because people copied old snippets from past repositories.

Multiple Dependency Install Methods

Example: Different developers preferred different commands: npm ci, npm install, and even pnpm install. Each approach produced different lockfile behavior and caching results, leading to unpredictable builds.

Mismatched Database Environments

Example: Local and CI test environments may run various Postgres versions—anywhere between 9.6 and 15—even though the production environment required a specific, newer release.


The Fix — The default Keyword

The default keyword lets you set shared configuration for all jobs in the pipeline unless a job overrides those settings. It’s declared once, at the top of the .gitlab-ci.yml, and instantly unifies behavior across the entire pipeline.

For example:

With this approach, every job starts with:

  • The same base container

  • A consistent database version

  • A unified dependency installation method

GitLab provides several default-level keywords—covering images, services, scripts, timeouts, artifacts, and more. The documentation lists all available options, but the principle is simple: set the standard once and let every job benefit from it.


GitLab CI/CD: Artifacts and Cache

Even though GitLab pipelines run inside fresh, isolated containers, real-world CI/CD almost always needs to share files, results, or dependencies so work isn’t repeated unnecessarily. GitLab provides two mechanisms for this: artifacts and cache. They sound similar but serve very different purposes.

Let’s break them down.


Artifacts — Deliverables That Move Through the Pipeline

Artifacts Docsarrow-up-right, Dependencies Docsarrow-up-right

Artifacts are files or directories that a job explicitly hands off to later stages.

Think of them as pipeline outputs: something a job produces that another job depends on.

What Artifacts Are Used For

Use artifacts when you want to transfer something forward in the pipeline, such as:

  • Compiled frontend bundles

  • Built binaries

  • Test reports / coverage reports

  • Generated documentation

  • Scan results

A job in a later stage can download and use these artifacts.

Important Behavior

  • Artifacts are available only to jobs in later stages, not parallel jobs in the same stage

  • They’re stored by GitLab and can be downloaded via the UI

  • They expire unless you set expire_in: 0 (never expire) or customize the duration

Quick Example

The dist/ folder created by the build job is passed to the deploy job.

Another example:


Cache — Speed Boosters for Repeated Work

Docsarrow-up-right, Caching visualizedarrow-up-right, Official caching examplesarrow-up-right

Cache is designed to speed up jobs, not transfer deliverables.

While artifacts are about sharing, cache is about avoiding re-downloading or recomputing.

What Cache Is Used For

Cache shines with data that:

  • Takes a long time to download

  • Doesn’t need to be versioned

  • Can be reused across jobs and pipelines

Examples:

  • node_modules/

  • .m2/ Maven repository

  • Python venv/

  • Docker build layers

  • Large dependency folders

Important Behavior

  • Cache is typically shared across jobs and pipelines within the same project

  • Not intended for build outputs

  • Cache keys control when caches are reused or invalidated

  • A job restores the cache before running, and uploads it afterwards (if changed)

  • Not guaranteed: Caches can be cleared or evicted, so your pipeline should still work without them (just slower)

  • Upload timing: Cache is uploaded after script succeeds, so failed jobs don't update the cache

  • Pull/Push policies: You can control whether a job downloads, uploads, or both:

    • pull: Only download (for jobs that don't modify dependencies). That means the cache can only be read/downloaded.

    • push: Only upload (for the first job that installs dependencies). That means the cache can be updated/uploaded.

    • pull-push: Both (default). A job can download the cache at the start and upload the cache at the end, both read and write permissions. That means the cache can be updated and read/uploaded.

Quick Example for per-job cache

GitLab restores cached dependencies so tests start faster.

Cache (.pip/ and venv/) in the above example:

  • These directories are cached between pipeline runs

  • Next time this job runs, pip packages won't need to be re-downloaded from the internet

  • The virtual environment is preserved

  • No cache key specified = uses the default key (all jobs share this cache)

  • Makes subsequent pipeline runs faster

chevron-rightArtifacts (htmlcov/index.html) in the above example:hashtag
  • This is saved within the current pipeline run

  • The coverage report is preserved and displayed in GitLab's UI under the test report section

  • Available for download after the pipeline finishes

  • Not carried over to future pipeline runs

Another example comes with a predefined variable. CI_COMMIT_REF_SLUG is a GitLab predefined variable that contains a sanitized version of your branch or tag name, safe for use in URLs and file paths.

Another example with strategic caching:

When you set cache under default:, you’re saying:

“All jobs in this pipeline should use this cache configuration unless they choose to override it.”

What this last example means in practice

  • Every job will restore this cache before execution

  • Every job will update this cache after completion

  • If a job doesn’t need Node.js, it will still unnecessarily cache node_modules/

  • Jobs may accidentally overwrite each other’s cache

⚠️ This can cause cache pollution, because a job that shouldn’t be touching cache might still rewrite it.

chevron-rightCache sharing with global or branch scopeshashtag

Sharing Cache Across Jobs

Using the Same Cache Key

To share cache across jobs, use the same cache key in all jobs that need to access it:

All three jobs now share the same cache.


Sharing Across ALL Branches

Use a static key (no branch variables):

Without ${CI_COMMIT_REF_SLUG} or other dynamic variables, every branch and every job uses the same cache.


Global Cache Configuration

Define cache globally so all jobs inherit it:


Is Cache Automatically Applied?

Yes, but with conditions:

  1. Same key required: The job must specify the same cache key (or inherit it globally)

  2. Automatic download: GitLab automatically downloads and extracts the cache at the start of each job that requests it

  3. Automatic upload: Cache is automatically uploaded after the job's script section succeeds

  4. No explicit "restore" needed: You don't need to manually extract or apply it

Example flow:

Important notes:

Cache is not guaranteed: If the cache is cleared, evicted, or doesn't exist yet, the job still runs (just slower). Always ensure your jobs can work without cache.

First run has no cache: The first time a pipeline runs with a new key, there's no cache to download. Subsequent runs will have it.

Policy control: You can optimize by having only one job push to cache:

This prevents multiple jobs from trying to update the same cache simultaneously, which could cause conflicts.


Artifacts vs Cache — The Core Difference

Feature

Artifacts

Cache

Primary purpose

Pass files to later stages

Speed up jobs by reusing data

Scope

Within same pipeline

Across pipelines

Typical contents

Build outputs, reports

Dependencies, package caches

Persistence

Saved by GitLab and viewable

Can be overwritten frequently

Availability

Only to future stages

Any job that uses same cache key

Expiration

Configurable, default expires

Not stored forever


How to Decide Which One You Need

Use artifacts when: ✔ Another stage needs the exact output of a job ✔ You want downloadable files in GitLab ✔ You’re producing build or test results

Use cache when: ✔ You want to avoid reinstalling dependencies ✔ You’re optimizing frequent repetitive work ✔ The data can be safely regenerated


In One Sentence

  • Artifacts = hand-off packages between stages.

  • Cache = reusable stash for speeding up work.


🚀 Pipelines in GitLab CI/CD

Variables docsarrow-up-right, variable precedencearrow-up-right

Pipelines are the backbone of GitLab’s automation system. They define what happens after you push code—building, testing, scanning, deploying, and everything in between. But as your project grows, so does pipeline complexity. Small configuration changes can snowball into hours of maintenance unless you design pipelines in a scalable, DRY, and predictable way.

One of the key tools GitLab gives you for that is variables.


🔧 Why Pipelines Need Variables

Imagine a team maintaining several GitLab pipeline files across multiple services. One morning, infrastructure updates the internal API endpoint:

“New internal API URL: services.internal.example.net”

That should have been a simple update… but instead, the team spends half a day searching for the old URL across multiple YAML files scattered across microservice repositories.

Later that day, the integration pipeline fails — because one job in one file still references the old URL.

The problem?

Hard-coded values buried deep in different pipeline definitions.


💡 Variables Fix This

With GitLab CI/CD variables, you replace repeated values with a single source of truth.

Before (hard-coded everywhere):

After (one change updates everything):

Change the variable once → every job automatically uses the updated value.


🧩 Types of CI/CD Variables

GitLab provides several kinds of variables, each with different use cases and scopes.


1. Predefined Variables

GitLab injects these automatically into every pipeline.

Examples:

  • CI_COMMIT_SHA — the commit’s full SHA

  • CI_COMMIT_REF_NAME — the branch or tag name

  • CI_PIPELINE_SOURCE — whether it was triggered by push, merge request, schedule, etc.

These are ideal for tagging images, tracking builds, and making pipelines dynamic.


2. Custom Variables

You define your own values—great for anything that:

  • changes between environments

  • appears multiple times

  • should be controlled from a single location

  • contains secrets (when masked/protected)

Examples:

  • URLs and API endpoints

  • Docker registry addresses

  • Feature flags

  • Version strings (e.g., NODE_VERSION, TERRAFORM_VERSION)

Where can you define them?

  • In the .gitlab-ci.yml

  • In GitLab’s UI (project/group/instance level)

  • At runtime (manual pipeline triggers)

  • In child pipelines

  • In components or includes


🧠 Variable Precedence (Why It Matters)

GitLab allows the same variable name to appear in multiple places. But which one wins?

For example, if API_URL is defined:

  • in the GitLab UI

  • in the .gitlab-ci.yml

  • inside a job

  • in a child pipeline

  • as a secret variable

GitLab has strict precedence rules to determine the final value. Higher-priority variables override lower ones.

Understanding precedence prevents extremely tricky bugs—like pipelines working in one branch but breaking in another because a variable value was overridden unintentionally.

(You can always check GitLab’s full precedence documentation when designing critical pipelines.)

Predefined variables

chevron-rightCommon predefined variableshashtag

1. CI_COMMIT_SHA

  • What it is: The full commit hash of the current pipeline’s commit

  • Use case: Tagging builds or container images

  • Example:


2. CI_COMMIT_SHORT_SHA

  • What it is: Shortened version of the commit SHA (usually first 8 characters)

  • Use case: Labeling artifacts or build folders for easier reference

  • Example:


3. CI_COMMIT_REF_NAME

  • What it is: Name of the branch or tag for the current commit

  • Use case: Conditional deployments or environment routing

  • Example:


4. CI_COMMIT_MESSAGE

  • What it is: The commit message that triggered the pipeline

  • Use case: Include context in logs, notifications, or deployment messages

  • Example:


5. CI_PIPELINE_SOURCE

  • What it is: How the pipeline was triggered (push, schedule, merge_request, manual, etc.)

  • Use case: Run certain jobs only on scheduled or manual pipelines

  • Example:


6. CI_DEFAULT_BRANCH

  • What it is: The project’s default branch, usually main or master

  • Use case: Conditional logic for jobs that should run only on the default branch

  • Example:


7. CI_JOB_NAME

  • What it is: The name of the job currently running

  • Use case: Customize behavior, logging, or artifact naming per job

  • Example:


Practical Patterns

  • Scheduled jobs: $CI_PIPELINE_SOURCE == "schedule" → run overnight tasks

  • Tagged deployments: $CI_COMMIT_TAG → deploy only tagged releases

  • Unique builds: $CI_COMMIT_SHORT_SHA → name builds and artifacts uniquely

These predefined variables give pipelines dynamic behavior without hard-coding values, making your CI/CD more maintainable and safe.

Custom variables in GitLab CI/CD

Custom variables let you manage project-specific or environment-specific configuration. Unlike predefined variables, they are created and maintained by your team. They are ideal for:

  • API endpoints

  • Database credentials

  • Feature flags

  • Version numbers

  • Deployment tokens

Custom variables help avoid hard-coding values in .gitlab-ci.yml and make pipelines more maintainable.


Where Custom Variables Are Defined

Scope
Example Use

Pipeline-level / in .gitlab-ci.yml

Non-sensitive values like build flags or version numbers visible to everyone who can view the repo

Project-level / GitLab UI

Sensitive variables like deployment tokens or API keys. Only authorized users can see or modify them

Group-level / GitLab UI

Shared values across multiple projects, e.g., company-wide Docker registry URLs or common deployment targets


Example Usage

1. Pipeline-level variable

  • Visible in code

  • Used for build options, feature flags, or versions


2. Project-level variable

  • Stored securely in Project Settings

  • Only authorized users can view/modify

  • Ideal for secrets like deployment tokens or API keys


3. Group-level variable

  • Shared across all projects in the group

  • Great for common registry URLs, environment URLs, or company-wide configurations


Custom Variable Best Practices

  1. Avoid hard-coding

    • Hard-coded values must be updated in multiple jobs and files when things change

    • Using custom variables centralizes updates

  2. Environment scope

    • Restrict a variable to a specific environment (e.g., production)

    • Prevents accidental use in other environments

  3. Protected variables

    • Only available on protected branches (e.g., main/master)

    • Prevents accidental exposure in feature branches

  4. Masked variables

    • Hides the value in job logs

    • Job scripts can still use the variable safely

    • Prevents secrets from being accidentally printed

  5. Masked + hidden

    • In addition to masking, the variable value is not visible in the UI

    • Can only be set when creating a new variable, not after creation

Real-world Scenario: API Migration

Imagine a team’s API provider changes all endpoints to a new domain.

Approaches:

  1. Hard-coded URLs

    • Must update every job manually (build, test, deploy-dev, deploy-prod, integration tests)

    • Error-prone, tedious

  2. Custom variables

    • Update the variable once → all jobs automatically use the new endpoint

    • Reduces maintenance, lowers risk of deployment failures

  3. Environment-specific if statements

    • Can control which endpoint to use per environment

    • More flexible than hard-coding, but less centralized than variables

Takeaway: Custom variables make pipelines more maintainable, safer, and easier to update.


Rules

chevron-rightQuick reference for Ruleshashtag

Rules control when jobs run based on conditions you define. They're the primary way to make your pipelines dynamic and efficient.


Basic Structure

The logic:

  • Define a rules: block

  • Add one or more conditions (if, changes, exists)

  • Specify what happens when the condition is true (when)


The if Clause

Use if to check variables (including GitLab's predefined variables):

Common conditions:


The changes Clause

Run jobs only when specific files change - great for performance!

This prevents unnecessary work. If you only changed a README, why rebuild the entire application?


The exists Clause

Run jobs only if certain files exist:


The when Keyword

Controls what happens when a rule matches:

Options:

  • on_success (default) - Run the job if all previous stages succeeded

  • always - Run the job regardless of previous job status

  • never - Don't run the job

  • manual - Job requires manual approval in the UI

  • delayed - Wait before running (with start_in)

Examples


Multiple Rules (Evaluation Order)

Rules are evaluated top to bottom. The first match wins:

Logic:

  1. If branch is main → always deploy

  2. Else if branch is staging → require manual approval

  3. Otherwise → never run


Combining Conditions


Practical Real-World Example


Why Use Rules?

Save time - Don't run unnecessary jobs

Save resources - Fewer compute minutes = lower costs

Faster feedback - Developers get results quicker

Control deployment - Prevent accidental production deployments

Improve efficiency - Only test what changed

chevron-rightMore examples of using ruleshashtag

These examples add a few important concepts:


1. Deploy Only from Main - The "Catch-All" Pattern

Key insight: The final when: never acts as a default/fallback. If no previous rule matches, the job won't run. This is a common pattern to explicitly block a job unless conditions are met.


2. Merge Request Trigger

Key insight: The CI_PIPELINE_SOURCE variable identifies how the pipeline was triggered:

  • "merge_request_event" - From a merge request

  • "web" - From the GitLab web UI

  • "push" - From a git push

  • "schedule" - From a scheduled pipeline

  • "api" - From API call

This is crucial for running jobs only in specific contexts (e.g., run extra validation only on MRs).


3. Workflow Rules - Pipeline-Level Control

Major new concept: workflow: applies rules to the entire pipeline, not just individual jobs.

What this does:

  • If commit message ends with -wipentire pipeline is blocked

  • If triggered by a tag → entire pipeline is blocked

  • Otherwise → pipeline runs

Use cases:

  • Skip pipelines for work-in-progress commits

  • Prevent pipelines on tag creation

  • Only run pipelines on specific branches globally


Optimizing Job Run Order

Controlling how jobs runarrow-up-right, Pipeline efficiencyarrow-up-right

There are two main strategies for optimizing pipeline performance: changing job execution order and parallelizing slow jobs.


1. The needs Keyword - Optimize Job Order

By default, GitLab runs jobs in stages sequentially - all jobs in one stage must complete before the next stage starts:

Problem: Even if test only needs build to finish, it must wait for every job in the build stage.

Solution: Use needs

Result: Jobs start as soon as their specific dependencies finish, not when the entire stage completes.

More Complex Example

Without needs: test-backend waits for both backend AND frontend to build.

With needs: test-backend starts as soon as build-backend finishes, even if build-frontend is still running.


2. The parallel Keyword - Speed Up Individual Jobs

When a single job is slow because it has too much work, split it across multiple runners:

How It Works

GitLab creates 4 identical runners and provides special variables:

  • $CI_NODE_TOTAL = 4 (total number of parallel runners)

  • $CI_NODE_INDEX = 1, 2, 3, or 4 (which runner this is)

Your test framework uses these to split the work:

Each runner executes 1/4 of the tests simultaneously.

Real-World Scenario

Before (100 tests on 1 runner):

  • Time: 400 seconds

  • CPU: maxed out

After (100 tests on 4 runners):

  • Time: ~100 seconds (4x faster)

  • CPU per runner: ~25% each

  • Total job time: significantly reduced


Parallel Examples for Different Tools

Jest (JavaScript)

pytest (Python)

RSpec (Ruby)

Manual splitting


Combining needs and parallel

You can use both strategies together for maximum optimization:

Benefits:

  • Tests start immediately after build (don't wait for each other)

  • Each test suite runs in parallel (faster execution)

  • Deploy starts as soon as last test finishes


Parallel Matrix Strategy

You can also use parallel:matrix to run the same job with different configurations:

This creates 6 jobs (3 Ruby versions × 2 databases) that run simultaneously.


Key Takeaways

needs keyword:

  • ✅ Optimizes job order

  • ✅ Jobs start as soon as dependencies finish

  • ✅ Reduces total pipeline time

parallel keyword:

  • ✅ Optimizes individual job runtime

  • ✅ Distributes work across multiple runners

  • ✅ Reduces CPU bottlenecks

  • ✅ Requires test framework support for sharding

Best practice: Use both together for maximum speed!


Managing Complexity in Gitlab CI/CD

As projects grow, CI/CD configurations can become unwieldy. GitLab provides several pipeline types to manage this complexity.


The Problem

Large projects face:

  • Huge config files (hundreds of lines in .gitlab-ci.yml)

  • Distributed teams wanting control over their own configuration

  • Unnecessary pipeline runs for commits that don't need CI


Solution 1: Parent-Child Pipelines

Split large configurations into smaller, manageable files within the same repository.

Example

How It Works

  1. Parent pipeline (main .gitlab-ci.yml) detects changes

  2. Only triggers child pipeline for the affected area

  3. Child pipeline runs independently with its own configuration

Benefits

Modularity - Each team manages their own .gitlab-ci.yml

Performance - Only relevant pipelines run (frontend changes don't trigger backend tests)

Reduced complexity - Smaller, focused configuration files

Parallel execution - Child pipelines run concurrently

Easier to understand - Each file contains only relevant jobs


Solution 2: Multi-Project Pipelines

Trigger pipelines in different repositories - useful for microservices or split codebases.

Example

Real-World Scenario

Your e-commerce site has:

  • Main application in ecommerce/main-app

  • Payment service in payments-team/payment-service

  • Shipping service in logistics/shipping-service

When you change payment-related code in the main app, it automatically triggers tests in the payment service repository to ensure compatibility.

Benefits

Cross-repo coordination - Test dependencies across repositories

Microservices architecture - Each service has its own repo and CI

Team independence - Payment team controls their pipeline

Integration testing - Verify services work together


Solution 3: Merge Request Pipelines

Run different jobs for merge requests vs. regular branch pushes.

Example

Common Patterns

Skip deployments in MRs:

Run extra checks only on MRs:

Different behavior for MRs vs. main:

Benefits

Faster feedback - Developers get quick results on MRs

Save resources - Don't deploy to staging for every MR

Targeted testing - Run different tests in different contexts

Cost optimization - Skip expensive jobs when not needed


Combining All Three

Real-world complex setup:


Pipeline Type Comparison

Type
Use Case
Scope

Basic

Simple projects

Single file, sequential stages

With needs

Optimize dependencies

Single file, parallel execution

Parent-Child

Large monorepos, team separation

Same repo, multiple config files

Multi-Project

Microservices, cross-repo dependencies

Different repositories

Merge Request

Different behavior for MRs vs branches

Context-aware execution


Key Takeaways

Parent-Child Pipelines:

  • Break up large configs

  • Team ownership of their pipeline

  • Only run what changed

Multi-Project Pipelines:

  • Coordinate across repositories

  • Test microservice integrations

  • Maintain service independence

Merge Request Pipelines:

  • Faster developer feedback

  • Skip unnecessary jobs

  • Context-specific testing

Best Practice: Use the simplest pipeline type that solves your problem. Start simple, add complexity only when needed.


Gitlab Registires

Coursearrow-up-right to learn about Package, Container, Terraform Registries


Docker in Docker


Last updated