The fun task of getting the results of Jenkins builds back into GitLab.

Today I had the fun task of getting the results of Jenkins builds back into GitLab.
This post hopefully describes some of the errors I saw (or didn’t see) and the multiple steps that were not clear or not noted in documentation or on the internetz.

So following the documentation here (https://github.com/jenkinsci/gitlab-plugin) I went through the normal steps of triggering builds from GitLab to Jenkins, in particular:

In Jenkins

* Create the Project
* Configure the SCM (Source Code Management) checkout of the gitlab repo as per normal for Jenkins (I’ll also add a pipeline example at the bottom)
* This Usually involves adding ‘deploy’ ssh key to the GitLab project or however you have SSH Keys configured in you Jenkins

In the Jenkins project tick ‘Build when a change is pushed to GitLab’

* Click Advanced > ‘Secret Token’ > ‘Generate’
* Note the ‘GitLab webhook URL’

In GitLab for the Project goto: Settings > Integrations

* Input the Gitlab webhook URL in the ‘URL’ box
* Input the ‘Secret Token’
* Trigger on ‘Push Events’
* Click ‘Add webhook’
* Then Test it

If your following along the doco - You will notice we are using the “Configuring per-project authentication” method.
That completes our “GitLab-to-Jenkins authentication”

The next bit will be where I experienced the most issues and hence this blog post - “Jenkins-to-gitlab-authentication”.

Now the documentation is correct - and I will repeat it here with some additional verbiage.

1. Create a new (Jenkins) user in GitLab

* I won’t go into details here, but set the password to something long and random and forget the password, as no-one should log in as this user.

2. Give this user ‘Developer’ permissions on each repo you want Jenkins to send build status to

* Yes BUT if using the default GitLab(?) configuration and you yolo your Git and always commit to the master branch then ‘Developer’ can not commit (build statuses) directly to master.
Failing to fix this results in the following errors in your Jenkins log:
“c.d.g.util.CommitStatusUpdater#updateCommitStatus: Failed to update Gitlab commit status for project ‘Your-Project-Name’”
“javax.ws.rs.ClientErrorException: HTTP 403 Forbidden” More Java Stack Trace.

* To resolve this error in GitLab goto your Project > Settings > Repository (Settings) > Protected Branches (Expand)
From there either “Unprotect” the master branch or change the permissions to something more suited.

3. Log in or ‘Impersonate’ that (Jenkins) user in GitLab, click the user’s icon/avatar and choose Settings
Click on ‘Impersonation Tokens’
Create a token named e.g. ‘jenkins’ with ‘api’ scope; expiration is optional
Copy/Note the token immediately, it cannot be accessed after you leave this page

4. On the Configure System (Manage Jenkins > Configure System) page in Jenkins, in the GitLab configuration section,
Supply the ‘Connection Name’ - I recommend using underscores instead of spaces, especially if you have more than one Connection.
Supply the GitLab host URL, e.g. https://your.gitlab.server
Click the ‘Add’ button to add a credential, choose ‘GitLab API token’ as the kind of credential
Scope select ‘Global (Jenkins, nodes, items, all child items etc)’
* If you select ‘System (Jenkins and nodes only)’ you will get the following error in your jobs: “Can’t submit build status: No GitLab connection configured”
Paste your GitLab user’s API key into the ‘API token’ field
I also recommend creating a human friendly ‘ID’ like jenkins-api-user-at-gitlab to use in your pipeline, especially if you have more than one Connection.
Click ‘Add’ to save the credentials
Click the ‘Test Connection’ button, it should succeed

5. Finally scroll the bottom of the page click ‘Save’.

It should be noted that re-visiting the ‘Configure System’ page it’s common for the GitLab section to report “API Token for Gitlab access required” - This doesnt seem to matter as the credentials work fine.

That’s the authentication both ways sorted and some errors I encountered and resolved, all thats left is to do is trigger your job - Either via a git commit or via Jenkins - And check for the update in GitLab!

As a bonus to anyone that got this far here is a ‘Jenkinfile’ that goes through a simple Docker container build and also triggers a script inside the container.
You will have to excuse the mess around dockerImage.withRun - as it’s rather difficult to get credentials into the container without committing them to the image etc.

#!groovy

def errorFriendly = "My Log Collector"
def imageName = "unittest"

def registryNamespace = "sysadmin"
// Our custom docker registry
def dockerRegistry = "reg.acme.com"
def dockerImageName = "${dockerRegistry}/${registryNamespace}/${imageName}"

def slackSendMessage(color, message) {
  slackSend channel: 'cicd-notifs',
  tokenCredentialId: 'slack-notifications-token',
  baseUrl: 'https://acme.slack.com/services/hooks/jenkins-ci/',
  color: color,
  message: message
}

node("sysadmin") {

  stage('GIT Clone DockerFile Config') {
    try {
      gitlabBuilds(builds: ['build_container', 'test_ansible_playbook']) {
      // Do nothing but notify GitLab about expected builds so that it sets them as pending
      }
      git branch: 'master',
          credentialsId: 'jenkins-gitlab-ssh-key',
              url: 'git@git.acme.com:SysAdmin/my-log-collector.git'
      sh "ls -lat"
      sh "pwd"
    } catch (Exception ex) {
        println("Unable to git clone: ${ex}")
        slackSendMessage("#FF0000", "Stage 1/3 - Git clone failed for ${errorFriendly}")
        error 'Git Clone failure'
    }
  }

  stage ('Build Docker Image') {
    try {
      updateGitlabCommitStatus name: 'build_container', state: 'running'
      dir("./${imageName}") {
        docker.withRegistry('reg.acme.com', 'sysadmin_jenkins_docker_reg') {
          def customImage = docker.build("${dockerImageName}:${env.BUILD_ID}")
          }
        }
      updateGitlabCommitStatus name: 'build_container', state: 'success'
    } catch (Exception ex) {
        println("Unable build image: ${ex}")
        slackSendMessage("#FF0000", "Stage 2/3 - Docker Container build failed for ${errorFriendly}")
        updateGitlabCommitStatus name: 'build_container', state: 'failed'
        error 'Docker Build failure'
    }
  }

  // Dont bother pushing the image as we are just testing a script and checking for results

  stage('Clone and Execute the playbook') {
    try {
        updateGitlabCommitStatus name: 'test_ansible_playbook', state: 'running'
        withCredentials(bindings: [sshUserPrivateKey(credentialsId: 'jenkins-gitlab-ssh-key', 
            keyFileVariable: 'myLogKey', 
            passphraseVariable: '', 
            usernameVariable: '')]) {
          // Damn Jenkins Docker devs - Cant stuff just work as advertised ie. .inside()!!?!!!
          docker.image("${dockerImageName}:${env.BUILD_ID}").withRun('-t -u root --entrypoint=cat -v "/srv/jenkins/workspace/$JOB_NAME@tmp:/srv/jenkins/workspace/$JOB_NAME@tmp" -e myLogKey=$myLogKey' ) { c ->
          sh "docker cp ./${imageName}/docker-entry.sh ${c.id}:/docker-entry.sh"
          sh "docker exec ${c.id} /docker-entry.sh"
          }
        }
        updateGitlabCommitStatus name: 'test_ansible_playbook', state: 'success'
    } catch (Exception ex) {
        println("Ansible-Playbook failure")
        slackSendMessage("#FF0000", "Stage 3/3 - UnitTest failed for ${errorFriendly}")
        updateGitlabCommitStatus name: 'test_ansible_playbook', state: 'failed'
        error 'Ansible-Playbook failure'
    }
  }
}

Mount image file under Linux

Sometimes you just *need* to mount an image file under Linux (ie. forensics and/or data recovery).
This isn’t always easy if you DD the disk, then you need to work out the partition maths.

Easiest way is to ‘fisk -l’ the image file:

root@HackerBox:~/forensics# fdisk -l /mnt/temp/ewf1
Disk /mnt/temp/ewf1: 10 GiB, 10737418240 bytes, 20971520 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x39bf39be

Device                   Boot   Start      End           Sectors      Size Id  Type
/mnt/temp/ewf1p1 *         63         20948759 20948697  10G  7   HPFS/NTFS/exFAT

From the above you should see ‘Sectors’ are 512 bytes (pretty normal for NTFS) and the partition starts at 63 (sectors) in.

So finally all we need to do is mount the image with the command:

mount /mnt/temp/ewf1 /mnt/temp1 -o ro,loop,show_sys_files,streams_interace=windows,offset=$((63*512))

You can possibly leave out the ’show_sys_files,streams_interace=windows’ parameters if you aren’t doing forensics.

Resize Linux partition while online

Previously I always had to fumble around using FDISK, do some dangerous task of deleting the partition table while in use, then add re-adding the table with the new extended settings!
Too difficult and too danger prone (although it never failed on me…)

Anyway I just stumbled on a way easier process! Introducing ‘growpart’.
The process for resizing EXT4 partition, ‘/dev/xvda1′ would be:

Backup the partition table (just in case)

sfdisk -d /dev/xvda > partition_backup

Resize the partition

growpart -v /dev/xvda 1

And finally resize the filesystem to make use of the larger partition

resize2fs /dev/xvda1

Happy resizing.

Note: Quick investigation it seems ‘growpart’ is from the RPM ‘cloud-disk-utils’ from Amazon AWS hosted systems. So not sure about regular avaliability.

Elasticsearch: search_context_missing_exception - No search context found for id

Consolidating daily logstash indexes into monthly logstash indexes sometimes results in the following error:

      {
        "index": "logstash-2018-11-01",
        "shard": 0,
        "node": "aodScdJuQ5OWLyucQ6Px5Q",
        "reason": {
          "type": "search_context_missing_exception",
          "reason": "No search context found for id [3708723]"
        }

This error is typically caused by the following:

Reindexing uses the scroll api under the covers to read a “point-in-time” view of the source data.

This point in time consists of a set of segments (Lucene files) that are essentially locked and prevented from being deleted by the usual segment merging process that works in the background to reorganise the index in response to ongoing CUD (create/update/delete) operations.

It is costly to preserve this view of data which is why users of the scroll API must renew their lock with each new request for a page of results. Locks will timeout if the client fails to return within the timespan which they said they would return.
The error you are seeing is because the reindex function has requested another page of results but the scroll ID which represents a lock on a set of files is either:

* timed out (i.e the reindex client spent too long indexing the previous page) or
* lost because the node serving the scroll api was restarted or otherwise became unavailable

(Source: https://discuss.elastic.co/t/problem-when-reindexing-large-index/117421)

Looking at the Elasticsearch logs I can see:

[INFO ][o.e.t.LoggingTaskListener] 3406201 finished with response BulkByScrollResponse[took=2.1h,timed_out=false,sliceId=null,updated=0,created=37036000,deleted=0,batches=37036,versionConflicts=0,noops=0,retries=0,throttledUntil=0s,bulk_failures=[],search_failures=[{"shard":-1,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [1102015]"}}, {"shard":-1,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [1102016]"}}, {"shard":-1,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [1102023]"}}]]
[INFO ][o.e.m.j.JvmGcMonitorService] [server-ls2.local] [gc][40994] overhead, spent [335ms] collecting in the last [1s]

[WARN ][o.e.t.TransportService   ] [server-ls2.local] Received response for a request that has timed out, sent [51770ms] ago, timed out [21769ms] ago, action [internal:discovery/zen/fd/master_ping], node [{server-ls5.local}{aM_wxa2mTY2XI9P7bsobSg}{gsNCWS-GQ0md1u-yuhxSNw}{192.168.11.55}{192.168.11.55:9300}{site_id=rack1, ml.machine_memory=67279155200, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], id [11840408]

So there was a timeout issue - but the cause is unknown at this time….

Debian Lighttpd does infinite redirect loop and fails to connect

Just imagine your running a blog that requires zero maintenance and one day you access it and it doesn’t load!

You try Firefox and then Chrome and finally Edge (the new IE)

You notice that Firefox and Chrome seem to loop and then finally fail - You notice that Edge works….

You notice that cURL works.

Things are but aren’t working.

Finally you notice Firefox is trying to do TLS1.3! Interesting how do I disable that on Debian 9 with Lighttpd? You Can’t!

What’s the fix?

in lighttpd.conf in your SSL section input:

ssl.disable-client-renegotiation = “disable”

ssl.disable-client-renegotiation exists because of a bug back in 2009 - This bug has long been patch in newer versions of OpenSSL and is safe to turn back on.

Disabling this setting allowed you to find the answer to your troubles :-)