Data Science & AI Support Specialist at Spektrum
Den Haag, Zuid-Holland, Netherlands -
Full Time


Start Date

Immediate

Expiry Date

30 Aug, 25

Salary

0.0

Posted On

31 May, 25

Experience

3 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Sql, English, R, Data Processing, Data Science, Computer Science, Languages, Data Analytics, Python, Xml, Models

Industry

Information Technology/IT

Description

Spektrum have a wide range of exciting opportunities in several global locations.
We are always looking to add great new talent to our team and look forward to hearing from you.
Spektrum supports apex purchasers (NATO, UN, EU, and National Government and Defence) and their Tier 1 supplier ecosystem with a wide range of specialist services. We provide our clients with professional services, specialised aerospace and defence sales, delivery, and operational subject matter expertise. We are looking for personnel to join our team and support key client projects.

WHO WE ARE SUPPORTING

The NATO Communication and Information Agency (NCIA) is responsible for providing secure and effective communications and information technology (IT) services to NATO’s member countries and its partners. The agency was established in 2012 and is headquartered in Brussels, Belgium.

The NCIA provides a wide range of services, including:

  • Cyber Security: The NCIA provides advanced cybersecurity solutions to protect NATO’s communication networks and information systems against cyber threats.
  • Command and Control Systems: The NCIA develops and maintains the systems used by NATO’s military commanders to plan and execute operations.
  • Satellite Communications: The NCIA provides satellite communications services to enable secure and reliable communications between NATO forces.
  • Electronic Warfare: The NCIA provides electronic warfare services to support NATO’s mission to detect, deny, and defeat threats to its communication networks.
  • Information Management: The NCIA manages NATO’s information technology infrastructure, including its databases, applications, and servers.

Overall, the NCIA plays a critical role in ensuring the security and effectiveness of NATO’s communication and information technology capabilities.

ESSENTIAL SKILLS AND EXPERIENCE

  • At least 3 years’ of practical experience in the field of data science and/ or data analytics;
  • Experience using data processing/visualization/analytics software packages and development environments, preferably such as KNIME, VS Code, GitLab, Power BI, Jupyter Lab, and Docker-based API;
  • Experience with data processing Big Data, creating and utilizing containerized building blocks and running containers (APIs) on Kubernetes clusters;
  • Experience with programming/scripting in languages like Python, R, SQL and working with data formats like CSV, XML, JSON;
  • Experience performing content extraction from files/databases/systems, (LLM based) embedding models, entity-extraction, key-word-extraction and content similarity measures;
  • Creative, flexible and pro-active overcoming obstacles;
  • Good drafting, communication and presentation skills in English, including technical and non-technical levels;
  • High attention to detail and accuracy;

EDUCATION

  • Master in Computer Science, Engineering or relevant field.
  • A higher degree in Data Science is preferred.
Responsibilities

ROLE BACKGROUND

The NATO Information and Communication Agency (NCIA) located in The Hague, Netherlands, is currently involved in processing vast amounts and highly variant data coming from theatre for the purpose of efficient archiving. In light of these activities, within NCIA Chief Technology Office, the Exploiting Data Science and Artificial Intelligence (EDS&AI) team is tasked to apply Big Data and AI technology to prepare, run and adjust processing pipelines for processing various source data into archiving formats and metadata, and prepare for (semantic) search. NATO has an obligation to support national investigations into situation that occurred in theatre. In order to support the different teams involved most optimal, the EDS&AI team brings the expertise to extract and exploit the vast and varied data on the table, by using the Agency’s high performance computing classified sandbox. The EDS&AI team provides the core data science skills and technology needed for big data analysis and AI. The EDS&AI team applies innovative technology to data whenever it is not possible to extract value with conventional approaches.

ROLE DUTIES AND RESPONSIBILITIES

The services described below will be provided to the NCIA CTO/EDS&AI team, as they deliver specialised Data Science and AI results to their stakeholders in NATO Headquarters and NATO Allied Command Operations. Overarching objectives:

  • Make required documents from theatre accessible and searchable by archivists during execution
  • Capture document contents into long term preservation formats
  • Capture Functional Area System (FAS; back-up) contents into long term preservation formats
  • Identify (and remove) duplicate documents, records of temporary value and non records that are not required for archiving
  • Provide (interim/final) data reports describing actions and results
  • Setting up / improving pipelines to process all required documents and that uniquely identifies and traces decisions and processing steps. This is to be conducted on the provided classified sandbox environment, with provided performance hardware and toolsets.
  • Implementing / improving (missing) pipeline steps for marking duplicate files, based on file attributes, path (structure) and content (similarity), and rules for considering a file or structure a duplicate.
  • Extracting document-format records from Functional Area Systems (FAS) databases and back-ups performed otherwise. Archiving SME’s and system SME’s are available for guidance on target formats and source system structure and data interpretation. Each FAS is processed separately; not all sprints touch upon this item.
  • Processing / Monitoring progress of various office, image and video file types to the accepted archiving formats, including extraction of metadata and preparing search semantic indexes.
  • Automating registering all processed documents with semantic indexes with the sandbox natural language search tool.
  • Automating the final copy of all non-duplicate and extracted archive documents with content and metadata to the NATO archiving system.
  • Reporting status, progress and statistics of the (raw) files being processed to archive formats, metadata and search indexes.
  • Delivering full reporting of results, trace of pipeline steps taken and (stakeholder) accepted failures.
Loading...