This job posting isn't available in all website languages

Data Engineer - Content & Innovation

IT/Technical & Product Development
OPE001NR Requisition #

In line with the Elsevier corporate strategy of greater content volume, types and sophistication, the services that Elsevier provides are becoming increasingly powered by applications of data science. We are looking for a Data Engineer who can focus on preparing, integrating and normalizing data, in order to design and create high quality data sets. These data sets are the foundation for systems powered by machine learning. Data Science in our context ranges from article submission systems, recommender systems, from information extraction to other classification systems. In all workflows, meta-data and other structured data that authors or other human agents add to our publications must be captured and put to maximum effect.


As a Data Engineer you will be working with many groups developing our content and information offering to end customers. These services rely on existing text and data mining as well as content structuring and meta-data generation processes. Many of these processes rely on human interaction and creation, we want to introduce more automation to these processes by capturing human inputs and back the submission and annotation systems by machine learning engines. Ultimately, machine learning tools may suggest annotations and structured meta-data that are as good as or better than human-generated data.


As a Data Engineer you have a very solid grounding in software engineering, optimization of processes, coding practices, data standards, storage options and cloud infrastructure. Good knowledge of the state-of-the-art tooling in capturing content and translating human annotations to machine models, is helpful, but ultimately it is the team collaboration with data scientists that unleashes the full potential of our data - your work. If you can show that you understand the product cycle – front-end functionality to a back-end requirements – that will makes you a great addition to the Content & Innovation team.


You will be working between Elsevier Operations and Technology with a varied and cross-functional team of Technology and Product colleagues to pilot and develop new methods of extracting and surfacing information relevant to our customers for new product development. When successful, the Data Engineer will support the implementation of industry-scale high-quality production systems. You will work closely with both the publishing, content modelling and NLP teams. Sample projects may include article reference structuring / resolution, institution / author disambiguation, and concept / keyword suggestion and normalization.


As an information solutions provider, Elsevier is looking for someone that is able to work on information from internal and external sources and using different (or no) data standards. The ideal candidate will have industry experience solving meta-data-normalization problems - and apply that experience to all of the above areas.


What we are looking for:

  • Technical skills should include software development experience in a curly brace language or Python, as well as scripting abilities. Writing queries, handling data (ETL), and experience using *nix systems, open source software and libraries.
  • Excellent knowledge/proficiency in one or more programming languages (e.g. Python, Java etc.)
  • Experience in web development, understanding of web services, APIs
  • Database querying languages (SQL or similar)
  • Experience in environments for big data engineering and distributed computation (e.g. Spark environments such as DataBricks, Zeppelin)
  • Basic knowledge of software version control systems (git preferred)
  • Ability to write scripts for task automation
  • Experience in gathering requirements for software
  • Familiarity to cloud technologies i.e. Amazon Web Services (AWS)
  • Open mind to work with new technologies
  • Curiosity for algorithm development
Education, Knowledge, Skills and Experiences
  • University graduate (Master level) in computer science, or an associated area.
  • Experience working with ETL or data, cloud technology and so on in (Big) Data environments.
  • Experience with data mining is very beneficial, industry experience a big bonus.
  • Ability to drive new developments and implement process changes and disruptive technologies in the organization.
  • Familiarity with agile software development.
  • Good communication and documentation skills with the ability to convey complex technical concepts to non-technical professionals.
  • Knows how to improve efficiency of existing code, always considering performance factors.

Choosing to work at Elsevier is choosing to use your talents in a professionally challenging environment, where your personal development and technical expertise will be valued and rewarded. You’ll get the balance of working on a truly worthwhile endeavour without sacrificing any career advantages, but actually enjoying greater professional opportunities. It’s about making an intelligent choice.

Previous Job Searches

Activity Feed

Job shares through Reed Exhibitions
Someone referred the Quality Test Analyst III position. About an hour ago
Someone referred the Aviation Analyst - Cirium - Hong Kong position. About an hour ago
Someone referred the News Assistant position. About an hour ago
Someone referred the News Assistant position. About an hour ago
Someone referred the Data Scientist position. 3 hours ago