We are looking for a Senior Data Analyst, Disclosure Evolution to be responsible for leading and delivering the goals of the Question to Score project. This project pertains to data science models and the use of third-party data enrichment for company and city disclosure on environmental issues. Specifically, the successful candidate will lead this work across all thematic areas. The successful candidate will ensure the data science models are aligned with the principles set out in the Question to Score project and work closely with the Data & Insights Department to ensure cross-organization consistency. The candidate will work closely with the Senior Manager, Question to Score and will report to the Head of Data Science & Products.
This role requires some experience in handling environmental data and how environmental issues can affect, and are affected by, businesses, cities, states, and regions. The successful candidate must have working knowledge in the field of machine learning with specialized knowledge in NLP. This is a crucial role in building trend analyses and enabling similarity scoring across previous questionnaire responses. The successful candidate will need to demonstrate capability to work and communicate effectively with others, including stakeholders and thematic teams, to ensure processes are followed, deliverables are aligned to milestones and outputs are built to agreed quality standards.
Key responsibilities include:
Delivery of Question to Score evolution projects, focusing on the development of machine learning pipelines to leverage well known NLP libraries and aid the automatic evaluation of textual responses within operational time of execution and business requirement guideline. This includes:
o Working with the Data and Insights Department to shape the new questionnaires into an automatable state, while fulfilling the mission objectives both for cities and companies.
o Ideation, design, prototyping and deployment of Data Science products, powered by scalable machine learning models
o Determine, with the help of the data science team, key data points to facilitate the assessment of trends in key environmental metrics and performance indicators against organizational environmental targets.
o Defining how targets perform against transition norms for organizations, using both quantitative and qualitative data points and helping establish model parameters that will help with these assessments.
Deliver presentation and data story boards to enable the communication of the value of your work to management and stakeholder audiences.Help in the preparation and provision of curated datasets.Promote data driven decision making and educate on the meaning and principles of data science.Productionising analysis pipelines through a cloud toolset and hosting static and dynamic presentations of the generated insight.
Required skills and experience:
Msc/PhD educated in Computer Science, Statistics and Mathematics or similar.At least 2 years of experience using open source programming language for large scale analysis (Python and R, PySpark) and relational databases (MongoDB, DATASTAX, Teradata, Parquet, Hive) and using SQL to query databasesA strong mathematical and statistical background with a deep understanding of statistical inference, experimental design, sampling, and simulationSome experience in the training and production of machine learning models using both structured and unstructured data in big data pipelines, for example AWS / Azure / Google Cloud or others.Excellent data visualization skills using Power BI or similar tools.Experience with well-known code libraries for data preprocessing (pandas, dplyr, tidyr, , scipy, feature-engine, beautiful soup, scrapy, spacy, nltk, TextBlob, fastText, polyglot, requests, json, functools).Good technical communication & presentation skills in English.Be able to work in a matrix environment within a virtual team.Excellent problem-solving skills.Project experience with NLP, text analytics and other relevant areas (e.g. text classification, topic detection, information extraction, Named Entity recognition, entity resolution, Question-Answering, sentiment analysis, event detection, language modelling).
Desired skills and experience:
Familiarity with GitHub, Linux, Shell scripting (bash).Experience working as part of a scrum team.Experience with automatic testing scripts and environmental promotion pipelines (e.g. azure DevOps pipeline).A good understanding of GHG and sustainability data.Knowledge of the financial system and capital markets.An awareness of environmental issues, particularly as they relate to our core themes of Climate Change, Deforestation and Water Security.Ambition to start to enable and coach colleagues as part of an expanding organization with growing data science organization.
This is a fixed term 14-month contract, full time role based remotely for the first 6-8 months, which could then move to the London office.
Salary and benefits: 40k-48k, 30 days holiday excluding bank holidays, flexible working opportunities and others benefits.
Interested applicants must be eligible to work legally in the United Kingdom