Software Engineer in Data
at Speechmatics

London, England, United Kingdom -

Start Date	Expiry Date	Salary	Posted On	Experience	Skills	Telecommute	Sponsor Visa
Immediate	19 Jul, 2024	Not Specified	19 Apr, 2024	N/A	Docker,Instrumentation,Speech,Code,Git,Sql,Pipeline Development,Beam,Airflow	No	No

Add to Wishlist Apply All Jobs

Required Visa Status:

Citizen	GC
US Citizen	Student Visa
H1B	CPT
OPT	H4 Spouse of H1B
GC Green Card

Employment Type:

Full Time	Part Time
Permanent	Independent - 1099
Contract – W2	C2H Independent
C2H W2	Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

Speechmatics is a cutting-edge, applied AI Research company that is breaking down cultural barriers; harnessing the power of speech. Our modelling pipelines turn millions of hours of audio into one of the world’s most accurate Speech Intelligence platforms. In the coming months, we’re aiming to gather millions more hours of data to grow the capabilities of our APIs. As we grow the number of languages we understand from 50 to over 100 and train our next generation models, we’re looking to revolutionise the way we think about data. To boost this revolution, we’re looking for a talented Software Engineer in Data to join our team.
As a Software Engineer in Data, you will own the sourcing of audio and text data for a range of languages and diverse voices. This includes designing, deploying and maintaining much of our data tooling. You will also play a role in our understanding of new languages by working with native speakers to preprocess data and evaluate models. Working collaboratively with our Machine Learning Engineers, you will train speech recognition models and build tools and dashboards to analyse their performance. By sharing your insights with other teams and our external partners, you will drive the growth of Speech Intelligence and our mission to ‘Understand Every Voice’.

DESIRED EXPERIENCE INCLUDES:

Strong software engineering skills, e.g. Python, Git, CI/CD pipelines, Docker
ETL pipeline development for processing large datasets, particularly text or audio (e.g. Prefect, Airflow, Beam)
Data stores for large-scale datasets (such as Parquet, key-value databases, SQL)
Distributing code across HPC/Spark/Kubernetes clusters
Building dashboards and instrumentation to monitor pipeline performance
Previous experience with speech or text data in ML/NLP applications including deep learning frameworks like PyTorch; this is a plus but not required

WHO WE ARE:

Speechmatics is the leading expert in Speech Intelligence, and uses AI and Machine Learning to unlock business value in human speech worldwide . We work with an amazing mix of global companies , and our technology can integrate into our customers stack irrespective of their industry or use case – making it the go-to solution to harness useful information from speech. We have recently raised $62 million at Series B and continue to grow positively .
Joining us means working with some of the smartest minds around the world , focused on cutting-edge projects and deploying the latest techniques to disrupt the market. We believe in putting people first ; we’ll do all we can to help you develop your skills and give you the tools you need to thrive . We support people to work wherever they work best and also understand the importance of coming together to collaborate, socialise and build relationships .
This is only the beginning; we’re looking for amazing people like you to continue our journey…
At Speechmatics, our mission is simple: understanding every voice out there. That’s not just about our tech – it’s the heart and soul of who we are.
We welcome different experiences, viewpoints, and identities. For us, it’s not just the right thing to do; it’s our catalyst for sparking innovation and creativity. Our teams thrive in an environment that celebrates and supports everyone – no matter their gender, identity or expression, race, disability, age, sexual orientation, religion, belief, marital status, national origin, veteran status, pregnancy, or maternity status.
But we don’t just open the door to diversity – we actively welcome it. Why? Because we believe every unique voice adds something special to our team, leading us to smarter solutions and a better workplace.
So, come as you are and join our Speechling community. We’re building a place where every voice not only gets heard but is also respected and valued.
For more information on us, please visit our website and follow Speechmatics on our social channels via Twitter, Facebook, LinkedIn, and YouTube.

Responsibilities:

YOU’LL THRIVE IN THIS ROLE IF YOU:

Have experience developing highly scalable ETL pipelines for preprocessing hundreds of TBs of data, including pipeline monitoring for performance metrics
Excel at taking ownership of projects from end-to-end, including data acquisition, ingestion and indexing
Enjoy diving deep into results to identify the strengths and weaknesses of models
Keep up-to-date with the latest developments in data preprocessing techniques for machine learning
Are a code optimising guru, building tools to streamline workflows (when off the shelf solutions won’t do)

REQUIREMENT SUMMARY

Experience:Min:N/AMax:5.0 year(s)

Industry:Information Technology/IT

Functional area of job:IT Software - System Programming

Domain:Software Engineering

Qualifications:Graduate

English Proficiency:Proficient

Number of posts:1

Address of job:London, United Kingdom

Software Engineer in Data
at Speechmatics

Required Visa Status:

Employment Type:

REQUIREMENT SUMMARY

INDIA

AUSTRALIA

UNITED ARAB EMIRATES

Software Engineer in Dataat Speechmatics

Required Visa Status:

Employment Type:

REQUIREMENT SUMMARY

Software Engineer in Data
at Speechmatics