Multimodal embeddings for Telecom Images at Ericsson
Massy, Ile-de-France, France -
Full Time


Start Date

Immediate

Expiry Date

19 Mar, 26

Salary

0.0

Posted On

19 Dec, 25

Experience

2 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

Telecommunication Systems, Network Architectures, Computer Vision, Multimodal Learning, Representation Learning, Transformer-Based Architectures, Vision Transformers, CLIP, Python, Deep Learning Frameworks, PyTorch, TensorFlow, Cloud-Based AI Platforms, AWS, Generative AI, Large Multimodal Models

Industry

Telecommunications

Description
Design evaluation benchmarks for telecom image understanding (e.g., retrieval, captioning, or visual grounding tasks). Analyze performance against generic VLMs and report improvements in domain alignment and factual grounding. Basic understanding of telecommunication systems and network architectures (4G/5G or similar). Interest in computer vision, multimodal learning, and representation learning. Familiarity with Transformer-based architectures (Vision Transformers, CLIP, or similar). Experience with Python and deep learning frameworks (PyTorch, TensorFlow, or similar). Experience with cloud-based AI platforms, preferably AWS and Amazon Bedrock (or similar cloud LLM/VLM services). Experience with version control and collaboration platforms (Git/GitLab, or similar). Curiosity about Generative AI, Large Multimodal Models, and their applications in telecom. Qualities of fast learning, critical thinking, autonomy, and teamwork. Willingness to work in an inclusive, research-oriented, and multicultural environment. Fluent English language skills in both writing and conversation.
Responsibilities
Design evaluation benchmarks for telecom image understanding tasks such as retrieval, captioning, or visual grounding. Analyze performance against generic VLMs and report improvements in domain alignment and factual grounding.
Loading...