Principal Engineer, Data Engineering at Sandisk
Bengaluru, karnataka, India -
Full Time


Start Date

Immediate

Expiry Date

04 Feb, 26

Salary

0.0

Posted On

06 Nov, 25

Experience

10 year(s) or above

Remote Job

Yes

Telecommute

Yes

Sponsor Visa

No

Skills

HPC Administration, Linux/Unix System Administration, Shell Scripting, NFS, Storage Management, Backup Management, Networking Principles, Analytical Skills, Problem-Solving Skills, Communication Skills, Collaboration, HPC Cluster Software, Containerization Technologies, Workload Orchestration, Scripting Languages, Semiconductor Domain Experience

Industry

Semiconductor Manufacturing

Description
Company Description Sandisk understands how people and businesses consume data and we relentlessly innovate to deliver solutions that enable today’s needs and tomorrow’s next big ideas. With a rich history of groundbreaking innovations in Flash and advanced memory technologies, our solutions have become the beating heart of the digital world we’re living in and that we have the power to shape. Sandisk meets people and businesses at the intersection of their aspirations and the moment, enabling them to keep moving and pushing possibility forward. We do this through the balance of our powerhouse manufacturing capabilities and our industry-leading portfolio of products that are recognized globally for innovation, performance and quality. Sandisk has two facilities recognized by the World Economic Forum as part of the Global Lighthouse Network for advanced 4IR innovations. These facilities were also recognized as Sustainability Lighthouses for breakthroughs in efficient operations. With our global reach, we ensure the global supply chain has access to the Flash memory it needs to keep our world moving forward. Job Description System Design and Deployment: Designing and deploying high-performance computing clusters and systems based on organizational requirements and industry best practices. Configuring hardware components, network and storage systems to optimize performance and reliability. System Maintenance and Monitoring: Performing routine maintenance tasks such as software updates, patches, and system upgrades to ensure optimal performance and security. Monitoring system performance, resource utilization, and capacity planning to proactively address potential issues and bottlenecks. User Support and Training: Providing technical support and troubleshooting assistance to users of the HPC systems. Developing and delivering training sessions to educate users on best practices, usage guidelines, and efficient utilization of HPC resources. Security and Compliance: Implementing and maintaining security protocols, access controls, and data protection measures to safeguard HPC infrastructure and sensitive data. Ensuring compliance with relevant regulatory requirements and organizational policies related to HPC operations. Documentation and Reporting: Creating and maintaining comprehensive documentation including system configurations, operational procedures, and troubleshooting guides. Generating regular reports on system performance, usage statistics, and operational metrics for management and stakeholders. Qualifications Bachelor’s degree in computer science, Information Technology, or a related field (or equivalent work experience). Proven experience (8+ years) as an HPC Administrator or in a similar role managing HPC systems in a production environment. Proficiency in configuring and managing HPC cluster software such as Slurm, NC, LSF or Grid Engine. Strong knowledge of Linux/Unix system administration and shell scripting. Experience with NFS and storage (NetApp/ISILON) and backup management in HPC environments. Familiarity with networking principles, including TCP/IP, VLANs, and InfiniBand. Excellent analytical and problem-solving skills with the ability to troubleshoot complex issues independently. Strong communication skills and the ability to collaborate effectively with cross-functional and cross geography teams and end-users. Preferred Skills: Bachelor’s degree in computer science, Engineering, or a related discipline. Experience in HPC technologies (e.g., HPC Systems Professional, Cray Certified System Administrator). Knowledge with containerization technologies (e.g., Docker, Singularity) and workload orchestration frameworks (e.g., Kubernetes) is a plus. Knowledge of scripting languages like shell/Ansible commonly used in unix admin will be a plus. Knowledge of Dell/CISCO UCS servers in HPC environments. Semiconductor domain experience is a must. Additional Information Sandisk thrives on the power and potential of diversity. As a global company, we believe the most effective way to embrace the diversity of our customers and communities is to mirror it from within. We believe the fusion of various perspectives results in the best outcomes for our employees, our company, our customers, and the world around us. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect and contribution. Sandisk is committed to offering opportunities to applicants with disabilities and ensuring all candidates can successfully navigate our careers website and our hiring process. Please contact us at [email protected] to advise us of your accommodation request. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.
Responsibilities
The Principal Engineer will design and deploy high-performance computing clusters and systems, ensuring optimal performance and reliability. They will also provide user support, maintain security protocols, and generate reports on system performance.
Loading...